Randomisation

Random assignment is what makes A/B testing work. Without it, the comparison between control and variant is just two non-equivalent groups with whatever traffic happened to land in each. With it, the variants are statistically interchangeable in expectation, so any outcome difference is plausibly the effect of the change.

The mechanism is usually a hash. Take the user ID (or session ID, or visitor ID from a cookie), hash it to a number, mod by 100, and assign to a variant based on the bucket. Same input always gives the same hash, so the same user keeps getting the same variant across visits.

What “consistent” assignment means in practice

A user assigned to variant B on Monday should still see variant B on Friday. Otherwise the visitor experiences variant inconsistency, which both confuses them and corrupts the test. The mechanisms that preserve consistency:

Cookie-based assignment - works as long as the cookie sticks. Falls apart across devices, after cookie clearing, and in private windows.
User-ID based assignment - works for logged-in users. Bulletproof in product analytics, useless for pre-login traffic.
Hybrid - cookie for anonymous, user-ID after login, with a stitching mechanism to reconcile. Standard in most modern A/B testing platforms.

Sample ratio mismatch

If you assign 50/50 and 100,000 visitors arrive, you’d expect roughly 50,000 per variant. If the split is 47,000 / 53,000, something is broken. The usual causes:

The variant takes longer to load and bots or impatient visitors bail before assignment is recorded
A redirect-based test loses traffic in the redirect itself
Assignment happens after rather than before a tracking event, so the variant gets undercounted
Bots are routed differently from humans by an intermediate CDN rule

SRM is the most important diagnostic check. Every platform should run it automatically and flag any split that’s significantly off the planned ratio. A test with SRM is invalid regardless of the result.

Things people get wrong

IP-based assignment. Multiple users behind one corporate or household IP get the same variant. Breaks randomisation in B2B and shared-network contexts.
Time-based bucketing. Splitting traffic by hour of day means day-of-week and time-of-day effects map directly onto variant. Not random.
Forgetting to lock assignment before the user’s first interaction. If the variant changes mid-session because assignment was racing against the page render, the user sees both and the data is junk.
Ignoring that the unit of randomisation needs to match the unit of analysis. Randomise on users but measure on sessions and the variance is wrong (see ratio metrics).
Ignoring SRM warnings because the test “looks like it’s working”. A broken split usually correlates with whatever else is causing the SRM. The result is not trustworthy.