Active Directory security · pre-registered study
The remediation ranking problem
Security teams are routinely advised to rotate credentials, lock down delegation, and segment the network, but rarely told in what order. We applied each of these interventions at graded coverage on a live Active Directory forest and measured the real change in attack surface. At full coverage, all three completely close the reachable-and-undetected surface — they are equally effective. Because the effects are equal, the only thing that distinguishes them is implementation cost, and on that basis credential rotation removes the same risk as network segmentation for roughly a sixth of the effort.
01The question
The standard hardening advice for Active Directory is a list, not a ranking. A team with limited time has to choose what to do first, and the usual instinct is to rank by raw impact — to lead with the intervention that closes the most attack paths. We wanted to test whether that instinct is right, by measuring the actual effect of each intervention and weighing it against what it costs to deploy.
02How we measured it
We used a validated counterfactual engine to apply three interventions — credential rotation, delegation lockdown, and network segmentation — at graded coverage across the held-out defensive postures on a live forest. For each, we executed the real attack before and after and recorded the change in the reachable-and-undetected surface. We scored cost using a coarse implementation-effort tier: low counts as one unit, medium as three, high as six. The value of an intervention is its risk reduction divided by its effort.
03What we found
At full coverage, every intervention drove the reachable-and-undetected surface to zero. Their absolute effects are identical. This is the key observation: because the interventions are equally effective, ranking them by impact is uninformative, and the decision collapses entirely onto cost.
low cost — best value
medium cost
high cost
Ranked by value — risk reduction per unit of effort — the order is clear and well separated.
| intervention | effort tier | mean risk reduction | per unit effort | n |
|---|---|---|---|---|
| credential rotation | low (1) | 1.00 | 1.000 | 26 |
| delegation lockdown | medium (3) | 1.00 | 0.333 | 43 |
| network segmentation | high (6) | 1.00 | 0.167 | 2 |
The interventions also differ in how they respond to partial deployment. Credential rotation and delegation lockdown are linear in coverage: hardening a third of the accounts closes about a third of the surface, so partial work buys proportional protection. Segmentation, by contrast, is effectively all-or-nothing.
04What this means
When two remediations achieve the same result, the only rational basis for choosing between them is cost. A program that leads with network segmentation spends its most expensive resource — a network re-architecture — to achieve a closure it could have obtained from a password rotation. The linear dose-response adds a practical point: because credential rotation pays off proportionally, a team does not need to finish to benefit, which is rarely true of segmentation.
Read together with our structural study, the recommendation is consistent. The reachable surface is driven by permission and delegation density, and credential rotation closes the credential-reuse portion of it at the lowest cost available. It is a reasonable default to do first.
05Limitations
The segmentation arm has only two environments, so its full-coverage result is sound but its dose-response is unmeasured. Because the full-coverage effects are deterministic — every intervention closes the surface completely, with no variance — a standardized effect size is not meaningful here, and we rank by effort-adjusted value instead.
The cost tiers are a coarse heuristic, not measured person-hours, so the value ranking inherits that coarseness. Absolute risk reduction is also specific to this forest and will not transfer unchanged to an environment with a different baseline surface. Only these three interventions are validated counterfactuals; others would be estimates.
Real before-and-after attack execution on a live multi-domain forest. The model is a frozen black box, and the analysis plan was committed before the run. Every measurement is reconstructable from the per-environment records.