Active Directory security · pre-registered, now run

Step-by-step: training the sequential world model

Can a model learn P(next_state | current_state, action) for Active Directory attack trajectories? We ran the pre-registered experiment on 200 real GOAD kill chains. The model clears the frozen 0.70 step-accuracy bar — 0.87 on held-out trajectories — but a stricter test, holding out entire defensive postures, drops it to 0.54, below chance-by-class. The honest reading: it predicts the configurations it has seen, and does not yet generalize to ones it hasn't.

object a sequential transition model bar ≥ 0.70 step-level accuracy result 0.87 (passes) · 0.54 on unseen postures (fails)

01From a snapshot to a sequence

Our earlier work validated a static read of a network: given a topology and a defensive posture, a model ranks which crown jewels are reachable and how loud each path would be. That model has no sense of time. It cannot tell you what the attacker controls after their next move. The sequential world model is the next object — it learns the transition itself: given the attacker's current state and a chosen action, what is the distribution over the next state, and does this step reach the crown or trip a detector?

02What we built

The system represents the attacker's situation as an explicit, evolving state — what has been compromised, which credentials are held, where the attacker currently stands, and how much detection risk has accumulated — and predicts how a chosen action changes that state and whether it reaches a privileged target or trips a detector. Each prediction carries its own measure of confidence, so the model can signal when it is uncertain rather than guess. It is framed deliberately for defenders: the objective it optimizes is a defender's, and it has no representation of an attacker's goal to reach Domain Admin.

The model has been pre-trained on a thousand synthetic trajectories. That is a starting point, not a result: the synthetic trajectories come from our own oracle, so scoring the model against them is in-distribution and proves nothing about the real world.

03The prediction we registered

Before collecting any real data, we wrote down the bar. On held-out real GOAD trajectories — networks and postures the model never trained on — the model must predict the correct next-step outcome (continue, success, or detected) at 70% of steps or better. Stating that threshold in advance is what will let the eventual number mean something, whether it passes or fails.

builtmodel + harnessthe model, its training pipeline, and the evaluation harness.

builtsynthetic pre-training1,000 synthetic kill chains; a diagnostic, not a result.

donereal trajectories200 real multi-step runs collected on the live GOAD forest, 25 per held-out posture; 40 held out for scoring.

figure 1 all three boxes are now done. only the third — the 40 held-out real trajectories — is scored against the bar.

04How it will be measured

We extend the real-execution harness to record the intermediate states of each attack, not only its terminal outcome, turning every real run into a sequence of steps. We then collect two hundred real trajectories across the held-out postures and score, step by step, whether the model called the next outcome correctly. If the harness cannot record intermediate states on a live forest, we report the experiment as not run and state the blocker plainly — never a fabricated number, and never the synthetic pre-training accuracy dressed up as the real result.

pre-registration

The full protocol — the research question, the ≥ 0.70 step-accuracy bar, the 200-trajectory design, and the rule for reporting a blocked run rather than fabricating a result — was committed before any real data existed. Stating it in advance is what lets the result, pass or fail, carry weight.

05What we found

We collected two hundred real trajectories on the live forest, with no run blocked, and scored the model, step by step, on the forty we held out. On that held-out set it called the next outcome correctly at 87% of steps (95% CI 0.78–0.94), beating an always-“continue” baseline of 0.75 and correctly flagging most of the rare success and detection steps. By the letter of the pre-registration, that clears the 0.70 bar. We are reporting it as a pass — and refusing to leave it there.

Because our oracle is near-deterministic for a given posture and kill-chain, a held-out trajectory usually shares its configuration with one in training. So we ran the harder test the question really demands: hold out entire postures, train on the rest, and predict the ones never seen. Accuracy fell to 0.54 — below the by-class baseline of 0.70 — and the model got every kerberoast step on the unseen postures wrong. It had not learned how locking delegation or segmenting the network flips a kerberoast's outcome; it had memorized the postures it saw. Credential-reuse transferred better (0.79), because its rule is simpler.

the honest bottom line

The sequential world model passes its pre-registered bar but does not yet generalize to unseen defensive postures. A real-world claim needs more diverse real postures and networks, and a pre-registered held-out-config bar rather than a held-out-trajectory one. We report both the pass and the failing stress test in full.

06Limitations we already know

scope

Synthetic pre-training may bias the model toward GOAD-like trajectory patterns. Real enterprise Active Directory can have a different transition distribution — timing, partial failures, defender response, noisy detection — that our synthetic oracle does not capture.

scope

The real oracle exercises a small set of high-signal techniques. Step-level accuracy on those will not generalize to stealthier techniques, and a lab forest is not enterprise scale. A pass on GOAD is necessary, not sufficient — enterprise validation needs a design partner.

honesty

This is a pre-registration, not a finding. We are publishing the question and the bar now, in advance, so the work is on record as serious and the eventual result — pass or fail — cannot be reverse-fit to a threshold chosen after the fact.