Product Pricing About Careers Blog Talk to us Get early access
← The journal field notes · Mar 28, 2026 · 6 min read

The first 90 days of a junior engineer in 2026.

Notes from a few dozen conversations with engineering managers onboarding the first cohort that learned to code after the AI tooling wave. The patterns are consistent. So are the fixes.

A 90-day onboarding timeline with three marked inflection points at weeks one, six, and twelve

Over the past six months we've spoken with several dozen engineering managers about one specific subject: what it's like to onboard a junior engineer in 2026. These are the first hires who did most or all of their learning after agentic coding tools became the default way to write software. The conversations were informal and off the record, so what follows is pattern, not data. But the patterns repeat with a consistency that surprised us, and they're worth writing down.

Week one: the best first PRs anyone has ever seen

Almost every manager opened with a version of the same observation: the first pull requests from this cohort are remarkable. Clean, idiomatic, well-tested, often shipped on day two or three instead of week two or three. One manager told us her most recent junior closed a ticket in his first week that she had privately budgeted a month for.

This is real and it should be said plainly: these engineers are productive immediately in a way no previous cohort was. The agent handles the codebase archaeology, the framework conventions, the test scaffolding. The new hire steers. The output is good.

The trouble is that week one is the high-water mark of the manager's confidence, and the next inflection point runs the other way.

Week six: the debugging cliff

The second pattern shows up when something breaks in a way the agent can't immediately resolve. A flaky integration test. A production incident with a misleading stack trace. A bug that lives in the interaction between two services rather than in any single diff.

Managers described the same scene independently: the junior pastes the error into the agent, applies the suggested fix, the fix doesn't work, pastes the new error, applies the next fix, and loops. What's missing is the thing debugging actually consists of: forming a hypothesis about the system, finding the cheapest way to test it, and updating. Several managers used nearly identical words: "they've never had to sit with a problem."

This isn't a character flaw and it isn't fragility. It's exactly what our 12-month study predicted at the cohort level: juniors showed the steepest retention gaps on AI-authored code, because the cognitive steps that build a mental model of the system (retrieval, elaboration, self-explanation) were never required of them. You can't debug a system you never had to model.

The vocabulary gap

A third pattern, subtler than the first two: vocabulary outruns understanding. The new cohort talks fluently about idempotency keys, connection pooling, write-ahead logs, and backpressure, because the agent narrates its work using those words and the words stick. Several managers said they were initially fooled by this in interviews and early one-on-ones.

The tell comes when you ask a question one level beneath the term. "You said the retries are idempotent. What makes them idempotent?" The answer is often a restatement of the word rather than a description of the mechanism. One manager started calling this "fluent surface": the engineer sounds two levels more senior than they can operate.

A related tell: code review replies that are clearly pasted from the agent. A reviewer asks why a lock was taken in a particular order, and the reply arrives in thirty seconds, four paragraphs long, in a register no human uses in a review thread. The reviewer learns nothing about what the author understands, which defeats the entire purpose of review as a teaching channel.

What the good managers are doing about it

The encouraging half of these conversations: a handful of managers have converged, mostly independently, on the same small set of fixes. None of them involve banning the tools, which everyone agreed would be both futile and counterproductive. All of them involve putting structured friction back into the loop at chosen moments.

1. Explain before merge

The simplest intervention: for any non-trivial agent-authored diff, the junior writes two or three sentences in the PR description explaining what the change does and why, before requesting review. Reviewers are told to read the explanation first and the diff second. If the explanation doesn't match the diff, that's the review conversation. Managers who do this report it surfaces misunderstandings in days that would otherwise surface in incidents.

2. Fly-solo Fridays

One protected block per week, usually a half day, where the junior works without agent assistance on a real ticket scoped to fit. The point isn't throughput on that day; it's the trend. In our study, twelve weeks of weekly solo practice improved unassisted throughput by 24%, against 4% for the control group. Managers running a version of this said the first few sessions are demoralising and the engineers themselves ask to keep them by week five.

3. Pairing on reading, not writing

Traditional pairing has the senior watch the junior write. Several managers have inverted it: the pair reads an unfamiliar part of the codebase together, agent closed, and the junior narrates what they think each piece does before checking. Reading code you didn't write, and predicting its behaviour, is the muscle the agent workflow exercises least.

4. Asking the second question

The cheapest fix of all: in reviews and one-on-ones, ask the question one level beneath the vocabulary. Not to catch anyone out, but because the gap between the word and the mechanism is exactly where the learning needs to happen, and the junior usually can't see the gap themselves.

The part that's our job

Everything above can be done with management discipline and zero new tooling, and teams are doing it. The honest limitation is that it depends on heroic, consistent effort from individual managers, and it treats every junior identically regardless of what they individually do and don't understand.

That's the gap aiworklab is built for: explain-to-merge as a default rather than a norm someone has to enforce, friction metered by a per-person skill graph rather than applied uniformly, fly-solo sessions with a solo-throughput trend line the engineer can watch move. The managers in these conversations were, in effect, hand-building the product. We'd like to give them the instrument instead.

If you're onboarding this cohort and seeing patterns we missed, or fixes that work better than these, write to team@aiworklab.com. Field notes improve with more fields.