The teaching layer for agentic coding

AI Work Lab — Ship code. Master the craft.

Agentic coding tools make you faster. Most of them also make you worse at your job. aiworklab is the engineering workspace that closes tickets and closes the skill gap, built on the coding agents you already trust.

Join the early-access waitlist → See how it works

Private beta now · GA early Q4 2026

BYO model · cloud or local Built on Claude Code, Codex, T3 Code, OpenCode Local-first by default

auth/middleware.ts · +8 -2 jwt lifecycle

if (req.headers.authorization) {

- const token = req.headers.authorization.split(' ')[1];

- return jwt.verify(token, SECRET);

+ const t = parseBearer(req.headers.authorization);

+ try { return jwt.verify(t, SECRET); }

+ catch (e) {

+ if (e.name === 'TokenExpiredError' && req.cookies.refresh) {

+ return rotate(req.cookies.refresh);

+ }

+ throw new Unauthorized();

+ }

}

explain to merge · in 2 to 3 sentences, what does this change do and why?

It now distinguishes an expired access token from an invalid one. If the token is expired but a refresh cookie is present, it rotates instead of failing; otherwise it 401s. Closes the loop that was logging users out mid-session.

PASS jwt-lifecycle advanced to demonstrated

Stand on the shoulders of giants

aiworklab is a teaching layer on top of the agent harnesses you already trust.

The skill atrophy crisis

The category is optimising for the wrong thing.

Every AI coding tool measures itself by lines accepted and time-to-first-diff. Nobody is measuring whether the human is getting better. Five years from now, that's going to bite.

68%

of developers in our 12-month study of 200 working engineers reported a "shipped it but couldn't rebuild it" feeling.

24 to 60 months

until the cohort that learned via tab-complete hits its first promotion review and the gap becomes visible.

spent today on tools that prove your team is getting better, not just faster. We're closing that gap.

Three modes, one workspace

Friction where it teaches. Nowhere else.

Pick a mode per task. The agent is the same; your relationship with it changes. Mastered concepts pass through silently. Novel ones get the right amount of friction at the right moment.

01 / Autopilot

Ship it.

Standard agent loop. Plans, executes, edits. The agent works, you ship. Concepts you encounter are silently logged to your skill graph for later review. No interruptions.

when · production work, infra chores, deadline mode

02 / Copilot

Ship it, but explain it.

Same speed, with one beat of friction. Before applying any non-trivial diff, a 15-second comprehension check tied to that exact change. Pass and merge. Most learning happens here.

when · default mode, real work in unfamiliar territory

03 / Coach

You write. It questions.

The agent withholds. You write the code; it reviews, points to bugs, asks Socratic questions, refuses to fix things for you. Slowest mode, deepest learning.

when · onboarding, weekly fly-solo, deload weeks

The killer feature

Every PR becomes a coaching session.

For agent-authored changes above a novelty threshold, the merge button stays disabled until you write a 2 to 3 sentence explanation of what the diff does and why. An LLM judges it against the diff.

Forces active processing

Active recall is one of the most-studied learning techniques in cognitive science. Tab-to-accept skips it entirely. We put it back.

Adapts to the user

Concepts already marked demonstrated on your skill graph never trigger a check. The friction shrinks as you grow.

Force-merge always available

When you genuinely don't have time, you skip. We log it. Your weekly retention report shows the trade-offs honestly.

db/migrations.ts · +9 -1 pg advisory locks

explain to merge · what does the advisory lock buy us here?

It guarantees only one app instance runs a given migration at a time, even during a rolling deploy. The lock is per migration name, so different migrations can still run in parallel. The try/finally is critical: if the migration throws, we still release the lock instead of holding it forever.

PASS pg-advisory-locks advanced to demonstrated

Product preview with sample data. The check appears only for concepts you haven't yet demonstrated.

Bring your own model

We don't sell tokens. You don't pay markup.

Sign in with the LLM provider you already pay for, or point us at a local model running on your laptop. We never proxy or mark up inference traffic. This is a permanent design choice, not a launch limitation.

Cloud key

Paste your API key. Validated client-side and stored in the OS keychain. Nothing transits our servers.

AnthropicOpenAIGoogleOpenRouter

OAuth sign-in

One click. We never see your raw credentials. Tokens stay on-device. The easiest path for non-power users.

Anthropic ConsoleOpenAI

Local model

Auto-detects models running on localhost. Works fully offline. The only tier where regulated industries can ship.

OllamavLLMLM Studio

Org gateway

Your admin provides a gateway URL; the team signs in via SSO. Standard for Team and Enterprise tiers.

LiteLLMBedrockVertex

For engineering leaders

Finally, an honest answer to "is the team getting better?"

An anonymised, aggregated dashboard of skill coverage across your engineering organisation. Concept retention curves. Bus-factor warnings on knowledge held by 2 or fewer engineers. The artefact that closes the budget conversation.

Tour the dashboard →

Backend guild · Q1 2027 sample data

142

concepts mastered, 12 engineers

+18%

solo throughput vs Q4

bus-factor warnings

kafka exactly-once semantics1 engineer

postgres logical replication2 engineers

terraform state surgery3 engineers

From early users

People who've felt the gap.

"We gave aiworklab to two teams running a head-to-head experiment. After eight weeks the coached group had higher solo throughput and better retention numbers. The other group closed more tickets. Leadership picked the wrong metric for a decade."

Jordan S.

Engineering director · Series C SaaS · Austin, TX

setup2 teams, 8 weeks, same backlog

coached groupsolo throughput up, retention up

control groupmore tickets closed, gap unchanged

takeawaymeasure the human, not the tab key

skill graph

"I've been shipping TypeScript for six years. aiworklab surfaced a gap in my understanding of how the compiler resolves conditional types that I genuinely didn't know was there. That's the product working as designed, and it's uncomfortable in exactly the right way."

Thomas W.

Staff engineer · infrastructure · Berlin

coach mode

"Coach mode on a Saturday morning is the closest I've come to real deep work since Copilot became default. The agent withholding is a feature, not a bug."

Nadia K.

Senior engineer · fintech · London

explain to merge

"The explain-to-merge gate caught a race condition in a diff my agent produced that I had just accepted without thinking. That's worth the subscription alone."

Rafael M.

Founding engineer · seed-stage startup · São Paulo

skill graph

"The skill graph surfaced three concepts I'd been avoiding for months by leaning on the agent. Seeing them labeled 'encountered, not demonstrated' was genuinely humbling."

Priya L.

Mid-level engineer · platform team · Seattle

Quotes from private alpha participants, shared with permission and lightly edited for length. Roles and locations are as self-described; surnames are abbreviated for privacy.

Frequently asked

Questions, answered.

Are you another agent? Another IDE?

No. We're a teaching layer that wraps the agent harnesses you already use: Claude Code, Codex CLI, OpenCode, T3 Code. We don't rebuild the agent loop; we contribute upstream where the licence allows and add the layer above it.

Why bring-your-own-model?

Because we believe pricing for the teaching layer should be honest and predictable. Token resellers have a structural conflict between their margin and your wallet. We avoid that conflict permanently: you bring inference, we charge for the pedagogy and the org features.

Is this paternalistic?

It would be, if friction were uniform. The skill graph means concepts you've already demonstrated never trigger a check. Senior engineers feel near-zero overhead. Friction shrinks as you grow, the opposite curve from current AI tools, which get stickier the more you depend on them.

What does the org dashboard actually show?

Concept-level skill coverage across your team, anonymised and aggregated. Retention curves. Solo-throughput trend (the user-only metric). Bus-factor warnings on concepts demonstrated by 2 or fewer engineers. No source code ever leaves your boundary; only concept tags and outcomes.

Where does the data live?

Local-first by default. SQLite per device. Cloud sync is opt-in for paid tiers and replicates only concept-level facts, never code. Enterprise customers can self-host the entire skill-graph store.

When can I use it?

Private beta is in flight and we're ahead of schedule. Public beta opens in Q3 2026, and general availability across all tiers (Pro, Team, Enterprise, and Education) lands by early Q4 2026, alongside our SOC 2 Type I report. Get on the waitlist below.

Early access

Don't atrophy.

Join the engineers and teams who think the next phase of AI tooling should be measured by what it does to the human, not just for them.

Email team@aiworklab.com → Talk to us

No commitment · we'll write back personally