Product Pricing About Careers Blog Talk to us
The teaching layer for agentic coding

Ship code.
Master the craft.

Agentic coding tools make you faster. Most of them also make you worse at your job. aiworklab is the first AI engineering workspace that closes tickets and closes the skill gap — built on the open-source agents you already trust.

BYO model · cloud or local Built on Claude Code, Codex, T3 Code, OpenCode Local-first by default
your skill graph · live tracking
seen explained demonstrated mastered
Stand on the shoulders of giants

aiworklab is a teaching layer on top of the agent harnesses you already trust.

The skill atrophy crisis

The category is optimising for the wrong thing.

Every AI coding tool measures itself by lines accepted and time-to-first-diff. Nobody is measuring whether the human is getting better. Five years from now, that's going to bite.

68%
of developers report a "shipped it but couldn't rebuild it" feeling, according to internal research with 200+ engineers in 2026.
24–60 months
until the cohort that learned-via-tab-complete hits its first promotion review and the gap becomes visible.
$0
spent today on tools that prove your team is getting better, not just faster. We're closing that gap.
Three modes, one workspace

Friction where it teaches. Nowhere else.

Pick a mode per task. The agent is the same — your relationship with it changes. Mastered concepts pass through silently; novel ones get the right amount of friction at the right moment.

01 / Autopilot

Ship it.

Standard agent loop. Plans, executes, edits. The agent works. You ship. Concepts encountered are silently logged to your skill graph for later review — no interruptions.

When · production work, infra chores, deadline mode
02 / Copilot

Ship it — but explain it.

Same speed, with one beat of friction. Before applying any non-trivial diff, a 15-second comprehension check tied to that exact change. Pass and merge. Most learning happens here.

When · default mode, real work in unfamiliar territory
03 / Coach

You write. It questions.

The agent withholds. You write the code; it reviews, points to bugs, asks Socratic questions, refuses to fix things for you. Slowest mode, deepest learning.

When · onboarding, weekly fly-solo, deload weeks
The killer feature

Every PR becomes a coaching session.

For agent-authored changes above a novelty threshold, the merge button stays disabled until you write a 2–3 sentence explanation of what the diff does and why. An LLM judges it against the diff.

01

Forces active processing

Active recall is one of the most-studied learning techniques in cognitive science. Tab-to-accept skips it entirely. We put it back.

02

Adapts to the user

Concepts already marked demonstrated on your skill graph never trigger a check. The friction shrinks as you grow.

03

Force-merge always available

When you genuinely don't have time, you skip. We log it. Your weekly retention report shows the trade-offs honestly.

auth/middleware.ts · +11 -3 concept · jwt lifecycle
if (req.headers.authorization) {
- const token = req.headers.authorization.split(' ')[1];
- return jwt.verify(token, SECRET);
+ const t = parseBearer(req.headers.authorization);
+ try { return jwt.verify(t, SECRET); }
+ catch (e) {
+ if (e.name === 'TokenExpiredError' && req.cookies.refresh) {
+ return rotate(req.cookies.refresh);
+ }
+ throw new Unauthorized();
+ }
}
Explain to merge · in 2–3 sentences, what does this change do and why?
It now distinguishes an expired access token from an invalid one. If the access token is expired but a refresh cookie is present, it rotates instead of failing — otherwise it 401s. Closes the silent re‐auth loop that was logging users out mid-session.
verdict pass · concept jwt-lifecycle → demonstrated · merge unlocked
Bring your own model

We don't sell tokens. You don't pay markup.

Sign in with the LLM provider you already pay for, or point us at a local model running on your laptop. We never proxy or markup inference traffic. This is a permanent design choice, not a launch limitation.

Cloud key

Paste your API key. Validated client‐side and stored in the OS keychain. Nothing transits our servers.

AnthropicOpenAIGoogleOpenRouter

OAuth sign-in

One click. We never see your raw credentials. Tokens stay on-device. Easiest path for non‐power users.

Anthropic ConsoleOpenAI

Local model

Auto‐detects models running on localhost. Works fully offline. The only tier where regulated industries can ship.

OllamavLLMLM Studio

Org gateway

Your admin provides a gateway URL; team signs in via SSO. Standard for Team and Enterprise tiers.

LiteLLMBedrockVertex
For engineering leaders

Finally, an honest answer to "is the team getting better?"

An anonymised, aggregated dashboard of skill coverage across your engineering organisation. Concept retention curves. Bus‐factor warnings on knowledge held by ≤ 2 engineers. The artefact that closes the budget conversation.

Backend guild · Q1 2027 live
142
concepts mastered, 12 engineers
+18%
solo throughput vs Q4
bus-factor warnings
kafka exactly-once semantics 1 engineer
postgres logical replication 2 engineers
terraform state surgery 3 engineers
From early users

People who've felt the gap.

We gave aiworklab to two teams running a head-to-head experiment. After eight weeks the coached group had higher solo throughput and better retention numbers. The other group closed more tickets. Leadership picked the wrong metric for a decade.
JS
Jordan S.
Engineering director · Series C SaaS · Austin, TX
I've been shipping TypeScript for six years. aiworklab surfaced a gap in my understanding of how the compiler resolves conditional types that I genuinely didn't know was there. That's the product working as designed, and it's uncomfortable in exactly the right way.
TW
Thomas W.
Staff engineer · infrastructure · Berlin
Coach mode on a Saturday morning is the closest I've come to real deep work since Copilot became default. The agent withholding is a feature, not a bug.
NK
Nadia K.
Senior engineer · fintech · London
The explain-to-merge gate caught a race condition in a diff my agent produced that I had just accepted without thinking. That's worth the subscription alone.
RM
Rafael M.
Founding engineer · seed-stage startup · São Paulo
The skill graph surfaced three concepts I'd been avoiding for months by leaning on the agent. Seeing them labeled "encountered — not demonstrated" was genuinely humbling.
PL
Priya L.
Mid-level engineer · platform team · Seattle
Frequently asked

Questions, answered.

Are you another agent? Another IDE?
No. We're a teaching layer that wraps the agent harnesses you already use — Claude Code, Codex CLI, OpenCode, T3 Code. We don't rebuild the agent loop; we contribute upstream and add the layer above it.
Why bring-your-own-model?
Because we believe pricing for the teaching layer should be honest and predictable. Token resellers have a structural conflict between their margin and your wallet. We avoid that conflict permanently — you bring inference, we charge for the pedagogy and the org features.
Is this paternalistic?
It would be, if friction were uniform. The skill graph means concepts you've already demonstrated never trigger a check. Senior engineers feel near-zero overhead. Friction shrinks as you grow — the opposite curve from current AI tools, which get stickier the more you depend on them.
What does the org dashboard actually show?
Concept-level skill coverage across your team, anonymised and aggregated. Retention curves. Solo-throughput trend (the user-only metric). Bus-factor warnings on concepts demonstrated by ≤ 2 engineers. No source code ever leaves your boundary — only concept tags and outcomes.
Where does the data live?
Local-first by default. SQLite per device. Cloud sync is opt-in for paid tiers and replicates only concept-level facts, never code. Enterprise customers can self-host the entire skill-graph store.
When can I use it?
Private alpha is in flight. Public beta opens Q4 2026. Team tier GA is targeted for Q1 2027. Enterprise + Education tiers Q2 2027. Get on the waitlist below.
Early access

Don't atrophy.

Join the engineers and teams who think the next phase of AI tooling should be measured by what it does to the human, not just for them.

No commitment · we'll write back personally