A senior engineer's day with Claude Code

A staff engineer named Reema tracked her work for two weeks at our suggestion. She works on payments infrastructure at a fintech I shouldn't name. Half of those days she used Claude Code aggressively — kept it open in a side panel, handed it tasks during pauses, used it for code review pre-passes. Half of the days she didn't. She kept careful notes in a spreadsheet I still have.

The data was clearer than I expected. AI-assisted days produced more code. But the gain wasn't distributed evenly. Some hours saved an hour. Some saved nothing. A few saved an hour but introduced a subtle bug she caught only on review the next morning. The net was good — about 30% more lines merged per AI-assisted day, with comparable bug rates after the morning re-read pass — but the shape of where the wins came from was not what either of us had assumed at the start.

The senior engineer's day with Claude Code isn't "use AI for everything." It's "know which work AI accelerates and which work it complicates." The skill is the calibration.

The shape of a productive day

Most senior engineers' days break into rough phases:

Morning (deep work). The hour or two of focus where complex problems get attention. Architecture, debugging, design.

Mid-morning (review and unblock). PR reviews, customer issue triage, helping junior teammates.

Afternoon (build). Coding the feature, writing the tests, integrating the pieces.

End of day (admin and prep). Tomorrow's priorities, blockers to surface, documentation.

AI coding assistants change the shape of three of these. The fourth — deep work — they barely touch.

Where AI saves real time

Morning deep work — partial gain. Some debugging is faster with AI: rubber-duck dialogue, log-pattern recognition, "have you seen this stack trace before?". Architecture work is faster when you've already decided the shape and need to translate it into scaffolding. But the actual thinking — the part where you understand a hard problem from first principles — isn't faster with an AI in the loop. AI accelerates the moves you already know how to make. It doesn't make a hard call easier.

Reema's diary captured this well. Her best AI-assisted morning was a database-deadlock investigation: she pasted the deadlock graph, the recent migration history, and the surrounding application code into Claude Code, and within twenty minutes she had three plausible hypotheses ranked by likelihood. She picked the right one, tested it, fixed it. Her worst AI-assisted morning was an architecture call about whether to introduce a queue between two services. The AI had opinions. They were generic. The decision required org context the AI didn't have. She made the call without it.

Mid-morning review — meaningful gain. PR reviews go faster with the AI doing the first pass: description-vs-diff alignment, test coverage check, scope-creep detection.

# Reema's actual PR-review pre-pass command (simplified)
gh pr view 4471 --json title,body,files \
  | claude review-pr \
      --conventions docs/style.md \
      --recent-history "git log --oneline -50" \
      --output-format markdown

That command takes ten seconds. The output is a structured pre-review: questions about the diff, missing test coverage, scope-creep flags. Reema reads it in two minutes, then opens the PR in GitHub and reviews the substance. Her time per PR went from a typical 25 minutes to a typical 10 — not because she was cutting corners, but because the mechanical checks were already done.

Afternoon build — biggest gain. Scaffolding new code, writing tests for existing code, refactoring well-tested code, generating boilerplate, drafting documentation. This is where the multi-hour gains compound. A typical "implement this endpoint with tests against the existing API patterns" prompt, fed with proper context, produces output that's 80% shippable on first try. The remaining 20% — naming, error semantics, edge cases the team's conventions cover differently — is the engineer's actual skill.

End of day admin — small gain. Drafting status updates, generating tomorrow's priority list from today's commits and meetings, filling out the timesheet. Not transformative. But it eliminates a 20-minute drag at the end of the day, which mattered to Reema because that 20 minutes was the difference between leaving on time and not.

Where AI complicates the day

A few patterns where AI usage costs more than it saves. Reema's notes were emphatic on these.

Production debugging under time pressure. AI-suggested fixes during a fire need extra-careful review. The temptation to ship the AI's first plausible answer is real, especially at 11pm on a Friday. The cost of shipping the wrong answer in production is higher than the time saved. Reema's rule, learned the hard way: during incidents, AI is for thinking-aloud, not for generating diffs.

Decisions that depend on org context. "Should we use library X or Y?" includes considerations the AI doesn't see — what's already in the team's stack, what skills the team has, what the platform team's preferences are, what the security team has approved. The AI's suggestion is a starting point. Treating it as the answer leads to PRs that get pushed back in review for reasons the AI couldn't have known.

Designing the public API. APIs that other teams will consume need careful naming, careful argument shapes, careful error semantics. The AI helps with the mechanics. The design is human. We have an unwritten rule on our team: any API surface another engineer is going to depend on gets designed by a senior engineer with a whiteboard, not by an AI in a chat window.

Reviewing AI-generated code. This one surprised me when Reema flagged it. Sometimes it's faster to write the code than to review the AI's version, especially for narrow, well-understood tasks. If the task is "format these dates consistently," writing the helper takes two minutes. Reading the AI's version, checking it handles UTC correctly, checking it handles the locale edge case the team cares about — that takes five minutes. The AI is slower for trivial work.

The senior engineer's skill is calibrating which work belongs in which bucket. There's no magic rule. It's the kind of thing you build a feel for over a few weeks of paying attention to your own patterns.

How to hand off context

The biggest source of AI productivity loss is handing off context badly. The patterns that work:

Lead with the goal. "I'm trying to make X happen, in this codebase, with these constraints" before any code or files. The model needs the why before it can help with the what.

Show the constraints. Style guide, pinned versions, performance budgets, things the team has decided not to do. Constraints prevent the model from suggesting a refactor the team has already chosen to defer.

Surface the why. "I picked this approach because the platform team prefers it" saves a round of rework where the AI suggests a "better" approach the team has explicitly rejected.

Read the AI's plan before approving the work. Catch misunderstandings cheap. The number-one source of wasted AI cycles is letting the model run on a misread of what you wanted.

A good context handoff looks like this:

I'm refactoring the Stripe webhook handler in services/billing/webhooks.py.

Goal: extract the per-event-type logic into separate handler functions so
each one has its own test file. Current code is one big switch statement
about 600 lines.

Constraints:
- Keep the existing public function signature handle_webhook(event, ctx).
- We use pytest with fixtures from tests/conftest.py.
- We have a pattern for handler registration in services/billing/registry.py.
- Don't introduce any new dependencies.
- Each handler should be unit-testable without hitting Stripe's API.

I'd like you to:
1. Propose the new file/function layout (no code yet).
2. After I approve the layout, generate the refactored code one handler at a time.
3. Write the test scaffolding for each handler.

Files for context: services/billing/webhooks.py, services/billing/registry.py,
tests/billing/test_webhooks.py.

That handoff produces dramatically better output than "refactor my webhook handler." It also catches misunderstandings before the AI generates 600 lines of the wrong thing.

The end-of-day loop

A practice that compounds: spend 5 minutes at end of day reviewing what AI helped with, what it didn't, and what tomorrow needs. Build a personal corpus of patterns where AI shines and patterns where it doesn't. Over a quarter, your hand-off speed and your review speed both improve sharply.

Reema's spreadsheet is now ten weeks long. It's the most-referenced file in her personal notes. The patterns it surfaced are the ones she gives to junior engineers who ask how she "uses AI so well." She doesn't use it well. She uses it carefully.

This is the engineering equivalent of the eval-set discipline that AI products use. You're building your own eval set on your own AI usage.

Close

The senior engineer with Claude Code isn't a different kind of engineer. They're the same engineer with a sharper toolkit. The toolkit is most powerful in the hands of someone who already understands the work. AI doesn't make a junior engineer senior; it makes a senior engineer faster. The discipline is in knowing which work to delegate and which work to own.

Reema's productivity is up about 30% on the work where AI helps, and unchanged on the work where it doesn't. The honest answer is that her overall throughput is up around 18%. That's not the 5x demo number you see on Twitter. It's also durable, and her bug rates haven't moved. That's the trade most senior engineers should expect to land on once the novelty wears off.

A senior engineer's day with Claude Code

The shape of a productive day

Where AI saves real time

Where AI complicates the day

How to hand off context

The end-of-day loop

Close

Related reading

Determinism harnesses for non-deterministic systems

Multi-agent orchestration: from kitchen brigade to opera

Retry strategies that don't compound errors