Process

What a one-week audit looks like.

Specific. Surgical. Not a 6-week engagement. Here's exactly what happens, day-by-day, from kickoff to delivery walkthrough.

Book a 15-min discovery call

$5,000 fixed-fee · 1 week turnaround · 100% refund within 7 days of delivery

Week 1, by day

Day 1
Intake call · access setup
- · 60-min kickoff. I learn your stack, your call mix, your constraints.
- · Mutual NDA signed if you want one (template ready).
- · Read-only API access to OpenAI / Anthropic / Vertex / wherever your LLM spend lives. CSV export works too.
- · You hand over ~50 representative prompts (anonymized is fine).
Day 2
Spend pull · call-type classification
- · Pull last 30–90 days of spend from each provider dashboard.
- · Sample ~1,000 representative requests from your traffic.
- · Classify each into one of seven task types (classify, extract, translate, brand_voice, vision, grounded, embed) via Claude Sonnet 4.6.
- · Output: spend breakdown table + call-type mix percentage + concentration risk number.
Day 3
Routing analysis
- · Map each task type to current provider/model and to candidate alternatives.
- · Per-task cost-per-1M-token deltas across Groq, Cerebras, Vertex Gemini Flash, Anthropic, OpenAI.
- · Identify the top three routing changes by dollar impact.
- · Output: per-call-type analysis (the §3 of your final report).
Day 4
Quality vetting
- · Side-by-side benchmarks on your sample prompts. Old provider vs proposed provider, scored.
- · Reject any swap that regresses on accuracy, latency tail, or brand tone.
- · Brand-visible writing stays on Anthropic Sonnet by default — that's a hard rule, not a recommendation.
- · Output: routing recommendation table with quality-confidence column.
Day 5
Code drop · fallback chain
- · Drop in the routeByTask() entry point and 5 typed provider clients.
- · Retry-with-jittered-backoff, 30s timeout, graceful degradation when primary is rate-limited or down.
- · Telemetry hooks: per-task counter, latency histogram, cost-USD counter.
- · Output: implementation-snippets/ directory + README ready to drop into your repo.
Day 6
ROI · risk · report
- · ROI calculator: payback weeks, year-1 net savings, year-2 net savings.
- · Risk + mitigation matrix (drift, rate-limit, brand-voice, bill shock, policy change).
- · 30/60/90-day rollout: shadow mode → 50/50 → 100% with safety nets.
- · Output: report draft (PDF + Markdown).
Day 7
Delivery walkthrough
- · 30-minute call with your engineering team. I walk the report and the code line-by-line.
- · Open-question Q&A. If you want the $15K turnkey implementation, we discuss scope here.
- · One week of email Q&A starts after delivery — no clock-watching, no hourly fees.

What we need from you

· Read-only API keys (or CSV exports) from each LLM provider you use.
· ~50 representative prompts. Anonymize whatever you want — I don't need PII.
· 30 minutes of engineering time across the week to answer clarifying questions.
· A 30-min calendar slot on Day 7 for the delivery walkthrough.

That's it. No multi-stakeholder workshops, no questionnaires, no 50-page intake form. Most clients spend less than two hours total on their side of the audit.

Ready when you are

A 15-minute discovery call to see if the audit's a fit. I'll ask about your current providers, your monthly spend, and your call mix. If we're a fit, we book the audit start date on the same call.

Book a 15-min discovery call

What a one-week audit looks like.

Week 1, by day

Intake call · access setup

Spend pull · call-type classification

Routing analysis

Quality vetting

Code drop · fallback chain

ROI · risk · report

Delivery walkthrough

What we need from you

Ready when you are