aimargin.dev
Cut your AI bill 40–60% in two weeks.
A fixed-fee audit of your LLM provider routing. Per-call-type cost breakdown, fallback chain code, and a 30/60/90-day rollout plan — built by an operator who runs the same stack in production.
$5,000 fixed-fee · 1 week turnaround · 100% refund within 7 days of delivery
What you get
1.
A 12-section audit report
Per-call-type cost breakdown (text, vision, grounded search, brand voice, embeddings) with specific routing recommendations and dollar-savings projections per task type.
2.
Production-ready TypeScript
A drop-in
routeByTask()entry point + 5 typed provider clients (OpenAI, Anthropic, Vertex, Groq, Cerebras) with retry, jittered backoff, timeout, and graceful fallback. Yours to keep.3.
A 30/60/90-day rollout plan
Shadow-mode plan, 10/50/100 traffic ramp, telemetry hooks for cost + latency + fallback-rate, and the daily-cost Slack digest you can wire up day one.
4.
A 30-min delivery walkthrough
I walk your team through the report, the code, and the open questions. After delivery, a week of email Q&A is included at no extra charge.
Why me
I run a consumer app that classifies ~40,000 events a month with text LLMs and ~10,000 images a month with vision models. Routing the right call to the right provider — Groq for plain-text classify, Vertex Gemini Flash for vision, Anthropic Sonnet for brand-visible writing, with Cerebras as failover — cut the LLM bill ~60% versus the OpenAI-only baseline I started with.
The same pattern reclaims 40–60% on most AI startups in the seed-to- Series-A range. The audit takes a week because the analysis is surgical — not because it's hard, but because each call type needs its own quality vetting before the routing flips on.
I learned this the hard way after a GCP free-tier suspension burned a weekend of scraping work. There's a right way to multi-source. I'll show you.
Pricing
Most popular
Audit
$5,000
fixed · 1 week · refund within 7 days
- · 12-section audit report (PDF + Markdown)
- · Drop-in routeByTask() implementation
- · 5 typed provider clients
- · 30-min delivery walkthrough
- · 1 week of email Q&A
Optional add-on
Implementation
$15,000
fixed · 2 weeks · post-audit
- · I ship the routing in your repo
- · Shadow mode + 10/50/100 ramp
- · Observability counters wired up
- · 30 days of post-launch tuning
- · Final eng-team walkthrough
Discussed at the end of the audit, not before. Pay only if the audit makes the case.
FAQ
- How do you access our cost data?
- Read-only API access (or a CSV export if you'd rather). I work from your provider dashboards plus a sample of representative prompts. No PII retained; everything deleted post-audit.
- Do you sign an NDA?
- Standard mutual NDA, yes. I'll send my template or work from yours.
- What if your routing recommendation degrades quality?
- The audit specifies quality vetting for every routing change — shadow mode + side-by-side sample comparison before any provider swap. I refuse to swap a category if quality regresses. Brand-visible writing stays on Anthropic for that reason.
- Refund?
- 100% within 7 days of delivery, no questions. I'd rather the project not work for you than collect an unhappy fee.
- Who are you?
- Paul Plawin— solo operator. Run a Latin dance discovery app called Bailar (mobile + web), built top-to-bottom. Burned through enough LLM budgets to know exactly where the leaks are.