The $49 answer to $1,000-a-month legal AI
CiteSight validates every cite in a brief against the authoritative case-law record - no LLM, no probability, no enterprise contract.
The brief
A business-law attorney described the same chore every week: opposing counsel files a brief, and someone on her side has to read every citation and confirm the case exists, the cite is to the right reporter, and the holding is actually what the brief claims. The big tools that promise to do this - Harvey.ai, CoCounsel, the rest of the enterprise legal-AI stack - start at roughly $1,000 per attorney per month. For a solo practice or a 10-attorney firm, that pricing is a wall. So she did what most small-firm attorneys do: she kept doing it by hand.
That conversation was the spec. CiteSight exists because the job she actually needed done - is this citation real, and does it point to the case the brief says it does? - does not require an LLM, a vector database, or a $50k annual contract. It requires a deterministic lookup against an authoritative source and a way to show the answer that a busy attorney can scan in fifteen seconds.
The thesis
Citations are facts. Facts don’t need probability.
Every enterprise legal-AI tool sells the same shape of answer: a confident paragraph generated by a model, with the citation embedded as supporting evidence. That shape is wrong for this job. A citation is either in CourtListener’s authoritative database under the case the brief names, or it isn’t. There is no “the model thinks this is 87% likely to be Brown v. Board.” There is only a green light, a yellow light, or a red light - and an attorney whose career depends on which one she’s looking at.
CiteSight refuses the LLM stack on purpose. The validation path is a deterministic call to CourtListener’s v4 citation-lookup endpoint, the result is normalized into one of four states (valid, warning, invalid, info for proprietary LEXIS/Westlaw cites), and the UI shows that state without editorializing. That decision is what makes $49/month possible - and it is what makes the output defensible in front of a judge.
How we built it
Stack at the surface. React 18 + TypeScript on Vite, shadcn/ui over Radix for the interface, Supabase for auth, Postgres, Storage, and Edge Functions, Stripe for the subscription, Cloudflare Turnstile on the auth flow, Netlify for the deploy. The whole frontend is ~16,000 lines of TypeScript across ~100 files. No LLM SDK appears anywhere in the repo. That absence is the architecture.
Where the work was: the extractor. The hard part of validating citations is not the validation - it is finding them in a 40-page brief without sending the entire document to an external API. src/utils/citationSentenceExtractor.ts is a sentence-anchored extractor with explicit 50-character context padding, written to hit “85–95% character reduction while maintaining 99%+ citation recall.” Every line in that file is there to protect the user from paying for tokens (or API quota) on text that contains no citations. It is the single piece of IP that makes the unit economics work at $49/month.
Where the work was: the PDF path. Attorneys do not paste briefs as text; they upload PDFs, including scanned ones. The pipeline is layered for that reality. A pasted brief goes straight to the extractor. A digital PDF is parsed client-side, scrubbed through pdfTextCleaner.ts, then extracted. An image-only PDF falls back to an OCR edge function (supabase/functions/ocr-pdf) that proxies to an external OCR service, then re-enters the same pipeline. Every branch lands in the same deterministic validator. There is one source of truth for what counts as a valid citation, no matter how the brief arrived.
Where the work was: the subscription gate. Four edge functions (check-subscription, create-checkout, customer-portal, stripe-webhook) form a single Stripe loop that decides whether a logged-in user is on a trial, active, or locked out. The most recent commits in the repo tighten that loop - forcing non-active subscribers into the trial flow rather than into the product - because the business case for CiteSight is the price, and the price has to be enforceable.
What the integrations had to survive. CourtListener is the entire reason CiteSight can exist; the Netlify CSP hard-codes its origin alongside Supabase so a misconfigured deploy fails closed, not open. Stripe is gated behind webhook signature verification because a customer-facing billing bug would cost more than the product earns. Supabase RLS is on from migration 002 so a leaked anon key can’t read another attorney’s brief.
What it shipped
- Paste a brief, get every citation in it validated against CourtListener in seconds, color-coded by status.
- Upload a PDF - digital or scanned - and the pipeline picks the right extraction path (parse, clean, or OCR) without asking the user to choose.
- Export a court-ready PDF report (jsPDF) showing every citation, its verdict, and a recommendation when the cite is wrong.
- A subscription gate that forces non-active users back into the trial flow, so the $49/month price is the actual price.
- A
/test-originalroute alongside the live/so the team can A/B the validation UX without redeploying.
What it changed
The brief that started this took the attorney several hours a week to spot-check by hand, with the implicit cost of every citation she didn’t have time to look up. CiteSight collapses that work into the time it takes to drag a PDF onto a page. The honest framing: the early metric is not adoption - it is the price differential itself. Charging $49/month for a job the market charges $1,000+ for is the change. Whether the market accepts that price is the next chapter of this case study, not this one.
What we’d still tell you
CiteSight is deliberately narrow. It validates that a cited case exists and matches the brief’s reference; it does not - yet - read the holding and tell you whether the brief characterized it accurately. That second step is where an LLM would actually earn its keep, and a v2 is likely to add it as a separate, optional pass clearly labeled as model-generated. The first version’s job is to be the deterministic floor underneath whatever else gets layered on. Sell the floor first; do not sell the ceiling.
"Citations are facts. Facts don't need probability."