Skip to content

LEANDERANTONY/AI_Job_Application_Agent

Repository files navigation

AI Job Application Agent

CI License: MIT Live App

A grounded job-application copilot. Search live listings across four ATS providers, paste a job description and see it parsed into hard / soft / must-have skills, and run a five-stage supervised pipeline that produces a tailored resume + cover letter — every claim anchored to evidence from the source resume.

Live: job-application-copilot.xyz · Workspace: app.job-application-copilot.xyz

Architecture overview


Visual tour

Job Search step with 12 live matches and saved jobs drawer

Step 01 — Resume builder (chat one into existence) Step 03 — Job Detail (parsed JD with match score + skill chips)
Output — classic_ats theme Output — Cover letter

What's actually inside

System What it does
Live job search Cached index of ~14,000 open roles from Greenhouse, Lever, Ashby, and Workday — refreshed every 4 hours. Hybrid relevance ranking fuses Postgres full-text search (with deterministic synonym / abbreviation expansion) and pgvector semantic similarity. Filter by company, work mode, role type, posted-within. Sort by relevance, recency, or alphabetical.
Resume intake Upload PDF / DOCX / TXT, or chat one into existence with the agentic builder — it asks questions, reads your GitHub README to pull in projects you forgot, web-searches for context when needed, and shows a live themed preview as you build. Parsed into a normalized profile with skills, experience timeline, projects, publications, and certifications.
JD review LLM-first JD parser with regex fallback. Surfaces hard skills, soft skills, and must-haves; shows match score against the loaded resume.
Supervised pipeline Matchmaker → Forge (tailoring) → Gatekeeper (review) → Resume Generation → Cover Letter. Three-layer LLM retry stack with per-agent fallback isolation, deterministic floor on every stage.
Artifact export 12 résumé themes — 6 single-column (ATS-safe) + 6 bespoke two-column layouts, all from one typed ThemeSpec registry — in DOCX or PDF, with a matching cover letter. The same source data feeds either pathway.
Grounded assistant Floating workspace chat with full context of the loaded resume, JD, analysis state, and saved jobs. Streams answers as they generate.
Command palette ⌘K / Ctrl+K from anywhere — jump between steps, load a saved job, re-ask a recent assistant question, or run the analysis.
Tier enforcement Per-(user, period, counter) atomic quota gates on every gated action (tailored applications, premium applications, assistant turns, resume parses, resume-builder sessions, job searches, saved jobs, saved workspaces). Free / Pro / Business cap matrix. Premium opt-in routes review + resume-gen + cover-letter to gpt-5.5 while keeping tailoring on mini for COGS reasons. Refund-on-failure so a transient workflow error doesn't burn a credit. A weekly per-user token meter (visible as a usage bar in the workspace) provides finer-grained accounting on top of the per-action quota.

How job discovery works

The cached jobs layer lives in Postgres (cached_jobs table) and is refreshed by a scheduled worker that fans out across all four sources. Highlights:

  • 79 Greenhouse boards + 6 Lever sites + 36 Ashby boards + 11 Workday Fortune-500 tenants in the active source pool.
  • 4-hour refresh cadence via pg_net cron triggering the /admin/refresh-cache endpoint (6 refreshes per day at 00:00 / 04:00 / 08:00 / 12:00 / 16:00 / 20:00 UTC).
  • Hybrid relevance ranking — a Postgres RPC fuses lexical full-text search (with deterministic synonym / abbreviation query expansion) and pgvector semantic similarity via Reciprocal Rank Fusion, so a role surfaces whether it matches on keywords or on meaning. Gracefully degrades to pure lexical if the semantic side is unavailable.
  • Saved-jobs drawer with a 24-hour TTL — bookmarks survive page reloads but expire if you don't act on them, with an EXPIRED badge so nothing silently disappears.

See ADR-013 and ADR-014 for the load-bearing decisions.

How the supervised pipeline works

ApplicationOrchestrator._run_pipeline runs five stages with progress callbacks, per-stage duration logging, JSON-contracted agent outputs, and per-agent fallback isolation:

  1. Matchmaker (deterministic) — build_fit_analysis() compares the candidate profile against the JD and produces matched / missing skills.
  2. Forge (TailoringAgent) — rewrites the deterministic baseline into role-specific resume guidance.
  3. Gatekeeper (ReviewAgent) — checks grounding, reports unsupported claims, and returns corrected tailoring when repairs are possible.
  4. Resume generation (ResumeGenerationAgent) — builds the final tailored resume artifact from the reviewed output.
  5. Cover letter (CoverLetterAgent) — runs only after review approval and produces a role-specific cover letter.

Each agent follows the same operating shape: deterministic baseline first, LLM-assisted refinement second, structured JSON output, and a deterministic fallback when assisted execution is unavailable. Per-agent fallback isolation means a single failing agent falls back independently — the other three keep their LLM-quality output. See ADR-018 for the three-layer retry stack (SDK retry × 2 + app-level retry + per-agent retry).

How grounding works

  • Deterministic services build the candidate profile, JD summary, fit analysis, and first-pass tailored draft before the agent layer runs.
  • ReviewAgent returns grounding_issues, unresolved_issues, revision_requests, and an optional corrected_tailoring payload.
  • The orchestrator uses corrected_tailoring as the downstream source of truth when review repairs the draft.
  • Cover-letter generation is gated on review approval.
  • The fallback review path checks whether the output references missing hard skills that aren't evidenced in the source profile.

Engineering notes

  • 75 Python test files cover parsing, normalization, fitting, tailoring, orchestration, builders, exports, auth, quotas, persistence, the Lemon Squeezy webhook, voice transcription, artifact feedback, prompt-registry byte-identity, error handling, hybrid job search, the four ATS adapters, and the cache-refresh healthcheck.
  • Quality runners in tests/quality/ produce evidence for each LLM-driven stage (parser, tailoring, review, resume gen, cover letter, assistant, JD parser, latency baseline). backend/nightly_eval.py wraps them into a single regression-checked batch — manual-only at pre-revenue stage by design, see ADR-026.
  • Every LLM prompt loads from a versioned JSON registry (prompts/<name>/v1.json) — all 11 builders migrated off Python f-string concats, each guarded by a byte-identity test so a template can't silently drift from its original.
  • 33 ADRs in docs/adr/ record the architectural decisions, including the Streamlit-first → Next.js + FastAPI transition (ADR-012), DOCX-first export (ADR-015), conversational builder (ADR-016), state-aware assistant (ADR-017), three-layer retry stack (ADR-018), independent step navigation (ADR-019), tier resolution shim (ADR-020), atomic quota with refund (ADR-021), tier-aware model selection (ADR-022), Lemon Squeezy as Merchant of Record for v1 (ADR-023), the observability stack (ADR-024), the EU cookie consent banner (ADR-025), manual-only nightly eval (ADR-026), the tier-gated export entitlement (ADR-027), LLM provider failover + premium reasoning tier (ADR-028), the single-source ThemeSpec registry (ADR-029), the résumé-builder agentic architecture (ADR-031), the six bespoke two-column résumé themes (ADR-032), and hybrid lexical + semantic job search (ADR-033).
  • Architecture details live in docs/architecture.md; the day-2 operational runbook in docs/deployment.md.

Deployment

  • app.job-application-copilot.xyz → Vercel-hosted Next.js workspace
  • api.job-application-copilot.xyz → VPS-hosted FastAPI backend
  • frontend/ → Next.js + React 19 + Turbopack
  • backend/ → FastAPI + Uvicorn, async OpenAI client, Supabase Postgres
  • backend/vps/ → Docker Compose + Caddy bundle for the backend stack
  • src/ → shared Python core (orchestrator, agents, builders, schemas, services)

Production stack

Layer Choice Notes
Frontend Next.js on Vercel Auto-deploys from main; source maps uploaded to Sentry on every deploy
Backend FastAPI in Docker on an EU VPS, fronted by Caddy The reverse-proxy block is committed in backend/vps/Caddyfile — runtime-only Caddy config is wiped on restart
Data Supabase (EU region) Auth (Google OAuth), per-user persistence, the cached_jobs index, quota counters, subscriptions, and the aijobagent_run_traces cost-attribution table
Scheduled work Supabase pg_cron + pg_net cached_jobs refresh every 4h, expired resume-builder-session cleanup every 5 min, and a daily cached_jobs retention healthcheck. Nothing scheduled spends OpenAI tokens — the only LLM-spending job (nightly_eval) is deliberately manual-only
Error + perf Sentry (jobagent-backend + jobagent-frontend) Error tracking + traces + AI Agents Monitoring (token/cost/latency spans, no prompt-body PII) + Logs + errors-only session replay + Sentry Crons (cached-jobs-refresh, cached-jobs-healthcheck) + a 5-min EU Uptime monitor. Always-on as legitimate interest
Product analytics PostHog (EU, free Developer plan) Autocapture + heatmaps + consent-gated session replay; server-side funnel events (job_searchedresume_uploadedanalysis_startedartifact_exported, plus quota_blocked and feedback_submitted) feeding a "Job Agent — Product Health" dashboard; every event tagged product: "jobagent". Consent-gated per GDPR/ePrivacy
Payments Lemon Squeezy (Merchant of Record) Scaffolded + HMAC-verified webhook live; env-gated behind a "Coming soon" frontend fallback until the dashboard's final variant IDs land

GDPR posture: a custom in-house cookie consent banner is the gate. Sentry error tracking + traces + Feedback load always (legitimate interest, GDPR Art. 6(1)(f) — crash reporting is operationally necessary). PostHog analytics + PostHog replay + Sentry Session Replay load only after explicit opt-in (ePrivacy Art. 5(3)). No third-party JS loads before consent. See ADR-024 and ADR-025.