Skip to content

scaleborg/second-brain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

636 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Samaritan

CI Python React FastAPI License: MIT

Samaritan is a skill intelligence system that turns your learning library into career momentum. Ingest high-signal sources, extract skill evidence, map role-fit gaps, and ship proof projects with grounded answers.

Documentation: https://scaleborg.mintlify.app

Product Overview

Most systems stop at content retrieval. Samaritan is productized around decision support:

  • Evidence supply: what you have demonstrated through real content consumption
  • Market demand: what your target roles and enterprise use cases currently reward
  • Strategic recommendations: what to do next, based on both

Core Loop

  1. Ingest high-signal content (YouTube, PDFs, web articles, Dropbox media).
  2. Extract learnings and map to a canonical skill taxonomy.
  3. Score domain capability and emergent composites.
  4. Generate role-fit and job recommendations.
  5. Track proof projects that close the highest-value gaps.

Composite Scoring Contract

  • readiness_pct: coverage-based readiness score for each capability composite.
    • Formula: 0.8 * required_coverage + 0.2 * supporting_coverage.
    • Used for activation and ranking.
  • confidence_pct: evidence-depth confidence score for covered skills.
    • Reflects how strongly skills are supported by extracted/declared evidence.
  • score_pct: compatibility alias of readiness_pct for existing integrations.

Product Surfaces

  • / - Landing page with mission framing (Why: compounding capability, How: Compound + Harness, Outcome: superpowers profile)
  • /landing - Alias route for /
  • /$marketingPage - Dynamic marketing pages
  • /mighty-god-mode - Command-layer surface for the active execution cycle (state snapshot, ingest packs, startup slices, and weekly protocol)
  • /organizer - Pre-ingest planning with day-bucketed time windows
  • /personal - Personal hub (tabbed: Dump, Renewals & Deadlines, Trip Ideas (Someday), Insured Equipment, Web Gallery, Website TODO)
  • /ingest - Ingestion control center (submit, monitor, inspect activity)
  • /signal-policy - Signal policy viewer (relevance gate criteria, career goal, recent verdicts)
  • /career-accelerator - Career Accelerator (signal-gated placeholder until enough evidence is ingested)
  • /career-foundations - Career Foundations (11 tabs, 34 companies, 8 career paths, diagnostic assessment, study planner, mock interviews, AI critique)
  • /math-bridge - Math Bridge Program (4 levels: Core Numeracy, High School, Pre-University, Engineering Prep; skill ladder with confidence scoring and micro-checks)
  • /high-performance - High Performance operating layer (sport, look, social capital, discipline, weekly protocol)
  • /projects/enterprise-projects - Enterprise proof-project planning and tracking
  • /projects/ai-generated-projects - AI-generated project ideas saved for execution
  • /projects/:projectId - Dedicated project detail workspace
  • /product-lab - Core-adjacent standalone builds tracker (project name, platform, core connection, repo/path, status, next milestone)
  • /startup-challenge - Challenge workflow and execution tracking
  • /library - Source management, filtering, and concept/tool views
  • /library/concepts/:concept - Concept-focused library view
  • /library/tools/:tool - Tool-focused library view
  • /library/:sourceId - Source detail page (metadata, learnings, related sources, chunk previews)
  • /oss-projects - Personalized open-source project discovery, search, and bookmarking
  • /chat - Source-grounded Q&A with provenance, harness controls, and optional Focus Chat mode
  • /chat/:conversationId - Persisted conversation route
  • /engine - Profile, superpowers, roles, and enterprise-pattern diagnostics
  • /taxonomy - Full taxonomy inventory with profile overlays
  • /monitor - Runtime health, architecture, evals, test-case browser, database topology, tracing, and quality scorecard
  • /elite-toolbox - Curated tool radar with core leverage, adjacent tools, and operator backlog
  • /codex-productivity - Codex workflow alignment matrix
  • /claude-productivity - Claude workflow alignment (sub-tabs: Claude Code Tips, 7 Prompting Rules)
  • /cowork-productivity - Cowork best-practices alignment matrix
  • /obsidian-productivity - Obsidian vault workflow alignment matrix
  • /gmail-productivity - Gmail inbox cleanup alignment matrix
  • /gemini-productivity - Gemini workflow alignment matrix
  • /excel-productivity - Claude for Excel workflow alignment matrix
  • /powerpoint-productivity - Claude for PowerPoint workflow alignment matrix
  • /presentations-productivity - Presentation tools workflow alignment matrix
  • /how-it-works - Execution model walkthrough with Compound + Harness operating principle
  • /execution-playbook - Standalone sidebar tab for the Claude Code execution playbook (checklists, prompt tactics, parallel ops, remote access)
  • /linkedin-network - LinkedIn audience audit with keep/cut scoring, snapshot persistence, and weekly plan export
  • /shopify-architect - Shopify implementation reference for architecture, APIs, and ecommerce execution
  • /module-48 - Module 48 fashion ecommerce (tabbed: Operations, P&L, Pennylane)
  • /sonic-dna - Sonic identity and aesthetic lineage mapping
  • /reference-tracks - Reference Track Vault (BPM, key, energy, arrangement, mix notes)
  • /brand-studio - Fashion design asset library and collection management
  • /brand-studio/share/$token - Public share link for collections/garments (no sidebar)
  • /events - Professional events tracker
  • /apps - Personal app inventory with AI enrichment
  • /dev-ref - Developer reference (37 tabs: 8 languages + 29 stack tools)
  • /prep - Interview prep (10 tabs: FAANG, SQL, System Design, NLP, Docker, K8s, Take-Home, Feedback, Physical AI, Distributed Systems)
  • /applied-systems - Applied systems tracks (7 tabs: LLMOps, RecSys, DataOps, Evals, World Models, 3D Vision, Distributed ML)
  • /chinese - Chinese language learning (dashboard, vocab, lessons, SRS review)
  • /cantonese - Cantonese language learning (dashboard, vocab, lessons, SRS review)
  • /math-refresh - Math curriculum (tracks: Zero to One, Prepa ML; interactive exercises)
  • /culture-generale - Culture Generale (tracks: Sciences, Humanites, Sciences Sociales; Learn/Quiz/Practice)
  • /embodied-ai - Embodied AI reference (7 tabs: World Models, Core, Humanoid, Service, Autonomous, Agentic, Edge Inference)
  • /bio-augmentation - Bio-Augmentation reference (6 tabs: Foundations, Neurotech, Wearables, Biohacking, Translation, Convergence)
  • /cognitive-toolkit - Mental models and reasoning frameworks (7 tabs)
  • /behavioral-design - Behavioral design patterns (8 tabs: Frameworks, Feed Design, Social Loops, Variable Rewards, Friction, Notifications, Gamification, Case Studies)
  • /elite-freelance - Elite freelance AI/ML positioning (5 tabs: Real-Time Systems, APIs at Scale, AI Agent Infra, Production Hardening, Positioning)
  • /agents - Agent roadmap and resources (tabbed: Roadmap, Resources)
  • /mcp - MCP server dashboard and documentation base (tabbed: Server Dashboard, Doc Base)
  • /skills - Claude Code skills inventory (My Skills, Official Anthropic, Community)
  • /changelog - GitHub release notes stream

Legacy redirects: /career-plan -> /career-accelerator, /shopify -> /shopify-architect, /social -> /linkedin-network, /linkedin-reset -> /linkedin-network, /house-rules -> /execution-playbook

Note: GET /api/courses exists as an API endpoint; there is no dedicated Courses tab in the current UI. Note: The landing page (/) keeps its dedicated visual treatment; all other product surfaces run in the white app theme.

Core Capabilities

  • Career intelligence: role-fit scoring, capability composites, strategic recommendations
  • Career foundations: diagnostic assessment, study planner with persistence, mock interviews with AI critique, drill practice
  • Math bridge: 4-level curriculum (26 topics) with skill ladder, confidence scoring, micro-checks, and prerequisite locking
  • Ingestion and evidence extraction: multi-source ingestion with relevance gating
  • Source-grounded chat: hybrid retrieval (FTS5 + vectors + reranking) with provenance
  • Skill intelligence: canonical taxonomy, alias resolution, unmatched-term discovery
  • Project execution layer: proof projects linked to capability gaps
  • Operator controls: health, eval runs, architecture visibility, tracing

Architecture (High Level)

  • Frontend: React + Vite
  • Backend: FastAPI
  • Retrieval: FTS5 + Chroma + reranking
  • Storage: SQLite + ChromaDB (data/)
  • Interfaces: HTTP + SSE + WebSocket progress streams
  • External providers: OpenAI, Groq, Cohere, Supadata, Mistral, Tavily, Gladia, Dropbox

Quick Start (Local)

1) Clone and Configure

git clone https://github.com/scaleborg/second-brain.git
cd second-brain
cp .env.example .env

Add required provider keys in .env.

2) Install Dependencies

# backend
uv venv .venv
source .venv/bin/activate
make install

# frontend
cd frontend
npm install
cd ..

3) Run the Stack

Single-terminal mode (recommended):

source .venv/bin/activate
make dev

Split-terminal mode:

# terminal 1
source .venv/bin/activate
make dev-backend

# terminal 2
make dev-frontend

4) Open

  • Frontend: http://localhost:5174
  • API docs: http://localhost:8000/docs

Canonical Commands

make test
make eval
make eval-career
make lint
make dev
make dev-stop
make dev-status
make dev-backend
make dev-frontend

Advanced workflows: docs/engineering.md

Common Operations

# ingest one URL
python ingest.py --url "https://youtube.com/watch?v=VIDEO_ID"

# ingest a local folder
python file_ingest.py --folder /path/to/docs

# Dropbox media sync
python dropbox_courses.py --root "/courses" --namespace global --dry-run
python dropbox_courses.py --root "/courses" --namespace global

# career recommendation eval
make eval-career

# demand-signal corpus builders
python scripts/build_job_ads_corpus.py \
  --input /path/to/job_ads_input.csv \
  --output evals/datasets/job_ads_corpus.jsonl

python scripts/build_enterprise_use_cases_corpus.py \
  --input /path/to/enterprise_use_cases_input.csv \
  --output evals/datasets/enterprise_use_cases_corpus.jsonl

API Surface (Primary)

  • /api/career/* - capability, taxonomy, composites, patterns, and recommendation endpoints (/overview, /taxonomy, /taxonomy/inventory, /composites, /patterns, /recommend/jobs*)
  • /api/chat, /api/chat/stream - source-grounded chat
  • /api/library/* - source lifecycle and search
  • /api/ingest* - URL/upload/Dropbox ingestion flows
  • /api/projects/* - project planning and tracking
  • /api/social/* - OAuth account connect/disconnect + one-click publishing to X/LinkedIn
  • /api/network-ops/* - LinkedIn network snapshot persistence and KPI summary deltas
  • /api/stats/calls - raw per-call LLM trace records (provider/model/call-site/tokens/latency)
  • /api/changelog - GitHub release notes for the marketing changelog page
  • /api/chinese/* - Mandarin Chinese learning dashboard, vocab, lessons, and review sessions
  • /api/cantonese/* - Cantonese learning dashboard, vocab, lessons, and review sessions
  • /api/learnings* - extracted evidence and metadata
  • /api/user-skills* - declared skill profile
  • /api/evals/* - evaluation runs and comparisons

OpenAPI reference: http://localhost:8000/docs

Documentation

License

MIT. See LICENSE.

About

Turn your learning library into career momentum. Ingest high-signal sources, extract skill evidence, map role-fit gaps, and ship proof projects with grounded answers.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors