CortexAI — Backend

FastAPI backend for CortexAI: auth, chat CRUD, streaming AI responses (Groq + LangChain), optional web search (Tavily), and PostgreSQL-backed history with smart summarization.

What this backend does

Auth: Sign up and login via /api/auth; JWT access tokens, bcrypt password hashing.
Chats: Create, list, get, and delete chats at /api/chats; each chat has a title and messages ordered by position.
Messages: Append user messages and stream assistant replies; long conversations use a summary of older messages so the AI keeps context without exceeding the context window.
AI: Groq LLM via LangChain; optional Tavily web search when the query needs up-to-date info.
Database: PostgreSQL (e.g. Neon); SQLAlchemy ORM; tables: users, chats, messages.

Folder structure

backend/
├── README.md              ← This file
├── main.py                ← FastAPI app, CORS, lifespan, routers
├── requirements.txt      ← Pip dependencies
├── pyproject.toml        ← Project metadata and Python deps (uv/pip)
├── .env                   ← Not in git: DATABASE_URL, GROQ_API_KEY, JWT_SECRET_KEY, etc.
│
├── config/
│   └── database_config.py ← SQLAlchemy engine, SessionLocal, Base, get_db
│
├── models/
│   └── __init__.py        ← SQLAlchemy models: User, Chat, Message
│
├── routes/
│   ├── user_routes.py     ← POST /api/auth/signup, POST /api/auth/login
│   └── agent_routes.py    ← /api/chats: list, create, get, delete, stream message
│
├── agent/
│   ├── prompts.py         ← System prompts (CortexAI, web search, summarization)
│   └── tools.py           ← get_llm, summarize_history_text, web_search, needs_web_search, etc.
│
└── utils/
    └── auth.py            ← create_access_token, get_current_user, get_password_hash, verify_password

Entry point: main.py; runs with uvicorn main:app.
Auth: JWT in Authorization: Bearer <token>; get_current_user used on protected routes.
Agent: agent_routes uses agent/tools and agent/prompts for LLM and summarization.

Setup

Python: 3.12+ recommended.
PostgreSQL: Create a database (e.g. Neon); you need the connection string for DATABASE_URL.

Virtual env and install:

cd backend
python -m venv .venv
.venv\Scripts\activate   # Windows
# source .venv/bin/activate   # macOS/Linux
pip install -r requirements.txt

Or with uv: uv sync (if using pyproject.toml).

Environment: Copy .env.example to .env (or create .env) and fill in the variables below. Do not commit .env.
Run:
```
uvicorn main:app --reload
```
API: http://localhost:8000
Interactive docs: http://localhost:8000/docs

Environment variables

Variable	Required	Description
`DATABASE_URL`	Yes	PostgreSQL connection string (e.g. `postgresql://user:pass@host/db?sslmode=require`).
`GROQ_API_KEY`	Yes	Groq API key for the LLM.
`JWT_SECRET_KEY`	Yes	Secret used to sign JWT access tokens.
`JWT_ALGORITHM`	No	Default `HS256`.
`ACCESS_TOKEN_EXPIRE_MINUTES`	No	Token expiry (default in code).
`BACKEND_CORS_ORIGINS`	No	Comma-separated origins (e.g. `http://localhost:5173`). Default allows common localhost URLs.
`TAVILY_API_KEY`	No	If set, enables web search for the agent.
`GROQ_MODEL`	No	Model name for Groq (e.g. `llama-3.1-70b-versatile`).
`PORT`	No	Port for uvicorn when run via `main.py` (default `8000`).

API overview

Method	Path	Description
GET	`/`	Health/hello; returns app name and version.
POST	`/api/auth/signup`	Register; body: `{ "email", "password" }`.
POST	`/api/auth/login`	Login (form or JSON); returns `access_token`.
GET	`/api/chats`	List current user’s chats (JWT required).
POST	`/api/chats`	Create a new chat (JWT required).
GET	`/api/chats/{id}`	Get one chat with all messages (JWT required).
DELETE	`/api/chats/{id}`	Delete a chat and its messages (JWT required).
POST	`/api/chats/{id}/messages`	Send a user message; response is streaming (JWT required).

Streaming: the frontend consumes the response body as a stream; the backend uses the LLM and (when needed) web search, then streams the assistant reply.

Database Architecture & History Management

Database Architecture

The application uses a relational database with three main tables that form a hierarchical structure for managing users, their chat sessions, and individual messages.

Entity Relationship Diagram

┌─────────────┐
│    User     │
├─────────────┤
│ id (PK)     │
│ email       │
│ password    │
│ created_at  │
└──────┬──────┘
       │
       │ 1:N
       │
       ▼
┌─────────────┐
│    Chat     │
├─────────────┤
│ id (PK)     │
│ user_id (FK)│◄───┐
│ title       │    │
│ summary     │    │
│ created_at  │    │
│ updated_at  │    │
│ last_msg_at │    │
└──────┬──────┘    │
       │          │
       │ 1:N      │
       │          │
       ▼          │
┌─────────────┐  │
│   Message   │  │
├─────────────┤  │
│ id (PK)     │  │
│ chat_id (FK)├──┘
│ role        │
│ content     │
│ position    │
│ created_at  │
└─────────────┘

Table Structures

1. Users Table (`users`)

Purpose: Stores user authentication and account information
Key Fields:
- id: Primary key (auto-increment integer)
- email: Unique email address (indexed for fast lookups)
- password_hash: Hashed password for authentication
- created_at: Timestamp of account creation

2. Chats Table (`chats`)

Purpose: Represents a conversation session between a user and the AI assistant
Key Fields:
- id: Primary key (auto-increment integer)
- user_id: Foreign key to users.id (CASCADE delete - if user is deleted, all their chats are deleted)
- title: Optional title for the chat (auto-generated from first user message)
- summary: Critical field - Stores AI-generated summary of older messages (see History Management section)
- created_at: When the chat was created
- updated_at: Automatically updated when chat is modified
- last_message_at: Timestamp of the most recent message (used for sorting chats)

3. Messages Table (`messages`)

Purpose: Stores individual messages within a chat conversation
Key Fields:
- id: Primary key (auto-increment integer)
- chat_id: Foreign key to chats.id (CASCADE delete - if chat is deleted, all messages are deleted)
- role: Either "user" or "assistant" (string, max 50 chars)
- content: The actual message text (TEXT type for unlimited length)
- position: Critical field - Integer that determines message order within a chat
- created_at: Timestamp of when the message was created

Relationships & Constraints

User → Chats: One-to-Many relationship
- One user can have multiple chat sessions
- When a user is deleted, all their chats are automatically deleted (CASCADE)
Chat → Messages: One-to-Many relationship
- One chat contains multiple messages
- When a chat is deleted, all its messages are automatically deleted (CASCADE)
- Messages are ordered by position field (ascending)
Foreign Key Constraints:
- chats.user_id references users.id with ON DELETE CASCADE
- messages.chat_id references chats.id with ON DELETE CASCADE

History Management System

The application implements a sophisticated history management system that balances complete data preservation with efficient context handling for the AI model.

Core Principle: Full History + Smart Summarization

The system maintains 100% of all messages in the database while using intelligent summarization to manage context window limitations.

How It Works

1. Message Position System

Every message is assigned a sequential position integer that ensures correct ordering:

# When a new user message is added:
last_position = max(Message.position) for chat_id
user_position = last_position + 1
assistant_position = last_position + 2

Why this matters:

Messages are always retrieved in order: ORDER BY position ASC
Even if messages are created simultaneously, position ensures correct sequence
No reliance on timestamps for ordering (which can have race conditions)

2. Two-Tier History Strategy

The system divides message history into two categories:

Recent Messages (Last 5 messages)

Stored: Full, complete message content
Usage: Sent directly to the AI model as context
Why: Recent messages contain the most relevant context for generating responses

Older Messages (Everything before the last 5)

Stored: Full, complete message content (still in database)
Usage: Summarized into the chat.summary field
Why: Reduces token usage while preserving important information

3. Summary Generation Process

When a chat exceeds 5 messages, the system automatically summarizes older messages:

if len(all_messages) > 5:
    older_messages = all_messages[:-5]  # Everything except last 5
    history_text = format_messages(older_messages)
    
    # Generate or update summary
    chat.summary = summarize_history_text(
        history_text, 
        existing_summary=chat.summary
    )

Summary Update Logic:

First time (no existing summary): Creates a new summary from older messages
Subsequent times: Updates existing summary by incorporating new older messages
Result: Summary stays concise (few paragraphs) but accumulates important context

4. Context Assembly for AI

When generating a response, the system constructs the context as follows:

[System Prompt]
  ↓
[Conversation Summary] (if exists)
  ↓
[Recent 5 Messages] (full content)
  ↓
[New User Message]
  ↓
[AI Generates Response]

Example Flow:

Chat has 25 messages total:
- Messages 1-20: Summarized into chat.summary
- Messages 21-24: Recent messages (full content)
- Message 25: New user message
- Message 26: AI response (to be generated)

Context sent to AI:
1. System prompt
2. Summary of messages 1-20
3. Full messages 21-25
4. AI generates message 26

Key Benefits of This Approach

Complete Data Preservation
- All messages are permanently stored
- Users can always view full conversation history
- No data loss, even for very long conversations
Efficient Context Management
- Only recent messages consume full token budget
- Older messages compressed into summaries
- Enables handling of very long conversations (100+ messages)
Incremental Summary Updates
- Summary is updated incrementally, not regenerated from scratch
- More efficient than re-summarizing entire history each time
- Maintains continuity of context
Correct Message Ordering
- Position-based ordering prevents race conditions
- Guarantees messages are processed in correct sequence
- Works correctly even with concurrent requests

Database Queries for History Management

Loading Messages in Order:

SELECT * FROM messages 
WHERE chat_id = ? 
ORDER BY position ASC

Getting Last Position:

SELECT MAX(position) FROM messages 
WHERE chat_id = ?

Finding Recent Messages:

all_messages = db.query(Message)
    .filter(Message.chat_id == chat.id)
    .order_by(Message.position)
    .all()

recent_messages = all_messages[-5:]  # Last 5
older_messages = all_messages[:-5]   # Everything before last 5

Best Practices Implemented

Atomic Operations: User message is saved before generating AI response
Transaction Safety: All related updates (message, summary, title) happen in single transaction
Cascade Deletes: Deleting a user/chat automatically cleans up related data
Indexed Queries: Foreign keys and frequently queried fields are indexed
Timezone Awareness: All timestamps use timezone-aware datetime objects

Summary Function Details

The summarize_history_text() function:

Uses AI Model: Leverages the same LLM (Groq) to generate summaries
Incremental Updates: When summary exists, it updates rather than replaces
Concise Output: Prompts ensure summaries stay brief (few paragraphs)
Context Preservation: Important details, decisions, and user preferences are maintained

Example Summary Update:

Existing Summary: "User asked about Python programming. Discussed functions and classes."

New Messages to Add: "User asked about async/await. Explained event loops."

Updated Summary: "User asked about Python programming including functions, classes, 
and async/await patterns. Discussed event loops and asynchronous programming concepts."

Conclusion

This architecture provides a robust, scalable solution for managing chat history that:

✅ Preserves all data permanently
✅ Handles conversations of any length efficiently
✅ Maintains correct message ordering
✅ Provides rich context to AI while managing token limits
✅ Enables fast retrieval and display of chat history

The combination of full message storage with intelligent summarization ensures both data integrity and optimal performance for AI interactions.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agent		agent
config		config
models		models
routes		routes
utils		utils
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
.vercelignore		.vercelignore
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock
vercel.json		vercel.json

Folders and files

Latest commit

History

Repository files navigation

CortexAI — Backend

What this backend does

Folder structure

Setup

Environment variables

API overview

Database Architecture & History Management

Database Architecture

Entity Relationship Diagram

Table Structures

1. Users Table (users)

2. Chats Table (chats)

3. Messages Table (messages)

Relationships & Constraints

History Management System

Core Principle: Full History + Smart Summarization

How It Works

1. Message Position System

2. Two-Tier History Strategy

Recent Messages (Last 5 messages)

Older Messages (Everything before the last 5)

3. Summary Generation Process

4. Context Assembly for AI

Key Benefits of This Approach

Database Queries for History Management

Loading Messages in Order:

Getting Last Position:

Finding Recent Messages:

Best Practices Implemented

Summary Function Details

Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Users Table (`users`)

2. Chats Table (`chats`)

3. Messages Table (`messages`)

Packages