ThemisDB provides integrated documentation assistance through AQL functions that leverage a pre-compiled documentation database and advanced AI capabilities. These functions enable users to query documentation, get configuration help, and troubleshoot issues directly from AQL queries.
NEW: Unified HELP() Function - A single intelligent function that uses three-tier intent detection to automatically determine what you need and provide the most appropriate response. This is the recommended way to access documentation assistance.
Three-Tier Intent Detection:
- Native NLP - Uses ThemisDB's CLASSIFY() function (primary, fastest)
- LLM-Based - Uses embedded LLM for semantic understanding (secondary, most accurate)
- Regex Fallback - Pattern matching for guaranteed reliability (tertiary, always works)
Key Features:
- Native AI Integration - Leverages ThemisDB's built-in NLP capabilities
- LLM-Powered Classification - Uses embedded LLM for complex cases
- Regex Fallback - Ensures reliability when AI unavailable
- SSE Compatible - Supports Server-Sent Events for streaming
- MCP Integration - Works with Model Context Protocol
- User Feedback - Can learn from corrections over time
The unified intelligent helper function with three-tier intent detection.
Syntax:
SELECT HELP(question_or_query) AS answer;Parameters:
question_or_query(string): Any question, problem description, or search request
Returns:
- String containing the appropriate response (answer, guidance, search results, or solution)
How it works:
The HELP() function uses a three-tier approach for maximum reliability and accuracy:
-
Primary: Native NLP Classification ⚡ Fastest
- Uses ThemisDB's built-in CLASSIFY() function
- Zero-shot classification with no training required
- Native implementation for best performance
- Currently in development (returns to next tier)
-
Secondary: LLM-Based Classification 🧠 Most Accurate
- Sends your query to an embedded LLM for intelligent classification
- LLM analyzes semantic meaning and context
- Returns intent: configuration, troubleshooting, search, or general
- Highly accurate and context-aware
- Supports multiple languages
-
Tertiary: Regex Pattern Matching 🛡️ Always Reliable
- If both AI methods unavailable, uses keyword-based detection
- Ensures reliability even without AI
- Pattern matching on common keywords
- Fast and predictable
Intent Routing:
- Configuration → Configuration help with topic extraction
- Troubleshooting → Error diagnosis and solutions
- Search → Document search with ranked results
- General → RAG-powered query with context
Examples:
-- General documentation questions (RAG-powered)
SELECT HELP('How do I enable sharding?') AS answer;
SELECT HELP('What is vector search?') AS info;
SELECT HELP('Explain RAID configuration') AS explanation;
-- Configuration help (AI detects setup intent)
SELECT HELP('Configure security settings') AS guide;
SELECT HELP('Setup replication') AS setup_guide;
SELECT HELP('How to configure sharding?') AS config;
-- Troubleshooting (AI detects problem/error)
SELECT HELP('Server hangs at startup') AS solution;
SELECT HELP('Connection error on port 8529') AS fix;
SELECT HELP('Database fails to start') AS troubleshooting;
-- Document search (AI detects information retrieval)
SELECT HELP('Search for RAID documentation') AS search_results;
SELECT HELP('Find information about vector embeddings') AS docs;
SELECT HELP('Look for security best practices') AS references;
-- Works with compound queries
SELECT HELP('I have an error: server not responding') AS solution;
SELECT HELP('Need to configure TLS, how?') AS guide;These functions are available when you need explicit control over the type of operation. The HELP() function uses these internally.
Query the documentation database with natural language and get an AI-generated answer.
Syntax:
SELECT DOCS_QUERY(query_text) AS answer;Parameters:
query_text(string): Natural language question about ThemisDB
Returns:
- String containing the generated answer
Examples:
-- Basic usage
SELECT DOCS_QUERY('How do I enable sharding?') AS answer;
-- Multiple queries
SELECT
DOCS_QUERY('What is the default port?') AS port_info,
DOCS_QUERY('How to configure TLS?') AS tls_info;
-- Use in WHERE clause
FOR doc IN :document
FILTER doc.type == 'configuration'
RETURN {
title: doc.title,
help: DOCS_QUERY(CONCAT('Explain ', doc.title))
};DOCS_SEARCH(query: string, limit: int = 5) -> array
Search the documentation database without LLM generation. Returns relevant documents ordered by relevance score.
Syntax:
SELECT DOCS_SEARCH(query_text, max_results) AS results;Parameters:
query_text(string): Search querymax_results(int, optional): Maximum number of results (default: 5)
Returns:
- JSON array of document objects with:
file_name: Document filenamefile_path: Full path to documentrelevance_score: Relevance score (0.0 to 1.0)content_type: MIME typecontent_preview: First 200 characters of contentmetadata: Additional metadata
Examples:
-- Basic search
SELECT DOCS_SEARCH('RAID configuration', 10) AS relevant_docs;
-- Search with filtering
LET docs = DOCS_SEARCH('vector embeddings', 5)
FOR doc IN docs
FILTER doc.relevance_score > 0.7
RETURN {
title: doc.file_name,
score: doc.relevance_score,
preview: doc.content_preview
};
-- Combine search with LLM
LET docs = DOCS_SEARCH('sharding', 3)
LET answer = DOCS_QUERY('Explain sharding configuration')
RETURN {
generated_answer: answer,
source_documents: docs
};DOCS_CONFIG_HELP(topic: string) -> string
Get configuration assistance for a specific topic. This function is optimized for configuration-related queries and returns structured guidance.
Syntax:
SELECT DOCS_CONFIG_HELP(topic) AS help;Parameters:
topic(string): Configuration topic (e.g., "sharding", "security", "replication")
Returns:
- String containing configuration guidance
Examples:
-- Get help on specific topics
SELECT DOCS_CONFIG_HELP('security') AS security_config;
SELECT DOCS_CONFIG_HELP('sharding') AS sharding_config;
SELECT DOCS_CONFIG_HELP('replication') AS replication_config;
-- Batch configuration help
FOR topic IN ['security', 'replication', 'caching']
RETURN {
topic: topic,
configuration_guide: DOCS_CONFIG_HELP(topic)
};
-- Dynamic configuration help
FOR setting IN :configuration
FILTER setting.needs_documentation == true
RETURN {
setting_name: setting.name,
help: DOCS_CONFIG_HELP(setting.category)
};DOCS_TROUBLESHOOT(error: string) -> string
Get troubleshooting assistance for errors or issues. This function analyzes the error description and provides potential solutions.
Syntax:
SELECT DOCS_TROUBLESHOOT(error_description) AS solution;Parameters:
error_description(string): Description of the error or issue
Returns:
- String containing troubleshooting guidance and potential solutions
Examples:
-- Troubleshoot specific errors
SELECT DOCS_TROUBLESHOOT('Server hangs at startup') AS solution;
SELECT DOCS_TROUBLESHOOT('Connection refused on port 8529') AS fix;
SELECT DOCS_TROUBLESHOOT('Out of memory error during query') AS help;
-- Log and troubleshoot errors
FOR error IN :error_log
FILTER error.severity == 'HIGH'
LIMIT 10
RETURN {
error_message: error.message,
timestamp: error.timestamp,
solution: DOCS_TROUBLESHOOT(error.message)
};
-- Interactive troubleshooting
LET error_msg = 'Failed to create vector index'
LET initial_help = DOCS_TROUBLESHOOT(error_msg)
LET related_docs = DOCS_SEARCH(error_msg, 3)
RETURN {
error: error_msg,
troubleshooting_steps: initial_help,
related_documentation: related_docs
};DOCS_STATS() -> object
Get statistics about the documentation database including total documents, database size, and cache information.
Syntax:
SELECT DOCS_STATS() AS stats;Returns:
- JSON object containing:
total_documents: Total number of documents in databasedatabase_version: Version of the documentation databasegeneration_time: When the database was generatedthemisdb_version: ThemisDB version the docs are forcache_stats: Cache hit/miss statistics
Example:
SELECT DOCS_STATS() AS documentation_info;Advanced Usage Patterns
1. Contextual Help System
-- Create a help function based on user context
LET user_query = 'How do I backup my data?'
LET search_results = DOCS_SEARCH(user_query, 5)
LET llm_answer = DOCS_QUERY(user_query)
RETURN {
query: user_query,
answer: llm_answer,
related_documents: search_results,
follow_up_queries: [
'How often should I backup?',
'What is the backup format?',
'Can I restore from backup?'
]
};2. Configuration Validation with Help
-- Validate configuration and provide help for invalid settings
FOR config IN :configuration
LET is_valid = config.value != null
LET help = is_valid ? null : DOCS_CONFIG_HELP(config.name)
RETURN {
setting: config.name,
value: config.value,
valid: is_valid,
help_if_invalid: help
};3. Error Analysis Pipeline
-- Analyze recent errors and group by category
LET recent_errors = (
FOR error IN :error_log
FILTER error.timestamp > DATE_NOW() - 3600000 // Last hour
RETURN error
)
LET analyzed_errors = (
FOR error IN recent_errors
LET solution = DOCS_TROUBLESHOOT(error.message)
RETURN {
error: error,
solution: solution,
category: error.category
}
)
LET grouped = (
FOR item IN analyzed_errors
COLLECT category = item.category INTO group
RETURN {
category: category,
count: LENGTH(group),
solutions: group[*].item.solution
}
)
RETURN grouped;4. Interactive Documentation Explorer
-- Search-driven documentation browsing
LET search_term = 'performance tuning'
LET matching_docs = DOCS_SEARCH(search_term, 10)
FOR doc IN matching_docs
LET summary = DOCS_QUERY(
CONCAT('Summarize in 2 sentences: ', doc.content_preview)
)
RETURN {
document: doc.file_name,
path: doc.file_path,
relevance: doc.relevance_score,
summary: summary,
full_content_available: doc.file_path
};5. Batch Configuration Assistance
-- Get help for all configuration categories
LET categories = ['security', 'performance', 'networking', 'storage', 'clustering']
FOR category IN categories
LET config_help = DOCS_CONFIG_HELP(category)
LET related_docs = DOCS_SEARCH(category, 3)
RETURN {
category: category,
configuration_guide: config_help,
documentation_references: related_docs
};Performance Considerations
Caching
The documentation assistant uses intelligent caching to improve performance:
- Response Cache: Frequently asked questions are cached
- Document Cache: Recently accessed documents stay in memory
- Search Cache: Recent search queries are cached
Query Optimization
-- GOOD: Single query with multiple operations
LET help = DOCS_QUERY('sharding configuration')
LET docs = DOCS_SEARCH('sharding', 5)
RETURN { help, docs };
-- AVOID: Multiple separate queries in a loop without need
FOR i IN 1..100
RETURN DOCS_QUERY('same question') // This will cache, but unnecessaryBatch Processing
-- GOOD: Batch related queries
LET topics = ['security', 'networking', 'storage']
FOR topic IN topics
RETURN DOCS_CONFIG_HELP(topic);
-- AVOID: Individual queries when batch is possible
RETURN {
security: DOCS_CONFIG_HELP('security'),
networking: DOCS_CONFIG_HELP('networking'),
storage: DOCS_CONFIG_HELP('storage')
}; // Better to use FOR loopError Handling
All documentation functions throw exceptions with descriptive messages if:
- Documentation database is not loaded
- Query fails
- Database is corrupted
Example error handling:
-- Graceful fallback
LET help = (
DOCS_QUERY('How to configure X?') OR
'Documentation not available. Please check docs.themisdb.com'
)
RETURN help;Configuration
Environment Variables
THEMIS_DOCS_DATABASE_PATH: Path to documentation databaseTHEMIS_DOCS_DATABASE_TYPE: Database type ("json" or "rocksdb")THEMIS_ENABLE_DOCS_ASSISTANT: Enable/disable documentation assistant
Auto-Discovery
If not explicitly configured, the system searches for documentation database in this order:
data/docs.db(RocksDB)data/docs_database.json(JSON)./docs.db(Current directory, RocksDB)./docs_database.json(Current directory, JSON)../data/docs.db(Parent directory, RocksDB)../data/docs_database.json(Parent directory, JSON)
Building Documentation Database
The documentation database is automatically generated during build when THEMIS_ENABLE_LLM=ON:
# Build with documentation database
cmake -B build -DTHEMIS_ENABLE_LLM=ON
cmake --build build
# Documentation database will be at: build/data/docs.dbManual generation:
# Generate JSON database
python3 scripts/generate_docs_database.py --output data/docs_database.json
# Generate RocksDB database
python3 scripts/generate_docs_rocksdb.py --output data/docs.db --method cpp
g++ -std=c++17 data/import_docs_rocksdb.cpp -o import_docs_rocksdb -lrocksdb
./import_docs_rocksdb data/docs_database.json data/docs.dbLimitations
- Database Size: Documentation database is ~2-3 MB
- LLM Requirement: Requires LLM model loaded for DOCS_QUERY, DOCS_CONFIG_HELP, and DOCS_TROUBLESHOOT
- Cache Size: Response cache limited to 1000 entries by default
- Language: Currently supports English documentation only