Perplexica is an AI-powered answering engine.
-
Updated
Jan 10, 2026 - TypeScript
Perplexica is an AI-powered answering engine.
High-performance lightweight proxy and load balancer for LLM infrastructure. Intelligent routing, automatic failover and unified model discovery across local and remote inference backends.
Small Language Model Inference, Fine-Tuning and Observability. No GPU, no labeled data needed.
emotional AI Companions for personal relationships
Run IBM Granite 4.0 locally on Raspberry Pi 5 with Ollama.This is a privacy-first AI. Your data never leaves your device because it runs 100% locally. There are no cloud uploads and no third-party tracking.
A private, local RAG (Retrieval-Augmented Generation) system using Flowise, Ollama, and open-source LLMs to chat with your documents securely and offline.
Recallium is a local, self-hosted universal AI memory system providing a persistent knowledge layer for developer tools (Copilot, Cursor, Claude Desktop). It eliminates "AI amnesia" by automatically capturing, clustering, and surfacing decisions and patterns across all projects. It uses the MCP for universal compatibility and ensures privacy
🚀 7 Ways to Run Any LLMs Locally - Simple Methods
🌳 Open-source RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval - Complete open-source implementation with 100% local LLMs (Granite Code 8B + mxbai-embed-large)
Web-Based Q&A Tool enables users to extract and query website content using FastAPI, FAISS, and a local TinyLlama-1.1B model—without external APIs. Built with React, it offers a minimal UI for seamless AI-driven search
LocalPrompt is an AI-powered tool designed to refine and optimize AI prompts, helping users run locally hosted AI models like Mistral-7B for privacy and efficiency. Ideal for developers seeking to run LLMs locally without external APIs.
Production-ready test-time compute optimization framework for LLM inference. Implements Best-of-N, Sequential Revision, and Beam Search strategies. Validated with models up to 7B parameters.
Powers the local RAG pipeline in the BrainDrive Chat w/ Docs plugin.
The high-performance brain for Turbo Cloud Gallery. Features Smart RAM caching, aggressive WebP compression, AI-powered memories (Ollama), and direct Telegram file smuggling. Optimized to run fast on low-end hardware.
Add a description, image, and links to the self-hosted-ai topic page so that developers can more easily learn about it.
To associate your repository with the self-hosted-ai topic, visit your repo's landing page and select "manage topics."