A full-featured Chromium-based browser built with Electron that includes advanced component extraction and composition capabilities, plus a powerful AI Agent system for natural language system control.
Control your computer using natural language! The AI Agent system provides 39 powerful capabilities across 4 modules:
- System Control (18 capabilities): Open/close apps, manage services, analyze performance
- File Management (10 capabilities): Create, edit, view, organize files intelligently
- Process Management (3 capabilities): Monitor and control processes and services
- Task Automation (6 capabilities): Schedule tasks, create workflows, save macros
- AI Chat (2 capabilities): Natural conversation and intelligent Q&A
- Click the 🤖 Agent Chat button in the bottom right
- Type natural language commands like:
system info- View CPU, RAM, disk usagefind slow processes- Find resource hogslist processes- Show running processesopen calculator- Launch appscheck disk- View disk spacehelp- See all 39 capabilities
- QUICK_START.md - Get started in 60 seconds
- AGENT_COMMANDS.md - Complete command reference
- AGENT_SYSTEM.md - Architecture details
The system shows capability names like open_application, but you need to use natural language:
- ❌ Don't type:
open_application - ✅ Type instead:
open notepad
See QUICK_START.md for full details!
The Challenge: Modern workflows require monitoring multiple web applications, dashboards, and data sources simultaneously. Users often need to switch between dozens of tabs, losing focus and productivity while trying to compose insights from scattered information.
The Solution: ComponentFlow eliminates tab chaos by letting you extract specific components (charts, metrics, feeds) from any website and compose them into unified, live-updating dashboards - all while maintaining full browsing capabilities.
Full-featured Chromium browser with integrated AI summarizer for intelligent page analysis and insights.
Compose live components from multiple websites into unified dashboards with real-time data updates.
Elegant home interface with smart extraction, live composition, and AI analysis capabilities.
Built-in system explorer for seamless file management and workspace organization.
This project provides:
- Chromium Browser: A fully functional web browser with tabs, navigation, and all standard browser features
- Component Extraction: Extract specific HTML components (divs, sections, etc.) from any website using CSS selectors
- Component Composition: Combine extracted components from different websites into a single unified view
- AI-Powered Summarizer: AI-powered page summarization and analysis capabilities
- Multi-Tab Browsing: Open and manage multiple tabs just like Chrome
- Full Navigation: Address bar, back/forward buttons, refresh, home button
- Browse Any Website: Navigate to any URL on the internet
- Webview Integration: Each tab runs in its own isolated webview
- Offline Speech Recognition: Local Python server with Whisper for voice commands
- No Internet Dependency: Works completely offline once models are downloaded
- Natural Voice Commands: Speak commands like "open aws", "new tab", "refresh"
- Audio Recording: Built-in microphone recording with MediaRecorder API
- Privacy-Focused: All audio processing happens locally on your machine
📖 Local Speech Recognition Setup →
- Natural Language Commands: Control your system using plain English
- File Operations: Create, read, delete, move, and copy files
- Directory Management: Browse, search, and organize folders
- Command Execution: Run system commands safely with output capture
- Smart Command Parser: Understands intent from natural language
📖 Full System Control Documentation →
- CSS Selector Extraction: Extract any component using CSS selectors (e.g.,
.header,#main-content,div.card) - Visual Preview: See extracted components before saving
- HTML Inspection: View the full HTML of extracted components
- Multi-Source Composition: Combine components from different websites
- Live Dynamic Components: Each component is a live webview that maintains its functionality
- Real-Time Updates: Components make actual API calls and update dynamically
- Isolated Rendering: Each component runs independently with its own context
- Auto-Sizing: Components automatically adjust height based on content
- Component Management: Save, remove, and organize extracted components
chromium-browser/
├── main.js # Electron main process
├── src/
│ ├── browser.html # Browser UI
│ ├── browser.css # Browser styles
│ └── browser.js # Browser logic (with local speech recognition)
├── speech_server.py # Python Flask server for speech recognition
├── requirements.txt # Python dependencies for speech server
├── start_speech_server.bat # Windows batch script to start speech server
├── LOCAL_SPEECH_README.md # Detailed speech recognition documentation
├── start-all.js # Start all apps at once
├── package.json
└── README.md
- Clone the repository:
git clone <your-repo-url>
cd chromium-browser- Install dependencies:
npm install- Environment Setup:
# Copy the example environment file
cp .env.example .env
# Edit .env and add your API keys
# Required: OPENAI_API_KEY for AI summarization features- Set up Local Speech Recognition (Optional but recommended):
# Install Python dependencies for speech recognition
pip install -r requirements.txt
# Or use the provided setup script (Windows)
# Double-click start_speech_server.bat-
Optional - Set up local development apps:
The project includes sample applications for testing component extraction. These are optional for general browser use.
# Create the apps directory if you want to use demo apps mkdir apps # See DEVELOPMENT.md for instructions on setting up demo applications
Start everything with one command:
npm run start-appsThis will launch all three demo web apps and the Chromium browser automatically.
- Start the Speech Server (for voice commands):
# Windows - double-click this file
start_speech_server.bat
# Or manually:
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
python speech_server.py- Start the Chromium browser:
npm start- Enter URL: Type any URL in the address bar and press Enter or click "Go"
- New Tab: Click "+ New Tab" to open additional tabs
- Navigate: Use ← → ⟳ ⌂ buttons for back, forward, refresh, and home
- Switch Tabs: Click on any tab to switch to it
- Close Tabs: Click the × on any tab to close it
- Start the Speech Server: Run
start_speech_server.batorpython speech_server.py - Enable Voice Commands: Click the 🎤 microphone icon in the toolbar
- Start Recording: Click the microphone again to begin recording
- Speak Clearly: Say a command like "open aws", "new tab", or "refresh"
- Wait for Processing: The system will transcribe and execute your command
Supported Voice Commands:
- "open aws" → Opens AWS cost app
- "open azure" → Opens Azure cost app
- "open store" → Opens e-commerce app
- "new tab" → Creates new tab
- "close tab" → Closes current tab
- "go home" → Navigates to home
- "refresh" → Reloads page
- "show system" → Opens system monitor
Voice Command Tips:
- Speak clearly and close to your microphone
- Wait for the recording to complete (5 seconds)
- Check the browser console if commands aren't working
- The speech server must be running on port 5000
Method 1: Hover Selection (Recommended)
- Navigate to a website (e.g.,
http://localhost:3003for the E-commerce dashboard) - Click "📦 Extract" button to open the extraction panel
- Click "🎯 Pick Element by Hovering" button
- Hover over any element on the page - it will be highlighted with a green outline
- Click the element you want to extract
- The selector will be auto-filled and extraction will start automatically
- Click "Save to Composition" to add it to your collection
Method 2: Manual CSS Selector
- Navigate to a website (e.g.,
http://localhost:3003for the E-commerce dashboard) - Click "📦 Extract" button to open the extraction panel
- Enter a CSS selector manually:
.metrics- Extract the metrics section from e-commerce app.product-table- Extract the product table.stat-card- Extract individual stat cards from AWS app.service-list- Extract the service list
- Click "Extract Component" to preview the extracted HTML
- Click "Save to Composition" to add it to your collection
- Click "🎨 Compose" button to open the composition view
- View all saved components from different websites
- See the live preview of all components combined
- Remove components you don't want
- Click "Back to Browser" to return to browsing
- Click the "System Explorer" button (folder icon) in the toolbar
- Select the "Commands" tab for natural language control
- Type a command like:
create file test.txtlist files in C:\Userscopy "document.txt" to "backup.txt"run command "dir"
- Click "Execute" or press Enter
- View results with detailed output and status
Examples:
create file "notes.txt" with content "Hello World"
delete file old_data.txt
move "file1.txt" to "C:\Backup\file1.txt"
show files in Downloads
run command "ipconfig"
See SYSTEM_CONTROL.md for complete documentation and examples.
Prerequisites:
- Start the speech recognition server (see LOCAL_SPEECH_README.md)
- Ensure your microphone is connected and permissions are granted
Test Procedure:
- Click the 🎤 microphone icon in the navigation bar
- Wait for "Recording..." status with red bars
- Speak clearly: "Show system resources"
- Wait for processing - the command will be transcribed
- System panel opens automatically with detailed information
Available Voice Commands:
"Show system resources" → Opens system panel with all info
"Show CPU" → Opens system panel (focused on CPU)
"Show disk" → Opens system panel (disk information)
"Show memory" → Opens system panel (RAM usage)
"Show services" → Opens system panel (Windows services)
"Show processes" → Opens system panel (running processes)
Test File Creation via Voice:
- Click the 🎤 microphone icon
- Say: "Create file hello.py on desktop"
- Check your Desktop for the newly created
hello.pyfile - File will contain Python template code with timestamp
Other supported filenames:
"Create file test.txt on desktop"- Creates text file"Create file script.js on desktop"- Creates JavaScript file"Make file notes.txt on desktop"- Alternative phrasing
What You'll See:
- System Resources: CPU usage per core, total/used/free memory, disk space per drive
- Windows Services: List of running services with status
- Running Processes: Top processes sorted by memory usage
- Real-time Metrics: Visual bars showing CPU and memory percentages
- Desktop File: Python file with print statement and timestamp
E-commerce App (localhost:3003):
.metrics- All metric boxes.metric-box- Individual metric.product-table- Product table.orders-chart- Weekly orders chart
AWS Cost App (localhost:3001):
.stats-grid- All statistics.stat-card- Individual stat card.chart-section- Cost trend chart.service-list- Service cost list
Azure Cost App (localhost:3002):
.azure-costs- Main container.cost-card- Individual cost cards.resource-table- Resource usage table
For development with auto-restart:
npm run devFor development with auto-restart:
npm run devTo build the Chromium browser app for distribution:
Quick Build (Recommended):
npm run buildThis will:
- Clean any previous builds
- Package the application using electron-packager
- Create a distributable version in
dist/chromium-browser-win32-x64/
Manual Build Commands:
# Clean previous builds
npm run clean
# Package for Windows
npm run package
# Package for all platforms (Windows, macOS, Linux)
npm run package:allAfter building, you'll find:
- Executable:
dist/chromium-browser-win32-x64/chromium-browser.exe - Complete App: The entire
dist/chromium-browser-win32-x64/folder contains everything needed to run the app
Running the Built App:
cd dist/chromium-browser-win32-x64
./chromium-browser.exeOr simply double-click chromium-browser.exe in the file explorer.
| Script | Description |
|---|---|
npm run build |
Complete build process with cleaning |
npm run package |
Package for Windows x64 |
npm run package:all |
Package for all platforms |
npm run clean |
Remove all build files |
The built application is portable and self-contained. You can:
- Zip the folder: Compress
dist/chromium-browser-win32-x64/and share - Copy to other machines: The entire folder works on any Windows x64 machine
- Create installer: Use tools like NSIS or Inno Setup for proper installers
- Node.js: v16 or higher
- npm: v7 or higher
- Windows: x64 architecture (for Windows builds)
- Disk Space: ~200MB for build output
Common Issues:
- "Cannot create symbolic link": Run terminal as Administrator or disable code signing
- ENOENT errors: Ensure all dependencies are installed with
npm install - Permission errors: Check write permissions in the project directory
- Electron: Cross-platform desktop app framework
- Chromium: Full-featured browser engine (via Electron webview)
- Express.js: Web server for demo applications
- Cheerio: Server-side HTML parsing for component extraction
- Axios: HTTP client for fetching web pages
- Flask: Python web framework for speech recognition server
- OpenAI Whisper: Offline speech recognition model
- MediaRecorder API: Browser API for audio recording
- HTML/CSS/JavaScript: Frontend technologies
- Browser: Each tab creates an isolated
<webview>element that loads websites independently - Extraction: When you request extraction, the browser fetches the page HTML via Axios and parses it with Cheerio on the backend to validate the selector
- Selector Matching: Cheerio finds elements matching your CSS selector and extracts metadata
- Storage: Component metadata (URL + selector) is stored in memory
- Composition: Each saved component creates a live webview that:
- Loads the full website in the background
- Hides all elements except your selected component using CSS injection
- Maintains full functionality (API calls, timers, WebSockets, etc.)
- Updates dynamically just like the original website
- Auto-adjusts height based on content
This means your composed dashboard shows live, real-time data from multiple sources!
-
Dashboard Composition: Combine metrics from multiple monitoring tools
-
Report Generation: Extract data visualizations from different sources
-
Component Library: Build a library of reusable UI components
-
Web Scraping: Extract specific data from websites for analysis
-
Learning: Study how different websites structure their HTML
- CORS: Some websites may block extraction due to CORS policies
- Dynamic Content: JavaScript-rendered content may not be captured
- Authentication: Cannot extract from pages requiring login (without additional setup)
- Styling: Extracted components may look different if they depend on external stylesheets
- Save compositions to disk
- Export compositions as HTML
- Screenshot capture for extraction
- JavaScript execution for dynamic content
- Component editing and styling
- Bookmark management
- History tracking
- Download manager
MIT License