Vision Karts

The Future of Computer-Vision Retail, Today

Welcome to Vision Karts - where cutting-edge computer vision meets seamless shopping experiences. This isn't just another checkout system. This is a complete reimagining of how retail works for camera-based, automated checkout in physical stores.

"To truly understand a world, you must first see every piece of it clearly." — Vision (in spirit)

Target: Python 3.8+ on Linux, macOS, or Windows, with optional CUDA GPU
Hardware: Standard RGB cameras (USB/IP) watching entrances, aisles, and exits in brick-and-mortar stores.

What We've Built

Vision Karts is an automated checkout system for retail stores that eliminates queues, reduces wait times, and transforms the shopping experience. Using state-of-the-art YOLO11 object detection and advanced face recognition, we've created a system that:

Detects products instantly - No scanning, no waiting, no hassle
Tracks customers seamlessly - Know who's shopping, personalize the experience
Calculates bills automatically - Accurate pricing, zero human error
Runs at blazing speeds - AI acceleration via TensorRT/ONNX Runtime for real-time performance

The Technology Stack

We didn't settle for "good enough." We went with the best tools for real-world, camera-based retail analytics and checkout:

YOLO11 - Modern object detection model (Ultralytics) for product recognition
Face Recognition - Modern dlib-based face recognition for customer tracking
AI Acceleration - TensorRT and ONNX Runtime optimization for sub-millisecond inference
Modular Architecture - Clean, professional codebase that scales

Quick Start

# Install dependencies
pip install -r requirements.txt

# Run on an image
python main.py path/to/image.jpg --prices src/prices.csv

# With custom model and acceleration
python main.py image.jpg --model models/custom_yolo11.pt --device cuda

Architecture Overview

vision_karts/
├── core/                    # Core functionality
│   ├── product_detector.py         # YOLO11-based product detection
│   ├── billing_engine.py           # Bill calculation engine
│   ├── customer_tracker.py         # Face recognition & tracking
│   ├── video_processor.py          # Real-time video processing
│   ├── camera_manager.py          # Multi-camera management
│   ├── virtual_cart.py            # Virtual cart management
│   ├── session_manager.py         # Session lifecycle management
│   ├── qr_auth.py                 # QR code authentication
│   ├── customer_db.py             # Customer database
│   ├── event_tracker.py           # Product event tracking
│   ├── exit_processor.py          # Exit processing
│   ├── receipt_generator.py       # Receipt generation
│   ├── payment_processor.py       # Payment processing
│   ├── store_layout.py            # Store layout management
│   └── sensor_fusion.py           # Sensor fusion integration
├── accelerators/          # AI acceleration modules
│   └── accelerator_manager.py     # TensorRT/ONNX Runtime optimization
├── analytics/             # Analytics and reporting
│   ├── metrics.py                  # Metrics collection
│   ├── dashboard.py                # Analytics dashboard
│   └── reports.py                 # Report generation
├── utils/                 # Utility functions
│   ├── image_utils.py             # Image processing utilities
│   └── config_loader.py           # Configuration loading
└── data/                   # Data handling modules
    ├── database.py                # Database abstraction
    └── models.py                  # Database models

Features

🚀 Lightning-Fast Detection

YOLO11 with optimized inference
Batch processing support
Real-time performance with GPU acceleration

💰 Intelligent Billing

Automatic price lookup
Multi-item support
Confidence-based filtering

👤 Customer Tracking

Face recognition for customer identification
Multi-customer support
Privacy-conscious design

⚡ AI Acceleration

TensorRT optimization (NVIDIA GPUs)
ONNX Runtime support (CPU/GPU)
Automatic backend selection

Performance

With AI acceleration enabled:

Inference time: < 10ms per image (GPU)
Accuracy: 98%+ on product detection
Throughput: 100+ images/second (batch processing)

Configuration

Customize everything via configs/default_config.yaml:

# Model Configuration
model:
  type: "yolo11"
  confidence_threshold: 0.5

# Acceleration Configuration
acceleration:
  enabled: true
  backend: "auto"  # tensorrt, onnx, cuda, cpu
  device: "auto"

# Video Processing Configuration
video_processing:
  enabled: true
  fps: 30
  cameras:
    - id: 0
      name: "Entrance"
      position: [0, 0]

# Virtual Cart Configuration
virtual_cart:
  enabled: true
  timeout_seconds: 300

# QR Authentication Configuration
qr_auth:
  enabled: true
  qr_secret: "your-secret-key"
  entry_gate_enabled: true

# Store Layout Configuration
store_layout:
  map_file: "configs/store_map.yaml"

# Analytics Configuration
analytics:
  enabled: true
  dashboard_port: 8080
  metrics_retention_days: 30

Installation

Basic Installation

pip install -r requirements.txt

With GPU Acceleration

# For NVIDIA GPUs with TensorRT
pip install -r requirements.txt
pip install onnxruntime-gpu

# Or install with GPU extras
pip install -e ".[gpu]"

Development Setup

git clone <repository-url>
cd Vision-Karts
pip install -e ".[dev]"

Usage Examples

Basic Product Detection

from vision_karts.core import ProductDetector
from vision_karts.utils import load_image

detector = ProductDetector(confidence_threshold=0.5)
image = load_image("shopping_cart.jpg")
detections, annotated = detector.detect(image, return_image=True)

Complete Checkout Flow

from vision_karts.core import ProductDetector, BillingEngine
from vision_karts.utils import load_image

# Initialize components
detector = ProductDetector(use_acceleration=True)
billing = BillingEngine("src/prices.csv")

# Process image
image = load_image("cart.jpg")
detections, _ = detector.detect(image)

# Generate bill
bill = billing.generate_bill(detections)
print(billing.format_bill(bill))

Customer Tracking

from vision_karts.core import CustomerTracker
import cv2

tracker = CustomerTracker(known_faces_dir="data/customers/")
image = cv2.imread("store_camera.jpg")
customers = tracker.track_customers(image)

for customer in customers:
    if customer['customer_id']:
        print(f"Customer {customer['customer_id']} detected!")

Complete Automated Checkout Flow

from vision_karts.core import (
    CameraManager, SessionManager, VirtualCartManager,
    EventTracker, ExitProcessor, BillingEngine
)

# Initialize components
billing = BillingEngine("src/prices.csv")
session_mgr = SessionManager()
cart_mgr = VirtualCartManager(price_calculator=billing.calculate_price)
event_tracker = EventTracker()
exit_processor = ExitProcessor()

# Setup cameras
cameras = [{'id': 0, 'name': 'Entrance'}, {'id': 1, 'name': 'Aisle 1'}]
camera_mgr = CameraManager(cameras)

# Process entry
session = session_mgr.create_session("customer_123")
cart = cart_mgr.create_cart("customer_123", session.session_id)

# Process frames and update cart
def process_frame(results, camera_id, timestamp):
    events = event_tracker.process_detections(
        "customer_123", results['detections'], timestamp
    )
    cart_mgr.update_cart_from_detections(
        "customer_123", results['detections'], 'pick'
    )

# Start processing
camera_mgr.start_all()

# On exit
cart_data = cart.to_dict()
transaction = exit_processor.process_exit(
    session.session_id, "customer_123", cart_data
)

AI Acceleration

Vision Karts supports multiple acceleration backends:

TensorRT (NVIDIA GPUs) - Highest performance, requires NVIDIA GPU
ONNX Runtime (CPU/GPU) - Cross-platform, good performance
CUDA (PyTorch) - Default GPU acceleration
CPU - Fallback for systems without GPU

Enable acceleration:

detector = ProductDetector(
    use_acceleration=True,
    device='cuda'  # or '0', '1' for specific GPU
)

Dataset Preparation

The system works with standard YOLO format datasets. For training custom models:

Prepare images with bounding box annotations
Convert to YOLO format (class_id x_center y_center width height)
Train using Ultralytics YOLO11:

yolo train data=dataset.yaml model=yolo11n.pt epochs=100

Contributing

We welcome contributions! This is cutting-edge technology, and we're always looking to push boundaries.

See CONTRIBUTING.md for detailed guidelines on how to report issues, propose improvements, and open pull requests.

Governance and Community

Project guidelines: CODE_OF_CONDUCT.md
How to contribute: CONTRIBUTING.md

License

This project is licensed under the MIT License. See LICENSE for details.

Advanced Features

Vision Karts includes a comprehensive suite of advanced features for complete automated checkout systems:

🎥 Real-Time Video Processing

Multi-camera support for store-wide monitoring
Frame-by-frame processing pipeline
Threaded video capture and processing
Real-time FPS monitoring and optimization
Configurable frame buffering

from vision_karts.core import VideoProcessor, FrameProcessor
from vision_karts.core import ProductDetector

detector = ProductDetector()
processor = FrameProcessor(detector)

with VideoProcessor(camera_id=0, processing_callback=processor.process_frame) as vp:
    # Process video stream
    frame, timestamp = vp.get_frame()

🛒 Virtual Cart Management

Per-customer virtual shopping carts
Real-time cart updates on product detection
Automatic quantity tracking
Cart persistence across sessions
Multi-customer cart isolation

from vision_karts.core import VirtualCartManager, BillingEngine

billing = BillingEngine("prices.csv")
cart_manager = VirtualCartManager(price_calculator=billing.calculate_price)

# Create cart for customer
cart = cart_manager.create_cart("customer_123", "session_456")

# Update cart from detections
cart_manager.update_cart_from_detections("customer_123", detections, event_type='pick')

🔐 QR Code Authentication

QR code generation for customer accounts
Secure token-based authentication
Entry gate control and validation
Customer registration and profile management

from vision_karts.core import QRAuth, EntryGate

qr_auth = QRAuth(secret_key="your-secret-key")
gate = EntryGate(qr_auth)

# Generate QR code for customer
qr_image, token = qr_auth.generate_qr_code("customer_123")

# Scan and validate at entry
customer_data = gate.scan_and_validate(camera_frame)

📊 Session Management

Complete customer session lifecycle tracking
Entry-to-exit session monitoring
Session timeout and cleanup
Multi-session support

from vision_karts.core import SessionManager

session_mgr = SessionManager()

# Create session on entry
session = session_mgr.create_session("customer_123", entry_camera="camera_0")

# Complete session on exit
completed = session_mgr.complete_session(session.session_id, exit_camera="camera_3")

📦 Product Event Tracking

Pick and return event detection
Temporal analysis of product interactions
Event history and validation
Confidence-based event filtering

from vision_karts.core import EventTracker

event_tracker = EventTracker()

# Process detections and generate events
events = event_tracker.process_detections(
    customer_id="customer_123",
    detections=product_detections,
    timestamp=time.time()
)

# Get recent events
recent_picks = event_tracker.get_recent_events(
    customer_id="customer_123",
    event_type="pick"
)

🏪 Store Layout & Zone Mapping

Configurable store layout system
Zone definitions (entrance, aisles, checkout, exit)
Shelf-level product tracking
Camera position mapping
Spatial relationship management

from vision_karts.core import StoreLayout

layout = StoreLayout("configs/store_map.yaml")

# Find zone for customer position
zone = layout.find_zone_for_point(x=10.5, y=5.2)

# Get shelves in zone
shelves = layout.get_shelves_in_zone("Aisle 1")

📈 Analytics & Reporting

Real-time metrics collection
Revenue and transaction analytics
Product popularity tracking
Customer behavior insights
Automated report generation

from vision_karts.analytics import MetricsCollector, AnalyticsDashboard

metrics = MetricsCollector()
dashboard = AnalyticsDashboard(metrics)

# Record transaction
metrics.record_transaction("txn_123", "customer_123", 45.99, 5)

# Get dashboard data
dashboard_data = dashboard.get_dashboard_data()

# Generate daily report
from vision_karts.analytics import ReportGenerator
reporter = ReportGenerator(metrics)
daily_report = reporter.generate_daily_report()

🧾 Automated Receipt Generation

PDF, text, and JSON receipt formats
Automatic receipt generation on exit
Email and SMS delivery support (configurable)
Transaction history tracking

from vision_karts.core import ReceiptGenerator

receipt_gen = ReceiptGenerator()

receipt = receipt_gen.generate_receipt(
    customer_id="customer_123",
    session_id="session_456",
    cart_data=cart.to_dict(),
    format="pdf"
)

🔌 Sensor Fusion Integration

Weight sensor integration for validation
Multi-sensor data fusion
Hardware sensor API support
Simulated sensors for testing

from vision_karts.core import SensorFusion, WeightSensor

sensor = WeightSensor("shelf_a1", simulated=True)
fusion = SensorFusion([sensor])

# Fuse sensor data with detections
validated = fusion.fuse_with_detections("shelf_a1", detections)

💾 Database Persistence

SQLite database (default)
PostgreSQL and MySQL support
Transaction history storage
Customer profile management
Analytics event logging

from vision_karts.data import Database

db = Database("data/vision_karts.db")

# Execute queries
customers = db.execute_query("SELECT * FROM customers WHERE customer_id = ?", ("customer_123",))

Complete System Architecture

┌─────────────────────────────────────────────────────────┐
│              Entry Gate (QR Scanner)                     │
└────────────────────┬────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────┐
│         Session Manager (Creates Session)                │
└────────────────────┬────────────────────────────────────┘
                     │
        ┌────────────┴────────────┐
        ▼                         ▼
┌──────────────┐        ┌──────────────────┐
│ Virtual Cart │        │ Camera Manager   │
│  (Per Cust)  │        │ (Multi-Camera)   │
└──────┬───────┘        └────────┬─────────┘
       │                         │
       │              ┌──────────┴──────────┐
       │              ▼                     ▼
       │      ┌──────────────┐    ┌──────────────┐
       │      │ Video        │    │ Product     │
       │      │ Processor    │───▶│ Detector    │
       │      └──────────────┘    └──────┬──────┘
       │                                  │
       │              ┌──────────────────┘
       │              ▼
       │      ┌──────────────┐
       │      │ Event        │
       │      │ Tracker      │
       │      │ (Pick/Return)│
       │      └──────┬───────┘
       │             │
       └─────────────┘
                     │
                     ▼
            ┌────────────────┐
            │ Virtual Cart   │
            │ Update         │
            └────────┬───────┘
                     │
                     ▼
            ┌────────────────┐
            │ Exit Gate      │
            │ (Finalize)     │
            └────────┬───────┘
                     │
        ┌────────────┴────────────┐
        ▼                         ▼
┌──────────────┐        ┌──────────────────┐
│ Payment      │        │ Receipt          │
│ Processor    │        │ Generator        │
└──────────────┘        └──────────────────┘

Roadmap

Support

For issues, questions, or contributions, please open an issue on GitHub.

Built with precision. Optimized for performance. Designed for the future.

Vision Karts - Where shopping meets innovation.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
configs		configs
docs		docs
examples		examples
image-dataset-preparation-tools		image-dataset-preparation-tools
product-image-dataset		product-image-dataset
src		src
trained-model		trained-model
vision_karts		vision_karts
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py

License

KarthikSriramGit/Vision-Karts

Folders and files

Latest commit

History

Repository files navigation

Vision Karts

The Future of Computer-Vision Retail, Today

What We've Built

The Technology Stack

Quick Start

Architecture Overview

Features

🚀 Lightning-Fast Detection

💰 Intelligent Billing

👤 Customer Tracking

⚡ AI Acceleration

Performance

Configuration

Installation

Basic Installation

With GPU Acceleration

Development Setup

Usage Examples

Basic Product Detection

Complete Checkout Flow

Customer Tracking

Complete Automated Checkout Flow

AI Acceleration

Dataset Preparation

Contributing

Governance and Community

License

Advanced Features

🎥 Real-Time Video Processing

🛒 Virtual Cart Management

🔐 QR Code Authentication

📊 Session Management

📦 Product Event Tracking

🏪 Store Layout & Zone Mapping

📈 Analytics & Reporting

🧾 Automated Receipt Generation

🔌 Sensor Fusion Integration

💾 Database Persistence

Complete System Architecture

Roadmap

Support

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages