Feature: 001-project-documentation
Target Audience: New developers joining the metrics-processor project
Time Estimate: 30 minutes
By the end of this guide, you will:
- ✅ Understand what metrics-processor does and why it exists
- ✅ Have a working local development environment
- ✅ Successfully build and run both binaries (convertor and reporter)
- ✅ Identify the major components and their responsibilities
- ✅ Know where to find key documentation sections
Problem: Cloud monitoring produces many different metric types (latencies, status codes, rates). Visualizing overall service health from these disparate metrics is challenging.
Solution: metrics-processor converts raw time-series metrics into simple semaphore-like health indicators:
- 🟢 Green (0): Service up and running normally
- 🟡 Yellow (1): Service degraded (slow, errors)
- 🔴 Red (2): Service outage
Architecture (high-level):
┌─────────────┐ ┌────────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Graphite │────▶│ Convertor │────▶│ Reporter │────▶│ Status Dashboard│
│ (TSDB) │ │ (evaluates) │ │ (notifies) │ │ (displays) │
└─────────────┘ └────────────────┘ └──────────────┘ └─────────────────┘
Raw metrics Flag metrics Health status Semaphore UI
Two Main Components:
- Convertor: Evaluates health from raw metrics, exposes HTTP API
- Reporter: Polls convertor, sends updates to status dashboard
Before starting, ensure you have:
- Rust: Version 1.75 or later (check with
rustc --version)- Install:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
- Install:
- Git: For cloning the repository
- Text Editor: VSCode, IntelliJ, or vim with Rust support
- Optional: Docker (for containerized TSDB testing)
System Requirements:
- Linux, macOS, or WSL2 on Windows
- 4GB RAM minimum
- 500MB disk space for dependencies
# Clone the repository
git clone https://github.com/your-org/metrics-processor.git
cd metrics-processor
# Build all components
cargo build
# Expected output: Compiling cloudmon-metrics v0.2.0
# Should complete in 1-3 minutes depending on hardwareVerify: You should see two binaries created:
ls -lh target/debug/cloudmon-metrics-*
# cloudmon-metrics-convertor
# cloudmon-metrics-reportermetrics-processor/
├── src/ # Rust source code
│ ├── lib.rs # Library root
│ ├── api.rs # HTTP API module
│ ├── api/
│ │ └── v1.rs # API v1 handlers
│ ├── config.rs # Configuration parsing
│ ├── types.rs # Domain data structures
│ ├── graphite.rs # Graphite TSDB integration
│ ├── common.rs # Shared utilities
│ └── bin/
│ ├── convertor.rs # Convertor binary entry point
│ └── reporter.rs # Reporter binary entry point
├── doc/ # Documentation (mdbook)
├── tests/ # Integration tests
├── Cargo.toml # Rust dependencies
├── openapi-schema.yaml # API specification
└── README.md # Project overview
| File | Purpose | When to Edit |
|---|---|---|
src/config.rs |
Configuration parsing & validation | Adding new config fields |
src/types.rs |
Core data structures (Config, FlagMetric, HealthMetric) | Changing data models |
src/api/v1.rs |
HTTP API handlers (/v1/health, /v1/maintenances) |
Adding/modifying endpoints |
src/graphite.rs |
Graphite TSDB client | TSDB query changes |
openapi-schema.yaml |
API contract | Documenting API changes |
# Run all tests (unit + integration)
cargo test
# Expected output: test result: ok. X passed; 0 failedWhat's being tested:
- Unit tests: Configuration parsing, metric evaluation logic
- Integration tests: HTTP endpoint behavior with mocked TSDB
If tests fail: Check the output for specific failures. Common issues:
- Missing dependencies: Run
cargo buildfirst - Port conflicts: Ensure ports 3005+ are available
Create config.yaml:
datasource:
url: "http://localhost:8080" # Mock TSDB (won't actually connect yet)
type: graphite
server:
address: "127.0.0.1"
port: 3005
metric_templates:
api_slow:
query: "stats.timers.api.$environment.$service.mean"
op: "gt"
threshold: 500
environments:
- name: "local-dev"
flag_metrics:
- name: "api_slow"
service: "test_service"
template:
name: "api_slow"
environments:
- name: "local-dev"
health_metrics:
test_service:
service: "test_service"
component_name: "Test Service"
category: "demo"
metrics:
- "test_service.api_slow"
expressions:
- expression: "test_service.api_slow"
weight: 1cargo run --bin cloudmon-metrics-convertor -- --config config.yaml
# Expected output:
# INFO cloudmon_metrics: Server starting on 127.0.0.1:3005In another terminal:
# Query health endpoint
curl "http://localhost:3005/v1/health?from=2024-01-01T00:00:00Z&to=2024-01-01T01:00:00Z&service=test_service&environment=local-dev"
# Expected response (empty data since no real TSDB):
# {"name":"test_service","category":"demo","environment":"local-dev","metrics":[]}Success! You've successfully run the convertor binary and made an API call.
Flag metrics are binary indicators (raised/lowered) based on raw TSDB queries:
// src/types.rs
pub struct FlagMetric {
pub name: String, // "api_slow"
pub query: String, // TSDB query template
pub comparison: Comparison, // gt, lt, eq
pub threshold: f64, // 500.0
}Flow:
- Query TSDB with variable substitution:
$service→test_service - Compare result to threshold:
mean_latency > 500ms - Set flag:
true(raised) orfalse(lowered)
Health metrics combine flag metrics using boolean expressions:
// src/types.rs
pub struct HealthMetric {
pub service: String,
pub expressions: Vec<Expression>, // Boolean expressions
}
pub struct Expression {
pub expr: String, // "api_slow || api_error_rate_high"
pub weight: u8, // 0=healthy, 1=degraded, 2=outage
}Flow:
- Evaluate each expression using flag states
- Take maximum weight of matching expressions
- Return semaphore value (0, 1, or 2)
- Edit
config.yaml→metric_templatessection - Add new template with query, op, threshold
- Reference in
flag_metricssection - Restart convertor:
cargo run --bin cloudmon-metrics-convertor
# Install cargo-watch
cargo install cargo-watch
# Auto-rebuild on file changes
cargo watch -x 'run --bin cloudmon-metrics-convertor -- --config config.yaml'# Enable debug logging
RUST_LOG=debug cargo run --bin cloudmon-metrics-convertor -- --config config.yaml
# Filter to specific module
RUST_LOG=cloudmon_metrics::api=debug cargo run ...cargo clippy
# Fix all warnings before committing (Constitution requirement)Now that you have a working environment, explore these documentation sections:
| Section | What to Learn | Location |
|---|---|---|
| Architecture | System design, data flow | doc/architecture/ |
| API Reference | Endpoint details, authentication | doc/api/ |
| Configuration | All config fields, examples | doc/configuration/ |
| Module Docs | Rust module responsibilities | doc/modules/ |
| Troubleshooting | Common issues, solutions | doc/guides/troubleshooting.md |
Next Steps:
- Read Architecture Overview to understand component interactions
- Review Configuration Schema for full config reference
- Check patterns.json for coding conventions
# Build
cargo build # Debug build
cargo build --release # Production build
# Test
cargo test # All tests
cargo test --test integration # Integration tests only
# Run
cargo run --bin cloudmon-metrics-convertor -- --config config.yaml
cargo run --bin cloudmon-metrics-reporter -- --config config.yaml
# Lint
cargo clippy # Linter
cargo fmt # Auto-format
# Documentation
cargo doc --open # Generate and open rustdoc
mdbook build doc/ && mdbook serve doc/ # Build user documentationCause: Missing system dependencies or outdated Rust version
Solution:
rustup update
cargo clean
cargo buildCause: Port 3005 is occupied
Solution:
# Change port in config.yaml
server:
port: 3006
# Or kill existing process
lsof -i :3005
kill <PID>Cause: Integration tests expect mock TSDB, mockito setup issue
Solution: Check that mockito dependency is present in Cargo.toml
You've successfully completed onboarding if you can:
- ✅ Build the project without errors
- ✅ Run all tests successfully
- ✅ Start convertor binary and query the API
- ✅ Explain the difference between flag metrics and health metrics
- ✅ Identify which module to edit for different types of changes
Estimated Time: If you completed this guide in ~30 minutes, you're ready to contribute! 🎉
- Code Questions: Check
doc/modules/for module-specific documentation - Configuration Issues: See
doc/configuration/schema.mdfor field reference - Architecture Questions: Read
doc/architecture/overview.md - Bugs: Check
doc/guides/troubleshooting.mdfirst, then file an issue
- Pick a starter issue from the issue tracker (look for "good first issue" label)
- Read the relevant module documentation
- Make your changes following the constitution guidelines (
.specify/memory/constitution.md) - Run tests and clippy before committing
- Submit a PR with clear description
Welcome to the team! 🚀