This document defines the interface requirements for Time Series Database (TSDB) backends in the metrics-processor.
The metrics-processor retrieves time series data from external TSDBs to compute service health metrics and flag states. Any TSDB backend must implement the query execution and response parsing interfaces defined below.
TSDB backends must implement a data fetching function with the following signature pattern:
pub async fn get_tsdb_data(
client: &reqwest::Client,
url: &str,
targets: &HashMap<String, String>, // alias -> query mapping
from: Option<DateTime<FixedOffset>>,
from_raw: Option<String>,
to: Option<DateTime<FixedOffset>>,
to_raw: Option<String>,
max_data_points: u16,
) -> Result<Vec<TsdbData>, CloudMonError>| Parameter | Type | Description |
|---|---|---|
client |
&reqwest::Client |
Shared HTTP client from AppState |
url |
&str |
Base URL of the TSDB instance |
targets |
&HashMap<String, String> |
Map of alias names to query expressions |
from |
Option<DateTime<FixedOffset>> |
Start time (parsed datetime) |
from_raw |
Option<String> |
Start time (raw string, e.g., "now-1h") |
to |
Option<DateTime<FixedOffset>> |
End time (parsed datetime) |
to_raw |
Option<String> |
End time (raw string) |
max_data_points |
u16 |
Maximum data points to return |
All TSDB backends must return data in a normalized format compatible with the processor:
pub struct TsdbData {
/// Target/metric name (used as lookup key)
pub target: String,
/// Array of (value, timestamp) tuples
pub datapoints: Vec<(Option<f32>, u32)>,
}- Value:
Option<f32>- The metric value,Nonefor null/missing data - Timestamp:
u32- Unix timestamp in seconds
Backends must return CloudMonError for failures:
pub enum CloudMonError {
ServiceNotSupported,
EnvNotSupported,
ExpressionError,
GraphiteError, // Rename to generic TsdbError for new backends
}| Scenario | Error Type | Handling |
|---|---|---|
| HTTP client errors | CloudMonError::GraphiteError |
Log and return error |
| 4xx response codes | CloudMonError::GraphiteError |
Log response body, return error |
| JSON parse failures | CloudMonError::GraphiteError |
Return error |
| Connection timeout | CloudMonError::GraphiteError |
Retry logic in client |
- Parse JSON response into the standard
TsdbDatastructure - Preserve target names exactly as aliased in the query
- Handle null values by setting
Nonein the datapoints - Maintain timestamp ordering (typically ascending)
The configuration must include TSDB connection details:
datasource:
url: 'https://graphite.example.com'
timeout: 10 # seconds, optional (default: 10)#[derive(Clone, Debug, Deserialize)]
pub struct Datasource {
pub url: String,
#[serde(default = "default_timeout")]
pub timeout: u16,
}For multi-backend support, extend configuration:
datasource:
type: graphite # or prometheus, influxdb
url: 'https://tsdb.example.com'
timeout: 10#[derive(Debug, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum DatasourceType {
Graphite,
Prometheus,
InfluxDB,
}The TSDB client is accessed via AppState:
pub struct AppState {
pub config: Config,
pub req_client: reqwest::Client, // Shared HTTP client
// ... other fields
}The get_service_health function in common.rs calls the TSDB:
let raw_data: Vec<graphite::GraphiteData> = graphite::get_graphite_data(
&state.req_client,
&state.config.datasource.url.as_str(),
&graphite_targets,
from_datetime,
from_raw,
to_datetime,
to_raw,
max_data_points,
).await?;When implementing a new TSDB backend:
- Implement async data fetching function
- Return data in
TsdbDataformat (target + datapoints) - Handle all HTTP error cases
- Parse TSDB-specific response format
- Support both raw string and parsed datetime parameters
- Implement query aliasing for target name preservation
- Add configuration options if needed
- Update
DatasourceTypeenum - Add integration tests with mocked responses
- Graphite Implementation - Reference implementation
- Adding Backends - Step-by-step guide