Skip to content

Add baseline Grafana observability and stage timing telemetry#3411

Draft
anth-volk wants to merge 3 commits intomasterfrom
feat/grafana-observability
Draft

Add baseline Grafana observability and stage timing telemetry#3411
anth-volk wants to merge 3 commits intomasterfrom
feat/grafana-observability

Conversation

@anth-volk
Copy link
Copy Markdown
Collaborator

Fixes #3410

Summary

  • add baseline OTLP/Grafana-capable observability in shadow mode for legacy simulation submission
  • instrument the Modal submission client and request lifecycle with coarse logs, traces, and metrics
  • add stage-level timing telemetry for setup, submission, and failure handling

Testing

  • uv run ruff check policyengine_api/observability.py policyengine_api/libs/simulation_api_modal.py policyengine_api/services/economy_service.py tests/unit/test_observability.py tests/unit/libs/test_simulation_api_modal.py tests/fixtures/libs/simulation_api_modal.py
  • uv run python -c "from policyengine_api.observability import get_observability, build_lifecycle_event; obs = get_observability('policyengine-api'); print(type(obs).name); print(build_lifecycle_event(stage='job.setup', status='ok', service='policyengine-api', telemetry={'run_id':'r1'})['stage'])"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement baseline Grafana observability and stage timing for legacy simulation submission

1 participant