fix: include all failing dimension values in anomaly alert description by joostboon · Pull Request #1032 · elementary-data/dbt-data-reliability

joostboon · 2026-06-28T11:14:34Z

Summary

Anomaly alerts for dimension-based tests (e.g. column_anomalies with dimensions) previously showed only a single arbitrary dimension value in the result message
Now the alert lists all failing dimension values with their metric value and average, truncated to 5 with an "and N more" suffix when there are many failures
Example new format: In column INCIDENTS_OPENED, 2 anomalous max values for dimension account_name: Joejuice (214.0, avg 21.0), Acme (15.0, avg 6.3).
Non-dimension tests are unaffected (fall back to existing single-row description)

Test plan

Added assertions to test_column_anomalies_group_by to verify the description format for 1 and 2 failing dimension values
Verified locally against a real column_anomalies test with dimensions -- alert now shows all failing dimension values

before and after screenshots:

Previously, dimension-based anomaly tests only showed a single (arbitrary) dimension value in the alert description. Now the description lists all failing dimension values with their metric and average values, truncated to 5 with an "and N more" suffix when there are many failures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…by tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-06-28T11:14:43Z

👋 @joostboon
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

coderabbitai · 2026-06-28T11:14:48Z

📝 Walkthrough

Walkthrough

The SQL macro get_anomaly_test_result_row is updated to collect anomalous rows and render a dimension-aware test_results_description listing up to 5 anomalous dimension values with metric details. Integration tests add assertions verifying the new description format for grouped anomaly scenarios.

Changes

Dimension-aware anomaly description

Layer / File(s)	Summary
Anomalous row collection and description rendering `macros/edr/data_monitoring/anomaly_detection/store_anomaly_test_results.sql`	Adds preprocessing to filter `anomaly_scores_rows` into `filtered_anomaly_scores_rows` and `anomalous_rows`, increments `failures.data` per anomalous row, and replaces the flat `anomaly_description` fallback with a conditional block that, when a dimension exists and anomalous rows are present, formats a description with up to 5 dimension entries showing `dimension_value`, rounded `metric_value`, and rounded `training_avg`, plus an "and N more" suffix.
Integration test assertions `integration_tests/tests/test_column_anomalies.py`	Adds `assertIn` checks on `test_results_description` in `test_column_anomalies_group_by` for both sensitivity scenarios, verifying anomalous count ("1 anomalous"/"2 anomalous"), the "for dimension dimension" phrase, and expected dimension values ("dim1", "dim2").

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A rabbit checks each dimension with care,
"dim1 and dim2—anomalies found there!"
The description now lists them, up to five,
With metric and average, keeping count alive.
🐇 Hop along, the tests all pass!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly matches the main change: expanding anomaly alert descriptions to include all failing dimension values.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/dimension-anomaly-alert-description

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

integration_tests/tests/test_column_anomalies.py (1)
320-323: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add one case that exercises the max_shown / "and N more" branch.

These assertions only cover 1- and 2-dimension failures, but the new formatter has separate behavior once there are more than 5 anomalous dimensions. A >5 case would keep the truncation and suffix logic from regressing silently.

Also applies to: 343-347
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration_tests/tests/test_column_anomalies.py` around lines 320 - 323, The
current assertions in the anomaly description tests only cover the
small-dimension cases, so add a new test case in test_column_anomalies that
drives the formatter past the max_shown threshold and verifies the “and N more”
suffix. Use the existing description checks around
test_result["test_results_description"] to assert the truncated output and the
extra-count branch, ensuring the formatter logic in the test results description
path does not regress.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@macros/edr/data_monitoring/anomaly_detection/store_anomaly_test_results.sql`:
- Around line 79-107: The summary in store_anomaly_test_results.sql is counting
and listing duplicate dimension_value entries from anomalous_rows, which can
overstate failures for a single dimension. Update the test_results_description
construction to deduplicate anomalous_rows by dimension_value before generating
dim_parts and total, keeping the latest or highest-priority row per value. Then
apply the existing max_shown cap to the unique list so the reported count and
examples reflect distinct failing dimensions.

---

Nitpick comments:
In `@integration_tests/tests/test_column_anomalies.py`:
- Around line 320-323: The current assertions in the anomaly description tests
only cover the small-dimension cases, so add a new test case in
test_column_anomalies that drives the formatter past the max_shown threshold and
verifies the “and N more” suffix. Use the existing description checks around
test_result["test_results_description"] to assert the truncated output and the
extra-count branch, ensuring the formatter logic in the test results description
path does not regress.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 64f4da88-9811-415a-88a8-6e8bd30533e4

📥 Commits

Reviewing files that changed from the base of the PR and between 244f9db and df48e27.

📒 Files selected for processing (2)

integration_tests/tests/test_column_anomalies.py
macros/edr/data_monitoring/anomaly_detection/store_anomaly_test_results.sql

coderabbitai · 2026-06-28T11:18:22Z

+    {% set anomalous_rows = [] %}
    {% for row in anomaly_scores_rows %}
        {% if row.anomaly_score is not none %}
            {% do filtered_anomaly_scores_rows.append(row) %}
            {% if row.is_anomalous %}
                {% set failures.data = failures.data + 1 %}
+                {% do anomalous_rows.append(row) %}
            {% endif %}
        {% endif %}
    {% endfor %}
+    {% set test_results_description %}
+      {% if rows_with_score %}
+        {% set first_scored_row = rows_with_score[0] %}
+        {% set dimension = elementary.insensitive_get_dict_value(first_scored_row, 'dimension') %}
+        {% if dimension and anomalous_rows %}
+          {% set max_shown = 5 %}
+          {% set shown_rows = anomalous_rows[:max_shown] %}
+          {% set dim_parts = [] %}
+          {% for row in shown_rows %}
+            {% set dim_val = elementary.insensitive_get_dict_value(row, 'dimension_value') %}
+            {% set dim_val_str = dim_val if dim_val is not none else 'NULL' %}
+            {% set m_val = elementary.insensitive_get_dict_value(row, 'metric_value') %}
+            {% set t_val = elementary.insensitive_get_dict_value(row, 'training_avg') %}
+            {% set m_str = (m_val | float | round(3)) if m_val is not none else 'N/A' %}
+            {% set t_str = (t_val | float | round(3)) if t_val is not none else 'N/A' %}
+            {% do dim_parts.append(dim_val_str ~ " (" ~ m_str ~ ", avg " ~ t_str ~ ")") %}
+          {% endfor %}
+          {% set total = anomalous_rows | length %}
+          {% if column_name %}In column {{ column_name | upper }}, {% endif %}{{ total }} anomalous {{ metric_name }} value{% if total > 1 %}s{% endif %} for dimension {{ dimension }}: {{ dim_parts | join(", ") }}{% if total > max_shown %}, and {{ total - max_shown }} more{% endif %}.


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Deduplicate dimension_value before building the summary.

anomalous_rows collects every anomalous scored row in the (full_table_name, column_name, metric_name) group. Because those rows are bucket-based, the same dimension can appear multiple times here, so total and dim_parts can repeat a single failing dimension and overstate the “N anomalous … values” count. Build the description from unique dimension_values first (keeping the latest or highest-priority row per value), then calculate total and apply the 5-item cap.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@macros/edr/data_monitoring/anomaly_detection/store_anomaly_test_results.sql` around lines 79 - 107, The summary in store_anomaly_test_results.sql is counting and listing duplicate dimension_value entries from anomalous_rows, which can overstate failures for a single dimension. Update the test_results_description construction to deduplicate anomalous_rows by dimension_value before generating dim_parts and total, keeping the latest or highest-priority row per value. Then apply the existing max_shown cap to the unique list so the reported count and examples reflect distinct failing dimensions.

joostboon and others added 2 commits June 28, 2026 13:22

test: assert dimension summary in test_results_description for group_…

df48e27

…by tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: include all failing dimension values in anomaly alert description#1032

fix: include all failing dimension values in anomaly alert description#1032
joostboon wants to merge 2 commits into
masterfrom
fix/dimension-anomaly-alert-description

joostboon commented Jun 28, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 28, 2026

Uh oh!

coderabbitai Bot commented Jun 28, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

joostboon commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions Bot commented Jun 28, 2026

Uh oh!

coderabbitai Bot commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

joostboon commented Jun 28, 2026 •

edited

Loading

coderabbitai Bot commented Jun 28, 2026 •

edited

Loading