fix: change message indexes to sort created_at descending by smoreinis · Pull Request #147 · scaleapi/scale-agentex

smoreinis · 2026-02-09T20:30:32Z

Summary

Sort created_at descending in message indexes, matching the typical newest-first query pattern
Remove redundant standalone task_id_idx from both messages and task_states collections — the compound indexes with task_id as leftmost prefix already cover task_id-only queries
Add unique: true to task_agent_compound_idx on task_states to enforce the (task_id, agent_id) uniqueness invariant at the database level
Auto drop-and-recreate indexes on startup when options change (e.g. sort direction, uniqueness), with fail-fast if recreate fails after drop
Clean up legacy task_id_idx zombie indexes on existing databases during startup
Wrap sync find_one in run_in_executor to avoid blocking the event loop in get_by_task_and_agent

Test plan

Verify indexes are recreated on startup with the new sort direction
Confirm message list queries return results in expected order
Repository unit tests pass (214/214)
Confirm index creation logs on startup show 3 indexes for messages, 2 for task_states
Verify unique constraint rejects duplicate (task_id, agent_id) inserts
Verify legacy task_id_idx is dropped from existing databases on startup

Greptile Summary

This PR hardens MongoDB index lifecycle management by implementing drop-and-recreate for indexes with changed options, cleaning up legacy task_id_idx indexes, flipping created_at sort direction to DESCENDING (matching all callers' default sort), adding a unique constraint to task_agent_compound_idx, and fixing a blocking synchronous find_one call in get_by_task_and_agent.

Confidence Score: 5/5

Safe to merge — previous P1 concerns (silent index mismatch, index-loss-on-recreate-failure) are fully addressed; remaining findings are P2 style suggestions.

All prior blocking concerns around index migration correctness have been resolved. The drop-and-recreate path properly aborts startup on recreate failure, and legacy cleanup is explicit. The two remaining comments are non-blocking style observations.

agentex/src/config/mongodb_indexes.py — the broad OperationFailure catch in legacy cleanup is the only open item worth addressing before the next release.

Important Files Changed

Filename	Overview
agentex/src/config/mongodb_indexes.py	Adds drop-and-recreate logic for indexes with changed options and a legacy cleanup block; OperationFailure catch in legacy cleanup is broader than IndexNotFound alone
agentex/src/domain/repositories/task_message_repository.py	Changes created_at sort to DESCENDING in compound indexes and removes redundant task_id_idx — consistent with all callers defaulting to desc order
agentex/src/domain/repositories/task_state_repository.py	Adds unique: True to task_agent_compound_idx, removes redundant task_id_idx, and fixes get_by_task_and_agent to use run_in_executor to avoid blocking the event loop

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[ensure_mongodb_indexes called at startup] --> B[Import repository classes]
    B --> C{For each repo_class}
    C --> D[collection = db[COLLECTION_NAME]]
    D --> E{For each index_spec in INDEXES}
    E --> F[collection.create_index]
    F -->|Success| G[log ✓ next index]
    G --> E
    F -->|OperationFailure: different options| H[drop_index name]
    H -->|drop fails| I[log ✗ continue]
    I --> E
    H -->|drop success| J[collection.create_index recreate]
    J -->|success| K[log ✓ next index]
    K --> E
    J -->|fails| L[log ✗ RAISE — aborts startup]
    F -->|other error| M[log ✗ next index]
    M --> E
    E -->|all indexes done| C
    C -->|all repos done| N[Legacy index cleanup]
    N --> O{For each legacy_indexes entry}
    O --> P[collection.drop_index]
    P -->|success| Q[log ✓ dropped]
    Q --> O
    P -->|OperationFailure| R[pass — already gone]
    R --> O
    P -->|other Exception| S[log ⚠ warning]
    S --> O
    O -->|done| T[log: index creation completed]

Prompt To Fix All With AI

This is a comment left during a code review.
Path: agentex/src/config/mongodb_indexes.py
Line: 134-135

Comment:
**OperationFailure catch is broader than IndexNotFound**

The comment says "Index doesn't exist — already clean", but `OperationFailure` covers many server-side errors beyond index-not-found (e.g., insufficient privileges, namespace issues). An unrelated `OperationFailure` during legacy cleanup would be silently dropped with `pass`. Scoping to error code 27 (`IndexNotFound`) makes the intent explicit and lets other failures surface:

```suggestion
        except OperationFailure as e:
            if e.code != 27:  # 27 = IndexNotFound — already clean
                logger.warning(
                    f"  ⚠ Unexpected OperationFailure dropping legacy index '{index_name}' "
                    f"from '{collection_name}': {e}"
                )
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: agentex/src/domain/repositories/task_state_repository.py
Line: 43-50

Comment:
**`run_in_executor` pattern inconsistent with base class**

`get_by_task_and_agent` now correctly offloads its synchronous `find_one` call to a thread pool, avoiding event-loop blocking. However, the base class `MongoDBCRUDRepository` has many other `async` methods (`get`, `get_by_field`, `find_by_field`, `find_by_field_with_cursor`, `list`, etc.) that call synchronous pymongo operations directly without `run_in_executor`. It's worth tracking a follow-up to apply the same fix across all base-class read methods.

How can I resolve this? If you propose a fix, please make it concise.

_{Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.
Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.}

_{Reviews (5): Last reviewed commit: "Merge branch 'main' into stas/messages-i..." | Re-trigger Greptile}

…k states Drop standalone task_id_idx from both messages and task_states collections since the compound indexes with task_id as leftmost prefix already cover task_id-only queries. Add unique: true to task_agent_compound_idx to enforce the (task_id, agent_id) uniqueness invariant at the database level.

greptile-apps · 2026-04-03T17:20:46Z

            "keys": [("task_id", pymongo.ASCENDING), ("agent_id", pymongo.ASCENDING)],
            "name": "task_agent_compound_idx",
-            "description": "Compound index for get_by_task_and_agent queries",
-        },
-        {
-            "keys": [("task_id", pymongo.ASCENDING)],
-            "name": "task_id_idx",
-            "description": "Single index for task_id queries",
+            "unique": True,
+            "description": "Unique compound index for get_by_task_and_agent queries",
        },


Unique constraint won't apply to existing deployments

Adding unique: True here will not enforce uniqueness on any environment where the task_agent_compound_idx index already exists as a non-unique index. When ensure_mongodb_indexes calls collection.create_index(keys, name="task_agent_compound_idx", unique=True) and an index with that name already exists without unique, MongoDB raises OperationFailure: "already exists with different options". The catch block in mongodb_indexes.py (lines 87–93) silently logs a warning and moves on — the old index remains in place and the uniqueness constraint is never applied.

The same applies to the sort-direction changes in task_message_repository.py: the pre-existing task_id_created_at_idx (with ASCENDING) won't be replaced with the DESCENDING variant on any running database.

To actually apply these changes, you need a one-time migration step (e.g., in ensure_mongodb_indexes or a separate script) that drops the old indexes before recreating them:

# In ensure_mongodb_indexes or a migration script, before create_index: try: collection.drop_index("task_agent_compound_idx") except Exception: pass # index might not exist yet collection.create_index(keys, name="task_agent_compound_idx", unique=True)

Or expose the existing drop_all_indexes helper and run it once as part of the deploy process. Until this is addressed, the unique invariant described in the PR summary will only be enforced on fresh databases.

Prompt To Fix With AI

This is a comment left during a code review. Path: agentex/src/domain/repositories/task_state_repository.py Line: 22-26 Comment: **Unique constraint won't apply to existing deployments** Adding `unique: True` here will **not** enforce uniqueness on any environment where the `task_agent_compound_idx` index already exists as a non-unique index. When `ensure_mongodb_indexes` calls `collection.create_index(keys, name="task_agent_compound_idx", unique=True)` and an index with that name already exists without `unique`, MongoDB raises `OperationFailure: "already exists with different options"`. The catch block in `mongodb_indexes.py` (lines 87–93) silently logs a warning and moves on — the old index remains in place and **the uniqueness constraint is never applied**. The same applies to the sort-direction changes in `task_message_repository.py`: the pre-existing `task_id_created_at_idx` (with `ASCENDING`) won't be replaced with the `DESCENDING` variant on any running database. To actually apply these changes, you need a one-time migration step (e.g., in `ensure_mongodb_indexes` or a separate script) that drops the old indexes before recreating them: ```python # In ensure_mongodb_indexes or a migration script, before create_index: try: collection.drop_index("task_agent_compound_idx") except Exception: pass # index might not exist yet collection.create_index(keys, name="task_agent_compound_idx", unique=True) ``` Or expose the existing `drop_all_indexes` helper and run it once as part of the deploy process. Until this is addressed, the unique invariant described in the PR summary will only be enforced on fresh databases. How can I resolve this? If you propose a fix, please make it concise.

Previously, if an index already existed with different options (e.g., non-unique vs unique), the startup code just logged a warning and moved on — meaning index changes never applied on existing deployments. Now we auto-recover by dropping the old index and recreating it.

Separate drop and create into distinct try/except blocks so that: - If drop fails, the old index stays in place and we skip recreate - If drop succeeds but create fails, we raise to prevent the app from starting without the index (e.g., missing unique constraint)

- Wrap sync find_one in run_in_executor to avoid blocking the event loop - Drop legacy task_id_idx indexes from existing databases on startup - Add clarifying comment on exception propagation in index recreate path

Update task_message_repository.py

fed71b2

smoreinis requested a review from a team as a code owner February 9, 2026 20:30

smoreinis requested a review from danielmillerp April 3, 2026 17:17

Merge branch 'main' into stas/messages-index

c758a6d

greptile-apps Bot reviewed Apr 3, 2026

View reviewed changes

smoreinis added 2 commits April 3, 2026 16:42

Merge branch 'main' into stas/messages-index

f3764a3

greptile-apps Bot reviewed Apr 3, 2026

View reviewed changes

Comment thread agentex/src/config/mongodb_indexes.py

smoreinis added 4 commits April 3, 2026 17:00

Merge branch 'main' into stas/messages-index

a9c458c

fix: address remaining PR review concerns

a64e550

- Wrap sync find_one in run_in_executor to avoid blocking the event loop - Drop legacy task_id_idx indexes from existing databases on startup - Add clarifying comment on exception propagation in index recreate path

Merge branch 'main' into stas/messages-index

17a96fe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: change message indexes to sort created_at descending#147

fix: change message indexes to sort created_at descending#147
smoreinis wants to merge 9 commits into
mainfrom
stas/messages-index

smoreinis commented Feb 9, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

greptile-apps Bot Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

smoreinis commented Feb 9, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

smoreinis commented Feb 9, 2026 •

edited by greptile-apps Bot

Loading