You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe that with the current tables, we can implement a trigger mechanism. I am thinking of something like this:
Trigger Evaluation Design
Overview
The controller holds an in-memory map of active periodic job timers. Each timer ticks at the job's trigger cadence. When a timer fires, the controller reschedules the job. No persistent scheduling state is needed beyond the existing job event log.
Data Structure
A map of JobId to a timer entry. Each entry holds:
A tokio::time::Interval configured with the trigger's period (or hold a stream for the cron case)
The trigger config (parsed from the descriptor, to avoid re-fetching on each tick)
The job's created_at (needed for interval anchor computation)
Boot Recovery
On startup, the controller runs a single query to find all jobs that:
Are in a terminal state (COMPLETED, ERROR, FATAL)
Have a periodic trigger in their descriptor (interval or cron)
For each, it parses the trigger config, computes the next fire time via Trigger::next_fire_time(now, created_at), and creates an Interval starting at that time with the appropriate period. If the next fire time is already in the past (the controller was down and missed ticks), the interval fires immediately on the first poll.
Steady-State Loop
A single async task drives all timers. On each fire:
Look up the job's current status
If ERROR: mark FATAL first (abandon current retry cycle), then reschedule fresh
If COMPLETED or FATAL: reschedule directly (new SCHEDULED event)
Compute the next fire time and let the interval tick again
The loop uses something like FuturesUnordered or a select!-based approach over the map entries, waking only when the next timer is due. Zero work between fires.
Lifecycle Events
Job completes/fails (reaches terminal state):
If the job has a periodic trigger and isn't already in the map, compute next fire time and insert a new timer entry
Job is rescheduled (re-deploy, config change):
Remove the old timer entry
Parse the new trigger config from the updated descriptor
Compute next fire time and insert a new timer entry
This handles trigger cadence changes (e.g., interval changed from 5min to 10min)
Job is deleted or stopped:
Remove the timer entry from the map
Job has a one-shot trigger:
Never enters the map. One-shot jobs are unaffected by this system.
What This Replaces
No next_fire_at column on jobs_status
No migration
No update_next_fire_at mutation
No polling query on every reconciliation tick
The get_jobs_for_trigger_evaluation query becomes a boot-only query (and can use the JSONB containment filter since it only runs once)
Invariants
The map is derived state -- it's fully reconstructable from the event log at any time. A crash and restart rebuilds it from scratch.
The map only contains periodic jobs in terminal states. Running/scheduled jobs are not in the map (they haven't fired their trigger yet).
Each job has at most one timer entry. Re-schedule replaces, never duplicates.
Edge Cases
Controller restart: boot recovery reconstructs the map. Missed ticks fire immediately (the computed next_fire_time will be in the past, so the interval fires on first poll).
Long downtime: same as above. Only the next fire time matters, not how many ticks were missed. No catch-up backfill.
Trigger config change while running: the running job completes, reaches terminal state, and the lifecycle event picks up the new descriptor to compute the next fire time. The old timer (if any) is replaced.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
next_fire_atcolumn tojobs_statuswith a filtered indexget_jobs_for_trigger_evaluation— fetches jobs wherenext_fire_at <= now()update_next_fire_at— sets/clears the next fire time for periodic triggersget_attempt_count_since_last_completed— counts scheduling attempts since last completed run