feat: add evaluate_flags() API for single-call flag evaluation#539
feat: add evaluate_flags() API for single-call flag evaluation#539
Conversation
Introduce posthog.evaluate_flags(distinct_id, ...) returning a FeatureFlagEvaluations snapshot. Branch on .is_enabled() / .get_flag() and pass the snapshot to capture() via a new flags option so events carry the exact values the code branched on, with no extra /flags request per capture. Filtering helpers .only_accessed() and .only([keys]) narrow the flag set attached to events. Pass flag_keys=[...] to evaluate_flags() to scope the underlying /flags request. Local evaluation is transparent. Deprecation (Phase 2 of the RFC): - feature_enabled, get_feature_flag, get_feature_flag_payload, and capture(send_feature_flags=...) now emit DeprecationWarning. - The deprecated surface remains functional and will be removed in the next major version. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703
`mock` is not in the project's test dependencies on CI. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703
posthog-python Compliance ReportDate: 2026-05-01 19:53:32 UTC
|
| Test | Status | Duration |
|---|---|---|
| Format Validation.Event Has Required Fields | ✅ | 517ms |
| Format Validation.Event Has Uuid | ✅ | 1507ms |
| Format Validation.Event Has Lib Properties | ✅ | 1507ms |
| Format Validation.Distinct Id Is String | ✅ | 1506ms |
| Format Validation.Token Is Present | ✅ | 1507ms |
| Format Validation.Custom Properties Preserved | ✅ | 1507ms |
| Format Validation.Event Has Timestamp | ✅ | 1506ms |
| Retry Behavior.Retries On 503 | ✅ | 9519ms |
| Retry Behavior.Does Not Retry On 400 | ✅ | 3505ms |
| Retry Behavior.Does Not Retry On 401 | ✅ | 3507ms |
| Retry Behavior.Respects Retry After Header | ✅ | 9513ms |
| Retry Behavior.Implements Backoff | ✅ | 23530ms |
| Retry Behavior.Retries On 500 | ✅ | 7505ms |
| Retry Behavior.Retries On 502 | ✅ | 7507ms |
| Retry Behavior.Retries On 504 | ✅ | 7515ms |
| Retry Behavior.Max Retries Respected | ✅ | 23525ms |
| Deduplication.Generates Unique Uuids | ✅ | 1501ms |
| Deduplication.Preserves Uuid On Retry | ✅ | 7510ms |
| Deduplication.Preserves Uuid And Timestamp On Retry | ✅ | 14525ms |
| Deduplication.Preserves Uuid And Timestamp On Batch Retry | ✅ | 7508ms |
| Deduplication.No Duplicate Events In Batch | ✅ | 1503ms |
| Deduplication.Different Events Have Different Uuids | ✅ | 1507ms |
| Compression.Sends Gzip When Enabled | ✅ | 1507ms |
| Batch Format.Uses Proper Batch Structure | ✅ | 1507ms |
| Batch Format.Flush With No Events Sends Nothing | ✅ | 1004ms |
| Batch Format.Multiple Events Batched Together | ✅ | 1506ms |
| Error Handling.Does Not Retry On 403 | ✅ | 3508ms |
| Error Handling.Does Not Retry On 413 | ✅ | 3507ms |
| Error Handling.Retries On 408 | ✅ | 7514ms |
Feature_Flags Tests
View Details
| Test | Status | Duration |
|---|---|---|
| Request Payload.Request With Person Properties Device Id | ❌ | 514ms |
Failures
request_payload.request_with_person_properties_device_id
Field 'token' not found in /flags request body at path 'token'. Available keys: ['distinct_id', 'groups', 'person_properties', 'group_properties', 'geoip_disable', 'device_id', 'flag_keys_to_evaluate', 'sentAt', 'api_key']
Phase 2 deprecation warnings on `feature_enabled`, `get_feature_flag`, `get_feature_flag_payload`, and `capture(send_feature_flags=...)` are moved to a separate PR so this minor ships only the new API. Gives users one minor to migrate before runtime warnings start. The deprecated methods are restored to their original implementations (no longer need to bypass each other to avoid cascading warnings). Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703
Prompt To Fix All With AIThis is a comment left during a code review.
Path: posthog/test/test_evaluate_flags.py
Line: 42-44
Comment:
**Missing `tearDown` leaks Client background threads**
`TestEvaluateFlagsRemote` creates a `Client` in `setUp` but never calls `self.client.shutdown()`. The other two test classes in this file (`TestEvaluateFlagsFiltering`, `TestCaptureWithFlagsSnapshot`) both have correct `tearDown` implementations. Without shutdown, the client's background consumer thread is never joined, which can cause test-suite noise, delayed process exit, or flaky behaviour when mocks from one test bleed into the next.
```suggestion
class TestEvaluateFlagsRemote(unittest.TestCase):
def setUp(self):
self.client = Client(FAKE_TEST_API_KEY)
def tearDown(self):
self.client.shutdown()
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: posthog/feature_flag_evaluations.py
Line: 69-80
Comment:
**`flag_definitions_loaded_at` is accepted but never supplied**
`_record_access` (line 242) emits `$feature_flag_definitions_loaded_at` only when `self._flag_definitions_loaded_at is not None`, but `Client.evaluate_flags()` never passes this argument to the `FeatureFlagEvaluations` constructor — so the property is always `None` and the event property is never emitted for locally-evaluated flags. Either wire up the value from the poller or remove the dead parameter and the guarded block in `_record_access` until it's ready.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: posthog/test/test_evaluate_flags.py
Line: 42-321
Comment:
**Prefer parameterised tests**
The test suite exercises multiple flag types (boolean, variant, disabled, missing) through per-assertion branches inside single test methods — e.g. `test_is_enabled_returns_correct_values_and_fires_events` and `test_get_flag_returns_variant_or_bool_with_full_metadata`. The project's review standard is to prefer parameterised tests. Using `subTest` (or a library like `ddt`) would give each flag-type scenario its own named, independently-failing case and reduce the assertion density per method.
**Context Used:** Do not attempt to comment on incorrect alphabetica... ([source](https://app.greptile.com/review/custom-context?memory=instruction-0))
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "revert: drop deprecation warnings (Phase..." | Re-trigger Greptile |
- Add `tearDown` to `TestEvaluateFlagsRemote` so the Client's background consumer thread is joined between tests, matching the pattern in the other test classes in this file. - Remove the dead `flag_definitions_loaded_at` constructor parameter from `FeatureFlagEvaluations`. The Python poller doesn't currently expose a definitions-loaded timestamp, so the parameter was always None and the gated branch in `_record_access` never fired. Trim it rather than leaving a confusing no-op; can be re-added with a real data source later. - Convert the two flag-type-variation tests to parameterized form. `test_is_enabled` and `test_get_flag_known_keys` now run as independently-named cases per flag type, and the missing-key behavior is split into its own focused test. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703
dustinbyrne
left a comment
There was a problem hiding this comment.
Looks good! I have a few questions/considerations
| if not self._accessed: | ||
| self._host.log_warning( | ||
| "FeatureFlagEvaluations.only_accessed() was called before any flags were accessed — " | ||
| "attaching all evaluated flags as a fallback. See " | ||
| "https://posthog.com/docs/feature-flags/server-sdks for details." | ||
| ) | ||
| return self._clone_with(self._flags) |
There was a problem hiding this comment.
If this is a legitimate case of nothing being accessed, there's no way to avoid getting all flags. This could be more confusing than getting nothing?
E.g., I'm imaging a case where flags are first retrieved and consumed as needed. Should an early call to capture(flags=flags.only_accessed()) include all flags? I think it's contradictory with the name only_accessed
There was a problem hiding this comment.
Good catch — fixed in 95eb1e9. only_accessed() now honors its name and returns an empty snapshot when nothing has been accessed (no fallback, no warning). The previous behavior was a misguided safety net that ended up being more surprising than helpful for exactly the scenario you describe (early capture() before any branching). The Node-side equivalent has the same change queued.
Updated test: test_only_accessed_returns_empty_when_no_flags_accessed.
| ) | ||
| return to_flags_and_payloads(resp) | ||
|
|
||
| def get_flags_decision( |
There was a problem hiding this comment.
Should we deprecate the other methods?
There was a problem hiding this comment.
Reversed course on this — shipped Phase 2 in eda573d alongside Phase 1 in this PR (rather than splitting it into a follow-up).
feature_enabled, get_feature_flag, get_feature_flag_payload, and capture(send_feature_flags=...) now emit DeprecationWarnings pointing at evaluate_flags(). The methods continue to return the same values; users who pin warnings to errors will get a heads-up on first use, and the rest see them surface via pytest / python -W / IDE inspectors.
feature_enabled and get_feature_flag were restructured to call _get_feature_flag_result directly so a single user-level call emits exactly one warning instead of cascading three.
Phase 3 (removal in next major) is separate.
| local_result, fallback_to_server = self._get_all_flags_and_payloads_locally( | ||
| distinct_id, | ||
| groups=dict(groups), | ||
| person_properties=person_properties, | ||
| group_properties=group_properties, | ||
| flag_keys_to_evaluate=flag_keys, | ||
| ) |
There was a problem hiding this comment.
We should pass device_id here, as it may be present in the context (via tracing headers).
I don't think it's necessary to add it as a parameter of evaluate_flags as well, but I'll leave it up to you.
There was a problem hiding this comment.
Good catch — wired up in 95eb1e9. evaluate_flags() now resolves device_id from context (get_context_device_id()) at the top of the method, then forwards it to the get_flags_decision(...) call so it lands in the /flags request body. I went with your suggestion and didn't add it as a method parameter — context-only is cleaner and matches the existing distinct_id resolution pattern.
| if self._evaluated_at and not (flag and flag.locally_evaluated): | ||
| properties["$feature_flag_evaluated_at"] = self._evaluated_at | ||
| if flag is None: | ||
| properties["$feature_flag_error"] = "flag_missing" |
There was a problem hiding this comment.
Just noting that we're losing granularity of $feature_flag_error values here
There was a problem hiding this comment.
You're right — fixed in 95eb1e9. The snapshot now tracks response-level errors (errors_while_computing_flags, quota_limited) at construction and _record_access builds a comma-joined $feature_flag_error matching the single-flag path's granularity. So a missing flag during a quota-limited response now reports quota_limited,flag_missing instead of just flag_missing.
New test: test_errors_while_computing_flags_propagates_to_event covers both the standalone (errors_while_computing_flags) and combined (errors_while_computing_flags,flag_missing) cases.
Not covered yet: TIMEOUT/CONNECTION_ERROR/api_error_NNN. Those manifest as exceptions on the request and currently just log + leave the snapshot empty (so flags would surface as flag_missing). Could fold those into the snapshot in a follow-up if useful, but the most common cases (errors_while_computing and quota_limited) are now covered.
|
how does auto captured events and errors set |
i'd say we should deprecate and add a warning now otherwise theres multiple ways of doing the same thing and users and agents will be very confused |
Resolves a typing import conflict in client.py from main and bundles the review-feedback changes: - Fix changeset format per RELEASING.md (`pypi/posthog: minor`). - `only_accessed()` now returns empty when nothing was accessed, instead of falling back to all flags. The fallback contradicted the method name and surprised reviewers. - Propagate response-level errors (`errors_while_computing_flags`, `quota_limited`) into `\$feature_flag_called` events so each access carries the granular error code(s) the single-flag path emits. - Resolve `device_id` from context in `evaluate_flags()` and pass it through to the `/flags` request. Important for experience-continuity flag matching when the device id flows in via tracing headers. - Make the precedence between `flags` and `send_feature_flags` explicit: `flags` always wins, and we log a warning when both are passed. - Clarify the `flag_keys` doc on the module-level `evaluate_flags`. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703
Per reviewer feedback, ship Phase 2 in this PR alongside Phase 1 instead of splitting it into a follow-up. The deprecated methods continue to work — they just emit a `DeprecationWarning` pointing at `evaluate_flags()`: - `feature_enabled()` - `get_feature_flag()` - `get_feature_flag_payload()` - `capture(send_feature_flags=...)` (only when truthy) `feature_enabled` and `get_feature_flag` are restructured to call `_get_feature_flag_result` directly instead of routing through each other, so a single user-level call emits exactly one warning instead of cascading. Tests cover each warning's emission and the no-cascade behavior. Existing tests that use the legacy methods will now generate DeprecationWarnings but otherwise pass unchanged. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703
Per PR review feedback (manoel): manual exception captures should be able to attach a `FeatureFlagEvaluations` snapshot the same way `capture()` can, so `\$exception` events carry the same flag context as the rest of the request's events. `capture_exception` already accepted `**kwargs` and forwarded select ones to `capture()` — `flags` was just missing from the forwarded set. This doesn't yet solve the wider question of how *auto*-captured exceptions (sys.excepthook, context-block exception handler) attach flags — that requires a separate mechanism (likely context-stashed flags) and is a follow-up. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703
Good question — the answer has two layers. Manual error captures: flags = posthog.evaluate_flags(distinct_id)
try:
risky_thing()
except Exception as e:
posthog.capture_exception(e, distinct_id=distinct_id, flags=flags)The same applies to anywhere the developer is the one calling Auto-captured exceptions (the case you're really asking about —
Happy to do (1) as a separate PR — felt out of scope for this one since it touches the context module and needs its own design pass. Worth tracking as a follow-up issue? |
You're right — reversed course on this and shipped Phase 2 in eda573d alongside Phase 1 in this PR. The PR description has been updated.
|
|
|
||
| The returned `FeatureFlagEvaluations` snapshot exposes `is_enabled()`, `get_flag()`, `get_flag_payload()` for branching and `only_accessed()` / `only([keys])` filter helpers. Pass `flag_keys=[...]` to `evaluate_flags()` to scope the underlying `/flags` request itself. | ||
|
|
||
| Deprecates `feature_enabled()`, `get_feature_flag()`, `get_feature_flag_payload()`, and `capture(send_feature_flags=...)`. They continue to work but now emit a `DeprecationWarning` pointing at `evaluate_flags()`. Removal is planned for the next major version. |
There was a problem hiding this comment.
Let's deprecate get_feature_flag_result as well. It lived a short life, but better to clean it up now.
|
@dustinbyrne you'll need to coordinate this with the @PostHog/team-docs-wizard as well otherwise the wizard will be lost or instrumenting new apps with deprecated APIs |
|
I think 'Stash flags on the context' makes sense |
Edwin gave me a short into on how the wizard sources context for these types of things. As I understand it, as long as it's documented on the website, the wizard will automatically start doing the right thing. For Python context it sources:
So these are the documents that we will need to update. Same for every other language. |
|
Updating the docs should do it. There isn't a feature flag specific thing that we run during setup, so it should be safe to get this in. There is a best practices skill, and I think this might change some of the feature flag best practices? |
dustinbyrne
left a comment
There was a problem hiding this comment.
I can spend some time tomorrow updating documentation for this. If you update the other SDK branches to match this spec I'll review those as well! 👍
Mirrors the changes from PostHog/posthog-python#539 (commit 95eb1e9): - only_accessed() returns an empty snapshot when nothing was accessed, rather than falling back to all flags + a warning. The fallback contradicted the method's name and surprised reviewers — pre-access any flags you want attached. - Propagate response-level errors (errors_while_computing_flags, quota_limited) into $feature_flag_called events as a comma-joined $feature_flag_error so each access carries the same granular error codes the single-flag path emits. quota_limited is now parsed from the v2 response. - Drop the unused flag_definitions_loaded_at plumbing (dead code in Phase 1 — replaced by the response-level error propagation). - Clarify the flag_keys docstring on EvaluateFlagsOptions: it scopes the network call, distinct from the in-memory only([keys]) helper. Generated-By: PostHog Code Task-Id: 2b101877-6890-43d1-8dbd-306433cd9d25
Per reviewer feedback on PostHog/posthog-python#539, ship Phase 2 in this PR alongside Phase 1 instead of splitting into a follow-up. The deprecated methods continue to work — they just emit a `#[deprecated]` compile warning pointing at `evaluate_flags()`: - `Client::get_feature_flag` - `Client::is_feature_enabled` - `Client::get_feature_flag_payload` Both blocking and async clients are covered. `is_feature_enabled` allows the deprecation lint internally because it still routes through `get_feature_flag` — that's the implementation detail; user-level call sites still surface exactly one warning each (one per call to a deprecated method). The existing tests and examples that exercise these methods get module-level `#![allow(deprecated)]` with a comment noting the deprecation window. The companion `evaluate_flags()` snapshot path covers all three methods' use cases without an extra `/flags` round-trip per call and emits a deduped `\$feature_flag_called` event with full metadata. Generated-By: PostHog Code Task-Id: 2b101877-6890-43d1-8dbd-306433cd9d25
…ularity, captureException flags) Mirrors fixes from PostHog/posthog-python#539: - `onlyAccessed()` returns empty when nothing has been accessed (no fallback to all flags). The previous fallback contradicted the method name and surprised reviewers. - Propagate response-level errors (`errors_while_computing_flags`, `quota_limited`) into `$feature_flag_called` events so each access carries the granular error code(s) the single-flag path emits. - Make `flags` vs `sendFeatureFlags` precedence explicit on `capture()`: `flags` always wins, and we log a warning when both are passed. - Phase 2 deprecation warnings: `getFeatureFlag`, `isFeatureEnabled`, `getFeatureFlagPayload`, and `capture({ sendFeatureFlags })` now log a deduped `[PostHog] ... is deprecated` console warning the first time they're used. `isFeatureEnabled` is restructured to call `_getFeatureFlagResult` directly so a single user-level call emits exactly one warning instead of cascading. - `captureException` and `captureExceptionImmediate` accept an optional `flags` snapshot so `$exception` events carry the same flag context as the rest of the request's events. Adds a process-wide dedup helper `emitDeprecationWarningOnce` matching Python's `warnings.warn` default-dedup behavior. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703
| def evaluate_flags( | ||
| distinct_id=None, # type: Optional[str] | ||
| groups=None, # type: Optional[Dict[str, str]] | ||
| person_properties=None, # type: Optional[Dict[str, Any]] | ||
| group_properties=None, # type: Optional[Dict[str, Dict[str, Any]]] | ||
| only_evaluate_locally=False, # type: bool | ||
| disable_geoip=None, # type: Optional[bool] | ||
| flag_keys=None, # type: Optional[list] | ||
| ) -> FeatureFlagEvaluations: |
There was a problem hiding this comment.
surfaced when documenting: we should include a device_id override for parity with the deprecated methods
this might be relevant for other sdks as well?
There was a problem hiding this comment.
good shout! So most of the other SDKs didn't actually support this yet – it was only this one and JS that supported that operation. I do think we should support device_id in this method going forward for other SDKs, but given that only Python and JS have it now, it's only a concern for those SDKs. Definitely will update JS, though.
The deprecated single-flag methods (`feature_enabled`, `get_feature_flag`, `get_feature_flag_payload`) and `get_all_flags` / `get_all_flags_and_payloads` all accept an explicit `device_id` parameter that overrides the value resolved from context. Mirror that on `evaluate_flags()` so callers can bypass the context resolver when they need to. Behavior: if `device_id=None` (the default), the value is resolved via `get_context_device_id()` as before; an explicit string takes priority. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703
* feat: add evaluate_flags() API for single-call flag evaluation
Adds a snapshot-based feature flag API mirroring posthog-python (#539) and
posthog-node (#3476). One call to evaluate_flags(distinct_id, options) reaches
/flags?v=2 once and returns a FeatureFlagEvaluations cache that:
- Resolves is_enabled / get_flag locally with full metadata propagation
($feature_flag_id, $feature_flag_version, $feature_flag_reason,
$feature_flag_request_id) on the deduplicated $feature_flag_called event
- Treats get_flag_payload as event-free
- Offers only_accessed() / only([keys]) filter helpers with warnings on
misuse, gated by a new feature_flags_log_warnings client option
- Short-circuits empty-distinct_id snapshots so accesses never emit events
Also adds Event::with_flags(&snapshot) so a captured event inherits
\$feature/<key> and \$active_feature_flags from the snapshot without an extra
/flags request.
Both blocking and async clients implement the host trait that owns the
per-distinct_id dedup cache (cap 50_000, full reset on overflow to match the
JS SDK).
The existing get_feature_flag / is_feature_enabled methods stay silent — a
Phase 2 follow-up will retrofit them onto the same dedup helper.
Generated-By: PostHog Code
Task-Id: 2b101877-6890-43d1-8dbd-306433cd9d25
* fix: address CodeQL unused-variable false positives in async spawn
CodeQL flags `e` as unused inside `tokio::spawn(async move { ... })` even
though tracing's `%e` shorthand uses Display on it. Switch to `{e}` capture
syntax so the use is unambiguous to the analyzer; same telemetry, slightly
less structured but the field was only consumed by the debug log anyway.
Generated-By: PostHog Code
Task-Id: 2b101877-6890-43d1-8dbd-306433cd9d25
* fix: bind async-spawn errors via to_string so CodeQL sees the use
The previous attempt swapped tracing's `%e` shorthand for `{e}` capture
syntax but CodeQL still flagged the variables as unused — its Rust
extractor doesn't track identifiers through the format-string macro
expansion inside `tokio::spawn(async move { ... })`. Explicitly bind the
error to a `String` via `.to_string()` and log that, which gives the
analyzer an unambiguous use.
Generated-By: PostHog Code
Task-Id: 2b101877-6890-43d1-8dbd-306433cd9d25
* feat: incorporate evaluate_flags review feedback from posthog-python
Mirrors the changes from PostHog/posthog-python#539 (commit 95eb1e9):
- only_accessed() returns an empty snapshot when nothing was accessed,
rather than falling back to all flags + a warning. The fallback
contradicted the method's name and surprised reviewers — pre-access
any flags you want attached.
- Propagate response-level errors (errors_while_computing_flags,
quota_limited) into $feature_flag_called events as a comma-joined
$feature_flag_error so each access carries the same granular error
codes the single-flag path emits. quota_limited is now parsed from
the v2 response.
- Drop the unused flag_definitions_loaded_at plumbing (dead code in
Phase 1 — replaced by the response-level error propagation).
- Clarify the flag_keys docstring on EvaluateFlagsOptions: it scopes
the network call, distinct from the in-memory only([keys]) helper.
Generated-By: PostHog Code
Task-Id: 2b101877-6890-43d1-8dbd-306433cd9d25
* feat: deprecate legacy single-flag methods (Phase 2)
Per reviewer feedback on PostHog/posthog-python#539, ship Phase 2 in this
PR alongside Phase 1 instead of splitting into a follow-up. The deprecated
methods continue to work — they just emit a `#[deprecated]` compile warning
pointing at `evaluate_flags()`:
- `Client::get_feature_flag`
- `Client::is_feature_enabled`
- `Client::get_feature_flag_payload`
Both blocking and async clients are covered. `is_feature_enabled` allows
the deprecation lint internally because it still routes through
`get_feature_flag` — that's the implementation detail; user-level call
sites still surface exactly one warning each (one per call to a deprecated
method). The existing tests and examples that exercise these methods get
module-level `#![allow(deprecated)]` with a comment noting the deprecation
window.
The companion `evaluate_flags()` snapshot path covers all three methods'
use cases without an extra `/flags` round-trip per call and emits a
deduped `\$feature_flag_called` event with full metadata.
Generated-By: PostHog Code
Task-Id: 2b101877-6890-43d1-8dbd-306433cd9d25
* fix: address evaluate_flags review feedback from @dustinbyrne
- Skip the /flags round-trip when `flag_keys` is set and local evaluation
already resolved every requested key. Without `flag_keys` we still hit
the API since the local poller may not know every flag the project has.
- Don't lose successful local evaluations when /flags fails. If the
remote call errors but we already have local results, return a snapshot
built from those instead of propagating the error; flag the snapshot's
errors_while_computing_flags so $feature_flag_called events carry that
context.
- Normalize `metadata.payload` from /flags?v=2: when it arrives as a
JSON-encoded string (the API sometimes ships it that way) parse it
into the equivalent JSON value so callers branch on uniform shapes.
- Capture a tokio runtime Handle when constructing the async event host
so $feature_flag_called events can be spawned from any context the
snapshot is consumed in, including threads without an entered runtime.
Without this, a snapshot moved across threads would panic on access.
- Remove the `feature_flags_log_warnings` client option. The single
`only(...)` warning surfaces via tracing::warn! and is silenceable
through normal tracing-subscriber level filters (e.g. `posthog_rs=error`).
- Demote `EvaluatedFlagRecord`, `FlagCalledEventParams`, and
`FeatureFlagEvaluationsHost` from public re-exports to `pub(crate)` —
they were implementation details for the snapshot's host plumbing,
not user-facing API.
Also drops the now-dead `key` field on `EvaluatedFlagRecord` (the HashMap
key already serves that purpose).
Generated-By: PostHog Code
Task-Id: 2b101877-6890-43d1-8dbd-306433cd9d25
) * feat: add EvaluateFlagsAsync() API for single-call flag evaluation Adds a new EvaluateFlagsAsync(distinctId, options) method on the client that returns a FeatureFlagEvaluations snapshot. The snapshot powers IsEnabled / GetFlag / GetFlagPayload calls, fires $feature_flag_called lazily (deduped against the existing per-distinct-id cache), and can be forwarded to a new Capture(..., flags: snapshot) overload to attach $feature/<key> and $active_feature_flags to events without a second /flags request. Mirrors PostHog/posthog-js#3476 and PostHog/posthog-python#539. Also fixes a long-standing bug where the legacy single-flag path hard-coded locally_evaluated=false on every $feature_flag_called event. Locally-evaluated flags now correctly carry locally_evaluated=true, $feature_flag_reason="Evaluated locally", and a new $feature_flag_definitions_loaded_at timestamp surfaced via LocalFeatureFlagsLoader. The existing IsFeatureEnabledAsync / GetFeatureFlagAsync / Capture(..., sendFeatureFlags, ...) APIs are unchanged in this PR; a follow-up minor will mark them deprecated in favor of the snapshot API. Generated-By: PostHog Code Task-Id: 494d1c64-1b39-421a-9317-7ccd5992aa40 * review: thread-safety, drop dead reason, parameterize tests, parse JSON - FeatureFlagEvaluations._accessed: HashSet<string> -> ConcurrentDictionary<string, byte> so callers may share a snapshot across parallel branches without corrupting it. - ToRecord: leave EvaluatedFlagRecord.Reason null for locally-evaluated flags; the "Evaluated locally" string is hardcoded inside BuildFeatureFlagCalledProperties and the host gates record.Reason with !LocallyEvaluated, so it was unread. - Collapse IsEnabledReturnsFalseForUnknownKey + GetFlagReturnsNullForUnknownKey into a parameterized [Theory] over the accessor under test. - Replace the brittle substring match on $active_feature_flags with a parsed, order-independent comparison; Dictionary iteration order isn't a guarantee. Generated-By: PostHog Code Task-Id: 494d1c64-1b39-421a-9317-7ccd5992aa40 * review: DIMs, dedup fast-path, perf, coverage gaps Address PR feedback: - IPostHogClient: add default interface implementations for the new Capture(flags:), CaptureException(flags:), and EvaluateFlagsAsync members so external implementers don't see a source break. Conditionally compiled — DIMs only on netstandard2.1+ (the runtime requirement); netstandard2.0 keeps abstract members. - FeatureFlagEvaluations.RecordAccess: early-return on repeat access, dropping per-call dedup-cache lookups + property allocation when a key has already been seen by this snapshot. Cross-snapshot dedup still flows through the MemoryCache. - AddFeatureFlagsToCapturedEvent (snapshot path): single-pass enumeration over Records, skip the LINQ Where/Select/ToArray for $active_feature_flags. - FeatureFlagEvaluations._records: tighten field type to Dictionary so Keys is a clean expression-bodied getter (no IReadOnlyDictionary cast). - FeatureFlagEvaluations.Only(...): lazy missing-keys list — no allocation when every requested key is present. - EvaluationsHost: drop the redundant id/version/reason copy block — the values it would write are already populated by BuildFeatureFlagCalledProperties via the FeatureFlagWithMetadata pattern match. - EvaluatedFlagRecord: remove the now-unused Id/Version/Reason fields. The property dict is built from record.Flag (typed as FeatureFlagWithMetadata when present) rather than from duplicated record-level state. - EvaluateFlagsAsync: local-pass quota_limited preserves locally-evaluated records and surfaces FeatureFlagError.QuotaLimited (matches remote-pass behavior); previously it discarded local results entirely. Add a comment on the local-wins merge clarifying the divergence from GetAllFeatureFlagsAsync. - IPostHogClient.EvaluateFlagsAsync: <remarks> contrasting FlagKeysToEvaluate (request-body scoping) with FeatureFlagEvaluations.Only(...) (in-memory). - IFeatureFlagEvaluationsHost.TryCaptureFeatureFlagCalledEventIfNeeded -> CaptureFeatureFlagCalled (no return value, no try semantics). Tests added: - MixedLocalAndRemoteEvaluationMergesRecordsAndTagsSourceCorrectly: pins the local-wins merge with locally_evaluated tagged correctly per source. - UnknownKeyAccessAppendsFlagMissingErrorOnFeatureFlagCalled: pins the $feature_flag_error wiring through to the emitted event. - CaptureExceptionAttachesFeatureFlagsFromSnapshot: pins the new CaptureException(flags:) overload so a CaptureExceptionCore wiring mistake would be caught. Generated-By: PostHog Code Task-Id: 494d1c64-1b39-421a-9317-7ccd5992aa40 * chore: deprecate single-flag and sendFeatureFlags APIs Mark the four legacy paths replaced by EvaluateFlagsAsync + snapshot [Obsolete(error: false)] so users see migration guidance the moment they update the package: - IPostHogClient.IsFeatureEnabledAsync / .GetFeatureFlagAsync - IPostHogClient.Capture(..., bool sendFeatureFlags, ...) - IPostHogClient.CaptureException(..., bool sendFeatureFlags, ...) Cascade [Obsolete] to wrapper extensions: - FeatureFlagExtensions: 5 IsFeatureEnabledAsync + 5 GetFeatureFlagAsync overloads - CaptureExtensions: bool-sendFeatureFlags overloads of Capture / CapturePageView / CaptureScreenView - CaptureExceptionExtensions: bool-sendFeatureFlags overloads Each extension delegates internally; suppress CS0618 inside the body so the warning surfaces at the user call site, not at the SDK call into itself. Internal call sites that always passed sendFeatureFlags: false migrate to the new Capture(..., flags: null, ...) overload — no behavioral change, but stops the SDK from internally calling its own deprecated path. Tests and samples that intentionally exercise the deprecated surface get a file-level #pragma warning disable CS0618. The new FeatureFlagEvaluationsTests cross-path dedup test wraps a single IsFeatureEnabledAsync call in a per-call pragma so the rest of the file still catches accidental new uses. PostHog.AI's OpenAI handler keeps the legacy Capture(..., sendFeatureFlags: false, ...) call with a #pragma + TODO; its tests assert the legacy mock shape and migrating them is its own change. PostHog.AspNetCore's PostHogVariantFeatureManager suppresses with a #pragma + TODO; the FeatureManager API is per-flag so a snapshot rewrite is non-trivial. All 781 unit tests, 26 AspNetCore tests, and 19 AI tests pass. Generated-By: PostHog Code Task-Id: 494d1c64-1b39-421a-9317-7ccd5992aa40 * review: fix CI, propagate cancellation, tighten Records, polish comments - sdk_compliance_adapter Program.cs: switch the lone Capture(..., sendFeatureFlags: false, ...) call to the new flags: null overload so the Docker publish in the SDK-compliance CI job no longer hits CS0618 → exit 1. - EvaluateFlagsAsync: exclude OperationCanceledException from the catch-all so cancellation propagates instead of being logged as UnknownError. Matches GetFeatureFlagAsync's filter. - FeatureFlagEvaluations.Records: typed as IReadOnlyDictionary so the one consumer (PostHogClient.AddFeatureFlagsToCapturedEvent) can iterate but cannot mutate the snapshot's underlying state. - Local-quota comment in EvaluateFlagsAsync: clarify that `records` is always empty when the catch fires (the throwing call is the first inside the try). - Capture / CaptureException / DIM bodies: name every trailing argument (timestamp:, flags:) so non-trailing-named-argument call sites don't trip future IDE/bot warnings even though the C# 7.2+ rules accept them. - FeatureFlagEvaluationsTests: drop the unused Microsoft.Extensions.Options using directive. Generated-By: PostHog Code Task-Id: 494d1c64-1b39-421a-9317-7ccd5992aa40 * chore: drop stray BOM after pragma in FeatureFlagExtensionsTests Generated-By: PostHog Code Task-Id: 494d1c64-1b39-421a-9317-7ccd5992aa40 * chore: revert manual version bump; release workflow handles it RELEASING.md confirms the auto-release workflow bumps Directory.Build.props on merge based on the bump-* PR label, so leaving 2.6.0 in source would cause it to compound (e.g. to 2.7.0 after bump-minor). Restore to main's 2.5.0. Generated-By: PostHog Code Task-Id: 494d1c64-1b39-421a-9317-7ccd5992aa40
…3476) * feat(node): add evaluateFlags() API for single-call flag evaluation Introduce `posthog.evaluateFlags(distinctId, options)` returning a `FeatureFlagEvaluations` snapshot. Branch on `isEnabled()` / `getFlag()` and pass the snapshot to `capture()` via a new `flags` option so events carry the exact values the code branched on, with no extra /flags request per capture. Filtering helpers `onlyAccessed()` and `only([keys])` let callers shrink the flag set attached to events. A new `featureFlagsLogWarnings` option toggles the associated user-facing warnings. Existing `isFeatureEnabled` / `getFeatureFlag` / `sendFeatureFlags` continue to work unchanged; `sendFeatureFlags` is marked deprecated in JSDoc ahead of a future major-version removal. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703 * feat(node): support flagKeys option on evaluateFlags() Allow callers to scope the underlying /flags request to a subset of flags. The chained `flags.only([...])` filter still exists for event-attachment scoping after evaluation; `flagKeys` reduces the network payload itself. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703 * docs(node): expand evaluateFlags() JSDoc with flagKeys and filtering examples Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703 * fix(node): close parity gaps on FeatureFlagEvaluations $feature_flag_called events - Plumb $feature_flag_definitions_loaded_at into the snapshot at construction so locally-evaluated flag access via the new API emits the same event schema as the existing single-flag path. - Short-circuit $feature_flag_called emission when the snapshot has no resolvable distinctId, so the safety-fallback empty snapshot doesn't leak events with empty distinct_id values. - Demote the shared dedup helper from public to protected; the only external caller is a closure with `this`-scoped access. - Document the onlyAccessed() empty-fallback behavior and clarify that the local-evaluation flag definition has no version field. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703 * fix(node): suppress flag_missing events on filtered FeatureFlagEvaluations slices Address review feedback on PR #3476: - Filtered snapshots from `only()` / `onlyAccessed()` no longer fire misleading `$feature_flag_called` events with `flag_missing` when branching on a key that was excluded from the slice. The slice tracks whether it's a filtered view via an `_isSlice` flag and short-circuits `_recordAccess` for absent keys. Document this behavior on the filter helpers' JSDoc — slices are intended for `capture()`, not branching. Add a regression test covering the path. - Refactor `evaluate-flags.spec.ts` to extract a `setup(overrides)` helper used by all suites, replacing eight repeated `new PostHog(...)` blocks plus four duplicated capture-listener setups. Per-test deviations (`featureFlagsLogWarnings: false`, `personalApiKey: ...`) now stand out as explicit overrides. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703 * feat(node): port Python PR feedback (deprecation warnings, error granularity, captureException flags) Mirrors fixes from PostHog/posthog-python#539: - `onlyAccessed()` returns empty when nothing has been accessed (no fallback to all flags). The previous fallback contradicted the method name and surprised reviewers. - Propagate response-level errors (`errors_while_computing_flags`, `quota_limited`) into `$feature_flag_called` events so each access carries the granular error code(s) the single-flag path emits. - Make `flags` vs `sendFeatureFlags` precedence explicit on `capture()`: `flags` always wins, and we log a warning when both are passed. - Phase 2 deprecation warnings: `getFeatureFlag`, `isFeatureEnabled`, `getFeatureFlagPayload`, and `capture({ sendFeatureFlags })` now log a deduped `[PostHog] ... is deprecated` console warning the first time they're used. `isFeatureEnabled` is restructured to call `_getFeatureFlagResult` directly so a single user-level call emits exactly one warning instead of cascading. - `captureException` and `captureExceptionImmediate` accept an optional `flags` snapshot so `$exception` events carry the same flag context as the rest of the request's events. Adds a process-wide dedup helper `emitDeprecationWarningOnce` matching Python's `warnings.warn` default-dedup behavior. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703 * fix(node): address PR review — drop unused methods, add JSDoc @deprecated tags Per dustin's feedback on PR #3476: - Remove unused public `_getDistinctId()` / `_getGroups()` methods on `FeatureFlagEvaluations`. They had no callers (verified via grep across the repo) and don't need to ship as part of the public surface. - Add JSDoc `@deprecated` tags to `getFeatureFlag`, `isFeatureEnabled`, and `getFeatureFlagPayload` on both the `PostHogBackendClient` impl (client.ts) and the `IPostHog` interface (types.ts). The runtime `console.warn` was already in place; the JSDoc tag adds IDE strike- through and agent-tooling visibility — code agents reading the public surface will see the deprecation immediately rather than waiting for a runtime call. Generated-By: PostHog Code Task-Id: b8a45b11-b41c-4995-8622-acea525e7703
Problem
Phase 1 + Phase 2 of the Server SDK Feature Flag Evaluations RFC for
posthog-python. Companion to the Node SDK PR (PostHog/posthog-js#3476).Today every flag check fires its own
/flagsrequest, andcapture(send_feature_flags=True)silently fires yet another on every captured event. The flag values on a captured event can diverge from the ones the code actually branched on when person/group properties differ between calls.send_feature_flagsalso attaches every evaluated flag to every event, which bloats properties on high-volume events.Changes
New API (Phase 1)
posthog.evaluate_flags(distinct_id, ...)returns aFeatureFlagEvaluationssnapshot:A single
/flagsrequest powers both branching and event enrichment.is_enabled()andget_flag()fire$feature_flag_calledevents (deduped through the existing cache) with the full metadata —$feature_flag_id,$feature_flag_version,$feature_flag_reason,$feature_flag_request_id— so experiment exposure tracking keeps working.Two layers of scoping
Network-level (
flag_keysoption): scopes the underlying/flagsrequest itself.Event-level (filter helpers): narrow which flags get attached to a captured event without re-fetching.
Deprecation warnings (Phase 2)
The legacy single-flag surface keeps working but now emits
DeprecationWarnings pointing atevaluate_flags():feature_enabled()get_feature_flag()get_feature_flag_payload()capture(send_feature_flags=...)(only when truthy)feature_enabledandget_feature_flagare restructured to call_get_feature_flag_resultdirectly instead of routing through each other, so a single user-level call emits exactly one warning instead of cascading.Phase 3 (removal in next major) ships separately.
Local evaluation
Transparent. When the poller resolves a flag, the snapshot carries
locally_evaluated=Trueand reason"Evaluated locally", matching whatget_feature_flag()emits today.Backwards compatibility
No breaking changes. All existing call paths return the same values they did before — the only behavior change is the new
DeprecationWarningemissions, which can be silenced via Python's standard warnings filter.Internals
_capture_feature_flag_calledwas refactored: the dedup + capture portion is extracted into_capture_feature_flag_called_if_needed, which is shared between the single-flag path and the newFeatureFlagEvaluationsobject. Both paths now dedupe identically.Response-level errors (
errors_while_computing_flags,quota_limited) are propagated into$feature_flag_calledevents from the snapshot, matching the granularity of the single-flag path.Tests
posthog/test/test_evaluate_flags.py— 27 tests covering remote evaluation, local evaluation, filtering helpers, capture integration,flag_keysround-trip, empty-distinct_id safety, error-granularity propagation, and deprecation warning emission (with no-cascade verification).Full suite: 489 passed.
ruff formatandruff checkclean.Created with PostHog Code