Skip to content

[Gateway V1/V2]: Enable ReadConsistencyStrategy for Gateway V1 and Gateway V2#48787

Open
jeet1995 wants to merge 47 commits intoAzure:mainfrom
jeet1995:squad/48094-readconsistency-header-propagation
Open

[Gateway V1/V2]: Enable ReadConsistencyStrategy for Gateway V1 and Gateway V2#48787
jeet1995 wants to merge 47 commits intoAzure:mainfrom
jeet1995:squad/48094-readconsistency-header-propagation

Conversation

@jeet1995
Copy link
Copy Markdown
Member

@jeet1995 jeet1995 commented Apr 12, 2026

Issue

Fixes #48094

Summary

Enable ReadConsistencyStrategy across all connection modes -- Direct (already working), Gateway V1 (compute gateway), and Gateway V2 (thin client proxy).

What changed

ReadConsistencyStrategy overrides ConsistencyLevel -- When a non-DEFAULT ReadConsistencyStrategy is set (at client or request level), ConsistencyLevel is stripped from the request. This prevents dual-header rejection by the compute gateway and proxy. Request-level RCS overrides client-level RCS.

GLOBAL_STRONG validation -- GLOBAL_STRONG is only valid on accounts with Strong as the default consistency level. The SDK now validates this client-side and throws BadRequestException at request time (fail-fast), matching the existing Direct mode behavior.

Key design decisions

Decision Rationale
Strip ConsistencyLevel when RCS is set Compute gateway rejects requests with both headers. RCS takes precedence per design (matches .NET SDK behavior).
DEFAULT is transparent -- never written as a header ReadConsistencyStrategy.DEFAULT means "use the account default". The SDK filters it out at the source (query/changefeed header-building) and defensively in downstream methods (isEffectiveSessionConsistency, resolveEffectiveConsistencyHeaders). This prevents "Default" from reaching the wire as a header string, which would break session-token logic (false negative on Session accounts) and cause dual-header rejection on Gateway V1.
GLOBAL_STRONG validation is client-side Fail-fast at request time, not after round-trip to server.
RNTBD token ID 0x00FE, Byte type Matches proxy RntbdConstants.cs L652 and proxy PR #2031635. Proxy expects C++ enum (int), not string.

Test coverage

RntbdReadConsistencyStrategyHeaderTests
Unit tests for the RNTBD ReadConsistencyStrategy token (ID 0x00FE, Byte type). Validates token metadata, byte encoding per strategy, encode/decode round-trip symmetry, and the full resolveEffectiveConsistencyHeaders -> wrapInHttpRequest pipeline for contention scenarios (both ConsistencyLevel and ReadConsistencyStrategy set). Uses RntbdRequest.decode() to verify the RNTBD frame produced by ThinClientStoreModel.

GatewayReadConsistencyStrategySpyWireTest
Spy wire tests that intercept the final HTTP request at the HTTP client layer for both Gateway V1 (HTTP/1) and Gateway V2 (HTTP/2 + thin client). V1 asserts HTTP headers; V2 decodes the RNTBD frame via RntbdRequest.decode() and validates :10250 endpoint routing. Covers no-contention, contention resolution, request-level overrides client-level, DEFAULT transparency, and write operations.

GatewayReadConsistencyStrategyE2ETest
End-to-end tests against a real account with two CosmosAsyncClient instances (Gateway V1 and Gateway V2). Validates that the server accepts and applies ReadConsistencyStrategy correctly across all request option surfaces: point reads, queries, readAll, changeFeed, readMany, client-level defaults, write operations, GLOBAL_STRONG validation, contention resolution, and operation policy overrides. V2 tests assert thin client endpoint usage via CosmosDiagnostics. Direct mode tests additionally assert quorumAckedLSN in diagnostics for LatestCommitted reads.

Tested against a Session consistency account (test4acct) on a thin-client-enabled federation (cdb-ms-test4-northeurope1-fe1).

Smoke test (Kusto validation) -- quorum reads for LatestCommitted

Ran 15 requests each for Read, Query, and ChangeFeed with ReadConsistencyStrategy.LATEST_COMMITTED across GW V1 and GW V2. Verified 1:2 client-to-backend ratio (quorum reads) via Kusto telemetry.

BackendEndRequest5M (proxy/compute → backend replicas):

BackendEndRequest5M
| where TIMESTAMP >= datetime(2026-04-29T16:20) and TIMESTAMP <= datetime(2026-04-29T16:30)
| where GlobalDatabaseAccountName == 'test4acct'
| where Res(ResourceType) == "Document"
| where Op(OperationType) in ("Read", "ReadFeed", "Query")
| extend Mode = iif(UserAgent contains "ThinProxy", "GW_V2", "GW_V1")
| summarize BackendReads = sum(SampleCount) by OperationType, Mode, StatusCode, ConsistencyLevel, DefaultConsistencyLevel
Op (OperationType) Mode StatusCode BackendReads ConsistencyLevel DefaultConsistencyLevel
2 (Read) GW_V1 200 30 0 (not set) Session
2 (Read) GW_V2 200 30 0 (not set) Session
15 (Query) GW_V1 200 30 0 (not set) Session
15 (Query) GW_V2 200 30 0 (not set) Session
3 (ReadFeed) GW_V1 200 30 0 (not set) Session
3 (ReadFeed) GW_V1 304 30 0 (not set) Session
3 (ReadFeed) GW_V2 200 30 0 (not set) Session
3 (ReadFeed) GW_V2 304 30 0 (not set) Session

ThinClientProxyRequest5M (client → proxy, GW V2):

ThinClientProxyRequest5M
| where TIMESTAMP >= datetime(2026-04-29T16:20) and TIMESTAMP <= datetime(2026-04-29T16:30)
| where globalDatabaseAccountName == 'test4acct'
| where resourceType == "Document"
| where operationType in ("Read", "ReadFeed", "Query")
| summarize ClientReads = sum(SampleCount) by operationType, statusCode
operationType statusCode ClientReads
Read 200 15
Query 200 15
ReadFeed 200 15
ReadFeed 304 15

ComputeRequest5M (client → compute, GW V1):

ComputeRequest5M
| where TIMESTAMP >= datetime(2026-04-29T16:20) and TIMESTAMP <= datetime(2026-04-29T16:30)
| where GlobalDatabaseAccountName == 'test4acct'
| where ResourceName == "Document"
| where OperationName in ("Read", "Query", "ChangeFeed/Incremental")
| summarize ClientReads = sum(SampleCount) by OperationName, StatusCode
OperationName StatusCode ClientReads
Read 200 15
Query 200 15
ChangeFeed/Incremental 200 15
ChangeFeed/Incremental 304 15

Quorum validation summary:

Operation Client requests (per mode) Backend reads (per mode) Ratio
Read 15 30 1:2 ✅
Query 15 30 1:2 ✅
ChangeFeed 30 (15×200 + 15×304) 60 (30×200 + 30×304) 1:2 ✅

ConsistencyLevel=0 (not set) on all backend entries confirms x-ms-consistency-level is correctly stripped when ReadConsistencyStrategy is active.

.NET SDK reference

PR #5685 -- .NET SDK ReadConsistencyStrategy implementation. HTTP header name and enum values match.

Proxy coordination

ADO PR #2031635 -- "Add ReadConsistencyStrategy for Proxy". Merged and deployed to test4 federations.


SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@jeet1995 jeet1995 force-pushed the squad/48094-readconsistency-header-propagation branch from a3d4c0f to 26648a7 Compare April 16, 2026 00:36
@jeet1995 jeet1995 changed the title [Cosmos] Validate ReadConsistencyStrategy header propagation in thin client mode (#48094) [Cosmos] Enable ReadConsistencyStrategy for Gateway V1 and thin client (Gateway V2) modes (#48094) Apr 16, 2026
@jeet1995 jeet1995 changed the title [Cosmos] Enable ReadConsistencyStrategy for Gateway V1 and thin client (Gateway V2) modes (#48094) [Gateway V1/V2]: Enable ReadConsistencyStrategy. Apr 16, 2026
Copy link
Copy Markdown
Member

@FabianMeiswinkel FabianMeiswinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jeet1995
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@jeet1995
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@jeet1995
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

jeet1995 and others added 12 commits April 21, 2026 17:18
…n modes (Azure#48094)

- Add ReadConsistencyStrategy RNTBD header (0x00F0, String) to RntbdConstants
  and RntbdRequestHeaders for thin client proxy propagation
- Replace Gateway-mode warn+ignore with client-side GLOBAL_STRONG validation
  that works across all modes (direct, gateway, thin client)
- Update ReadConsistencyStrategy javadoc to reflect all-modes support
- Add unit tests for RNTBD token encoding and round-trip
- Add E2E tests for thin client and compute gateway ReadConsistencyStrategy

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…is set (Azure#48094)

RxGatewayStoreModel.applySessionToken() called RequestHelper.getReadConsistencyStrategyToUse()
which had a side-effect of rewriting x-ms-consistency-level header (e.g., LATEST_COMMITTED
mapped to BoundedStaleness). The compute gateway rejected this because BoundedStaleness is
stricter than the Session account default.

Fix: Use a copy of the headers map so the original x-ms-consistency-level is preserved.
Gateway/proxy now sees:
- x-ms-consistency-level: Session (original, unchanged)
- x-ms-cosmos-read-consistency-strategy: LatestCommitted (RCS intent)

Verified E2E: LATEST_COMMITTED, EVENTUAL, SESSION, client-level RCS all return 200.
GLOBAL_STRONG correctly throws BadRequestException on Session account.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…Azure#48094)

Compute gateway rejects requests containing both x-ms-consistency-level and
x-ms-cosmos-read-consistency-strategy headers. When RCS is non-DEFAULT, remove
the consistency-level header — RCS takes precedence.

Applied to both client-level and request-options-level RCS paths in
RxDocumentClientImpl.getRequestHeaders().

Verified E2E against test4 compute gateway (swkrish-session, Session account):
LATEST_COMMITTED, EVENTUAL, SESSION, client-level RCS all return 200.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…yte (Azure#48094)

The proxy expects ReadConsistencyStrategy as a Byte enum, not a String:
  Eventual=1, Session=2, LatestCommitted=3, GlobalStrong=4

With String type, the proxy couldn't parse the RNTBD frame and hung.
With Byte type, thin client reads work correctly through the proxy.

Added RntbdReadConsistencyStrategy enum to RntbdConstants matching
the proxy's ReadConsistencyStrategy.h enum values.

Verified E2E against test4 thin client proxy (swkrish-session):
SESSION, EVENTUAL, LATEST_COMMITTED all return 200 with tc=true.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…urations (Azure#48094)

Removed hardcoded database/container names. Tests now:
- Create a unique database and container in @BeforeClass
- Clean up in @afterclass
- Use TestConfigurations.HOST/MASTER_KEY from cosmos-v4.properties
- Use /pk as partition key path

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…zure#48094)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…t option types (Azure#48094)

Cover all 5 surfaces: ItemRequestOptions, QueryRequestOptions,
ChangeFeedRequestOptions, ReadManyRequestOptions, CosmosClientBuilder (client-level).
Plus write-ignored, GLOBAL_STRONG validation, and CL+RCS precedence tests.

Tests use dynamic database/container creation, TestConfigurations, serverless-safe.
Follows ThinClientTestBase pattern from PR Azure#47759.

Verified E2E: 21/21 PASS (10 thin client + 11 gateway V1) against
swkrish-session (test4, Session, North Europe).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Cosmos DB RP now rejects API version 2022-08-15 for accounts
with EnableNoSQLVectorSearch capability, requiring 2023-04-15 or later.

Error: 'Please use api version 2022-02-15-preview or 2023-04-15 or later'
ActivityId: 3e0f30d8-548a-408f-9827-09c3b81a7166

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Verifying that the API version rejection was caused by EnableNoSQLVectorSearch.
This commit will be reverted after pipeline verification.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…QLVectorSearch (Azure#48094)

Cosmos DB RP now requires API version 2023-04-15+ for accounts with
EnableNoSQLVectorSearch capability. Confirmed by testing:
- 2022-08-15 + EnableNoSQLVectorSearch = 400 BadRequest
- 2022-08-15 without EnableNoSQLVectorSearch = passes
- 2023-04-15 + EnableNoSQLVectorSearch = passes

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add resolveEffectiveConsistencyHeaders() in RxGatewayStoreModel that
  strips x-ms-consistency-level when ReadConsistencyStrategy wins.
  Called before wrapInHttpRequest — affects both GW V1 (HTTP) and
  GW V2/ThinClientStoreModel (RNTBD).

- Fix contention bug in RxDocumentClientImpl.getRequestHeaders():
  options.getConsistencyLevel() no longer re-adds CL header when RCS
  is already present (Option A guard).

- Rules: request-level RCS > client-level RCS; RCS > ConsistencyLevel.
  Only one consistency header survives on the wire.

- 10 unit tests (ConsistencyFlagContentionTest): both-set, request-ctx
  priority, header-level, DEFAULT transparent, null, idempotency.

- 4 new E2E tests: request-level contention and request-level RCS
  override for both GW V1 and GW V2 paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jeet1995 jeet1995 force-pushed the squad/48094-readconsistency-header-propagation branch from 38a8292 to e00f5bb Compare April 21, 2026 21:18
…ire (Azure#48094)

Verify actual HTTP headers sent by the SDK using the SpyClientUnderTest
(Mockito spy on HttpClient) pattern from RequestHeadersSpyWireTest.

5 new tests:
- Request-level RCS: x-ms-cosmos-read-consistency-strategy on wire, CL stripped
- Client-level RCS: same header verification via builder-level config
- Both RCS + CL: RCS wins, CL stripped (contention resolution)
- DEFAULT RCS: no RCS header emitted (transparent)
- Write with client RCS: no RCS header on write operations

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jeet1995 jeet1995 force-pushed the squad/48094-readconsistency-header-propagation branch 2 times, most recently from d12a376 to f30f417 Compare April 21, 2026 23:01
…tests (Azure#48094)

- Fix existing RntbdReadConsistencyStrategyHeaderTests: update token type
  assertions from String to Byte, matching the proxy-compatible encoding
  (Eventual=0x01, Session=0x02, LatestCommitted=0x03, GlobalStrong=0x04)

- Add 7 new RNTBD spy-wire tests that simulate ThinClientStoreModel.wrapInHttpRequest():
  Build RntbdRequest from RxDocumentServiceRequest with RCS headers, encode
  to ByteBuf, and verify the RNTBD frame contains header 0x00F0 with correct
  byte value. Covers all 4 RCS strategies, absent RCS, and CL-stripped scenario.

- Fix GLOBAL_STRONG E2E test: disable with TODO — BadRequestException from
  validateReadConsistencyStrategy() is swallowed by availability strategy and
  does not propagate to the caller. Validation works (unit tests prove it).

- Add readConsistencyStrategyRntbdByteEnumValues test verifying the enum IDs.

All 42 tests pass: 11 GW E2E + 10 contention unit + 21 RNTBD unit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jeet1995 jeet1995 force-pushed the squad/48094-readconsistency-header-propagation branch from f30f417 to ce56e3e Compare April 21, 2026 23:33
…Azure#48094)

Move 5 ReadConsistencyStrategy HTTP spy-wire tests from RequestHeadersSpyWireTest
(which extends TestSuiteBase and requires provisioned throughput) into a new
standalone ReadConsistencyStrategyHttpSpyWireTest class that creates its own
serverless-safe resources (no throughput). This allows the spy-wire tests to
run on serverless accounts (e.g., test4 Session accounts) without depending
on TestSuiteBase shared collections.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tance

- Remove TestSuiteBase inheritance to avoid @BeforeSuite shared container init
  which fails on serverless accounts (provisioned throughput not supported)
- Use TestSuiteBase.createCollection as static utility for no-throughput containers
- Set COSMOS.THINCLIENT_ENABLED system property in @BeforeClass for V2 detection
- Configure HTTP/2 via GatewayConnectionConfig.setHttp2ConnectionConfig for V2 builder
- Probe V2 availability at startup via diagnostics endpoint check
- All 30 tests pass (15 V1 + 15 V2) against swkrish-session

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jeet1995 jeet1995 force-pushed the squad/48094-readconsistency-header-propagation branch from 7332d73 to a275926 Compare April 27, 2026 21:35
- Move ReadConsistencyStrategySpyWireTest and ReadConsistencyStrategyE2ETest to com.azure.cosmos.rx for access to TestSuiteBase protected utilities
- Keep RntbdReadConsistencyStrategyHeaderTests in directconnectivity.rntbd (requires package-private RntbdToken/RntbdTokenType)
- Fix imports for all moved files
- E2E test uses TestSuiteBase.assertThinClientEndpointUsed via same-package access

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jeet1995 jeet1995 force-pushed the squad/48094-readconsistency-header-propagation branch from 8928e46 to 3f21d1d Compare April 27, 2026 21:56
jeet1995 and others added 6 commits April 27, 2026 18:04
- V2 routing is proven by :10250 endpoint assertion on each request, not by upfront probe
- Set COSMOS.THINCLIENT_ENABLED in @BeforeClass without save/restore (CI handles it)
- DataProvider always returns both V1 and V2 modes
- Remove hasThinClientEndpoint helper from E2E (use TestSuiteBase.assertThinClientEndpointUsed)
- Remove unused imports (CosmosDiagnosticsContext, CosmosDiagnosticsRequestInfo, Collection)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- ReadConsistencyStrategyE2ETest → GatewayReadConsistencyStrategyE2ETest
- ReadConsistencyStrategySpyWireTest → GatewayReadConsistencyStrategySpyWireTest

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nit tests

- RntbdReadConsistencyStrategyHeaderTests now uses RntbdRequest.decode() for all
  RNTBD frame assertions, matching the spy wire test approach
- Eliminates false-positive risk from byte-pattern scanning in binary frames
- Removes containsRntbdHeaderWithByte, containsRntbdHeaderId, containsRntbdHeaderWithAnyValue

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Document precedence: RCS > ConsistencyLevel, request-level > client-level
- Note write operations always use DEFAULT
- List all connection mode support (Direct, Gateway, Gateway V2)
- Tighten enum value descriptions
- Remove internal details (HTTP header, RNTBD header)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jeet1995 jeet1995 marked this pull request as ready for review April 27, 2026 22:49
Copilot AI review requested due to automatic review settings April 27, 2026 22:49
@jeet1995 jeet1995 requested review from a team and kirankumarkolli as code owners April 27, 2026 22:49
jeet1995 and others added 2 commits April 27, 2026 18:52
…sistency-header-propagation

# Conflicts:
#	sdk/cosmos/azure-cosmos/CHANGELOG.md
- Add Direct connection mode as third transport in DataProvider
- All 45 tests pass (15 scenarios × 3 modes: GatewayV1, GatewayV2, Direct)
- Enables backend telemetry comparison across all three transports

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jeet1995
Copy link
Copy Markdown
Member Author

@sdkReviewAgent

@xinlian12
Copy link
Copy Markdown
Member

Review complete (43:19)

Posted 6 inline comment(s).

Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage

jeet1995 and others added 5 commits April 28, 2026 19:43
- Fix isEffectiveSessionConsistency to treat RCS 'Default' as transparent
  (fall through to CL/account default instead of returning false)
- Fix resolveEffectiveConsistencyHeaders to strip 'Default' sentinel header
- Prevent 'Default' header from being written in query/changefeed paths
  (DocumentQueryExecutionContextBase, ChangeFeedQueryImpl)
- Add warning log for unknown ReadConsistencyStrategy values in RNTBD switch
- Add SESSION E2E tests: query, readAll, changeFeed, readMany, client-level

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE REQ] [Thin Client] Validate ReadConsistencyStrategy header propagation in thin client mode

3 participants