Skip to content

Comments

remove query comments #560#749

Closed
dev-lew wants to merge 807 commits intopgdogdev:mainfrom
dev-lew:dev-lew/Remove-query-comments-#560
Closed

remove query comments #560#749
dev-lew wants to merge 807 commits intopgdogdev:mainfrom
dev-lew:dev-lew/Remove-query-comments-#560

Conversation

@dev-lew
Copy link

@dev-lew dev-lew commented Feb 4, 2026

This PR addresses #560

It modifies the comment function (now parse_comment), which will capture a role and shard and then return the original query string with all comments removed except for those that match 0 or more regexes.

If there is a cache miss, we do another cache lookup after removing comments except for those used for pgdog metadata (pgdog_shard, etc) in the hope that we can increase cache hit rate.

Two associated helper functions were added as well.

Also, some tests were modified to reflect the new functionality.

levkk and others added 30 commits September 12, 2025 10:48
* Deliver NOTIFY outside of transactions only

* handle transaction error
* Numeric rough draft, includes binary protocol

* better matching logic for sign, clean up underscores

* pin the numeric tower implemetation work for now

* plan updates

* add float and double internal types for binary float4 and float8

* plan updates for numeric

* Remove planfiles

* float and numeric python tests

* Clean up clippy issues

* ensure sort test table is sharded

* Numeric: fix index too large when taking dscale digits by subtracting added digits

* Update warning message (numeric.rs)

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>

* Update warning message (numeric.rs)

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>

* Update warning caps (numeric.rs)

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>

* Update warning capitalization (numeric.rs)

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>

* Numeric: add NaN support and tests for error cases integration revealed

* Update vectors to f32, clean up double/float with tests

---------

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>
* Update CLAUDE.md with new test commands and guidelines

* Clarify TDD workflow and test execution instructions

* Update CLAUDE.md with important coding guidelines

* Update CLAUDE.md with coding principles and workflow
* SIMD the vector calculations

* benchmarks and better unrolling/vector access

* Better memory usage in Vector implementation

* Update pgdog/src/frontend/router/sharding/distance_simd_rust.rs

* Update pgdog/src/frontend/router/sharding/vector.rs

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Move parsing outside lock lifetime

* Adjust test to match new default
* Handle cross-shard errors

* clippy

* Add tests

* Cross-shard check

* Ah!

* handle error

* refactor

* test

* fmt
* Process extended protocol messages in the right order

* Fix tests

* test

* test name
* Handle Close at the end of the request

* Handle close differently

* handle close
* Ignore DISCARD

* fix test + rubocop
* Add /* pgdog_role */ comment

* aha

* huh

* add test

* switch prepared statements to readwrite lock

* faster
* Handle SQLAlchemy empty queries better

* remove

* ok
* server_lifetime setting

* Human duration
* Fix comments with SET

* Add test case
* More java tests!

* set test
* generate coverage + codecov

* test arguments

* correct token

* coverage in integration?

* integration coverage fixes

* merge down to a single test run instead of test + coverage separately

* integration troubleshooting

* ensure cov generation uses --release

* remove redundant build

* explicit checks for coverage changes and manually kill pgdog before coverage run

* try stopping between sub-suites

* debug output in verify profiles

* pour on the debug logging

* further debugging

* path bug

* search anywhere for profiles

* whee!

* fail to bail when profiles havent changed

* add a new test to produce a diff

* add bigint binary test for more coverage there (just to check coverage is working)

* add codecov config, basic test for arrays
…AST, RewriteEngine that produces ghost columns (pgdogdev#502)

* Initial AVG + expression parser
AVG function implementation:
Detects paired count+avg functions on the same expression
Expression parser handles logic of "same expression" by assigning id through interning
Adds "ghost" column using rewriting if no count is paired with avg
removes "ghost" column from returned value of aggregates

* rollback transactions correctly

* Remove redundant `.to_owned()`

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* PR feedback - document distinct sort, avg with casts

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Add pure-sql comparison run to integration

* rework sql to be a bit cleaner, one query per file

* working sql suite, fully specified DSNs with gssenc disabled

* CI rework

* each integration in its own run

* revert to shared suite, distributing artifacts is colossally slow

* stub GSSEnc for psycopg

* try draining and fixing out of sync errors

* more draining

* move python to restart the server before run, tweak logic change around errored connections

* revert ensure_in_sync changes

* ok I guess we dont get to keep it running
dev-lew and others added 17 commits February 8, 2026 21:14
…ents in one query (pgdogdev#760)

### Race condition fix

Fixes race condition with sequelize and other similar ORMs that do the
following:

```
Parse
Describe
Flush
Bind
Execute
Flush
```

We treated this as query is finished and let our pool logic send `Sync`
asynchronously as a cleanup step. Meanwhile, Sequelize would send `Sync`
separately and we would immediately return `ReadyForQuery`, but the
cleanup is async and would happen _after_ we responded with
`ReadyForQuery`. The client then thought a record was created before it
was actually saved in Postgres. This is a likely fix for pgdogdev#685 cc
@mkreidenweis-schulmngr

### Multiple `SET` statements

Sequelize also sends multiple `SET` commands in one query. We handle
this now correctly, although it's not related to the race condition.
…inished sending request (pgdogdev#762)

- chore: add a bunch of sequelize and prisma tests
- fix: close server connections that have partially sent client requests
- those are harder to recover since we don't know which messages have
been successfully processed by the server
We were incorrectly releasing server back into the pool after receiving:

```
Execute(limit=x)
Flush
```

Not entirely sure what the implication is since the connection cleanup
logic kicked in by sending `Sync` and closing the transaction, but this
does have a smell of something was handled incorrectly.
…nt_connection_recovery setting (pgdogdev#764)

- fix: put `Sync` into its own `ClientRequest`. This affects sharded
spliced requests only (multiple `Execute`/`Bind` in one request).
Previously, Sync was only attached to the last request only. Now, it's
sent individually to all shards, making sure all shards execute
requests.
- feat: add `client_connection_recovery` setting. Allows to disconnect
clients on hard to recover connection errors, like pool timeouts.

```toml
[general]
client_connection_recovery = "drop"
```
Standardized the returned query if there are comments.
We prepend and sort the metadata comments so they are consistent.
Standardized the returned query if there are comments.
We prepend and sort the metadata comments so they are consistent.
…gdogdev#766)

- add a hook for enterprise edition when schema change is detected
- make the schema serializable, so we can send it over the wire
- feat: load foreign key constraints in schema on startup
- feat: add `load_schema` general setting: on/off/auto
Everytime an AST is created, call parse_comment
Everytime an AST is created, call parse_comment
@dev-lew dev-lew marked this pull request as ready for review February 14, 2026 20:29
dev-lew and others added 6 commits February 14, 2026 15:30
Simply always attempt to parse comments for non-parametrized
route functions
Simply always attempt to parse comments for non-parametrized
route functions
…ew/pgdog into dev-lew/Remove-query-comments-#560
…ew/pgdog into dev-lew/Remove-query-comments-#560

let result = remove_comments(query, &tokens, Some(&[&SHARD])).unwrap();

assert_eq!(result, "pgdog_shard: 4SELECT * FROM table WHERE id = 1");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct, it should be:

/* pgdog_shard: 4 */ SELECT * FROM table WHERE id = 1;


let result = remove_comments(query, &tokens, Some(&[&ROLE, &SHARD])).unwrap();

assert_eq!(result, "pgdog_role: primarypgdog_shard: 4SELECT 1 + 2 ;");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. This query isn't valid SQL.

If there is duplicate whitespace due to comment removal,
remove it. Also, prepend the metadata string as a comment.
Finally, make sure metadata is not a Vec filled with empty strings.
If there is duplicate whitespace due to comment removal,
remove it. Also, prepend the metadata string as a comment.
Finally, make sure metadata is not a Vec filled with empty strings.
…ew/pgdog into dev-lew/Remove-query-comments-#560
@dev-lew dev-lew closed this Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.