Skip to content

fix(connectors): unify extract_column_value into single free function#3199

Open
atharvalade wants to merge 1 commit into
apache:masterfrom
atharvalade:fix/unify-extract-column-value
Open

fix(connectors): unify extract_column_value into single free function#3199
atharvalade wants to merge 1 commit into
apache:masterfrom
atharvalade:fix/unify-extract-column-value

Conversation

@atharvalade
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #3172

Rationale

Bug fixes or new type support added to one copy of extract_column_value silently won't apply to the other, causing divergence between sequential and parallel code paths.

What changed?

The Postgres source connector had extract_column_value as a method on PostgresSource (&self), making it impossible to call from static/parallel contexts without duplicating the function. Any future parallel/chunked path would need its own copy.

Converted it to a single free function callable from any context. Added missing DATE (via chrono::NaiveDate) and BPCHAR type handling that the parallel copy on the benchmark branch had but this version lacked.

Local Execution

  • Passed
  • Pre-commit hooks ran

AI Usage

  1. Opus 4.6
  2. exploration and implementation guidance
  3. Verified via cargo check, clippy -D warnings, fmt --check, all 19 unit tests passing, CI lint scripts passing
  4. Yes, all code can be explained

@atharvalade atharvalade changed the title fix(postgres): unify extract_column_value into single free function fix(connectors): unify extract_column_value into single free function Apr 29, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 0% with 101 lines in your changes missing coverage. Please review.
✅ Project coverage is 18.33%. Comparing base (2337883) to head (d8fc00e).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
core/connectors/sources/postgres_source/src/lib.rs 0.00% 101 Missing ⚠️

❌ Your patch check has failed because the patch coverage (0.00%) is below the target coverage (50.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@              Coverage Diff              @@
##             master    #3199       +/-   ##
=============================================
- Coverage     74.18%   18.33%   -55.86%     
  Complexity      943      943               
=============================================
  Files          1200     1198        -2     
  Lines        109645    94898    -14747     
  Branches      86534    71804    -14730     
=============================================
- Hits          81345    17396    -63949     
- Misses        25553    77084    +51531     
+ Partials       2747      418     -2329     
Components Coverage Δ
Rust Core 0.89% <0.00%> (-74.54%) ⬇️
Java SDK 58.44% <ø> (ø)
C# SDK 69.13% <ø> (-0.31%) ⬇️
Python SDK 81.43% <ø> (ø)
Node SDK 91.44% <ø> (ø)
Go SDK 39.91% <ø> (ø)
Files with missing lines Coverage Δ
core/connectors/sources/postgres_source/src/lib.rs 40.82% <0.00%> (-27.08%) ⬇️

... and 685 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@atharvalade atharvalade force-pushed the fix/unify-extract-column-value branch 2 times, most recently from 6a3e441 to 08153c5 Compare April 29, 2026 18:07
@lukaszzborek
Copy link
Copy Markdown
Contributor

lukaszzborek commented Apr 29, 2026

@atharvalade
In issue is Postgres source connector has two nearly identical functions and to remove one. But i don't see second one.
I'm jus asking for clarification, if i don't see something

Also remeber about #3196 where you also add Date

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs.

If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR.

Thank you for your contribution!

@github-actions github-actions Bot added S-stale Inactive issue or pull request and removed S-stale Inactive issue or pull request labels May 7, 2026
@github-actions github-actions Bot added the S-waiting-on-review PR is waiting on a reviewer label May 14, 2026
@hubcio
Copy link
Copy Markdown
Contributor

hubcio commented May 14, 2026

/author

@github-actions github-actions Bot added S-waiting-on-author PR is waiting on author response and removed S-waiting-on-review PR is waiting on a reviewer labels May 14, 2026
@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs.

If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR.

Thank you for your contribution!

@github-actions github-actions Bot added the S-stale Inactive issue or pull request label May 22, 2026
@atharvalade
Copy link
Copy Markdown
Contributor Author

@atharvalade In issue is Postgres source connector has two nearly identical functions and to remove one. But i don't see second one. I'm jus asking for clarification, if i don't see something

Also remeber about #3196 where you also add Date

Good catch, the second copy (extract_column_value_static) lives on the feat/pg-to-iceberg-benchmark branch (not yet merged to master), where the parallel/chunked polling paths were added. This PR preemptively converts the method to a free function so that when those paths land, they can call the same function instead of duplicating it. I'll drop the DATE addition here since #3196 already covers it, and rebase on latest master.

@github-actions github-actions Bot removed the S-stale Inactive issue or pull request label May 22, 2026
@atharvalade atharvalade force-pushed the fix/unify-extract-column-value branch from 08153c5 to d8fc00e Compare May 22, 2026 05:18
@atharvalade
Copy link
Copy Markdown
Contributor Author

/ready

@github-actions github-actions Bot added S-waiting-on-review PR is waiting on a reviewer and removed S-waiting-on-author PR is waiting on author response labels May 22, 2026
Comment thread Cargo.lock
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no reason to include these changes in this PR.

@github-actions github-actions Bot added S-waiting-on-author PR is waiting on author response and removed S-waiting-on-review PR is waiting on a reviewer labels May 22, 2026
.map(serde_json::Value::from)
.unwrap_or(serde_json::Value::Null))
}
"VARCHAR" | "TEXT" | "CHAR" | "BPCHAR" => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BPCHAR is the only new behavior in this PR but nothing exercises it. codecov flags 0% patch coverage (101 missing lines). a single round-trip test reading a char(n) column closes the gap and prevents the same divergence the issue is trying to prevent.

if possible, just make it part of some existing tests instead of making a new tc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-author PR is waiting on author response

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Two diverging copies of extract_column_value

3 participants