feat: autodetect DATE/DATETIME fields in CSV type inference#154
Merged
Conversation
- Extend ColumnType enum with DATE, DATE_EU, DATE_US, DATETIME, DATETIME_EU, DATETIME_US sub-variants; all map to TEXT affinity in DDL - Add displayName() method: DATE*/DATETIME* display as DATE/DATETIME in --columns, --validate, and --sample output instead of internal tag name - Add isDate() / isDateTime() detectors (length-gated; no overlap) - Add SlashOrder enum + accumSlashOrder() for DD/MM vs MM/DD disambiguation per column: d1>12→EU, d2>12→US, both≤12→abstain, contradictory→TEXT - Rewrite inferTypes() with 11 tracking arrays; DATETIME>DATE>INTEGER>REAL>TEXT priority; mixed ISO+slash format or mixed date+datetime → TEXT fallback - Add normalizeDateToIso() / normalizeDateTimeToIso() helpers that reformat EU/US/dash/T-separator values into YYYY-MM-DD / YYYY-MM-DD HH:MM:SS - Update insertRowTyped() with 6 new ColumnType cases; stack-buffer bind uses sqliteTransient() (SQLITE_TRANSIENT sentinel via @setRuntimeSafety(false)) - Add sqliteTransient() fn in sqlite.zig (replaces unrepresentable const) - Add loader unit test binary to build.zig (unit-test step) - Add 15 date/datetime integration tests (140-154) covering ISO, EU dash, EU slash, US slash, ambiguous, --columns, --validate, ORDER BY, --no-type-inference - All 154 integration tests + CSV/XML/loader unit tests pass; ziglint clean
…e variants - EU-dash DD-MM-YYYY date → DATE - Mixed ISO + EU-dash dates → DATE (bind-time distinction) - Slash datetime with d1>12 → DATETIME_EU - Slash datetime with d2>12 → DATETIME_US
- README: update type inference description to list DATE/DATETIME - README: add La Liga season-lengths real-world example (julianday arithmetic on auto-detected DATE column, COVID and World Cup anomalies) - README: update 'Date range filter' recipe and 'How it works' section - man page: add DATE/DATETIME to DESCRIPTION, --columns, and --sample entries
…nit test - README: fix julianday(2020-07-19)-julianday(2019-08-16) = 338, not 337 - loader.zig: add unit test for d_has_nonslash && d_has_slash -> TEXT path (ISO date + slash date in same column falls back to TEXT)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #142
Extends type inference to detect DATE and DATETIME columns in CSV input and normalizes values to ISO 8601 on insert, enabling SQLite date functions (
date(),strftime(), etc.) to work correctly on imported data.What's new
ColumnTypesub-variants (DATE, DATE_EU, DATE_US, DATETIME, DATETIME_EU, DATETIME_US) carrying format info to bind time without extra parallel arraysisDate()andisDateTime()detectors for YYYY-MM-DD, DD-MM-YYYY, DD/MM/YYYY, MM/DD/YYYY, YYYY-MM-DD HH:MM:SS, YYYY-MM-DDTHH:MM:SS, DD/MM/YYYY HH:MM, MM/DD/YYYY HH:MMYYYY-MM-DDorYYYY-MM-DD HH:MM:SS--no-type-inferencebackward compatibility maintained — dates stay as raw TEXTsrc/loader.zigfor all detection, disambiguation, normalization, and inference pathsDefinition of Done
zig build test -Dbundle-sqlite=true+zig build unit-test -Dbundle-sqlite=true)ziglint src build.zigclean)