Skip to content

re-work get_multi_columns#298

Draft
gordthompson wants to merge 6 commits into
cockroachdb:masterfrom
gordthompson:issue_297
Draft

re-work get_multi_columns#298
gordthompson wants to merge 6 commits into
cockroachdb:masterfrom
gordthompson:issue_297

Conversation

@gordthompson
Copy link
Copy Markdown
Collaborator

@gordthompson gordthompson commented Apr 12, 2026

Fixes: #297
Fixes: #303

@gordthompson
Copy link
Copy Markdown
Collaborator Author

ping @rafiss

Copy link
Copy Markdown
Contributor

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I had a few questions/comments about this, some due to my lack of familiarity.

Also some potential additional test cases that might be useful to add:

  • A few type mapping tests: BYTEA→BLOB, NUMERIC→DECIMAL precision/scale propagation, REAL→FLOAT, SMALLINT→INTEGER, and all 8 ARRAY variants (those mutate fields in place, so worth having some specific testing for that).
  • Non-default schema path for get_multi_columns(schema=...).
  • Computed column reflection (Computed("a + b")). The old code emitted computed={sqltext, persisted}; so we could verify that happens still

Comment thread sqlalchemy_cockroachdb/base.py
Comment thread sqlalchemy_cockroachdb/base.py Outdated
Comment thread test/test_column_reflect.py
# Check if a sequence is being used and adjust the default value.
autoincrement = False
if default is not None:
nextval_match = re.search(r"""(nextval\(')([^']+)('.*$)""", default)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the old code rewrote bare nextval('seq') to become nextval('"schema".seq') when a schema was provided. The new code drops it entirely, unless I missed somehing.

Was it intentional? If so, a note in the commit message explaining why would be useful. And possibly even a mention in CHANGES.md?

Tools that depend on schema-qualified sequence names might get affected.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, cockroachdb/cockroach#170049 should take care of that.

(None, "spatial_ref_sys"),
]:
table_columns = (
connection.execute(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is doing a per-table information_schema.columns lookup inside the loop, on top of the one PGDialect already ran. Could we replace with one bulk SELECT keyed by (schema, table) that we do outside the loop and add to a Python dict?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep it simple for now. We can re-visit this if user feedback (re: performance) warrants.

Comment thread sqlalchemy_cockroachdb/base.py
Comment thread sqlalchemy_cockroachdb/base.py
self._test("inet", INET)


class UnknownTypeTest(fixtures.TestBase):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we no longer need this test?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That test was really to verify _type_map

_type_map = {
"bool": sqltypes.BOOLEAN, # introspection returns "BOOL" not boolean
"boolean": sqltypes.BOOLEAN,
"bigint": sqltypes.INT,

for completeness. The previous version of get_multi_columns() replaced the method in PGDialect, and it used _type_map to infer more generic column types from specific column types. The new version of get_multi_columns() extends the method in PGDialect and it doesn't use _type_map. Our extended version really just

  • handles hidden columns, and
  • enforces legacy type mappings like BIGINT → INT, which are necessary to prevent existing users of Alembic from getting hammered with spurious "changes" of column type.

In fact, now it appears that _type_map is no longer used at all, and should be removed too.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, now it appears that _type_map is no longer used at all, and should be removed too.

Done.

@gordthompson
Copy link
Copy Markdown
Collaborator Author

Switching this to draft pending resolution of cockroachdb/cockroach#170049

cc: @rafiss

@gordthompson gordthompson marked this pull request as draft May 11, 2026 13:23
@gordthompson
Copy link
Copy Markdown
Collaborator Author

re:

A few type mapping tests: BYTEA→BLOB, NUMERIC→DECIMAL precision/scale propagation, REAL→FLOAT, SMALLINT→INTEGER, and all 8 ARRAY variants (those mutate fields in place, so worth having some specific testing for that).

We have tests like that in test/test_introspection.py . In fact, they are what tipped me off to the need tor our own type mapping in the first place.

Computed column reflection (Computed("a + b")). The old code emitted computed={sqltext, persisted}; so we could verify that happens still

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

get_columns() does not reflect Identity columns SAWarning "Could not parse type name 'USER-DEFINED'" when reflecting CockroachDB enum

2 participants