-
Notifications
You must be signed in to change notification settings - Fork 39
Description
Hi,
I'm migrating a product codebase that uses SQLAlchemy mssql+pyodbc - on Azure Functions; primarily due to SIGSEGV crashes, traced back to likely runtime/driver issues.
So far so good - however the refactor has been significant due to UUID handling behaviours (which are extensively used in the ORM) - pyodbc returns string by default (v2 flags via native_uuid).
Is your feature request related to a problem? Please describe.
mssql-python always returns uuid.UUID objects from UNIQUEIDENTIFIER columns. For codebases migrating from pyodbc (which returns strings by default), this causes widespread breakage across every codepath that touches a GUID:
- AttributeErroe: existing code calls .strip(), .upper(), .replace() on values that are now uuid.UUID, not str
- TypeError in JSON serialization: uuid.UUID is not JSON-serializable by default; every API response, queue message, and telemetry payload that includes a GUID fails
- Silent equality mismatches: "ABC..." == UUID("ABC...") is False, causing lookups, dictionary keys, and set membership checks to silently break
- SQLAlchemy result processor crash: SQLAlchemy's _python_UUID helper calls value.replace("-", "") assuming a string input, raising AttributeError when it receives a uuid.UUID object (this is the same class of bug as pymssql might support native UUID, set attribute sqlalchemy/sqlalchemy#9414 which was filed for pymssql)
This is not a niche concern, pyodbc is the dominant Python MSSQL driver, so most codebases migrating to mssql-python will hit this. In my production codebase (~700 UNIQUEIDENTIFIER columns, ~90 files touching GUIDs), adopting mssql-python required a global monkey-patch on SQLAlchemy internals and an extensive audit of every string operation on GUID values.
Describe the solution you'd like
A native_uuid parameter on connect() that controls whether UNIQUEIDENTIFIER columns are returned as uuid.UUID objects or str:
Current behavior preserved (default): returns uuid.UUID objects
conn = mssql_python.connect(conn_str, native_uuid=True)Migration-friendly: returns str (hyphenated, e.g. "110e2700-9d34-44e9-ba0e-bd74401a54a4")
conn = mssql_python.connect(conn_str, native_uuid=False)Both modes should continue to accept uuid.UUID objects as bind parameters (input handling is already correct).
The implementation would be minimal: in the result-row processing where UUID(bytes_le=raw_bytes) is already constructed:
val = uuid.UUID(bytes_le=raw_bytes)
return val if self._native_uuid else str(val)This should apply consistently across fetchone(), fetchmany(), and fetchall() (the inconsistency from #241 was already fixed).
Describe alternatives you've considered
- Monkey-patching SQLAlchemy internals: This is what we currently do. We patch sqlalchemy.sql.sqltypes._python_UUID to short-circuit when it receives a uuid.UUID object instead of a string. It works but is fragile (depends on an internal API that could be renamed/removed) and doesn't help non-SQLAlchemy consumers.
My current workaround in production
original = sqltypes._python_UUID
def _python_uuid(value):
if isinstance(value, UUID):
return value # short-circuit: already a UUID
return original(value)
sqltypes._python_UUID = _python_uuid- Application-level str() coercion everywhere: Wrapping every DB result in str() at the repository layer. This is error-prone at scale (easy to miss a codepath) and defeats the purpose of having typed UUID support.
- SQLAlchemy TypeDecorator: A custom column type that overrides process_result_value to stringify. This works for ORM queries but doesn't cover text() queries or raw cursor usage.
- Waiting for the official SQLAlchemy dialect: The dialect being developed (referenced in coercion of UNIQUEIDENTIFIER / uuid values from binary to string only works with fetchone(), not fetchall() or fetchmany() #241) will presumably set supports_native_uuid = True, which solves the SQLAlchemy layer. But it doesn't help raw DB-API consumers, Pydantic serialization, or non-SQLAlchemy frameworks (Django, SQLModel, etc.).
Additional context
Precedent: pyodbc addressed this exact problem with its native_uuid flag (mkleehammer/pyodbc#177, added in v4.0.9). The default is False (strings), with True opting into uuid.UUID objects. mssql-python could use the inverse default (True) since it's a newer driver, while still offering the escape hatch.
Ecosystem pattern:
| Driver | UNIQUEIDENTIFIER return type | Configurable? |
|---|---|---|
| pyodbc | str (default) / uuid.UUID (opt-in) | Yes — pyodbc.native_uuid |
| pymssql 2.x | uuid.UUID (always) | No |
| mssql-python | uuid.UUID (always) | No (this request) |
Migration context: mssql-python positions itself as a modern, Microsoft-supported alternative to pyodbc. The primary adoption path is pyodbc → mssql-python. A compatibility flag would significantly lower the migration
barrier, teams can switch drivers first, then migrate UUID handling incrementally, rather than requiring a big-bang refactor of every GUID touchpoint.
Related issues:
- coercion of UNIQUEIDENTIFIER / uuid values from binary to string only works with fetchone(), not fetchall() or fetchmany() #241: fetchone() vs fetchall() UUID coercion inconsistency (fixed in v0.12.0/v0.13.0)
- pymssql might support native UUID, set attribute sqlalchemy/sqlalchemy#9414 - pymssql native UUID causing the same AttributeError pattern