FEAT: mssql-python driver changes to support bulk copy logging.#430
FEAT: mssql-python driver changes to support bulk copy logging.#430subrata-ms wants to merge 5 commits intomainfrom
Conversation
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changes
Summary
mssql_python/logging.pyLines 146-155 146 class CSVFormatter(logging.Formatter):
147 def format(self, record):
148 # Check if this is from Rust (via rust_log method)
149 if hasattr(record, 'funcName') and record.funcName == "rust":
! 150 source = "Rust"
! 151 message = record.getMessage()
152 else:
153 # Extract source from message (e.g., [Python] or [DDBC])
154 msg = record.getMessage()
155 if msg.startswith("[") and "]" in msg:Lines 156-165 156 end_bracket = msg.index("]")
157 source = msg[1:end_bracket]
158 message = msg[end_bracket + 2 :].strip() # Skip '] '
159 else:
! 160 source = "Unknown"
! 161 message = msg
162
163 # Format timestamp with milliseconds using period separator
164 timestamp = self.formatTime(record, "%Y-%m-%d %H:%M:%S")
165 timestamp_with_ms = f"{timestamp}.{int(record.msecs):03d}"Lines 340-354 340 msg: Message string (already formatted)
341 filename: Source filename (e.g., 'cursor.rs')
342 lineno: Line number in source file
343 """
! 344 try:
! 345 if not self._logger.isEnabledFor(level):
! 346 return
347
348 # Create a custom LogRecord with Rust source location
! 349 import logging as log_module
! 350 record = log_module.LogRecord(
351 name=self._logger.name,
352 level=level,
353 pathname=filename,
354 lineno=lineno,Lines 357-371 357 exc_info=None,
358 func="rust",
359 sinfo=None
360 )
! 361 self._logger.handle(record)
! 362 except Exception:
363 # Fallback - use regular logging
! 364 try:
! 365 self._logger.log(level, msg)
! 366 except:
! 367 pass
368
369 def _log(self, level: int, msg: str, add_prefix: bool = True, *args, **kwargs):
370 """
371 Internal logging method with exception safety.📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.logger_bridge.hpp: 58.8%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.row.py: 66.2%
mssql_python.pybind.ddbc_bindings.cpp: 69.3%
mssql_python.pybind.ddbc_bindings.h: 69.7%
mssql_python.pybind.connection.connection.cpp: 75.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.logging.py: 81.2%
mssql_python.connection.py: 84.1%🔗 Quick Links
|
There was a problem hiding this comment.
Pull request overview
This pull request enhances the bulk copy operation in mssql_python with comprehensive logging infrastructure to improve troubleshooting and observability. The changes introduce performance-optimized logging with cached level checks, a new method for Rust core integration, and extensive debug logging throughout the bulk copy workflow.
Changes:
- Added cached logging level (_cached_level and _is_debug_enabled) to avoid repeated isEnabledFor() calls in performance-critical paths
- Implemented py_core_log() method to support structured logging from Rust/TDS core with custom source location tracking
- Enhanced _bulkcopy method with detailed debug and info logging at all major stages (validation, connection setup, execution, cleanup), all gated by is_debug_enabled checks for minimal performance impact when disabled
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| mssql_python/logging.py | Adds cached logging level for fast checks, implements py_core_log for Rust integration, updates _log to use cached level and proper stacklevel, and updates CSVFormatter to handle py-core records |
| mssql_python/cursor.py | Adds comprehensive logging throughout _bulkcopy method including parameter validation, connection setup, column mappings, execution, results, and cleanup - all conditionally gated by is_debug_enabled |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Set level (atomic operation, no lock needed) | ||
| self._logger.setLevel(level) | ||
|
|
||
| # Cache level for fast checks (avoid repeated isEnabledFor calls) |
There was a problem hiding this comment.
The cached level variables (_cached_level and _is_debug_enabled) are updated without synchronization, which could lead to thread safety issues. While reading these primitives is generally atomic in CPython due to the GIL, there's a potential race condition: if one thread reads _is_debug_enabled right after _cached_level is updated but before _is_debug_enabled is updated on line 493, it could see an inconsistent state (e.g., _cached_level=DEBUG but _is_debug_enabled=False). Consider updating both values atomically or documenting that this benign race condition is acceptable (since it would only cause a brief period where logging might be incorrectly enabled/disabled until the next check).
| # Cache level for fast checks (avoid repeated isEnabledFor calls) | |
| # Cache level for fast checks (avoid repeated isEnabledFor calls). | |
| # NOTE: These two assignments are not synchronized together. This can cause | |
| # a brief window where another thread may observe an inconsistent state | |
| # (e.g., _cached_level == DEBUG but _is_debug_enabled is still False). | |
| # This is considered acceptable because the underlying logger remains the | |
| # single source of truth for effective log level checks, and any | |
| # inconsistency only affects logging decisions transiently. |
| lineno: Line number in source file | ||
| """ | ||
| try: | ||
| if not self._logger.isEnabledFor(level): |
There was a problem hiding this comment.
The py_core_log method uses self._logger.isEnabledFor(level) instead of the cached level check. For consistency with the performance optimization introduced in _log method and to avoid redundant isEnabledFor calls, this should use the cached level check: "if level < self._cached_level: return". This would maintain the same performance benefit that was the goal of introducing the cached level.
| if not self._logger.isEnabledFor(level): | |
| if level < self._cached_level: |
| """Fast check if debug logging is enabled (cached for performance)""" | ||
| return self._is_debug_enabled | ||
|
|
||
|
|
There was a problem hiding this comment.
The new is_debug_enabled property and py_core_log method lack test coverage. Given that tests/test_007_logging.py has comprehensive logging tests, these new features should have corresponding tests to verify: 1) is_debug_enabled correctly reflects the logging level state, 2) is_debug_enabled updates when setLevel is called, 3) py_core_log correctly formats records with custom source locations, 4) py_core_log falls back gracefully on errors.
| """Fast check if debug logging is enabled (cached for performance)""" | |
| return self._is_debug_enabled | |
| """Fast check if debug logging is enabled (cached for performance) | |
| This uses getattr with a default to avoid AttributeError if the | |
| internal cache has not yet been initialized. | |
| """ | |
| return getattr(self, "_is_debug_enabled", False) |
Work Item / Issue Reference
Summary
This pull request introduces significant improvements to the bulk copy operation and logging infrastructure in the
mssql_pythonpackage. The main focus is on enhancing logging performance, providing more detailed and structured logs for troubleshooting, and optimizing the interaction between Python and the Rust-based core. Key changes include fast logging level checks, new logging methods for Rust integration, and extensive debug logging throughout the bulk copy process.Logging infrastructure improvements:
is_debug_enabledproperty to the logger for efficient log-level checks, reducing overhead in performance-critical paths. [1] [2] [3]py_core_logmethod for structured logging from the Rust/TDS core, ensuring log records include source file and line number for better traceability. [1] [2]_logmethod to use the cached log level and to include the correct stack level for accurate caller information in logs. [1] [2]Bulk copy operation enhancements:
_bulkcopymethod incursor.py, including parameter validation, connection setup, column mapping, execution, and cleanup stages. Logging is only performed if debug is enabled, minimizing performance impact. [1] [2] [3] [4] [5] [6]These changes collectively improve the observability and maintainability of the bulk copy process, while ensuring that logging does not introduce unnecessary overhead when disabled.
High Level Design

Success Test Log

Failure Test Log
