Skip to content

FEAT: mssql-python driver changes to support bulk copy logging.#430

Open
subrata-ms wants to merge 5 commits intomainfrom
subrata-ms/bulkcopy_py
Open

FEAT: mssql-python driver changes to support bulk copy logging.#430
subrata-ms wants to merge 5 commits intomainfrom
subrata-ms/bulkcopy_py

Conversation

@subrata-ms
Copy link
Contributor

@subrata-ms subrata-ms commented Feb 12, 2026

Work Item / Issue Reference

AB#41882


Summary

This pull request introduces significant improvements to the bulk copy operation and logging infrastructure in the mssql_python package. The main focus is on enhancing logging performance, providing more detailed and structured logs for troubleshooting, and optimizing the interaction between Python and the Rust-based core. Key changes include fast logging level checks, new logging methods for Rust integration, and extensive debug logging throughout the bulk copy process.

Logging infrastructure improvements:

  • Added a cached logging level and a fast is_debug_enabled property to the logger for efficient log-level checks, reducing overhead in performance-critical paths. [1] [2] [3]
  • Implemented a py_core_log method for structured logging from the Rust/TDS core, ensuring log records include source file and line number for better traceability. [1] [2]
  • Updated the internal _log method to use the cached log level and to include the correct stack level for accurate caller information in logs. [1] [2]

Bulk copy operation enhancements:

  • Added comprehensive debug and info logging throughout the _bulkcopy method in cursor.py, including parameter validation, connection setup, column mapping, execution, and cleanup stages. Logging is only performed if debug is enabled, minimizing performance impact. [1] [2] [3] [4] [5] [6]

These changes collectively improve the observability and maintainability of the bulk copy process, while ensuring that logging does not introduce unnecessary overhead when disabled.

High Level Design
image

Success Test Log
image

Failure Test Log
image

@github-actions
Copy link

📊 Code Coverage Report

🔥 Diff Coverage

31%


🎯 Overall Coverage

76%


📈 Total Lines Covered: 5474 out of 7152
📁 Project: mssql-python


Diff Coverage

Diff: main...HEAD, staged and unstaged changes

  • mssql_python/logging.py (31.8%): Missing lines 150-151,160-161,344-346,349-350,361-362,364-367

Summary

  • Total: 22 lines
  • Missing: 15 lines
  • Coverage: 31%

mssql_python/logging.py

Lines 146-155

  146         class CSVFormatter(logging.Formatter):
  147             def format(self, record):
  148                 # Check if this is from Rust (via rust_log method)
  149                 if hasattr(record, 'funcName') and record.funcName == "rust":
! 150                     source = "Rust"
! 151                     message = record.getMessage()
  152                 else:
  153                     # Extract source from message (e.g., [Python] or [DDBC])
  154                     msg = record.getMessage()
  155                     if msg.startswith("[") and "]" in msg:

Lines 156-165

  156                         end_bracket = msg.index("]")
  157                         source = msg[1:end_bracket]
  158                         message = msg[end_bracket + 2 :].strip()  # Skip '] '
  159                     else:
! 160                         source = "Unknown"
! 161                         message = msg
  162 
  163                 # Format timestamp with milliseconds using period separator
  164                 timestamp = self.formatTime(record, "%Y-%m-%d %H:%M:%S")
  165                 timestamp_with_ms = f"{timestamp}.{int(record.msecs):03d}"

Lines 340-354

  340             msg: Message string (already formatted)
  341             filename: Source filename (e.g., 'cursor.rs')
  342             lineno: Line number in source file
  343         """
! 344         try:
! 345             if not self._logger.isEnabledFor(level):
! 346                 return
  347             
  348             # Create a custom LogRecord with Rust source location
! 349             import logging as log_module
! 350             record = log_module.LogRecord(
  351                 name=self._logger.name,
  352                 level=level,
  353                 pathname=filename,
  354                 lineno=lineno,

Lines 357-371

  357                 exc_info=None,
  358                 func="rust",
  359                 sinfo=None
  360             )
! 361             self._logger.handle(record)
! 362         except Exception:
  363             # Fallback - use regular logging
! 364             try:
! 365                 self._logger.log(level, msg)
! 366             except:
! 367                 pass
  368 
  369     def _log(self, level: int, msg: str, add_prefix: bool = True, *args, **kwargs):
  370         """
  371         Internal logging method with exception safety.


📋 Files Needing Attention

📉 Files with overall lowest coverage (click to expand)
mssql_python.pybind.logger_bridge.hpp: 58.8%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.row.py: 66.2%
mssql_python.pybind.ddbc_bindings.cpp: 69.3%
mssql_python.pybind.ddbc_bindings.h: 69.7%
mssql_python.pybind.connection.connection.cpp: 75.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.logging.py: 81.2%
mssql_python.connection.py: 84.1%

🔗 Quick Links

⚙️ Build Summary 📋 Coverage Details

View Azure DevOps Build

Browse Full Coverage Report

@subrata-ms subrata-ms changed the title bulk copy changes for logging FEAT: mssql-python driver changes to support bulk copy logging. Feb 16, 2026
@github-actions github-actions bot added the pr-size: medium Moderate update size label Feb 16, 2026
@subrata-ms subrata-ms marked this pull request as ready for review February 16, 2026 14:23
Copilot AI review requested due to automatic review settings February 16, 2026 14:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances the bulk copy operation in mssql_python with comprehensive logging infrastructure to improve troubleshooting and observability. The changes introduce performance-optimized logging with cached level checks, a new method for Rust core integration, and extensive debug logging throughout the bulk copy workflow.

Changes:

  • Added cached logging level (_cached_level and _is_debug_enabled) to avoid repeated isEnabledFor() calls in performance-critical paths
  • Implemented py_core_log() method to support structured logging from Rust/TDS core with custom source location tracking
  • Enhanced _bulkcopy method with detailed debug and info logging at all major stages (validation, connection setup, execution, cleanup), all gated by is_debug_enabled checks for minimal performance impact when disabled

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
mssql_python/logging.py Adds cached logging level for fast checks, implements py_core_log for Rust integration, updates _log to use cached level and proper stacklevel, and updates CSVFormatter to handle py-core records
mssql_python/cursor.py Adds comprehensive logging throughout _bulkcopy method including parameter validation, connection setup, column mappings, execution, results, and cleanup - all conditionally gated by is_debug_enabled

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# Set level (atomic operation, no lock needed)
self._logger.setLevel(level)

# Cache level for fast checks (avoid repeated isEnabledFor calls)
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cached level variables (_cached_level and _is_debug_enabled) are updated without synchronization, which could lead to thread safety issues. While reading these primitives is generally atomic in CPython due to the GIL, there's a potential race condition: if one thread reads _is_debug_enabled right after _cached_level is updated but before _is_debug_enabled is updated on line 493, it could see an inconsistent state (e.g., _cached_level=DEBUG but _is_debug_enabled=False). Consider updating both values atomically or documenting that this benign race condition is acceptable (since it would only cause a brief period where logging might be incorrectly enabled/disabled until the next check).

Suggested change
# Cache level for fast checks (avoid repeated isEnabledFor calls)
# Cache level for fast checks (avoid repeated isEnabledFor calls).
# NOTE: These two assignments are not synchronized together. This can cause
# a brief window where another thread may observe an inconsistent state
# (e.g., _cached_level == DEBUG but _is_debug_enabled is still False).
# This is considered acceptable because the underlying logger remains the
# single source of truth for effective log level checks, and any
# inconsistency only affects logging decisions transiently.

Copilot uses AI. Check for mistakes.
lineno: Line number in source file
"""
try:
if not self._logger.isEnabledFor(level):
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The py_core_log method uses self._logger.isEnabledFor(level) instead of the cached level check. For consistency with the performance optimization introduced in _log method and to avoid redundant isEnabledFor calls, this should use the cached level check: "if level < self._cached_level: return". This would maintain the same performance benefit that was the goal of introducing the cached level.

Suggested change
if not self._logger.isEnabledFor(level):
if level < self._cached_level:

Copilot uses AI. Check for mistakes.
Comment on lines +602 to 605
"""Fast check if debug logging is enabled (cached for performance)"""
return self._is_debug_enabled


Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new is_debug_enabled property and py_core_log method lack test coverage. Given that tests/test_007_logging.py has comprehensive logging tests, these new features should have corresponding tests to verify: 1) is_debug_enabled correctly reflects the logging level state, 2) is_debug_enabled updates when setLevel is called, 3) py_core_log correctly formats records with custom source locations, 4) py_core_log falls back gracefully on errors.

Suggested change
"""Fast check if debug logging is enabled (cached for performance)"""
return self._is_debug_enabled
"""Fast check if debug logging is enabled (cached for performance)
This uses getattr with a default to avoid AttributeError if the
internal cache has not yet been initialized.
"""
return getattr(self, "_is_debug_enabled", False)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-size: medium Moderate update size

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant