-
-
Notifications
You must be signed in to change notification settings - Fork 34.6k
intern_common should use _Py_LOCK_DONT_DETACH #149162
Copy link
Copy link
Open
Labels
3.14bugs and security fixesbugs and security fixes3.15pre-release feature fixes, bugs and security fixespre-release feature fixes, bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)topic-free-threadingtype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error
Metadata
Metadata
Assignees
Labels
3.14bugs and security fixesbugs and security fixes3.15pre-release feature fixes, bugs and security fixespre-release feature fixes, bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)topic-free-threadingtype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error
Bug report
PyUnicode_InternFromStringand various other interning functions callintern_common, which locks an interpreter-wide lock in the free threading build. We should use_Py_LOCK_DONT_DETACHwhen locking that lock to avoid potential deadlocks in certain extensions.PyTorch, TensorFlow, and possibly other extensions use C++11 one time initialization via function local statics to intern string. For example:
https://github.com/pytorch/pytorch/blob/8804b122355d1aa1cc6bfffd1516f3df39f96c5f/torch/csrc/autograd/python_function.cpp#L369-L370
C++ static locals implement thread-safe one time initialization, similar to
std::call_onceorpthread_once. That means that a (hidden) C++ lock is held during the call toPyUnicode_InternFromString. If we detach, duringPyUnicode_InternFromString, we risk a lock ordering deadlock.Here's a complicated scenario that could deadlock:
static PyObject* method_name = PyUnicode_InternFromString(...). Blocks on interned mutex and detaches for stop the world pausestatic PyObject* method_name = PyUnicode_InternFromString(...)and blocks on C++ one time initialization lock without detaching.So there's a deadlock where Thread 4 - (waiting on) -> Thread 3 - (waiting on) -> Thread 1 - (waiting on) -> Thread 4.
This isn't a problem in GIL-enabled builds because thread 3 won't block during
PyUnicode_InternFromString._Py_LOCK_DONT_DETACHprevents this problem becausePyUnicode_InternFromString()will no longer block on a stop the world if there's contention on the interned mutex.Note that if callers used something like pybind11's
gil_safe_call_once_and_storethat would also avoid the deadlock.Linked PRs