Addressing Incorrect Handling of Python GIL that Leads to a SEGFAULT #361
Addressing Incorrect Handling of Python GIL that Leads to a SEGFAULT #361Scusemua wants to merge 2 commits intogo-python:masterfrom
Conversation
|
Hi @Scusemua ! Thanks for your patch! You saved me several hours of life! :)) PyGILState locking is needed for sure, but Go is insidious too :) even between PyGILState_Ensure/PyGILState_Release calls! And sure, PyGILState_Release will panic that current thread has no python thread info and SEGFAULT etc. My simple two clients-loaded Web server was crashing on this even with your patch. To fix - we need to lock current goroutine to OS thread during the PyGILState_Ensure/PyGILState_Release section. Simply add runtime.LockOSThread() call before PyGILState_Ensure (1 place in bind/symbols.go) and runtime.UnlockOSThread() after every PyGILState_Release call (3 places in bind/symbols.go) And now my test server is running fine, calling Python callback in the web handler :) Feel free to add this to your branch and be brave to propose your PR for merge! Have nice days guys! |
|
Hi @nickeygit, Yeah, I also found that I needed to pin the goroutines to OS threads, or else I would get really strange errors -- just like you're describing. I just think I forgot to add an update about this to my original post. 😅 With these two fixes, everything has been working fine on my end with no more issues for months! :) |
|
I've encountered a new issue in which the application encounters a segmentation fault when calling the Sometimes, the segmentation fault occurs on this line: Other times, it doesn't occur until The issue is with the At first, I thought it was perhaps because the callback was being executed before the original thread state was restored by But I tried adding a Has anybody else encountered this? Have you ever experienced this issue, @nickeygit? |
I found what appears to be a bug in
bind/symbols.go.Consider the following Go function that was generated by gopy:
This is generated for a Go method
ExampleFunctionof aGoObjectstruct. The intent is for this method to be called from Python.The bug is that we access the C/Python API function
C.PyCallable_Checkbefore acquiring the GIL, leading to aSEGFAULT. The problem lies in the following excerpt of the above function:Notice that we only acquire the GIL via
PyGILState_Ensureafter the call toC.PyCallable_Check(_fun_arg), which leads to aSEGFAULT.The changes in this PR will result in the above Go function being generated differently as:
Notice that we now call
C.PyGILState_Ensure()before the call toC.PyCallable_Check(_fun_arg). Likewise, we callC.PyGILState_Release(_gstate)before returning within the body of the associated if-statement to ensure that there is a matching call (to match the_gstate := C.PyGILState_Ensure()).In terms of testing, I've just been using the modified
gopyin an application I'm developing. I'm in the process of doing more testing in the meantime; however, I am confident that this is a bug, as you must hold the Python GIL when interfacing/invoking any of the C/Python API.From the "Thread State and the Global Interpreter Lock" subsection of the "Initialization, Finalization, and Threads" Python C-API documentation:
On a separate (but related) note -- since we defer
C.PyEval_RestoreThread(_saved_thread)at the beginning of the method's execution, it's a little silly to release the GIL viaC.PyGILState_Release(_gstate)immediately before returning (even in the case where we don't call into Go code); however, we need to have matching calls betweenC.PyEval_SaveThread()andC.PyEval_RestoreThread()as well asPyGILState_Ensure()andPyGILState_Release(), so I think it ultimately makes sense to do things this way for now. In theory, we could provide more complex logic to handle the GIL more efficiently in the future.