[refine](exec) replace std::shared_mutex/std::shared_lock with annotated wrappers for thread safety analysis#63109
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
Automated code review summary:\n\nNo blocking issues found in the actual PR diff. The change is focused on adding annotated shared-mutex/shared-lock wrappers and applying them to VDataStreamMgr and RuntimeFilterMergeControllerEntity without changing the effective locking order or heavy-work placement.\n\nCritical checkpoints:\n- Goal/test: The PR goal appears to be enabling Clang thread-safety analysis for shared-mutex users; the code accomplishes this for the touched classes. No test changes are included, but this is a static-analysis annotation/refactor with no intended runtime behavior change.\n- Scope: The modification is small and focused.\n- Concurrency: Existing shared/exclusive lock protection is preserved. VDataStreamMgr keeps receiver-map/set operations under the same mutex, and cancel still copies receivers before doing heavier cancellation. RuntimeFilterMergeControllerEntity preserves the map mutex plus per-filter mutex pattern.\n- Lifecycle/static initialization: No new lifecycle-sensitive globals or static initialization dependencies were introduced.\n- Configuration/compatibility/persistence/data writes: Not applicable.\n- Parallel paths: The two touched shared-mutex users are updated consistently in their declarations and call sites.\n- Tests/results: No runtime test result changes. A compile/static-analysis run in CI should validate the annotation intent.\n- Observability/performance: No new logs or metrics needed; runtime lock behavior remains equivalent to std::shared_mutex/std::shared_lock.\n- User focus: No additional user-provided review focus was specified.
|
run buildall |
TPC-H: Total hot run time: 29710 ms |
TPC-DS: Total hot run time: 170673 ms |
| // Wraps std::shared_mutex and provides both exclusive and shared capability | ||
| // operations so GUARDED_BY / REQUIRES_SHARED / etc. can reference it. | ||
| class CAPABILITY("mutex") AnnotatedSharedMutex { | ||
| public: |
There was a problem hiding this comment.
我们使用这个封装的API,跟直接使用lock的各种方法是什么区别?
|
run buildall |
TPC-H: Total hot run time: 29583 ms |
TPC-DS: Total hot run time: 169648 ms |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
|
/review |
There was a problem hiding this comment.
I found one blocking issue in the new shared-lock wrapper. The PR goal is focused and the VDataStreamMgr/runtime-filter migrations otherwise preserve the existing lock scope and call patterns, but the new SharedLockGuard releases the wrong capability kind in its annotation, so Clang's shared-capability tracking will not model the RAII guard correctly.
Critical checkpoints:
- Goal/test: The goal is to make shared-mutex usage visible to Clang thread-safety analysis. The current code partially accomplishes this, but SharedLockGuard's destructor annotation must be fixed for the new wrapper to be correct. Existing BE tests are adjusted, but I did not see a targeted compile/thread-safety check result in the PR body.
- Scope: The implementation is small and focused.
- Concurrency: The migrated maps remain protected by the same shared/exclusive lock, and the VDataStreamMgr helper avoids the previous conditional-lock path. The blocking issue is annotation correctness, not runtime lock behavior.
- Lifecycle/static initialization: No new static lifecycle concern found.
- Config/incompatibility/protocol/storage/transaction: Not applicable.
- Parallel paths: The two migrated shared-mutex users are covered; no similar new issue found beyond SharedLockGuard itself.
- Tests: Existing mock signature is updated. A compile/thread-safety analysis check should cover the annotation fix.
- Observability/performance: No additional logging or metric need found.
- User focus: No additional user-provided review focus was specified.
| #ifdef BE_TEST | ||
| doris::mock_random_sleep(); | ||
| #endif | ||
| _mu.lock_shared(); |
There was a problem hiding this comment.
SharedLockGuard acquires a shared capability via ACQUIRE_SHARED(mu), but the destructor is annotated with RELEASE(), which is the exclusive-capability release annotation. That means Clang's thread-safety model will not correctly pair the shared acquisition/release for this RAII guard, defeating the purpose of using it for REQUIRES_SHARED/GUARDED_BY checking. Please release the shared capability here.
| _mu.lock_shared(); | |
| ~SharedLockGuard() RELEASE_SHARED() { |
There was a problem hiding this comment.
不对,~SharedLockGuard() RELEASE() 是可以的,而且在这里更应该保持 RELEASE(),不要改成 RELEASE_SHARED()。
|
/review |
This reverts commit 5500e28.
|
/review |
There was a problem hiding this comment.
I found one blocking issue in the new thread-safety annotation wrapper. The PR goal is to make shared mutex usage visible to Clang thread-safety analysis, but the current SharedLockGuard destructor annotation itself triggers -Wthread-safety warnings in a minimal compile check using this header.
Critical checkpoint conclusions:
- Goal/test: The goal is clear, but the current implementation does not fully accomplish it because the new guard is not warning-clean under Clang thread-safety analysis. I did not find an added test or compile-check coverage for this wrapper.
- Scope: The change is small and focused.
- Concurrency: The runtime lock behavior remains equivalent to the previous
std::shared_mutexusage, but the static annotations need correction. No new lock-order issue was found. - Lifecycle/static initialization: No special lifecycle or cross-TU static initialization issue found.
- Config/compatibility/protocol/storage: Not applicable.
- Parallel paths: The PR updates the visible
VDataStreamMgrandRuntimeFilterMergeControllerEntitypaths consistently. - Special checks: No problematic new defensive checks found.
- Test coverage: Existing BE test mock was adjusted, but no coverage verifies the new annotation wrapper with
-Wthread-safety. - Observability/persistence/data writes: Not applicable.
- Performance: No runtime performance regression found beyond replacing equivalent lock guards.
User focus: No additional user-provided review focus was specified.
| #endif | ||
| _mu.lock_shared(); | ||
| } | ||
| ~SharedLockGuard() RELEASE() { |
There was a problem hiding this comment.
This currently makes the new guard fail the same Clang thread-safety analysis it is meant to enable. With the current header, a minimal use such as SharedLockGuard g(mu); auto it = m.find(1); under -Wthread-safety reports warning: releasing mutex 'g' using shared access, expected exclusive access at scope exit. For a SCOPED_CAPABILITY RAII object, the constructor records that it acquired mu in shared mode, but the no-argument destructor releases the scoped lock object itself, so it should use the normal release annotation while AnnotatedSharedMutex::unlock_shared() keeps RELEASE_SHARED().
| ~SharedLockGuard() RELEASE() { | |
| ~SharedLockGuard() RELEASE() { |
|
run buildall |
There was a problem hiding this comment.
Automated review summary: no new blocking issues found in this pass.
Critical checkpoint conclusions:
- Goal and tests: The PR focuses on replacing raw shared mutex/lock usage with annotated wrappers so Clang thread-safety analysis can reason about shared access. The changed code accomplishes that refactor for the touched managers. The existing BE unit test mock signature was updated, but I did not run the full BE test suite in this review environment.
- Scope: The change is small and focused: one annotation helper addition, two manager migrations, and one test mock signature update.
- Concurrency: The modified code is concurrency-related. Existing shared/exclusive locking intent is preserved: map inserts/erases use
LockGuard, read-only lookups/iteration useSharedLockGuard, and long receiver cancellation remains outside the manager lock. I did not find a new lock-order or heavy-under-lock regression introduced by this PR. - Lifecycle/static initialization: No new static or cross-translation-unit initialization dependency was introduced.
- Configuration: No configuration items were added.
- Compatibility: No storage format, RPC protocol, or function symbol compatibility issue was introduced. The
find_recvrC++ API changed internally and visible callers/tests in the repository were updated. - Parallel paths: The touched shared-mutex paths in
VDataStreamMgrandRuntimeFilterMergeControllerEntitywere migrated consistently in the PR scope. - Conditional checks: No new special conditional behavior was introduced.
- Test coverage/results: No new test behavior was added beyond updating an existing test mock signature; this is acceptable for a mechanical refactor, but CI/BE UT remains the meaningful validation.
- Observability: No new observability need was introduced.
- Transaction/persistence/data writes: Not applicable.
- FE/BE variable passing: Not applicable.
- Performance: The wrappers preserve the same underlying
std::shared_mutexbehavior and do not add runtime overhead outside BE_TEST random sleeps already used by the existing lock guard pattern. - Additional review focus: The focus file says there are no additional user-provided focus points; no extra issue was found there.
Existing review context: I treated the current SharedLockGuard destructor annotation thread as already-known and did not duplicate it.
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 30700 ms |
TPC-DS: Total hot run time: 169015 ms |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run cloud_p0 |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Issue Number: N/A
Problem Summary:
After #63070 replaced
std::mutex/std::lock_guardwith annotated wrappers (AnnotatedMutex/LockGuard), shared mutex usage still relied on rawstd::shared_mutexandstd::shared_lock/std::unique_lock, which are invisible to Clang's thread safety analysis. This leaves shared-lock sites unverified —GUARDED_BY,REQUIRES_SHARED, and other annotations cannot be enforced.AnnotatedSharedMutex(wrappingstd::shared_mutexwithCAPABILITY/ACQUIRE/RELEASE/ACQUIRE_SHARED/RELEASE_SHAREDannotations) andSharedLockGuard(RAIISCOPED_CAPABILITYwithACQUIRE_SHARED/RELEASE) inthread_safety_annotations.h.VDataStreamMgrandRuntimeFilterMergeControllerEntityfromstd::shared_mutextoAnnotatedSharedMutex, and fromstd::unique_lock/std::shared_locktoLockGuard/SharedLockGuard.GUARDED_BYannotations to the protected maps._find_recvras a private helper annotated withREQUIRES_SHARED(_lock), eliminating thebool acquire_lockparameter that previously bypassed lock tracking.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)