[Bugfix] fix builtin algo lookup in plugin via unifed register#317
[Bugfix] fix builtin algo lookup in plugin via unifed register#317haochengxia wants to merge 1 commit into
Conversation
|
Looking for one thing? Review this PR in Change Stack to search files, summaries, diffs, and code without losing your place. 📝 WalkthroughWalkthroughThis PR introduces a built-in eviction algorithm registry system that enables case-insensitive algorithm name lookup and simplifies the plugin loading pathway. The registry maps algorithm name strings to initialization functions, consolidating algorithm discovery into a single lookup mechanism used by both the CLI and plugin systems. ChangesBuilt-in Algorithm Registry System
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Infer (1.2.0)libCacheSim/cache/evictionAlgoRegister.clibCacheSim/cache/evictionAlgoRegister.c:3:10: fatal error: 'libCacheSim/evictionAlgo.h' file not found ... [truncated 789 characters] ... .2.0/lib/infer/facebook-clang-plugins/clang/install/lib/clang/18/include" libCacheSim/cache/plugin.clibCacheSim/cache/plugin.c:5:10: fatal error: 'libCacheSim/plugin.h' file not found ... [truncated 721 characters] ... er-linux-x86_64-v1.2.0/lib/infer/facebook-clang-plugins/clang/install/lib/clang/18/include" test/test_evictionAlgo.cIn file included from test/test_evictionAlgo.c:5: ... [truncated 729 characters] ... nfer-linux-x86_64-v1.2.0/lib/infer/facebook-clang-plugins/clang/install/lib/clang/18/include" Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors the eviction algorithm registration by introducing a centralized registry (evictionAlgoRegister.c) and a helper function get_builtin_cache_init to look up built-in cache initialization functions. This replaces the inline registry in cache_init.h and allows unified lookup for both cachesim and plugin-based cache creation, enabling case-insensitive algorithm resolution. The feedback suggests adding defensive NULL checks for algorithm names, handling potential NULL values from dlopen and dlerror in plugin.c, and registering "tinyLFU" as an alias for "wtinyLFU" in the new registry.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| cache_t *create_cache_internal(const char *const cache_alg_name, | ||
| common_cache_params_t cc_params, | ||
| void *cache_specific_params) { | ||
| cache_t *(*cache_init)(common_cache_params_t, void *) = NULL; | ||
| cache_init_func_ptr cache_init = get_builtin_cache_init(cache_alg_name); |
There was a problem hiding this comment.
Add a defensive check to ensure cache_alg_name is not NULL before proceeding. If cache_alg_name is NULL, passing it to get_builtin_cache_init or sprintf later in the function will cause a segmentation fault.
cache_t *create_cache_internal(const char *const cache_alg_name,
common_cache_params_t cc_params,
void *cache_specific_params) {
if (cache_alg_name == NULL) {
return NULL;
}
cache_init_func_ptr cache_init = get_builtin_cache_init(cache_alg_name);| cache_init_func_ptr init_func; | ||
| } eviction_algo_entry_t; | ||
|
|
||
| cache_init_func_ptr get_builtin_cache_init(const char *const cache_alg_name) { |
There was a problem hiding this comment.
Add a defensive check to ensure cache_alg_name is not NULL before calling strcasecmp. If cache_alg_name is NULL, strcasecmp will dereference a null pointer and cause a segmentation fault.
cache_init_func_ptr get_builtin_cache_init(const char *const cache_alg_name) {
if (cache_alg_name == NULL) {
return NULL;
}| void *handle = dlopen(NULL, RTLD_LAZY | RTLD_GLOBAL); | ||
| /* should not check err here, otherwise ubuntu will report err even though | ||
| * everything is OK */ |
There was a problem hiding this comment.
Add a defensive check to ensure handle is not NULL after calling dlopen. If dlopen fails and returns NULL, passing it to dlsym can cause undefined behavior or a crash on some platforms.
void *handle = dlopen(NULL, RTLD_LAZY | RTLD_GLOBAL);
if (handle == NULL) {
return NULL;
}
/* should not check err here, otherwise ubuntu will report err even though
* everything is OK */| if (cache_init == NULL) { | ||
| WARN("cannot load internal cache %s: error %s\n", cache_alg_name, err); | ||
| abort(); | ||
| return NULL; | ||
| } |
There was a problem hiding this comment.
Handle the case where err is NULL gracefully. If dlerror() returns NULL, passing it directly to %s in WARN can cause undefined behavior or a crash on some platforms.
if (cache_init == NULL) {
WARN("cannot load internal cache %s: error %s\n", cache_alg_name, err ? err : "symbol not found");
return NULL;
}| {"twoq", TwoQ_init}, | ||
| {"wtinyLFU", WTinyLFU_init}, |
There was a problem hiding this comment.
Add "tinyLFU" as an alias for WTinyLFU_init in the registry. Currently, only "wtinyLFU" is registered, but "tinyLFU" is a very common name used by users (e.g., in cachesim command line and potentially in mrcProfiler). Registering it as an alias ensures it can be resolved correctly through the unified registry.
| {"twoq", TwoQ_init}, | |
| {"wtinyLFU", WTinyLFU_init}, | |
| {"twoq", TwoQ_init}, | |
| {"tinyLFU", WTinyLFU_init}, | |
| {"wtinyLFU", WTinyLFU_init}, |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@libCacheSim/cache/evictionAlgoRegister.c`:
- Line 32: The mapping for the algorithm name "fifo-reinsertion" incorrectly
points to Clock_init; change the initializer to FIFO_Reinsertion_init in the
registration table so the string "fifo-reinsertion" is associated with the
correct initializer function (ensure FIFO_Reinsertion_init is declared/visible
where the table is defined).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e7c7aa52-41e7-4ba2-8a1c-2ac2550641a6
📒 Files selected for processing (9)
doc/advanced_lib_extend.mddoc/quickstart_mrcProfiler.mdlibCacheSim/bin/cachesim/cache_init.hlibCacheSim/cache/CMakeLists.txtlibCacheSim/cache/evictionAlgoRegister.clibCacheSim/cache/plugin.clibCacheSim/include/libCacheSim/evictionAlgo.hscripts/README.mdtest/test_evictionAlgo.c
| {"cr_lfu", CR_LFU_init}, | ||
| {"fifo", FIFO_init}, | ||
| {"fifo-merge", FIFO_Merge_init}, | ||
| {"fifo-reinsertion", Clock_init}, |
There was a problem hiding this comment.
Map fifo-reinsertion to the correct initializer
Line 32 maps "fifo-reinsertion" to Clock_init, which will initialize the wrong policy for that algorithm name. This should point to FIFO_Reinsertion_init.
Suggested fix
- {"fifo-reinsertion", Clock_init},
+ {"fifo-reinsertion", FIFO_Reinsertion_init},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {"fifo-reinsertion", Clock_init}, | |
| {"fifo-reinsertion", FIFO_Reinsertion_init}, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@libCacheSim/cache/evictionAlgoRegister.c` at line 32, The mapping for the
algorithm name "fifo-reinsertion" incorrectly points to Clock_init; change the
initializer to FIFO_Reinsertion_init in the registration table so the string
"fifo-reinsertion" is associated with the correct initializer function (ensure
FIFO_Reinsertion_init is declared/visible where the table is defined).
For issue #304
In the original plugin system, we would find the symbol always via
This is safe when we impl plugin init function and name it correctly, but it would affect the user experience when they need to use builtin init functions. To achieve unified experiment like cachesim, you can refer to the init function via
lru, 's3fifo', etc., we can share the lookup table via a register layer and move it into the core lib.After the modifiction, command like the following one would be safe
MPLBACKEND=Agg python3 scripts/profile_mrc.py \ --tracepath data/twitter_cluster52.csv \ --trace-format csv \ --trace-format-params 'time-col=1,obj-id-col=2,obj-size-col=3,delimiter=,,obj-id-is-num=1' \ --profiler=MINISIM \ --algos=lru,fifo \ --profiler-params=FIX_RATE,0.1,10 \ --sizes=0.01,0.02 \ --name /tmp/libcachesim_profile_mrc_testSummary by CodeRabbit
New Features
LRU,lru, andLruare equivalent).Documentation
Tests