Conversation
| lock_path = Path(cache_dir) / "build.lock" | ||
| with FileLock(str(lock_path), timeout=7200): | ||
| # Re-check inside the lock — another process may have built it | ||
| # while we were waiting. |
There was a problem hiding this comment.
Need help looking into file read checks like this. Mainly, for when the agent spins up multiple caching jobs at once, which leads to downstream issues like overwriting files, etc.
EricSchrock
left a comment
There was a problem hiding this comment.
I ran out of time and couldn't get to a full review but I reviewed the added docs, examples, and tests and skimmed the model and dataset changes. I found a few issues with the examples and tests.
@Logiquo might have better feedback on the models themselves and the changes to base_dataset.py.
examples/drug_recommendation/drug_recommendation_mimic4_adacare.py
Outdated
Show resolved
Hide resolved
examples/mortality_prediction/mortality_mimic4_multimodal_adacare.py
Outdated
Show resolved
Hide resolved
|
The dataset part LGTM, other than we don't need to check bin files and we do not need to invalidate caches (or if we do want to invalidate caches, the best way should be using StreamingDataset to read it to see if it throws exception). Checking *.bin file won't help in this case. |
|
Thanks for the catch @EricSchrock . I've deleted the redundant test cases and edited some things so they don't break. I've asked @Logiquo for a re-review/approval. I think the main thing is just making it more robust to user error. One of the problems is that there can be race conditions if the user is not careful and wants to run multiple jobs at the same time with the same cache_dir. I've only discovered this myself after running the agent and cleansing an existing corrupted cache where he submits multiple parallel jobs, and I think it's not impossible for users to follow that same pattern randomly in their development cycle. |
EricSchrock
left a comment
There was a problem hiding this comment.
Flagging a couple more cuda:# instances to fix. Submitting as "Comment" instead of "Approve" to clear "Changes Requested" but leave the final approval to @Logiquo.
| # --------------------------------------------------------------------------- | ||
| # STEP 4: Define Optuna objective | ||
| # --------------------------------------------------------------------------- | ||
| DEVICE = "cuda:3" # or "cpu" |
There was a problem hiding this comment.
| DEVICE = "cuda:3" # or "cpu" | |
| DEVICE = "cuda:0" # or "cpu" |
| # STEP 5: Train the model | ||
| trainer = Trainer( | ||
| model=model, | ||
| device="cuda:4", # or "cpu" |
There was a problem hiding this comment.
| device="cuda:4", # or "cpu" | |
| device="cuda:0", # or "cpu" |
| # --------------------------------------------------------------------------- | ||
| # STEP 4: Define Optuna objective | ||
| # --------------------------------------------------------------------------- | ||
| DEVICE = "cuda:2" # or "cpu" |
There was a problem hiding this comment.
| DEVICE = "cuda:2" # or "cpu" | |
| DEVICE = "cuda:0" # or "cpu" |
EricSchrock
left a comment
There was a problem hiding this comment.
Actually, looks like I have to approve to clear the "changes requested". Approving to unblock.
I agree, and i think |
This pull request introduces new example scripts and documentation updates to support multimodal and hyperparameter-tuned drug recommendation models on the MIMIC-IV dataset, specifically using AdaCare and RETAIN model variants. It also improves the usability and clarity of existing example scripts.
Major additions and improvements:
1. New example scripts for drug recommendation:
drug_recommendation_mimic4_adacare.pydemonstrating how to use the AdaCare model for drug recommendation on MIMIC-IV, including data loading, task setup, training, evaluation, and prediction inspection.drug_recommendation_mimic4_adacare_optuna.py, providing an end-to-end example of hyperparameter tuning for AdaCare using Optuna, covering parameter search, training, and evaluation.drug_recommendation_mimic4_multimodal_retain.py, showcasing the use of the newMultimodalRETAINmodel for handling both sequential and non-sequential EHR features, with detailed steps and comparison to vanilla RETAIN.2. Documentation enhancements:
MultimodalAdaCareandMultimodalRETAINclasses, improving discoverability and API reference completeness. [1] [2]3. Usability and bug fixes in existing examples: