Add/mm retain adacare by jhnwu3 · Pull Request #885 · sunlabuiuc/PyHealth

jhnwu3 · 2026-03-10T20:38:04Z

This pull request introduces new example scripts and documentation updates to support multimodal and hyperparameter-tuned drug recommendation models on the MIMIC-IV dataset, specifically using AdaCare and RETAIN model variants. It also improves the usability and clarity of existing example scripts.

Major additions and improvements:

1. New example scripts for drug recommendation:

Added drug_recommendation_mimic4_adacare.py demonstrating how to use the AdaCare model for drug recommendation on MIMIC-IV, including data loading, task setup, training, evaluation, and prediction inspection.
Added drug_recommendation_mimic4_adacare_optuna.py, providing an end-to-end example of hyperparameter tuning for AdaCare using Optuna, covering parameter search, training, and evaluation.
Added drug_recommendation_mimic4_multimodal_retain.py, showcasing the use of the new MultimodalRETAIN model for handling both sequential and non-sequential EHR features, with detailed steps and comparison to vanilla RETAIN.

2. Documentation enhancements:

Updated AdaCare and RETAIN model documentation to include the new MultimodalAdaCare and MultimodalRETAIN classes, improving discoverability and API reference completeness. [1] [2]

3. Usability and bug fixes in existing examples:

Ensured all example scripts use a consistent dataset cache directory for reproducibility and efficiency.
Fixed sample access in the RETAIN example to use the correct dataset indexing method, improving code reliability.

jhnwu3 · 2026-03-10T20:39:40Z

pyhealth/datasets/base_dataset.py

+            lock_path = Path(cache_dir) / "build.lock"
+            with FileLock(str(lock_path), timeout=7200):
+                # Re-check inside the lock — another process may have built it
+                # while we were waiting.


Need help looking into file read checks like this. Mainly, for when the agent spins up multiple caching jobs at once, which leads to downstream issues like overwriting files, etc.

EricSchrock

I ran out of time and couldn't get to a full review but I reviewed the added docs, examples, and tests and skimmed the model and dataset changes. I found a few issues with the examples and tests.

@Logiquo might have better feedback on the models themselves and the changes to base_dataset.py.

examples/drug_recommendation/drug_recommendation_mimic4_adacare.py

tests/core/test_adacare.py

tests/core/test_retain.py

examples/mortality_prediction/mortality_mimic4_multimodal_adacare.py

pyhealth/datasets/base_dataset.py

Logiquo · 2026-03-16T04:42:08Z

The dataset part LGTM, other than we don't need to check bin files and we do not need to invalidate caches (or if we do want to invalidate caches, the best way should be using StreamingDataset to read it to see if it throws exception).

Checking *.bin file won't help in this case.

jhnwu3 · 2026-03-16T17:49:44Z

Thanks for the catch @EricSchrock . I've deleted the redundant test cases and edited some things so they don't break.

I've asked @Logiquo for a re-review/approval. I think the main thing is just making it more robust to user error.

One of the problems is that there can be race conditions if the user is not careful and wants to run multiple jobs at the same time with the same cache_dir. I've only discovered this myself after running the agent and cleansing an existing corrupted cache where he submits multiple parallel jobs, and I think it's not impossible for users to follow that same pattern randomly in their development cycle.

EricSchrock

Flagging a couple more cuda:# instances to fix. Submitting as "Comment" instead of "Approve" to clear "Changes Requested" but leave the final approval to @Logiquo.

EricSchrock · 2026-03-17T00:51:31Z

examples/drug_recommendation/drug_recommendation_mimic4_adacare_optuna.py

+    # ---------------------------------------------------------------------------
+    # STEP 4: Define Optuna objective
+    # ---------------------------------------------------------------------------
+    DEVICE = "cuda:3"   # or "cpu"


Suggested change

DEVICE = "cuda:3" # or "cpu"

DEVICE = "cuda:0" # or "cpu"

EricSchrock · 2026-03-17T00:51:45Z

examples/drug_recommendation/drug_recommendation_mimic4_retain.py

+    # STEP 5: Train the model
+    trainer = Trainer(
+        model=model,
+        device="cuda:4",  # or "cpu"


Suggested change

device="cuda:4", # or "cpu"

device="cuda:0", # or "cpu"

EricSchrock · 2026-03-17T00:51:56Z

examples/drug_recommendation/drug_recommendation_mimic4_rnn_optuna.py

+    # ---------------------------------------------------------------------------
+    # STEP 4: Define Optuna objective
+    # ---------------------------------------------------------------------------
+    DEVICE = "cuda:2"   # or "cpu"


Suggested change

DEVICE = "cuda:2" # or "cpu"

DEVICE = "cuda:0" # or "cpu"

EricSchrock

Actually, looks like I have to approve to clear the "changes requested". Approving to unblock.

Logiquo

LGTM, thanks!

Logiquo · 2026-03-17T01:21:24Z

Thanks for the catch @EricSchrock . I've deleted the redundant test cases and edited some things so they don't break.

I've asked @Logiquo for a re-review/approval. I think the main thing is just making it more robust to user error.

One of the problems is that there can be race conditions if the user is not careful and wants to run multiple jobs at the same time with the same cache_dir. I've only discovered this myself after running the agent and cleansing an existing corrupted cache where he submits multiple parallel jobs, and I think it's not impossible for users to follow that same pattern randomly in their development cycle.

I agree, and i think LockFile should be sufficient to solve this issues.

jhnwu3 added 8 commits January 27, 2026 12:03

init commit

4f67eae

Merge branch 'master' into add/mm_retain_adacare

f20c374

RNN memory fix

4323bd8

Merge branch 'master' into add/mm_retain_adacare

69cef5d

add example scripts here

427d5ab

more bug fixes?

77cfa60

commit to see new changes

51142b2

add test cases

b5b563c

jhnwu3 commented Mar 10, 2026

View reviewed changes

fix basemodel leakage of args

39db84f

jhnwu3 requested review from EricSchrock and Logiquo March 10, 2026 21:02

EricSchrock requested changes Mar 16, 2026

View reviewed changes

Logiquo reviewed Mar 16, 2026

View reviewed changes

pyhealth/datasets/base_dataset.py Outdated Show resolved Hide resolved

jhnwu3 added 3 commits March 16, 2026 12:01

fixes to tests and examples

e7c2cfa

more examples

43bab1c

reduce unnecessary checks, enable crashing on when a cache is invalid

59337d8

jhnwu3 requested a review from Logiquo March 16, 2026 17:47

EricSchrock reviewed Mar 17, 2026

View reviewed changes

EricSchrock approved these changes Mar 17, 2026

View reviewed changes

Logiquo approved these changes Mar 17, 2026

View reviewed changes

Conversation

jhnwu3 commented Mar 10, 2026

Uh oh!

jhnwu3 Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

EricSchrock left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Logiquo commented Mar 16, 2026

Uh oh!

jhnwu3 commented Mar 16, 2026

Uh oh!

EricSchrock left a comment

Choose a reason for hiding this comment

Uh oh!

EricSchrock Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

EricSchrock Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

EricSchrock Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

EricSchrock left a comment

Choose a reason for hiding this comment

Uh oh!

Logiquo left a comment

Choose a reason for hiding this comment

Uh oh!

Logiquo commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants