-
Notifications
You must be signed in to change notification settings - Fork 564
Description
The mortality_prediction, drug_recommendation, length_of_stay_prediction, and readmisison_prediction tasks all exclude visits that do not have "conditions" AND "procedures" AND "drugs". The first three exclude them with the following pattern.
if len(conditions) * len(procedures) * len(drugs) == 0:
continueHowever, the readmission_prediction tasks short circuits the generation of the procedures and drugs lists if there are no conditions and short circuits the generation of the drugs list if there are no procedures.
conditions = ...
if len(conditions) == 0:
continue
procedures = ...
if len(procedures) == 0:
continue
drugs = ...
if len(drugs) == 0:
continueThe two approaches are logically equivalent. The question is whether the latter is more performant. It's hard to say what Python does with this under the hood.
I did a quick test on one of the small demo/synthetic datasets (I don't remember which one) and there was not a noticeable difference in sample generation time. However, it's very possible we would notice a difference between the two approaches if we ran sample generation on the full MIMIC3 and/or MIMIC4 datasets.
Run said full dataset test/s and if the performance is noticeably different, update all the tasks to use the short circuit ifs.