Investigate whether short circuit if statements speed up task sample processing

The `mortality_prediction`, `drug_recommendation`, `length_of_stay_prediction`, and `readmisison_prediction` tasks all exclude visits that do not have "conditions" AND "procedures" AND "drugs". The first three exclude them with the following pattern.

```py
if len(conditions) * len(procedures) * len(drugs) == 0:
    continue
```

However, the `readmission_prediction` tasks short circuits the generation of the procedures and drugs lists if there are no conditions and short circuits the generation of the drugs list if there are no procedures.

```py
conditions = ...
if len(conditions) == 0:
    continue

procedures = ...
if len(procedures) == 0:
    continue

drugs = ...
if len(drugs) == 0:
    continue
```

The two approaches are logically equivalent. The question is whether the latter is more performant. It's hard to say what Python does with this under the hood.

I did a quick test on one of the small demo/synthetic datasets (I don't remember which one) and there was not a noticeable difference in sample generation time. However, it's very possible we would notice a difference between the two approaches if we ran sample generation on the full MIMIC3 and/or MIMIC4 datasets.

Run said full dataset test/s and if the performance is noticeably different, update all the tasks to use the short circuit ifs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate whether short circuit if statements speed up task sample processing #843

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigate whether short circuit if statements speed up task sample processing #843

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions