Skip to content

OOM for rf3 #198

@Admire7494

Description

@Admire7494

The command is rf3 fold inputs='/data2/wmq/foundry-production/P2_6ACA_rf.json' ckpt_path='/data2/wmq/.foundry/checkpoints/rf3_foundry_01_24_latest_remapped.ckpt' out_dir=logs/inference_outs/demo/P2_6ACA diffusion_batch_size=1

The json file is
{
"name": "P2_6ACA",
"components": [
{
"seq": A lenth of 1000 aa protein,
"msa_path": "/data2/wmq/foundry-production/P2.a3m",
"chain_id": "A"
},
{
"ccd_code": "AMP",
"chain_id": "B"
},
{
"ccd_code": "NDP",
"chain_id": "C"
},
{
"smiles": "C(CCC(=O)O)CCN",
"chain_id": "E"
}
]
}

The error is
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.35 GiB. GPU 0 has a total capacity of 23.65 GiB of which 2.64 GiB is free. Including non-PyTorch memory, this process has 21.00 GiB memory in use. Of the allocated memory 17.12 GiB is allocated by PyTorch, and 2.60 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Any suggestion for this large sequence?
Thanks a lot.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions