Skip to content

Conversation

@qwfy
Copy link

@qwfy qwfy commented Jan 16, 2026

Fixes #{issue number}
None.

Description:

Allow passing extra arguments to the distributed backend, this enables the following usage pattern:

torchrun ... main.py ...

# Note that nproc_per_node is not specified.
with idist.Prallel(backend=foo, timeout=bar) as par:
    ....

Which is not previously possible.

Note that kwargs in _NativeDistModel._create_from_backend is unused, so there is still a chance that the args passed via spawn_args is still unused, but it works for explicitly listed arguments like timeout

Check list:

  • New tests are added (if a new feature is added)
  • New doc strings: description and/or example code are in RST format
  • Documentation is updated (if required)

@github-actions github-actions bot added the module: distributed Distributed module label Jan 16, 2026
@vfdev-5
Copy link
Collaborator

vfdev-5 commented Jan 19, 2026

Thanks for the PR @qwfy
It makes sense, but reusing spawn_kwargs as kwargs for initialization method can be misleading.

Let's rename spawn_kwargs to kwargs and update the docstring:

- spawn_kwargs: kwargs to ``idist.spawn`` function.
+ kwargs: kwargs to the function responsible to initialize a processing group, e.g. ``idist.spawn`` or ``idist.initialize``.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: distributed Distributed module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants