[skyrl-train][Harbor] Upstream Harbor training code to main skyrl-train

The OpenThoughts-Agent project has been running RL training with SkyRL-train and Harbor for a while.

The integration of Harbor+SkyRL-train allows users to do RL training for terminal-use style tasks by just focusing on the data.

See the initial release: https://www.openthoughts.ai/blog/agent

For that project, all the code resided in a fork (so that we could make project-specific hot fixes): https://github.com/mlfoundations/SkyRL

This issue tracks the upstreaming of those changes to the main branch of SkyRL, which will be much more robust than what is currently there on the main branch.

- https://github.com/NovaSky-AI/SkyRL/pull/868
- https://github.com/NovaSky-AI/SkyRL/pull/890
- https://github.com/NovaSky-AI/SkyRL/pull/970
- https://github.com/NovaSky-AI/SkyRL/pull/981
- https://github.com/NovaSky-AI/SkyRL/pull/923
- https://github.com/NovaSky-AI/SkyRL/pull/985
- 3/N: step-wise training
- 4/N: async RL + Harbor
- 5/N: Integrate with Harbor's QueueOrchestrator to let Harbor handle retries and concurrency limit: https://github.com/laude-institute/harbor/pull/527
- Integration test, harbor configs (ensure all the defaults still work)

- Other small ones
  - https://github.com/NovaSky-AI/SkyRL/pull/987

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[skyrl-train][Harbor] Upstream Harbor training code to main skyrl-train #866

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[skyrl-train][Harbor] Upstream Harbor training code to main skyrl-train #866

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions