Training: make resuming training from a checkpoint simple

It is currently quite hard to resume training from a checkpoint - you need to hardcode a specific checkpoint path.

It is also buggy:
* Training will overwrite itself with checkpoints starting counting iteration from 0, so you'd need to use a different path
* wandb will also overwrite old snapshots with new snapshots