Philipp Moritz, Tyler Griggs, and the SkyRL Team

🗓️ Posted: October 14, 2025

<aside>

We are happy to announce SkyRL tx 0.0.2 🥳

SkyRL tx is an open source library that implements a backend for the Tinker API and allows people to set up their own Tinker-like service running on their own hardware.

With v0.0.2, SkyRL tx should actually be useful for training models now that it supports the most important LoRA layers.

See the code at SkyRL tx on GitHub

</aside>

Updates

Since the initial release, we made a number of enhancements:

We now support MultiLoRA for the MoE model (#432) — thanks Sarah for the insightful comments
We now support MultiLoRA for the attention layers (#435)
We now support different LoRA ranks for each model (#405)
We now support downloading checkpoints (#433) – thanks Lucas for the contribution!
We now support JIT compilation which speeds up the forward/backward passes (#452)
We now support both gradient accumulation and microbatches, which allows training models in memory constrained settings (#460)
We also did a number of internal cleanups and future proofing that will make it easier to modify and maintain the project going forward (#416, #415, #464) – thanks Kranthi for contributing!

There are also some in-flight PRs, we hope to cut a new release including them soon (#461, #466, #470). Thanks Lucas and Colin for the contributions!

There are still several missing features and remaining limitations, and we hope more people will use and contribute to the project as we continue to iterate.

To make it easier to get started, we show you can train Qwen/Qwen3-4B at the end of this blog post.

Roadmap

We are continuing to iterate on SkyRL tx and want to give you a sense of our plan:

Performance improvements: There’s a lot to do on this front, from working out the different sharding mechanisms (e.g. sequence parallel, pipeline parallel, expert parallel) to optimizing the current Jax code and improving the memory utilization. Also, if needed, we can integrate more optimized kernels for the different platforms we support (like GPUs or TPUs).
Testing and validation: More accuracy testing to further validate implementation correctness
LoRA coverage: Add support for remaining LoRA layers (e.g., embedding layer) and implement configuration parameters to select which layers to target with LoRA.