
Nebius Token Factory Data Lab
From production logs and existing data to better models.
Data Lab is the Token Factory workspace where production logs and existing data become reusable datasets for model improvement. Explore inference data, filter real-world signal, connect files from S3-compatible storage, and move curated datasets into post-training without stitching tools together.
Stop treating model improvement like a side project. Run it like a loop.
What you get
Faster iteration from production signal
Your production logs already contain the best clues about what the model gets right, where it breaks, and what needs to improve next. Nebius Token Factory Data Lab helps you turn that signal into reusable datasets faster, without a manual export-clean-upload cycle.
Work with data where it lives
Connect files from S3-compatible object storage directly into Token Factory Data Lab. Nebius works from metadata rather than creating an extra raw-data copy. The underlying data stays in the storage system your team already controls.
One path from data to model improvement
Explore data, curate a dataset, hand it into post-training, and bring the result back into production inside one Token Factory workflow. The value is not one isolated feature. It is the full loop from production signal to better models.
How the loop works
Step 1: Capture signal
Start from Token Factory inference data or bring in existing files from S3-compatible storage.
Step 2: Explore
Inspect, query, and filter the data to understand failure modes, edge cases, and high-value slices.
Step 3: Curate
Turn the useful subset into a reusable dataset for the next model-improvement cycle.
Step 4: Improve and redeploy
Send that dataset into post-training, deploy the improved model, and use the next round of production signal to keep iterating.
Build better models from real data

Explore inference logs and datasets
Use Token Factory Data Lab to inspect production inference data and existing datasets in one place. Filter by the dimensions that matter, isolate failure cases, and understand model behavior without switching tools.

Create reusable datasets
Turn filtered data into reusable datasets for the next iteration. Upload files directly or connect S3-hosted files, and keep the handoff from exploration to curation simple.

Move directly into post-training
Use the datasets you prepared in Token Factory Data Lab as the starting point for post-training workflows in Token Factory. Reduce friction between finding the right data and using it to improve the next model version.
Documentation and guides
Read the docs, walkthroughs, and examples that show how to go from logs and existing data to reusable datasets and model improvement.
From logs to better models
See how to turn production inference data into reusable datasets for the next training cycle.
Working with S3-connected data
Learn how Nebius Token Factory Data Lab handles connected files, metadata, and enterprise-friendly workflows around S3-compatible storage.
Post-training workflows
See how curated datasets move from Data Lab into Token Factory post-training.
One workspace for model iteration
One workspace for model iteration
Token Factory Data Lab gives ML teams a single place to work with inference logs, existing datasets, and the path into post-training. It is built for teams that want better data quality, faster iteration, and less workflow friction.

Built for governed model-improvement workflows
Nebius Token Factory Data Lab is designed for teams that want a tighter iteration loop without introducing another fragile data pipeline. Keep data where it already lives, move faster on curation, and keep the path back to model improvement inside Token Factory.

Questions and answers
For S3-connected files, Data Lab works from metadata needed for the workflow rather than creating an extra raw-data copy in a separate Nebius-managed store.