Part of Red River WestPart of Red River WestLearn more →

Macrodata Labs

Every strong model starts with great data

stage
Pre-seed
founders
Guilherme PenedoHynek Kydlíček
location
Paris

# about

Macrodata Labs builds training-data infrastructure for physical AI. Its open-source framework, Refiner, turns raw multimodal robotics data (trajectories, camera feeds, audio, language) into high-quality training datasets, running locally for development and scaling to an elastic serverless cloud with a single command. Built by the team behind FineWeb, the largest open LLM pre-training datasets.

# project

LICENSE
LANGUAGE
Jan 2026FIRST COMMIT
STARS
CONTRIBUTORS
DOWNLOADS