Here's a stack for building custom agent eval pipelines.
For AI researchers and engineers who care about quality of AI outputs this is practical tutorial and building blocks to scale your evals.
Offload batch processing to Tracto.
https://t.co/M6hBvYdcqg
While you wait run serverless GPU workloads on us. We aren't sending them to AWS but process in our own data center. Dynamic scaling, pay as you go.
https://t.co/0zBxJUbz9S - 30% cheaper than Modal
playground cluster with samples at https://t.co/lfFFGuucsE
Yep. Expect a lot of internet services to break soon unless things get better. Can’t pull images from ECR for instance.
The Modal team is randomly at an offsite in Italy so we’re all hands on deck!
AI is pushing traditional data stacks to their breaking point. ⚠️
In his new blog, Maxim Akhmedov, Head of TractoAI at Nebius, explains why unified, scalable infra is essential for multimodal data + inference at AI scale and why DIY point solutions are too costly to manage.
This is a must-read for platform teams.
👉 See the blog: https://t.co/xo25WbS8gi
Excited to see @pleiasfr featured in @sciencedirect.
Our AI infra allowed the team to run multi node LLM training with maximum customizations and control.
IN NEWS: Nebius lands a $17.4B partnership with Microsoft to provide high-performance AI infrastructure.
The CRO @marcboroditsky told us: "We believe the real opportunity is delivering the AWS equivalent for AI.”
“The $100B+ business opportunity is servicing the hyperscaler-level requirements of this new class of AI customers, while also supporting the single engineer building from scratch as they scale.”
One of the key components for training code #agents is access to verifiable real-world tasks, which are not easy to collect. We automated and scaled this process using @TractoAI as our main platform for data processing and storage, and decided to #opensource it. 3/4
If you itching to experiment with gpt oss 120b using vllm here's a working notebook on our GPU sandbox.
Run, experiment, share for free.
https://t.co/ehHLY56XBD
Our customer SieveStack is building the world’s largest dataset of molecular simulations to train a multi-layered stack of foundational models and advance dynamics-driven drug discovery. Read the story: https://t.co/bitTp032df
🔹 Goal:
To unlock treatment pathways for hard-to-treat diseases with AI-powered, physics-based modeling and biochemistry.
🔹 Solution:
To generate high-precision molecular dynamics simulations and optimize model training with a mixed-precision approach — FP32 for accuracy and BF16 for performance.
🔹 Result:
SieveStack worked with Nebius and @TractoAI support teams to prototype, debug and scale foundational models with GPU utilization of more than 90%, revealing drug-target interactions beyond the reach of lab experiments.
Anyone at ICML wanna see the future of world models? We're walking around with a laptop running our world model at 500 FPS+ (fully local). Would love to demo/chat with anyone interested
What if models could learn which problems _deserve_ deep thinking?
No labels. Just let the model discover difficulty through its own performance during training.
Instead of burning compute 🔥💸 on trivial problems, it allocates 5x more on problems that actually need it ↓
Free h100 for AI experiments and no strings attached. We just dropped a dozen of notebooks with product grade code for LLM tuning, inference and even data prep. Head on over to https://t.co/lfFFGutED6 and spin up your job in less than 30 sec
AI workloads of the near future will be data intensive. We are building to it. Excited to be data management infra for @nebiusai AI R&D team. Onwards!
Our AI R&D team presents SWE-rebench, a new benchmark for evaluating agentic LLMs on a continuously updated and decontaminated set of real-world software engineering tasks mined from real GitHub repos.
Explore the leaderboard and methodology behind it: https://t.co/WGOcul8eo9
Announcing the greatest bundle for AI & data teams!
TractoAI now supports https://t.co/p1a9bh1LT0 - with built-in storage, job scheduler, workflow manager, and observability dashboard.
Why manage 5-10 different tools and vendors? Get one integrated stack and save $$$
Read how @synth_labs, a startup developing AI solutions tailored for logical reasoning, is advancing AI post-training with our @TractoAI: https://t.co/WvwvcYshkL
🔹 Goal:
Develop an ML system that empowers reasoning models to surpass pattern matching and implement sophisticated search and exploration strategies.
🔹 Solution:
Build scalable training infrastructure. Reasoning models require large datasets and distributed computing, making multi-node training and high-performance GPUs essential for effective results.
🔹 Results:
Using TractoAI, a serverless platform on Nebius AI Cloud, SynthLabs trains AI reasoning models, laying the foundation for next-gen reasoning systems and enterprise use.
Our approach to AI compute is different. Learn about serverless GPU model and when it's a better choice vs dedicated compute.
We are talking at @nebiusai event this Thurs in SF.
https://t.co/ipN9JXwKQH