Tenstorrent just dropped serious benchmarks on DeepSeek 671B, and they are worth watching.
On decode, they hit 350 tokens per second. That is more than double Fireworks and Google Vertex at 144 tsu, and over 12x faster than Novita at 28 tsu. On prefill (100k sequence length), they clock 4.0 seconds — right in the top tier, behind only Google Vertex at 1.4 seconds while beating everyone else between 6.0 and 8.5 seconds.
The cost story is even bigger. At high throughput, Tenstorrent delivers at $6 per million tokens while NVIDIA GPU setups jump to $30 and keep climbing. They compare Galaxy Blackhole with the GB300 NVL72 rack from NVIDIA, quoting SemiAnalysis benchmarks.
The advantage comes from Tenstorrent’s purpose-built inference architecture. It is engineered from the ground up to keep cost per token low even as throughput scales, unlike general-purpose GPUs that were originally designed for training workloads. For DeepSeek 671B specifically, this translates into dramatically better efficiency on the metrics that matter most to AI companies: speed + real dollar cost at high volume.
This is the structural edge they are betting the entire inference market on, and the real question is whether Tenstorrent could serve inference at scale, provided that they gain traction within the infrastructure industry.
SuperCluster 36 up and running. 4 Galaxy all to all in a torus. 9 Quads all to all connected. Looks like one computer to software.
More silicon, faster computer
MolmoBot, our open robotic manipulation suite trained entirely in simulation, now has code, training data, a data generation pipeline, & evals all available.
This puts our robotics models within reach of any research lab—no extensive real-world data collection required. 🧵
Programming microcontrollers with Swift has never been easier. @kubamracek introduces a new repository of example projects to help get you started. https://t.co/yf4fNpoEtl
We’re dedicated to sharing our work @browsercompany - so today we’re publishing our first post on building rich native experiences on Windows with Swift & open sourcing our swift-firebase repo
First up, interoperability! Windows APIs, COM, C++ and how they integrate with Swift🧵
We’ve updated SwiftSyntax to use the new parser written in Swift!
Performance is up to 15% faster, release binaries are half their previous size, and binaries are more portable on Linux.
Kudos to the SwiftSyntax folks who fixed several bugs we reported very quickly.
We have internships available in the programming languages, compilers, debuggers, and development infrastructure teams at Apple! You'll learn firsthand about these thrilling topics from some really awesome folks; no prior experience required! See here: https://t.co/6XqWGuMXw0