Wai Tong @wt_chung - Twitter Profile

wt_chung retweeted

about 1 month ago

Last week we announced DeepSeek-V4. Today we’re sharing a closer look at DeepSeek-V4 Pro on Together AI: 512K context, controllable reasoning modes, and cached-input pricing for long-context workloads. Read more: https://t.co/T1mlIq10Cr

2

24

4

5

3K

wt_chung retweeted

John Hewitt @johnhewtt

about 1 month ago

New paper! Subliminal learning—transferring hidden signals between language models—is more powerful than we thought. By biasing the teacher with a steering vector instead of a prompt, we achieve strong, consistent transfer, which we use to study its mechanisms. w/@GeorgeMorgulis

johnhewtt's tweet photo. New paper! Subliminal learning—transferring hidden signals between language models—is more powerful than we thought. By biasing the teacher with a steering vector instead of a prompt, we achieve strong, consistent transfer, which we use to study its mechanisms. w/@GeorgeMorgulis https://t.co/9rMRB3BbjE

6

302

35

198

20K

wt_chung retweeted

Shang Zhu @ShangZhu18

3 months ago

Please check out our recent work on understanding divide-and-conquer approach for long context tasks! Led by @nehzux

0

4

1

0

395

wt_chung retweeted

Together AI @togethercompute

3 months ago

Introducing the official Together MCP server! Use it in your favorite coding agent to build AI apps, fine-tune models, or spin up clusters faster.

togethercompute's tweet photo. Introducing the official Together MCP server!

Use it in your favorite coding agent to build AI apps, fine-tune models, or spin up clusters faster. https://t.co/I8lchoyVar

3

14

3

2K

Who to follow

Jian Fang

@fangjian19

Professor of IMech, CAS, obsessed with high-order schemes, high-fidelity simulations and high-speed aerodynamics.

Sivaprasad G

@sprsdg

Curious and Perseverant. Experimental and Computational Fluid Dynamicist. Let’s talk about Machine learning in fluid mechanics! #PMRF @IITMadras

Multiscale Fluid Mechanics Lab - Lluís Jofre

@FluidMechLJC

Associate Professor @la_UPC. ERC-StG 2021 - SCRAMBLE | ERC-PoC 2025 - STELLAR. Interested in the Science & Engineering of Multiscale Fluid Mechanics.

wt_chung retweeted

Together AI @togethercompute

3 months ago

We’re back at #NVIDIAGTC and excited about this year’s lineup. Join us for sessions featuring leaders across Together AI, and visit our booth #1213 for live demos and a few can’t-miss activations. Check out last year’s highlights ⤵️

togethercompute's tweet photo. We’re back at #NVIDIAGTC and excited about this year’s lineup. Join us for sessions featuring leaders across Together AI, and visit our booth #1213 for live demos and a few can’t-miss activations.

Check out last year’s highlights ⤵️ https://t.co/iPNNtyH5EB

0

8

1

0

2K

wt_chung retweeted

James Zou @james_y_zou

3 months ago

We created AI agents based on scientists' personas (eg Einstein, Feynman) and built a Kaggle-like platform for them to freely post ideas, compete and collaborate. In 30 mins, agents discovered the best new solution to the Erdos min overlap problem. Great job by @federicobianchy @ykwon_0407! The solution is here https://t.co/J2gYscgAzv

james_y_zou's tweet photo. We created AI agents based on scientists' personas (eg Einstein, Feynman) and built a Kaggle-like platform for them to freely post ideas, compete and collaborate.

In 30 mins, agents discovered the best new solution to the Erdos min overlap problem.

Great job by @federicobianchy @ykwon_0407!
The solution is here https://t.co/J2gYscgAzv

75

2K

230

1K

185K

wt_chung retweeted

Together AI @togethercompute

4 months ago

We’re open-sourcing CoderForge-Preview — 258K test-verified coding-agent trajectories (155K pass | 103K fail). Fine-tuning Qwen3-32B on the passing subset boosts SWE-bench Verified: 23.0% → 59.4% pass@1, and it ranks #1 among open-data models ≤32B parameters. Thread on the data generation pipeline 🧵

togethercompute's tweet photo. We’re open-sourcing CoderForge-Preview — 258K test-verified coding-agent trajectories (155K pass | 103K fail).

Fine-tuning Qwen3-32B on the passing subset boosts SWE-bench Verified: 23.0% → 59.4% pass@1, and it ranks #1 among open-data models ≤32B parameters.

Thread on the data generation pipeline 🧵

15

524

68

401

188K

wt_chung retweeted

Amazon Science

@AmazonScience

4 months ago

🎙️ In continued conversation on the "Making a Mind" podcast, cognitive scientist @drperszyk sits down with AI researcher @denizbirlikci from Amazon's AGI Lab to explore how reinforcement learning is transforming AI agents into dependable tools. A system that succeeds once is a demo—a system that succeeds every time is a breakthrough. Episode 4 is live now, listen in to learn more.

0

6

1

2

1K

wt_chung retweeted

Together AI @togethercompute

5 months ago

In a report in @NatureBiotech, co-authors @james_y_zou, @federicobianchy, @oq_35, @nityathakkar_, and @EricDSun, tested research findings from #Agents4Science — the first conference where AI agents author and review papers.

1

9

2

1

2K

wt_chung retweeted

Amazon Science

@AmazonScience

5 months ago

Cognitive scientist at Amazon’s AGI Lab, @drperszyk from Amazon's AGI Lab explores the science of intelligence in "Making a Mind," a podcast featuring leading AI researchers. Episodes 1 & 2 are now live —listen in to hear from two members of the AGI Lab technical staff, product lead, Kelsey Szot and engineer @jasonlaster11, to learn more about the evolution from LLMs to modern agents and why developing high quality training environments is as fundamental as the model itself: https://t.co/wBCbWCGNz3

1

13

5

6

5K

wt_chung retweeted

Amazon News

@amazonnews

5 months ago

Building AI that thinks with us means tackling some of the field’s toughest challenges. 🎧 Making a Mind, a new podcast hosted by cognitive scientist Dr. Danielle Perszyk from Amazon’s AGI Lab, explores the science of intelligence with leading AI researchers. Episodes 1 & 2 are live now. 👇 https://t.co/xC2LQZRd0Q

32

34

9

5

21K

wt_chung retweeted

Together AI @togethercompute

6 months ago

Introducing: AI Native Conf — our inaugural, one-day event where founders and builders come together to dive into best practices and techniques across the AI lifecycle, from model training and fine-tuning to massive-scale inference. March 5. San Francisco. Request to attend #AINativeConf today: https://t.co/8Xp5nSRwWs

1

20

4

5

8K

wt_chung retweeted

Together AI @togethercompute

6 months ago

Introducing NVIDIA Nemotron 3 Nano, a fully open 30B with 3B active parameter hybrid MoE model engineered for maximum efficiency and benchmark-leading accuracy. AI natives can now use Nemotron 3 Nano on Together AI — with fast, reliable inference for specialized agentic systems at production scale.

togethercompute's tweet photo. Introducing NVIDIA Nemotron 3 Nano, a fully open 30B with 3B active parameter hybrid MoE model engineered for maximum efficiency and benchmark-leading accuracy.

AI natives can now use Nemotron 3 Nano on Together AI — with fast, reliable inference for specialized agentic systems at production scale.

2

18

5

7

4K

wt_chung retweeted

Ben Blaiszik

@BenBlaiszik

6 months ago

Lots of announcements from DOE in the AI4Science field - >$320 million in new investments! You can read more about the Genesis Mission, the American Science Cloud (AmSC), the Transformational AI Models Consortium (ModCon), along with 14 projects in robotics and automation and 37 in foundational AI in science applications here and summarized in the attached image. 🔗 https://t.co/vTtnR4F8X5 I'm excited for my team to help improve our understanding of catalysis and fusion materials via the Integrated Scientific Agentic AI for Catalysis (ISAAC) led by Dimosthenis Sokaras at @SLAClab and Maria Chan at @argonne and CascAIde for understanding fusion materials led by Paul Romano; and also to see Rick Stevens and @ianfoster leading ModCon to develop the foundational models and collect the data needed to advance science. Stay tuned!

BenBlaiszik's tweet photo. Lots of announcements from DOE in the AI4Science field - >$320 million in new investments!

You can read more about the Genesis Mission, the American Science Cloud (AmSC), the Transformational AI Models Consortium (ModCon), along with 14 projects in robotics and automation and 37 in foundational AI in science applications here and summarized in the attached image.

🔗 https://t.co/vTtnR4F8X5

I'm excited for my team to help improve our understanding of catalysis and fusion materials via the Integrated Scientific Agentic AI for Catalysis (ISAAC) led by Dimosthenis Sokaras at @SLAClab and Maria Chan at @argonne and CascAIde for understanding fusion materials led by Paul Romano; and also to see Rick Stevens and @ianfoster leading ModCon to develop the foundational models and collect the data needed to advance science. Stay tuned!

3

33

5

16

2K

wt_chung retweeted

Together AI @togethercompute

6 months ago

Managing AI agents and a team of people are more similar than you’d think. Our VP, Kernels, @realDanFu, shares his three lessons learned from building, managing, and scaling AI agents. 🔴Full video: https://t.co/MGTAFC4nhB

0

9

3

4

1K

wt_chung retweeted

Together AI @togethercompute

6 months ago

We’re taking the first step toward production-grade RL on the AI Native Cloud. Together AI + @AIatMeta's team are partnering to bring high-performance reinforcement learning to real agentic systems — long-horizon reasoning, tool use, and multi-step workflows. Check out the first TorchForge integration: https://t.co/jW2NjSLBYy

3

21

2

10

5K

Wai Tong @wt_chung

6 months ago

Proud to have contributed to these top LLM speeds! Check out our booth at #NeurIPS

Together AI @togethercompute

6 months ago

Together AI now offers the fastest inference for the most popular OSS LLMs including Qwen3 235B 2507, GPS-OSS-20B, and Kimi-K2-0905.

togethercompute's tweet photo. Together AI now offers the fastest inference for the most popular OSS LLMs including Qwen3 235B 2507, GPS-OSS-20B, and Kimi-K2-0905. https://t.co/1nIrFJ0Xff

2

20

2

10K

0

2

0

138

wt_chung retweeted

Ben Athiwaratkun

@ben_athi

8 months ago

Most speculative decoding research focuses on algorithms. But we know that data matters a ton! (e.g. no matter how good the spec algorithm is, if it's trained on bad & misaligned data, the speed will be poor) What if we build on algorithms that make data really shine?! In this work, we introduce ATLAS, a speculative decoding system that enables customization to your LLM traffic data, making the model speed blazing fast! https://t.co/PtNsavX8oC

1

24

5

7

5K

wt_chung retweeted

Junxiong Wang @_junxiong_wang

8 months ago

Excited about this! We’re getting over 500 TPS on Blackwell with DeepSeek-V3.1. The more you use it, the greater the speedup. Blog post: https://t.co/91IphGKaqa

0

14

5

7

7K

wt_chung retweeted

Tri Dao

@tri_dao

8 months ago

This work, led by @_junxiong_wang and @ben_athi, is a first step towards building AI systems that evolve and get better as you use them. More to come!

3

292

34

75

47K

Wai Tong

@wt_chung

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users