Yonas Tesfaye @ymt1234 - Twitter Profile

ymt1234 retweeted

10 months ago

GPT-5 shows remarkable robustness for production instruction-following. On IFScale—our benchmark testing 100s of simultaneous constraints—it maintains >90% accuracy* through 500 instructions. Huge leap over previous bests o3 & gemini-2.5-pro (~69%@ 500). *run on 1 seed, 5 ongoing

djarosai's tweet photo. GPT-5 shows remarkable robustness for production instruction-following. On IFScale—our benchmark testing 100s of simultaneous constraints—it maintains >90% accuracy* through 500 instructions. Huge leap over previous bests o3 & gemini-2.5-pro (~69%@ 500).
*run on 1 seed, 5 ongoing https://t.co/UnybJD4Cph

2

51

15

5

6K

ymt1234 retweeted

Daniel J

@djarosai

11 months ago

How many instructions can your LLM follow at once? Production LLM systems juggle 10-100s of instructions: policies, style, safety rules, tool use--but when do they overload? We introduce IFScale, a new benchmark measuring how instruction following degrades as instructions scale🧵

2

23

5

13

9K

Yonas Tesfaye @ymt1234

almost 4 years ago

@data_beth @alex_gude Also, I love that the more adorable the small town the more trouble for the riders. Cobbles, narrow roads, turns….

0

1

0

Yonas Tesfaye @ymt1234

almost 4 years ago

@data_beth @alex_gude I’m not really following closely this year but I find the extended highlights on YouTube to be the right amount (30-45 mins) of action/context

0

Who to follow

Yonas Tesfaye @ymt1234

about 5 years ago

@data_pat Trying to come up with a silly response to this has led me down a sad Google path that started from “Jupyter notebooks in production” and has mostly left me shaken

1

0

Yonas Tesfaye @ymt1234

almost 6 years ago

@alex_gude Are you enjoy this year?

0

Yonas Tesfaye @ymt1234

almost 6 years ago

@alex_gude @_lab41 @alex_gude Time for a transformers powered update?

0

ymt1234 retweeted

Corey Quinn

@QuinnyPig

almost 7 years ago

For every retweet this gets, I will add an Uncomfortable @awscloud Truth to the thread.

17

2K

324

0

ymt1234 retweeted

arXiv CS-CL @arxiv_cscl

almost 7 years ago

Overton: A Data System for Monitoring and Improving Machine-Learned Products https://t.co/fgAOVbGEXH

0

12

6

3

0

ymt1234 retweeted

Jeremy Howard

@jeremyphoward

almost 7 years ago

"New State of the Art AI Optimizer: Rectified Adam (RAdam). Improve your AI accuracy instantly versus Adam, & why it works" It's been a long time since we've seen a new optimizer reliably beat the old favorites; this looks like a very encouraging approach! https://t.co/1MZmTbmFjn

18

2K

798

354

0

ymt1234 retweeted

Tim Kietzmann @TimKietzmann

about 7 years ago

On the choice of deep learning hyperparameters

38

6K

2K

157

0

ymt1234 retweeted

Sebastian Ruder

@seb_ruder

over 7 years ago

This is a super cool resource: Papers With Code now includes 950+ ML tasks, 500+ evaluation tables (including SOTA results) and 8500+ papers with code. Probably the largest collection of NLP tasks I've seen including 140+ tasks and 100 datasets. https://t.co/lTAGE7LGZY

seb_ruder's tweet photo. This is a super cool resource: Papers With Code now includes 950+ ML tasks, 500+ evaluation tables (including SOTA results) and 8500+ papers with code. Probably the largest collection of NLP tasks I've seen including 140+ tasks and 100 datasets.
https://t.co/lTAGE7LGZY https://t.co/wfSyTplBR3

38

2K

1K

326

0

Yonas Tesfaye @ymt1234

over 7 years ago

@data_beth It was quite the rollercoaster indeed!

0

ymt1234 retweeted

PyTorch

@PyTorch

over 8 years ago

Monaural Sound Separation (input: song with vocals and instruments, output: only vocals) using MaD TwinNet architecture, from Drossos et. al. Online demo, PyTorch models and arxiv paper available: https://t.co/0pvxrHlJc1

PyTorch's tweet photo. Monaural Sound Separation (input: song with vocals and instruments, output: only vocals) using MaD TwinNet architecture, from Drossos et. al.
Online demo, PyTorch models and arxiv paper available:
https://t.co/0pvxrHlJc1 https://t.co/7EtrwFhtKg

4

192

71

1

0

ymt1234 retweeted

John McMurtrie @McMurtrieSF

over 8 years ago

When even the dictionary burns you.

252

34K

9K

0

ymt1234 retweeted

Soumith Chintala

@soumithchintala

over 8 years ago

a birds-eye view into Facebook's datacenter infra https://t.co/i8MaG0BTji

2

386

137

1

0

ymt1234 retweeted

Jeff Dean

@JeffDean

over 8 years ago

Slides from my talk in yesterday's ML Systems workshop are now up at https://t.co/cbCLDMqF3T #NIPS2017

18

989

353

1

0

ymt1234 retweeted

Awni Hannun

@awnihannun

over 8 years ago

Last 4 years in MT: more parallel data, more recurrence! Last 8 days in MT: no parallel data, no recurrence! Impressive work from Facebook on unsupervised MT (https://t.co/65GNUl6Ldo) and Salesforce on non-autoregressive MT (https://t.co/0bE9B6eJLE).

0

112

53

0

ymt1234 retweeted

hardmaru

@hardmaru

over 8 years ago

@dennybritz

4

97

19

0

Yonas Tesfaye

@ymt1234

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users