Yuting D.

Verified account

@maybeTuring

Building @SantoriLabs. Ex-GM@scale_AI, ex-@Google Sustainable founding, Human-ai-interaction

San Francisco, CA

Joined April 2013

166 Following

170 Followers

163 Posts

about 21 hours ago

The untrainable is the sexy name, the boring names : tacit knowledge, subjectivity, higher order thinking, (seems like even taste is out of the door now). We love to make it sound so easy to eval and verify. This might be true in coding but the reality is never just that. 1. Think about those AI outreach emails you've been getting. Grammarly correct, a lot of them use this "fake lowercase style so it doesn't look like ai". All of them are bad. I know when I say one but I cannot articulate a clear criteria on why. How good are the top models at generating them? Same goes to AI comments. Because style is the hardest to eval or even articulate. 2. A lot of ai-consulting work is really helpful define or transfer that judgement on good, or good enough. 3. I recently talked to an ai software agency. Their entire pitch is on functional, matching spec. Nothing about it'll be built well. I think we are still a year away from actually passing the benchmarks. Unknown what that looks like, but since it's all about the unarticulatable. It'll be a lot less visible.

1 day ago

https://t.co/Hw02laH9yp

90

2K

177

5K

1M

0

2

0

1

256

3 days ago

@tyfeng1997 @flubtitle Crazy how many people are stuck in the past. A so-called engineer influencer too

0

0

0

0

7

4 days ago

1. I spent a lot of time at scale labeling data myself, never thought it was beneath me. Instead it's how we developed quality criteria, instructions, how we provide partnerships to our customers. My co-founder @flubtitle and I built out new labeling products for LLMs in 2023 (well it's old now) because we did labeling, ran queues (meaning we were running real projects and needed to deliver data). Not because we got a prd from anyone. 2. One of the first things we built at Santori Labs is our voice-first eval/label flow and a roleplay system. I spent hours every week going through data, thinking about what is good vs not. 3. Imagine an engineer who thinks they are too good to do that, but instead they are just here to execute a prd that is given to them. 4. I don't think data is all you should do, but it's still one of the most important things you can do. Labeling is one form, another one is looking at agent traces. If you don't see why that's important, you are stuck in the past. 5. It's painful looking at data. You think you just look at it and you just know if this is good. It's never that. It's always the messy middle of "meh". That's why the design principle for our own data flow is that: data is a focused act, and the product needs to encourage focus

5 days ago

Just learned: Software engineers used to do manual data labeling at Scale AI while Alex Wang was CEO. After he left, new leadership joined, and were HORRIFIED to learn this. Stopped it ASAP Now at Meta, software engineers are assigned manual data labeling... see the pattern?

204

6K

158

1K

1M

3

29

2

10

4K

3 days ago

@craig_certo @flubtitle Looking is thinking! Yea I think so many people are still stuck in the dog vs hotdog era, despite everything

0

1

0

0

13

Who to follow

dreaming of better business models for social networks. founding Tome. Formerly @meta, @Uchicago alumni. | 📖 Will read any book you recommend

Verified account

building @vistaralabs agent ecosystem, @b402ai, @zaara_ai - AI factory | ex @McKinsey

I do a fantasy webcomic called Flipside! I also stream art & video games on Twitch!

4 days ago

@0xRasm @flubtitle real good ai summary

0

1

0

0

55

5 days ago

@abhinavsharma @GergelyOrosz Not just emphasizing with user but truly is how we developed those quality criterias for LLMs too^

0

0

0

0

44

5 days ago

@fatihkurtoglu @GergelyOrosz ^ ex-scalien hello 👋!

0

1

0

0

30

6 days ago

@Dr_TUC throughouting => why need the maxing part at all!!!

0

0

0

0

4

8 days ago

https://t.co/UgQhoD31AD

16

74

14

104

42K

6 days ago

@alexanderbenz Yea I thought a framework migration would be that but nope 🙈

1

1

0

0

11

6 days ago

@chongz @aarnogau @mercury Should we try rho

1

1

0

0

262

7 days ago

@JoshConstine #steelmanmaxxing lfg

0

1

0

0

61

7 days ago

@alexanderbenz 100%! Knowing when to stop is hard thou. Don't ask how I know 🤨

0

0

0

0

5

7 days ago

@aiechrl @pangram I used it to push back on my thinking, challenge from different perspectives. I found that quite good. I know I slapped that question over but I actually agree. It was just something to think through. "I'm not against AI, I'm against the easy"

1

0

0

0

14

7 days ago

@aiechrl @pangram https://t.co/CnfdYD4HXM wdyt

2

1

0

0

30

7 days ago

@alexanderbenz Ya, I had a friend who set up an agent to wake up every half an hour to make sure the job didn't crash or error. That's legit but very few of them were actually like that

1

1

0

0

30

7 days ago

@aiechrl @pangram What's the score for, say, one of David Foster Wallace's essays? Or if I decide to manually insert em-dashes I used to work on llm data, I know how human preferences work

0

0

0

0

16

8 days ago

@aiechrl @pangram what's the verdict

1

0

0

0

60

8 days ago

@flubtitle Turns out: claude code === costco Going for the fruits, back with the booze

0

0

0

0

59

8 days ago

@slothi_mb And that is either the em-dash vs "i'm written by human cause i'm all small caps"

0

1

0

0

57

8 days ago

@RyanMWexler The word "slop" makes me head spin. Also it's sad that some of the styles are turning into slop not because they are, but because they have become the AI standard :rip

0

0

0

0

64

Last Seen Users on Sotwe

Trends for you

Most Popular Users