shae wang @shaeyuwang - Twitter Profile

Pinned Tweet

4 months ago

Last week, a Head of AI Transformation at a large enterprise told me: "We're holding off on buying observability/performance eval tooling. Models are improving so fast. Eventually they'll catch their own errors." That mindset is incredibly dangerous. Here's why 👇

1

5

0

201

shae wang

@shaeyuwang

about 5 hours ago

@vibester @0xMovez it's AI slop

0

2

shae wang

@shaeyuwang

22 days ago

Computer use agents is a temporary solution. Guardrails for blocking malicious bots will always need to exist (and even more important with AI). The goal for computer-use should be making it easy for websites to authenticate agents and give them scoped access.

0

51

shae wang

@shaeyuwang

about 2 months ago

@GaryMarcus AI is not deterministic, (b) will never happen, build your own gates properly and actually study engineering if you want to vibe-code

0

85

Who to follow

Craig DeWitt

@CryptoCwby

Founder https://t.co/jJtxqvrmrP

FactBlock

@FactBlock

The leading web3 ecosystem builder, delivering industry-leading events and providing blockchain advisory/consulting services since 2018. Follow @kbwofficial

Mojaloop Foundation

@mojaloop

A charitable nonprofit organization advancing the #financialinclusion mission of #Mojaloop

shae wang

@shaeyuwang

about 2 months ago

@triathenum @jdegoes @lifeof_jer It's entirely on the vibers to understand the ceiling and risks here, disclaimers and cautionary tales are everywhere if they aren't too lazy to check and invest in learning the basics

0

2

0

60

shae wang

@shaeyuwang

about 2 months ago

Super excited for Alex to join Ripple as our new Chief Risk Officer! 🫡

0

2

0

1

113

shaeyuwang retweeted

Om Patel

@om_patel5

3 months ago

I taught Claude to talk like a caveman to use 75% less tokens. normal claude: ~180 tokens for a web search task caveman claude: ~45 tokens for the same task "I executed the web search tool" = 8 tokens caveman version: "Tool work" = 2 tokens every single grunt swap saves 6-10 tokens. across a FULL task that's 50-100 tokens saved why does it work? caveman claude doesn't explain itself. it does its task first. gives the result. then stops. no "I'd be happy to help you with that." no "Let me search the web for you" no more unnecessary filler words "result. done. me stop." 50-75% burn reduction with usage limits getting tighter every week this might be the most practical hack out there right now

om_patel5's tweet photo. I taught Claude to talk like a caveman to use 75% less tokens.

normal claude: ~180 tokens for a web search task

caveman claude: ~45 tokens for the same task

"I executed the web search tool" = 8 tokens
caveman version: "Tool work" = 2 tokens

every single grunt swap saves 6-10 tokens. across a FULL task that's 50-100 tokens saved

why does it work? caveman claude doesn't explain itself. it does its task first. gives the result. then stops.

no "I'd be happy to help you with that." no "Let me search the web for you" no more unnecessary filler words

"result. done. me stop."

50-75% burn reduction

with usage limits getting tighter every week this might be the most practical hack out there right now

940

23K

1K

10K

3M

shae wang

@shaeyuwang

3 months ago

I've been using Obsidian for a week. Lesson for beginners: if you're just starting with Obsidian, build first, organize later There's no right or wrong structure, and you'll find your favorite one by spending time with it yourself

Andrej Karpathy

@karpathy

3 months ago

Wow, this tweet went very viral! I wanted share a possibly slightly improved version of the tweet in an "idea file". The idea of the idea file is that in this era of LLM agents, there is less of a point/need of sharing the specific code/app, you just share the idea, then the other person's agent customizes & builds it for your specific needs. So here's the idea in a gist format: https://t.co/NlAfEJjtJV You can give this to your agent and it can build you your own LLM wiki and guide you on how to use it etc. It's intentionally kept a little bit abstract/vague because there are so many directions to take this in. And ofc, people can adjust the idea or contribute their own in the Discussion which is cool.

1K

27K

3K

47K

7M

0

1

0

42

shae wang

@shaeyuwang

3 months ago

instead of detecting deepfakes, what if we redesign the trust/identity system so they don’t matter?

Modulate @modulate_ai

3 months ago

We built the world's best deepfake detection model, per @huggingface. Then priced it at $0.25/hour. The competition charges up to $150/hour for worse accuracy. Turns out "enterprise security" was just a pricing strategy. We blew it up. 🧵↓

modulate_ai's tweet photo. We built the world's best deepfake detection model, per @huggingface. Then priced it at $0.25/hour.

The competition charges up to $150/hour for worse accuracy.

Turns out "enterprise security" was just a pricing strategy.

We blew it up. 🧵↓ https://t.co/nKOSIb5u2x

48

266

42

133

14M

0

49

shae wang

@shaeyuwang

3 months ago

Tighter scope -> lower variance -> better calibration -> better performance on these specialized tasks Specialized agents are also inevitable economically, because cheaper compute + easier to finetune/debug The orchestration layer does not need to be an LLM to work extremely well

Santiago

@svpino

3 months ago

I really like the idea of having multiple specialized agents instead of a "general purpose" agent that tries to do it all. A few days ago, I read (sorry, I don't remember where) a study claiming that specialized agents, even when they are all using the same model, beat general agents by a mile. These guys are doing precisely that with an army of hyper-specific agents. And of course, they are following the ---Claw theme.

19

32

3

19

10K

0

44

shae wang

@shaeyuwang

4 months ago

the single internal AI strategy companies should have is let people build instead of "unifying AI access/infrastructure" regulating too early kills innovation and revenue opportunities

0

36

shae wang

@shaeyuwang

4 months ago

5. Causal effectiveness layer Observability measures behavior. We need to also measure outcome. A/B tests run on single lines of marketing copy detect meaningful impact all the time. That's why the experimentation segment is worth $3B. So what about agents?

0

2

0

42

shae wang

@shaeyuwang

4 months ago

Last week, a Head of AI Transformation at a large enterprise told me: "We're holding off on buying observability/performance eval tooling. Models are improving so fast. Eventually they'll catch their own errors." That mindset is incredibly dangerous. Here's why 👇

1

5

0

201

shae wang

@shaeyuwang

4 months ago

4. Runtime lawfulness Evaluation ≠ enforcement. Behind a 100% eval score on benchmarks, it's still just probabilities. We need circuit breakers, policy engines, and continuous runtime governance and steering.

1

2

0

52

shae wang

@shaeyuwang

5 months ago

AI will only accelerate whatever condition you’re already in. That’s why grit and discipline have never been more important in history. The pretty good will become extremely good, the mediocre won’t even realize they’re still mediocre.

1

0

93

shae wang

@shaeyuwang

over 3 years ago

as a first-time manager, i’m learning that reducing scope to focus on 2-3 max value initiatives >> being in every room let your only testimony be one that people can’t forget

0

3

0

shae wang

@shaeyuwang

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users