Trevor Gurgick @TGurgick - Twitter Profile

Pinned Tweet

almost 5 years ago

@joulee 1. Which parts of the experiences are creating customer friction/pain? (for prioritization, impact) 2. Which actions/inputs are driving target outcomes? (particularly which are causal... ) 3. What does a healthy customer lifecycle look like? What is an anomaly?

0

49

1

7

0

TGurgick retweeted

Thariq

@trq212

3 months ago

We just released Claude Code channels, which allows you to control your Claude Code session through select MCPs, starting with Telegram and Discord. Use this to message Claude Code directly from your phone.

2K

26K

2K

18K

8M

TGurgick retweeted

Andrew Curran

@AndrewCurran_

3 months ago

Striking image from the new Anthropic labor market impact report.

551

13K

2K

9K

7M

Trevor Gurgick @TGurgick

4 months ago

I’ve interviewed more LLMs than humans thus far in 2026. Maybe AI really is taking all the jobs?

0

1

0

22

Who to follow

GP First In prev. @usmc / @fiscalnote / @anduriltech putting the ayyyy in AI

Trevor Gurgick @TGurgick

6 months ago

@realmadhuguru Feel this, but at the same time instead of worrying about being behind we should focus on what problems we can solve with this new found power

0

13

TGurgick retweeted

Madhu Guru

@realmadhuguru

6 months ago

Same feeling as a product person. For years, my constraint was execution capacity - lots of fun ideas, but building each required assembling eng and design teams. Hard and unscalable. With AI, I can now 'hire' a full team instantly. Massive scale unlocked...in theory. Because now there’s a new constraint: mastering how to orchestrate AI teams…the tools, workflows and product craft in this new world. Feels like there is no bigger lever than learning these skills. Channeling all my spare time and energy here.

35

695

40

430

84K

TGurgick retweeted

Andrew Ng

@AndrewYNg

7 months ago

Really proud of the DeepLearningAI team. When Cloudflare went down, our engineers used AI coding to quickly implement a clone of basic Cloudflare capabilities to run our site on. So we came back up long before even major websites!

278

7K

361

807

1M

Trevor Gurgick @TGurgick

11 months ago

Showing has always been more powerful than telling. Data and goals are still important (“why?”) but condensing the cycle down with a rapid prototype way better than weeks of word edits

Madhu Guru

@realmadhuguru

11 months ago

At @Google, we are moving from a writing‑first culture to a building‑first one. Writing was a proxy for clear thinking, optimized for scarce eng resources and long dev cycles - you had to get it right before you built. Now, when time to vibe-code prototype ≈ time to write PRD, PMs can SHOW not tell. Role profiles are blurring, creativity and building are happening in parallel.

204

5K

419

2K

642K

0

1

0

40

Trevor Gurgick @TGurgick

11 months ago

Irony of working in AI… my natural use of hyphens and dashes is no longer ‘human enough’

0

1

0

35

Trevor Gurgick @TGurgick

over 1 year ago

@venturetwins @a16z Love this. Having a newborn + tracking data manually for her, there seems to be an open space to better streamline how we integrate growth and development details in order to get more out of AI during the journey. Currently using 6 different apps to do that.

1

6

0

256

Trevor Gurgick @TGurgick

over 1 year ago

It’s been quite a month for @AnthropicAI

Alex Albert

@alexalbert__

over 1 year ago

The real shiptober (plus one day) was at Anthropic: • 11/1 - Token counting API • 11/1 - Multimodal PDF support across claude and the API • 10/31 - Voice dictation in Claude mobile apps • 10/31 - Claude desktop app • 10/29 - Claude in Github Copilot • 10/24 - Analysis tool • 10/22 - New Claude 3.5 Sonnet • 10/22 - Computer use API • 10/18 - Financial analyst quickstart • 10/17 - Mobile app design overhaul • 10/9 - Remove message order restrictions in API • 10/8 - Message Batches API • 10/4 - Artifacts errors auto-fix Btw we are able to ship this much because we use Claude all the time

126

2K

120

430

195K

0

1

0

69

TGurgick retweeted

Alex Albert

@alexalbert__

over 1 year ago

Computer use is the first step toward a completely new form of human-computer interaction. In just a few years, the way we interface with computers will be completely different from today. Let me explain:

alexalbert__'s tweet photo. Computer use is the first step toward a completely new form of human-computer interaction.

In just a few years, the way we interface with computers will be completely different from today.

Let me explain: https://t.co/rqVtAI0DV8

83

3K

277

2K

619K

Trevor Gurgick @TGurgick

over 1 year ago

This is fun. Claude can access your computer now! The shift towards Agents continues. 🤖

Anthropic

@AnthropicAI

over 1 year ago

We've built an API that allows Claude to perceive and interact with computer interfaces. This API enables Claude to translate prompts into computer commands. Developers can use it to automate repetitive tasks, conduct testing and QA, and perform open-ended research.

79

3K

427

815

604K

0

48

Trevor Gurgick @TGurgick

over 1 year ago

Why is every company shifting towards ‘Agentic’ AI products? TAM

Scott Brinker

@chiefmartec

over 1 year ago

Excellent article by Sonya Huang and Pat Grady of @Sequoia, "The Agentic Reasoning Era Begins", and the $10 trillion opportunity with service-as-a-software: https://t.co/wkI9mnLwn1 "Thanks to agentic reasoning, the AI transition is service-as-a-software. Software companies turn labor into software. That means the addressable market is not the software market, but the services market measured in the trillions of dollars."

chiefmartec's tweet photo. Excellent article by Sonya Huang and Pat Grady of @Sequoia, "The Agentic Reasoning Era Begins", and the $10 trillion opportunity with service-as-a-software:

https://t.co/wkI9mnLwn1

"Thanks to agentic reasoning, the AI transition is service-as-a-software. Software companies turn labor into software. That means the addressable market is not the software market, but the services market measured in the trillions of dollars."

23

2K

257

2K

221K

0

2

0

99

Trevor Gurgick @TGurgick

over 1 year ago

Lots of motivated buyers, network effects and the ability to generalize well between consumer and enterprise needs helps

Allie K. Miller

@alliekmiller

over 1 year ago

Stripe data shows that top AI startups in 2024 (ex: OpenAI, Anthropic, Mistral, Midjourney) are making money faster than equivalent SaaS companies in 2018. Al startups that hit at least $2.5M/mo rev achieved the milestone in 20 months — 5x faster than past SaaS startups. Do we think that’s because VC money is more concentrated? TikTok? Actual higher interest? OAI skewing everything? More here from FT: https://t.co/mw5KTrq0wS

16

101

14

50

19K

0

59

TGurgick retweeted

Allie K. Miller

@alliekmiller

over 1 year ago

ChatGPT isn’t slowing down. They just released a new feature called “Canvas” so that more work doesn’t just get assisted by ChatGPT, it gets done. Check it out.

13

287

20

97

33K

Trevor Gurgick @TGurgick

over 1 year ago

@thiagocaserta @deedydas Definitely better when you are starting with a blank slate or you’ll need to give a lot of context. That said, optimistic this is just growing pains and the meta reasoning / architecture learning will come soon (or with tuning)

1

0

95

Trevor Gurgick @TGurgick

over 1 year ago

One of the best TTS products created. Whole startups spent years on this technology for lesser results… definitely worth playing with.

Andrej Karpathy

@karpathy

over 1 year ago

NotebookLM is quite powerful and worth playing with https://t.co/EMHIjc15iU It is a bit of a re-imagination of the UIUX of working with LLMs organized around a collection of sources you upload and then refer to with queries, seeing results alongside and with citations. But the current most new/impressive feature (that is surprisingly hidden almost as an afterthought) is the ability to generate a 2-person podcast episode based on any content you upload. For example someone took my "bitcoin from scratch" post from a long time ago: https://t.co/7ajZNZ0BGi and converted it to podcast, quite impressive: https://t.co/ZZn0LJgsnu You can podcastify *anything*. I give it train_gpt2.c (C code that trains GPT-2): https://t.co/gDrAqix4Iv and made a podcast about that: https://t.co/bgcwmQr5d7 I don't know if I'd exactly agree with the framing of the conversation and the emphasis or the descriptions of layernorm and matmul etc but there's hints of greatness here and in any case it's highly entertaining. Imo LLM capability (IQ, but also memory (context length), multimodal, etc.) is getting way ahead of the UIUX of packaging it into products. Think Code Interpreter, Claude Artifacts, Cursor/Replit, NotebookLM, etc. I expect (and look forward to) a lot more and different paradigms of interaction than just chat. That's what I think is ultimately so compelling about the 2-person podcast format as a UIUX exploration. It lifts two major "barriers to enjoyment" of LLMs. 1 Chat is hard. You don't know what to say or ask. In the 2-person podcast format, the question asking is also delegated to an AI so you get a lot more chill experience instead of being a synchronous constraint in the generating process. 2 Reading is hard and it's much easier to just lean back and listen.

243

8K

1K

6K

829K

0

90

Trevor Gurgick @TGurgick

over 1 year ago

@SupermavenAI 👏👏👏

0

1

0

418

Trevor Gurgick @TGurgick

almost 2 years ago

@gdb 👏👏👏

0

45

Trevor Gurgick @TGurgick

almost 2 years ago

Great overview of o1 / Strawberry 👇

Jim Fan

@DrJimFan

almost 2 years ago

OpenAI Strawberry (o1) is out! We are finally seeing the paradigm of inference-time scaling popularized and deployed in production. As Sutton said in the Bitter Lesson, there're only 2 techniques that scale indefinitely with compute: learning & search. It's time to shift focus to the latter. 1. You don't need a huge model to perform reasoning. Lots of parameters are dedicated to memorizing facts, in order to perform well in benchmarks like trivia QA. It is possible to factor out reasoning from knowledge, i.e. a small "reasoning core" that knows how to call tools like browser and code verifier. Pre-training compute may be decreased. 2. A huge amount of compute is shifted to serving inference instead of pre/post-training. LLMs are text-based simulators. By rolling out many possible strategies and scenarios in the simulator, the model will eventually converge to good solutions. The process is a well-studied problem like AlphaGo's monte carlo tree search (MCTS). 3. OpenAI must have figured out the inference scaling law a long time ago, which academia is just recently discovering. Two papers came out on Arxiv a week apart last month: - Large Language Monkeys: Scaling Inference Compute with Repeated Sampling. Brown et al. finds that DeepSeek-Coder increases from 15.9% with one sample to 56% with 250 samples on SWE-Bench, beating Sonnet-3.5. - Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. Snell et al. finds that PaLM 2-S beats a 14x larger model on MATH with test-time search. 4. Productionizing o1 is much harder than nailing the academic benchmarks. For reasoning problems in the wild, how to decide when to stop searching? What's the reward function? Success criterion? When to call tools like code interpreter in the loop? How to factor in the compute cost of those CPU processes? Their research post didn't share much. 5. Strawberry easily becomes a data flywheel. If the answer is correct, the entire search trace becomes a mini dataset of training examples, which contain both positive and negative rewards. This in turn improves the reasoning core for future versions of GPT, similar to how AlphaGo’s value network — used to evaluate quality of each board position — improves as MCTS generates more and more refined training data.

DrJimFan's tweet photo. OpenAI Strawberry (o1) is out! We are finally seeing the paradigm of inference-time scaling popularized and deployed in production. As Sutton said in the Bitter Lesson, there're only 2 techniques that scale indefinitely with compute: learning & search. It's time to shift focus to the latter.

1. You don't need a huge model to perform reasoning. Lots of parameters are dedicated to memorizing facts, in order to perform well in benchmarks like trivia QA. It is possible to factor out reasoning from knowledge, i.e. a small "reasoning core" that knows how to call tools like browser and code verifier. Pre-training compute may be decreased.

2. A huge amount of compute is shifted to serving inference instead of pre/post-training. LLMs are text-based simulators. By rolling out many possible strategies and scenarios in the simulator, the model will eventually converge to good solutions. The process is a well-studied problem like AlphaGo's monte carlo tree search (MCTS).

3. OpenAI must have figured out the inference scaling law a long time ago, which academia is just recently discovering. Two papers came out on Arxiv a week apart last month:

- Large Language Monkeys: Scaling Inference Compute with Repeated Sampling. Brown et al. finds that DeepSeek-Coder increases from 15.9% with one sample to 56% with 250 samples on SWE-Bench, beating Sonnet-3.5.
- Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. Snell et al. finds that PaLM 2-S beats a 14x larger model on MATH with test-time search.

4. Productionizing o1 is much harder than nailing the academic benchmarks. For reasoning problems in the wild, how to decide when to stop searching? What's the reward function? Success criterion? When to call tools like code interpreter in the loop? How to factor in the compute cost of those CPU processes? Their research post didn't share much.

5. Strawberry easily becomes a data flywheel. If the answer is correct, the entire search trace becomes a mini dataset of training examples, which contain both positive and negative rewards.

This in turn improves the reasoning core for future versions of GPT, similar to how AlphaGo’s value network — used to evaluate quality of each board position — improves as MCTS generates more and more refined training data.

135

6K

1K

3K

800K

0

54

Trevor Gurgick

@TGurgick

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users