mindmodel @mindmodel - Twitter Profile

Pinned Tweet

almost 4 years ago

Your podcast listeners value your content and appreciate your work. But over time, some may wander off. What about new listeners? How do you attract new listeners to your podcast without spending additional time and effort? https://t.co/lDdBlLmSTG #podcasting #backcatalog #AI

1

13

0

mindmodel retweeted

Oded Rechavi

@OdedRechavi

about 16 hours ago

Most experiments fail, and negative results rarely get published. This means LLMs are unaware of the outcomes of most experiments.

75

4K

427

360

171K

mindmodel retweeted

Craft Pops @Chikapu_0311

2 days ago

Excelが嫌いすぎて作った曲

24

6K

936

2K

269K

mindmodel retweeted

OpenAI

@OpenAI

1 day ago

Building apps has never been easier. With Sites, Codex can turn your work, ideas, and plans into an interactive website or app your team can explore, use, and share with a URL. Rolling out to Business and Enterprise plans, before expanding more broadly.

869

19K

2K

10K

8M

Who to follow

Sobalaan 🧉

@Sobalaan

سوبلان - عبدٌ شحيحُ العينين، قصيرُ الباعين، ممتحنٌ بين الأبالسة و الضدّين.

Robert Miller (🦄,🦄)

@RobertMiller

Data Gerd (Geek-Nerd), Cloud Architect, MS SQL DBA/Architect/BI juggler, Big Data (Streaming), Full-Stack Developer, Chesapeake Bay Retrievers, and electronics.

spongebobdevpants

@spongebobdevpan

I like dev memes and tech tips

mindmodel retweeted

Tibo

@thsottiaux

4 days ago

The Codex usage limits have been reset for all paid ChatGPT subscriptions. You should be back to 100% weekly and 100% hourly limits. Let the tokens do incredible things today and have fun.

770

6K

331

252

988K

mindmodel retweeted

𐌁𐌉Ᏽ 𐌕𐌉𐌌𐌉

@OrevaZSN

4 days ago

App idea: Shazam but for the sound my car is making.

264

107K

8K

3K

2M

mindmodel retweeted

Maran

@TheMaran

4 days ago

how to claim $1200 worth chatgpt pro for 6 months chatgpt is giving free pro subscription to open source contributors > go to: https://t.co/hPJnYwhSFN - enter all necessary information - submit the form for review > only fill the form - if you have an active github profile - if you are actively building anything - if you got any repos with good number of stars - if you got a really useful project on github - if you are vibe coding > one of a japanese bro said that they are literally approving anyone with an active github > i too filled the form, let's take a chance

38

606

45

960

77K

mindmodel retweeted

Liberty Pill Memes

@LibertyPillMeme

5 days ago

When you finally decide to start going to a gym

37

3K

299

398

195K

mindmodel retweeted

Tibo

@thsottiaux

4 days ago

Five million users would agree. Resetting the limits tomorrow morning to celebrate. Time to go /fast

733

8K

381

598

2M

mindmodel retweeted

Paul Solt

@PaulSolt

5 days ago

Love this new ChatGPT feature to switch models! 🎉 Long press on the send button and change the thinking level. 4 taps 👉 0 additional taps. Just drag your finger and let go! - Pro - Extended (Pro) - Thinking - Instant Thank the ChatGPT team for a great new shortcut! @JustinBleuel

11

157

9

43

38K

mindmodel retweeted

λux

@novasarc01

6 days ago

i’m increasingly convinced that the best agent evals will come from mining real agent failure traces. my view is that every failed trace contains a potential eval but not in its raw form. raw traces are messy, long and too specific. the research problem is to distill them into clean reproducible tests. the pipeline i’m interested in is (which i'm currently working on): failure trace → failure attribution → earliest divergence point → minimal reproducible state → targeted eval → regression suite this turns trace data from passive observability into an active improvement loop. like can we extract the exact decision point where the agent should have behaved differently? and can we convert that into an eval that catches the same failure class in the future? i guess this matters because most agent failures are trajectory-level failures and not just output-level failures. personally i think this is much more realistic than relying only on hand-written benchmarks (imo they should look more like failure memory systems). hand-written evals encode what we think agents will fail on. traces encode what agents actually failed on. also once you have the mechanism, you can mutate the trace into variants. that is basically fuzzing for agents.

24

299

23

370

55K

mindmodel retweeted

ChatGPT

@ChatGPTapp

6 days ago

For every ChatGPT conversation that started as “one quick thing” and became a full on saga: table of contents is here. Available now for chats with 5+ responses.

193

3K

179

472

299K

mindmodel retweeted

🚨 AI News | TestingCatalog

@testingcatalog

6 days ago

OPENAI 👀: Codex in ChatGPT now supports Side Conversions, allowing users to ask side questions without disrupting the main thread. /Side testing 🤖

testingcatalog's tweet photo. OPENAI 👀: Codex in ChatGPT now supports Side Conversions, allowing users to ask side questions without disrupting the main thread.

/Side testing 🤖 https://t.co/5YuMqnGzsB

27

733

39

92

70K

mindmodel retweeted

cat

@_catwu

7 days ago

Excited to share our most powerful new Claude Code feature: dynamic workflows! Mention "workflow" in a prompt and Claude will dynamically create an orchestration plan that it strictly follows, allowing you to confidently trust that every stage happens in the right order even across 100s of agents.

_catwu's tweet photo. Excited to share our most powerful new Claude Code feature: dynamic workflows!

Mention "workflow" in a prompt and Claude will dynamically create an orchestration plan that it strictly follows, allowing you to confidently trust that every stage happens in the right order even across 100s of agents.

348

8K

822

6K

2M

mindmodel retweeted

Claude

@claudeai

7 days ago

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors. Available today at the same price.

claudeai's tweet photo. Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.

Available today at the same price. https://t.co/EufxL7T1kb

4K

67K

9K

8K

15M

mindmodel retweeted

a16z @a16z

8 days ago

OpenAI and Anthropic are effectively telling the market they can't solve every problem with a generic AI coworker. You don't pour billions into massive forward-deployed joint ventures if you think the next model release is going to take care of it. In the cloud supercycle, semis led and software followed (and you didn't need Qualcomm or ARM to tell you the value was migrating up the stack). In AI, the infra layer itself is telling us the application layer is a separate, massive opportunity they can't fully capture. a16z's @joeschmidtiv on why the app layer isn't dead: https://t.co/84QN5Mj9T3

a16z's tweet photo. OpenAI and Anthropic are effectively telling the market they can't solve every problem with a generic AI coworker.

You don't pour billions into massive forward-deployed joint ventures if you think the next model release is going to take care of it.

In the cloud supercycle, semis led and software followed (and you didn't need Qualcomm or ARM to tell you the value was migrating up the stack).

In AI, the infra layer itself is telling us the application layer is a separate, massive opportunity they can't fully capture.

a16z's @joeschmidtiv on why the app layer isn't dead: https://t.co/84QN5Mj9T3

182

3K

420

3K

1M

mindmodel retweeted

Justin

@JustinBleuel

8 days ago

@TrungTPhan Google Doc Martens https://t.co/xd7JcupXO6

1

189

4

5

28K

mindmodel retweeted

Serena Ge (Datacurve)

@serenaa_ge

9 days ago

Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks. On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.

serenaa_ge's tweet photo. Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks.

On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work. https://t.co/HCDcjNuTFK

511

6K

754

3K

2M

mindmodel retweeted

BuccoCapital Bloke

@buccocapital

11 days ago

Levie’s Law of AI Psychosis: The farther away you are from the actual work the more confident you are that humans are no longer needed I like it

75

8K

878

1K

441K

mindmodel retweeted

Spitfire

@RealSpitfire

11 days ago

She has embraced the memes. Well played Julie!! Good sport!😂

113

8K

579

154

178K

mindmodel retweeted

Siqi Chen

@blader

12 days ago

protip: adding a adversarial subagent review gate to my plans has been a HUGE unlock to make /goal runs higher quality, and longer running. prompt: "update this plan: before marking a task as done, validate the task with an adversarial subagent review"

57

1K

49

2K

90K

mindmodel

@mindmodel

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users