ircrp @ircrp - Twitter Profile

about 1 month ago

@fofrAI @paulopacitti "I use Gemini 3 Pro to generate a JSON prompt because it has the best vision capabilities (at the moment)." Does this stand still today, or there are new leaders in that regard ?

0

50

ircrp @ircrp

2 months ago

@gabrielchua @gabrielchua how do we enable imagegen tool in codex app ?

0

97

ircrp @ircrp

4 months ago

● Claude Opus 4.5 is no longer available. Please switch to Claude Opus 4.6! @antigravity

1

3

0

251

ircrp retweeted

Muratcan Koylan

@koylanai

5 months ago

Progressive disclosure is not reliable because LLMs are inherently lazy. "In 56% of eval cases, the skill was never invoked. The agent had access to the documentation but didn't use it." Vercel ran evals on Next.js 16 APIs that aren't in model training data to test whether agents could learn framework-specific knowledge through Skills vs. persistent context. Skills are the "correct" abstraction: package domain knowledge, let the agent invoke it when needed, minimal context. The agent decides when to retrieve. They work well WHEN the user triggers them; otherwise, LLMs just ignore them. Vercel's benchmarking is the first experiment of this kind I've seen, and it's actually interesting. - Baseline (no docs): 53% - Skill (default): 53% - Skill with explicit instructions: 79% - AGENTS[.]md with 8KB compressed docs index: 100% The skill approach assumes agents reliably recognize when they need external knowledge and act on it. They don't. "You MUST invoke the skill" made agents read docs first and miss project context. "Explore project first, then invoke" performed better. Same skill, different outcomes based on prompting. The winning approach removed the decision entirely. An 8KB compressed index embedded in AGENTS[.]md, with one instruction: "Prefer retrieval-led reasoning over pre-training-led reasoning." Two agent design learnings: 1. Passive context beats active retrieval for foundational knowledge. Don't make the agent decide to look things up, make the index always present. 2. Compress aggressively. Vercel went from 40KB to 8KB (80% reduction) with zero performance loss. The agent needs to know where to find docs, not have full content in context. The gap between "agent can access X" and "agent will access X" is larger than we assume. I keep seeing similar findings across agent architectures. Kimi Swarm's orchestrator is trained specifically to avoid sequential execution. Without training, orchestrators default to serial processing, planning a list of steps and executing them one by one. It's the EASY path. The agent defaults to the lazy path: hallucinating from training data rather than retrieving docs. Passive context removes the choice entirely; the agent doesn't decide whether to look things up; the index is already there. We keep finding that the "smarter", more autonomous design (let the agent decide when to X) underperforms the "dumber" design (always X, or structurally enforce X).

64

1K

92

1K

198K

Who to follow

Víctor Hugo Ortiz M - Pronósticos Deportivos

@ViejaGuardia666

Asalto en el camino a la suerte, al destino y alguna mosa despistada, soy pacifista sin bandera, no lucho por la paz, pero; follo por la virginidad

ShipHero

@weareshiphero

ShipHero gives brands and 3PLs fulfillment superpowers by making it easy to ship eCommerce.

Stace Loyd

@SayItAintStace

ircrp @ircrp

5 months ago

@darin_gordon @dannypostma

0

18

ircrp retweeted

Zac

@PerceptualPeak

5 months ago

WOW!!! If you have semantic memory tied to your UserPromptSubmit hooks, you MUST ALSO include it in your PreToolUse hook. I promise you - it will be an absolute GAME CHANGER. It will put your efficiency levels are over 9,000 (*vegeta voice*). How many times have you sat there, watching Claude code go through an extended workflow, just to notice it start to go down a path you just KNOW will be error filled - and subsequently take it forever to FINALLY figure it out? The problem with relying strictly on the UserPromptSubmit hook for semantic memory injection is the workflow drift from your original prompt. The memories it injects at the initiation of your prompt will be less and less relevant to the workflow the longer the workflow is. Claude has a beautiful thing called thinking blocks. These blocks are ripe for the picking - filled with meaning & intent - which is perfect for cosign similarly recall. Claude thinks to itself, "hmm, okay I'm going to do this because of this", then starts to engage the tool of its choice, and BOOM: PreToolUse hook fires, takes the last 1,500 characters from the most recent thinking block from the active transcript, embeds it, pulls relevant memories from your vector database, and injects them to claude right before it starts using its tool (hooks are synchronous). This all happens in less than 500 milliseconds. The result? A self correcting Claude workflow. Based on my testing thus far, this is one of the most consequential additions to my context management system I've implemented yet. Photos: ASCII chart showing the workflow of the hook, and then two real use-cases of the mid-stream memory embedding actually being useful. If you already have semantic memory setup, just paste this tweet and photos into Claude code and tell it to implement it for you. Then enjoy the massive increase of workflow efficiency :)

PerceptualPeak's tweet photo. WOW!!! If you have semantic memory tied to your UserPromptSubmit hooks, you MUST ALSO include it in your PreToolUse hook. I promise you - it will be an absolute GAME CHANGER. It will put your efficiency levels are over 9,000 (*vegeta voice*).

How many times have you sat there, watching Claude code go through an extended workflow, just to notice it start to go down a path you just KNOW will be error filled - and subsequently take it forever to FINALLY figure it out?

The problem with relying strictly on the UserPromptSubmit hook for semantic memory injection is the workflow drift from your original prompt. The memories it injects at the initiation of your prompt will be less and less relevant to the workflow the longer the workflow is.

Claude has a beautiful thing called thinking blocks. These blocks are ripe for the picking - filled with meaning & intent - which is perfect for cosign similarly recall. Claude thinks to itself, "hmm, okay I'm going to do this because of this", then starts to engage the tool of its choice, and BOOM:

PreToolUse hook fires, takes the last 1,500 characters from the most recent thinking block from the active transcript, embeds it, pulls relevant memories from your vector database, and injects them to claude right before it starts using its tool (hooks are synchronous). This all happens in less than 500 milliseconds.

The result?

A self correcting Claude workflow.

Based on my testing thus far, this is one of the most consequential additions to my context management system I've implemented yet.

Photos: ASCII chart showing the workflow of the hook, and then two real use-cases of the mid-stream memory embedding actually being useful.

If you already have semantic memory setup, just paste this tweet and photos into Claude code and tell it to implement it for you. Then enjoy the massive increase of workflow efficiency :)

29

666

51

1K

60K

ircrp @ircrp

6 months ago

@AlexObuchowski https://t.co/6XQCXYgzXM

Mistral AI

@MistralAI

6 months ago

We've doubled the Vibe context limit from 100k to 200k. Happy shipping! → uv tool install mistral-vibe

70

3K

188

684

198K

0

1

0

60

ircrp @ircrp

7 months ago

@tokumin Wait lol, i am a moron, didnt realise you are a lead of the product. Do you guys share prompts for those features ?

0

3

ircrp @ircrp

7 months ago

@tokumin @tokumin have you had any luck finding the exact prompts Notebook LM uses to generate the infographics/slide decks etc ?

1

0

39

ircrp @ircrp

7 months ago

@producer_ai I need an invite!

0

9

ircrp @ircrp

8 months ago

@iruletheworldmo -1. Baited

0

36

ircrp @ircrp

8 months ago

@MartinScapeX How bad is it ? 1.5h coding per day or worse ?

0

16

ircrp @ircrp

8 months ago

@shydev69 any write ups from their team on the reverse engineering of claude code ? curious about the internals of cc beyond just extracting the messages/tool calls from .jsonl sessions

0

317

ircrp @ircrp

10 months ago

@levan

0

34

ircrp @ircrp

10 months ago

@scaling01 we need dem gippite logit_bias params config lever, @sama, deploy pls.

0

207

ircrp @ircrp

10 months ago

@lefthanddraft @Davidedeferrar Is it true that the full sys prompt is about 30k tokens massive ?

0

14

ircrp @ircrp

10 months ago

@Davidedeferrar @lefthanddraft

1

0

28

ircrp retweeted

Wyatt Walls

@lefthanddraft

10 months ago

Want to find out the Juice number (reasoning effort) for GPT-5-Thinking? Try the prompt below. This is the equivalent of medium reasoning effort in the API

lefthanddraft's tweet photo. Want to find out the Juice number (reasoning effort) for GPT-5-Thinking?

Try the prompt below. This is the equivalent of medium reasoning effort in the API https://t.co/3Z6a9iAebf

52

701

25

372

160K

ircrp @ircrp

10 months ago

@sama @sama for love of god, please bump the input ctx window for Enterprise ChatGPT users, GPT 5 Thinking is somewhere around 50k tokens and GPT 5 (non thinking) is somehwere between 80-85k. Literally just tested few mins ago. PLEASE PLEASE PLEASE

0

57

ircrp

@ircrp

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users