jlcases| Producto y tecnología

@jlcasesES

Hablo de producto y tecnología | Interim CPTO en @rankia | 3x Founder | Advisory boards | Antes la ventaja era escalar. En la era IA, la ventaja es enfocar.

España

Joined December 2009

465 Following

3.5K Followers

13.2K Posts

jlcases| Producto y tecnología

@jlcasesES

about 3 hours ago

@tarugoconf @JuanCVezVazquez @david_bonilla @juan_miqueo Exactamente...¿Qué tiene que hacer @juan_miqueo ?😀

jlcases| Producto y tecnología

@jlcasesES

5 days ago

Menos subagentes más PRDs. Todo lo que veo es agents, tasks, subagentes. Mucha máquina, poco producto. Y luego queremos que el agente lo haga bien. Todo el mundo sacó a los PM y PO de la ecuación, como si sobraran. ¿Por qué iban a sobrar, si eran clave antes? El PRD y los criterios de aceptación son lo que evita que el agente se vaya por las ramas. Sin eso, es una moneda al aire.

295

jlcases| Producto y tecnología

@jlcasesES

7 days ago

https://t.co/bU1LIOEHSZ

159

jlcases| Producto y tecnología

@jlcasesES

8 days ago

Desarrollar producto no es escribir código. Es decidir qué construir, para quién, y cómo sabrás que está bien. En 3 min, el proceso entero con PaellaDoc: de una idea en el chat a un sprint verificable. PRD → épicas → historias → criterios → gate. Todo desde una conversación. 👇

351

Who to follow

David Aranzabal

@FX_foraliving

Trader, Emprendedor y Conferenciante Internacional. Fundador de Crypto Week Madrid (230 speakers) y Money Club. + 150 mil inversores en nuestros eventos

Halcón Trading⚡️

@eduleci

Pro Trader. Desde 2004, Sobrevolando los Mercados. Factor: QUAL & MTUM. Efecto Compuesto. Estoicismo. Antifrágil. Old School. Un papá con 🩵🩵💙

gerardoortega.es

@gerardortega_es

Análisis independiente de mercados financieros 📅 L y J 17:00h Webinarios Gratuitos Próximo directo Lunes 08/06 Apúntate aquí 👇👇👇

jlcases| Producto y tecnología

@jlcasesES

10 days ago

Muy interesante si hacéis agentes

Muratcan Koylan

@koylanai

11 days ago

Gradient descent for SKILL.md files sounds interesting, maybe a bit complex but it's becoming a real part of agent harness. SkillOpt is one of the first papers to treat markdown skill files as trainable parameters and provides a proper optimization framework for them. A few things I learned that you should consider too. 1. The validation gate is the only thing that matters in a self-editing loop. Held-out set, strict improvement, ties rejected. End-to-end, their best skills land with 1 to 4 accepted edits total. If your "self-improving agent" is accepting most of what it proposes, you're shipping slop. 2. Bounded edits are better than full rewrites. 4 to 8 edits per step is the sweet spot. Remove the budget and performance collapses. This is the textual analog of learning rate, and it transfers to any LLM-as-author loop. If you're using an agent to refactor your docs, your prompts, or your skills, cap the diff size. 3. Compactness wins. Median final skill: ~920 tokens. Skills do not need to be long. They need to be high-signal. Most skill files I see are bloated because length feels like effort. It isn't. 4. The harness is becoming less important; the skill is becoming more important. A Codex-trained skill ported into Claude Code hit +59.7 points on SpreadsheetBench. Procedural knowledge is more general than the runtime that produced it. 5. Frozen model + trained context is the practical adaptation. GPT-5.4-nano with a SkillOpt'd skill ≈ frontier behavior on procedural benchmarks. Cheaper, portable, inspectable, zero inference-time cost. This is the answer to "how do we adapt a frontier model for our domain" for almost everyone who isn't training their own models. 6. Verification is the bottleneck. Every gate in this paper depends on an auto-grader. That works for benchmarks. It fails for writing, design, and strategy, exactly the open-ended work we want to automate. Whoever builds the verifier for open-ended tasks owns the next stage. There are also two leassons I learned while shipping v2.3.0 of my Context Engineering Agent Skills repo, measured across composer-2, claude-opus-4-7, gpt-5.5, and gemini-3.1-pro via the @cursor_ai SDK: - Description and body are two different surfaces. The router only sees the description. The agent sees the body once activated. They can quietly disagree, and only end-to-end task tests catch it. - Aggregate accuracy is the wrong unit. When I rewrote three descriptions, the corpus average moved ~1pp. Individual skills moved 23–25pp. Per-skill effect size is where the action is. Also, in Feb 2026 I shared a piece called Personal Brain OS arguing that the markdown file is a first-class substrate for agent state. SkillOpt is the optimizer-shaped version of that same argument: not "store memory in files" but "treat files as trainable parameters with proper optimization machinery around them." That's the move from static to measured. The fast/slow split they describe already lives implicitly in the digital-brain-skill repo: - voice-guide and tone-of-voice.md are slow-state (rarely touched) - posts.jsonl and bookmarks.jsonl are fast-state What SkillOpt adds that I didn't have is a protected section invariant, a structural guarantee that fast edits cannot overwrite slow lessons. Removing that mechanism cost them 22 points on SpreadsheetBench. Worth borrowing. If you're building agents, SkillOpt: Executive Strategy for Self-Evolving Agent Skills is a good paper to read: https://t.co/ZS9SZXQ6Mv

koylanai's tweet photo. Gradient descent for SKILL.md files sounds interesting, maybe a bit complex but it's becoming a real part of agent harness.

SkillOpt is one of the first papers to treat markdown skill files as trainable parameters and provides a proper optimization framework for them.

A few things I learned that you should consider too.

1. The validation gate is the only thing that matters in a self-editing loop.

Held-out set, strict improvement, ties rejected. End-to-end, their best skills land with 1 to 4 accepted edits total. If your "self-improving agent" is accepting most of what it proposes, you're shipping slop.

2. Bounded edits are better than full rewrites. 4 to 8 edits per step is the sweet spot.

Remove the budget and performance collapses. This is the textual analog of learning rate, and it transfers to any LLM-as-author loop. If you're using an agent to refactor your docs, your prompts, or your skills, cap the diff size.

3. Compactness wins. Median final skill: ~920 tokens.

Skills do not need to be long. They need to be high-signal. Most skill files I see are bloated because length feels like effort. It isn't.

4. The harness is becoming less important; the skill is becoming more important.

A Codex-trained skill ported into Claude Code hit +59.7 points on SpreadsheetBench. Procedural knowledge is more general than the runtime that
produced it.

5. Frozen model + trained context is the practical adaptation.

GPT-5.4-nano with a SkillOpt'd skill ≈ frontier behavior on procedural benchmarks. Cheaper, portable, inspectable, zero inference-time cost. This is
the answer to "how do we adapt a frontier model for our domain" for almost everyone who isn't training their own models.

6. Verification is the bottleneck.

Every gate in this paper depends on an auto-grader. That works for benchmarks. It fails for writing, design, and strategy, exactly the open-ended work we want to automate. Whoever builds the verifier for open-ended tasks owns the next stage.

There are also two leassons I learned while shipping v2.3.0 of my Context Engineering Agent Skills repo, measured across composer-2, claude-opus-4-7,
gpt-5.5, and gemini-3.1-pro via the @cursor_ai SDK:
- Description and body are two different surfaces. The router only sees the description. The agent sees the body once activated. They can quietly disagree, and only end-to-end task tests catch it.
- Aggregate accuracy is the wrong unit. When I rewrote three descriptions, the corpus average moved ~1pp. Individual skills moved 23–25pp. Per-skill effect size is where the action is.

Also, in Feb 2026 I shared a piece called Personal Brain OS arguing that the markdown file is a first-class substrate for agent state. SkillOpt is the optimizer-shaped version of that same argument: not "store memory in files" but "treat files as trainable parameters with proper optimization machinery around them." That's the move from static to measured.

The fast/slow split they describe already lives implicitly in the digital-brain-skill repo:
- voice-guide and tone-of-voice.md are slow-state (rarely touched)
- posts.jsonl and bookmarks.jsonl are fast-state

What SkillOpt adds that I didn't have is a protected section invariant, a structural guarantee that fast edits cannot overwrite slow lessons. Removing that mechanism cost them 22 points on SpreadsheetBench. Worth borrowing.

If you're building agents, SkillOpt: Executive Strategy for Self-Evolving Agent Skills is a good paper to read: https://t.co/ZS9SZXQ6Mv

241

767K

260

jlcases| Producto y tecnología

@jlcasesES

10 days ago

@Sordevine Gracias, Enrique! Cocinándose la versión Windows.

jlcases| Producto y tecnología

@jlcasesES

10 days ago

Escribir código era el trabajo. Ahora es el último paso. Esto se llama PaellaDoc. Lo construí solo. Acabo de abrirlo. Hilo de 10 uno por pilar. ↓

jlcases| Producto y tecnología

@jlcasesES

10 days ago

@dulegilcasas Gracias , Dule! todavía hay plazas:)

jlcases| Producto y tecnología

@jlcasesES

10 days ago

PaellaDoc ya está abierto. Gratis para uso personal. Para siempre. Las primeras 50 personas que enseñen su .paella en el foro: Pro de por vida + Zoom conmigo. 50, o 30 de junio de 2026. Lo que pase primero. https://t.co/JRTrWiZYOd

230

jlcases| Producto y tecnología

@jlcasesES

10 days ago

¿De qué vale construir 100x más rápido si te ata a la silla? PaellaDoc tiene Telegram. Consultas el sprint desde el sofá. Creas un criterio en la cola del supermercado. 100x más rápido. 100x más libre. ↓ 9/10

236

jlcases| Producto y tecnología

@jlcasesES

14 days ago

Me está entrando TOC con Claude

134

jlcases| Producto y tecnología

@jlcasesES

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users