Martin Szummer @mszummer - Twitter Profile

Martin Szummer

@MSzummer

4 months ago

Maestro (our agent) is making the most of Sonnet 4.6 already - showing great results!

iGent AI @iGent_AI

4 months ago

We’ve been testing Sonnet 4.6 and it has been potent in our agent, Maestro. Our primary eval is to implement a long list of features across a diverse set of use cases, iteratively across codebases, building on prior work. The result: it completed features faster, cheaper, and with a higher benchmark pass rate.

iGent_AI's tweet photo. We’ve been testing Sonnet 4.6 and it has been potent in our agent, Maestro. Our primary eval is to implement a long list of features across a diverse set of use cases, iteratively across codebases, building on prior work. The result: it completed features faster, cheaper, and with a higher benchmark pass rate.

5

3

1

1K

0

59

MSzummer retweeted

Claude

@claudeai

4 months ago

This is Claude Sonnet 4.6: our most capable Sonnet model yet. It’s a full upgrade across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. It also features a 1M token context window in beta.

1K

22K

2K

5K

8M

MSzummer retweeted

Claude

@claudeai

6 months ago

Introducing Claude Opus 4.5: the best model in the world for coding, agents, and computer use. Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done.

claudeai's tweet photo. Introducing Claude Opus 4.5: the best model in the world for coding, agents, and computer use.

Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done. https://t.co/mid2Z1qzIf

1K

19K

2K

3K

8M

Martin Szummer

@MSzummer

9 months ago

This is a historic moment for us. Our software engineering agent, Maestro, generated solutions for all 12 ICPC World Finals problems — one of the hardest team programming competitions on Earth! We're opening its solutions for the community to validate. Go break them.

iGent AI @iGent_AI

9 months ago

We're excited to share that our agent, Maestro, drafted solutions to all 12 problems from ICPC 2025 World Finals in ~2 hours - using current models, no human involvement, no internet access. We deeply respect the human teams' extraordinary dedication. Note: no official validation

1

11

4

1

1K

0

2

0

92

Who to follow

Mark Fisher

@MarkFisherNHS

Not quite retired former NHS Chief Exec and Cabinet Office Director General. Chair Elect University of Huddersfield.

datascience@uw

@datascience_uw

Connecting people with data science research, education, and collaboration at UW-Madison

🇨🇭🌲Eric, CEO of the Swiss🌲🇨🇭

@neutral_af

Provider of guns and Wolps | Swiss gun market monitoring | Anti-communist | I'm on Youtube https://t.co/VyLdPrsnxH | 🌈 Niki 11.11.2019 - 02.10.2025

Martin Szummer

@MSzummer

10 months ago

Anthropic just made *the* LLM release we have been waiting for - two massive context Claude Sonnet models, handling up to 1M input tokens. These are the models that we used with our Maestro system @iGent_AI to build large, complex software, like a Redis-compatible database written in Rust, written entirely by AI https://t.co/1BpAZeevkl

Claude

@claudeai

10 months ago

Claude Sonnet 4 now supports 1 million tokens of context on the Anthropic API—a 5x increase. Process over 75,000 lines of code or hundreds of documents in a single request.

claudeai's tweet photo. Claude Sonnet 4 now supports 1 million tokens of context on the Anthropic API—a 5x increase.

Process over 75,000 lines of code or hundreds of documents in a single request. https://t.co/4hzltn9IoD

696

15K

1K

2K

3M

1

2

1

0

337

Martin Szummer

@MSzummer

10 months ago

Our agentic software engineering system, Maestro, can build large, complex software: it just finished building a Redis database from first principles in Rust, improving on its safety and performance!

iGent AI @iGent_AI

10 months ago

Tired of toy AI demos that fizzle in production? iGentAI built Ferrous: A Rust Redis-compatible server outperforming Valkey. 35KLOC, 100% test passing, beats benchmarks. Zero human code. Built in 70 hours of part-time direction. Toys vs. tools—here's the proof.

iGent_AI's tweet photo. Tired of toy AI demos that fizzle in production? iGentAI built Ferrous: A Rust Redis-compatible server outperforming Valkey. 35KLOC, 100% test passing, beats benchmarks. Zero human code. Built in 70 hours of part-time direction. Toys vs. tools—here's the proof. https://t.co/rbFeg5uCsa

1

13

4

2

2K

2

0

484

MSzummer retweeted

iGent AI @iGent_AI

about 1 year ago

You can also find out the full details on Sonnet 4.0 VibeCodeBench performance at https://t.co/bYszAhZWh0

0

1

0

229

MSzummer retweeted

iGent AI @iGent_AI

about 1 year ago

We've integrated Claude Sonnet 4 into Maestro, and the results are transformative. As our evaluations show, it maintains higher code quality even as project complexity grows. Combined with its new extended thinking capabilities, Maestro delivers an unmatched AI engineering experience. Signup at https://t.co/ut8NN2M13t

1

0

269

MSzummer retweeted

iGent AI @iGent_AI

about 1 year ago

@Anthropic reports Claude 4 models are 65% less likely to use shortcuts on agentic tasks. Our evaluations confirm this—Claude Sonnet 4 consistently understates feature completeness rather than overstate success. This translates to more reliable AI assistance through Maestro.

iGent_AI's tweet photo. @Anthropic reports Claude 4 models are 65% less likely to use shortcuts on agentic tasks. Our evaluations confirm this—Claude Sonnet 4 consistently understates feature completeness rather than overstate success. This translates to more reliable AI assistance through Maestro. https://t.co/8ChwgPhIrM

1

4

1

271

MSzummer retweeted

iGent AI @iGent_AI

about 1 year ago

Our VibeCodeBench evaluations affirm what @Anthropic just announced: Claude Sonnet 4 excels at autonomous multi-feature development. We've seen codebase navigation errors drop from 20% to near zero and strategic refactoring that saves ~500k tokens on multi stage, complex tasks. Proud to power Maestro with this breakthrough.

1

7

3

2

558

MSzummer retweeted

iGent AI @iGent_AI

over 1 year ago

"Agency > Intelligence" @karpathy nailed it, and after 18 months building Maestro, we agree. The real AI leap isn’t just smarts—it’s agency: the ability to act independently, turning assistants into partners.

iGent_AI's tweet photo. "Agency > Intelligence"
@karpathy nailed it, and after 18 months building Maestro, we agree. The real AI leap isn’t just smarts—it’s agency: the ability to act independently, turning assistants into partners. https://t.co/OEnUVRF2rW

1

11

4

1

9K

Martin Szummer

@MSzummer

almost 14 years ago

3 of us are planning a hike/cycle trip in Scotland following the #ICML2012 workshops (July 2-3-4); Anyone else wants to join? Bring boots!

1

0

Martin Szummer

@MSzummer

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users