Tim Messerschmidt

@SeraAndroid

DevRel Ecosystems Lead EMEA at Google. Proud dad, happy husband, and feminist. O'Reilly author. I ♥️ home automation. Opinions stated here are my own.

Berlin, Germany • he/him

Joined January 2010

1.4K Following

6.3K Followers

25.7K Posts

Tim Messerschmidt

@SeraAndroid

about 1 hour ago

Another day, another notable release by the @googlegemma team - the QAT checkpoints mean you benefit from compressed models which basically suffer no accuracy loss 🤯

Google Gemma

@googlegemma

about 4 hours ago

We just dropped Gemma 4 Quantization-Aware Training (QAT) checkpoints on Hugging Face! All Gemma 4 model sizes and their drafters are now optimized with QAT to cut memory requirements and maximize on-device performance!

119

419

129K

Tim Messerschmidt

@SeraAndroid

about 4 hours ago

What a cool model to run locally. I just found my favorite new coding companion and will have a lot of fun with the Collider app: https://t.co/l3vIF9Ix2C

Google Gemma

@googlegemma

1 day ago

Introducing Magenta RealTime 2, a new open model musicians can play as an instrument! Run low-latency, live music synthesis natively on your MacBook using MIDI, text, and audio. 🎶 We love seeing Google’s open model ecosystem grow!

280

166K

Tim Messerschmidt

@SeraAndroid

about 17 hours ago

I look forward to seeing what people will build with these skills!

Patrick Loeber

@patloeber

1 day ago

new skills repo from deepmind to speed up agentic scientific workflows https://t.co/BwpWjX3jN1

Tim Messerschmidt

@SeraAndroid

2 days ago

Gemma 4 12B is a great addition to the Gemma 4 family, especially if you want to run multimodal agents locally. What makes it stand out is its encoder-free architecture. Instead of separate vision and audio encoders adding latency, raw signals project directly into the LLM backbone. This means native, low-latency multimodal reasoning on an everyday 16GB laptop. Learn more: https://t.co/xtVlh9Xi97

Google

@Google

2 days ago

Today we’re introducing Gemma 4 12B — our latest open model that brings advanced agentic reasoning, vision and audio directly to your laptop. It delivers performance nearing our larger Gemma models with a much smaller total memory footprint, while being small enough to run locally with just 16GB of VRAM. It’s open and accessible for everyone to use under a permissive Apache 2.0 license. This is all made possible by our new, unified architecture that removes separate multimodal encoders. Here’s how we did it 🧵

Google's tweet photo. Today we’re introducing Gemma 4 12B — our latest open model that brings advanced agentic reasoning, vision and audio directly to your laptop.

It delivers performance nearing our larger Gemma models with a much smaller total memory footprint, while being small enough to run locally with just 16GB of VRAM. It’s open and accessible for everyone to use under a permissive Apache 2.0 license.

This is all made possible by our new, unified architecture that removes separate multimodal encoders. Here’s how we did it 🧵

245

830K

Who to follow

Ben Weiss

@keyboardsurfer

@AndroidDev Engineer @Google - Android MCP 👨‍💻 Your app, everywhere 🦋 https://t.co/RdInGcWTMW (he/him)

Uttam Tripathi

@tripathiuttam

VP, DevRel @ Qualcomm | Ex-Google, Ex-Amazon | Developer, Engineer, Startups, AI/ML, Loves Travel, Cooking and Cricket; Views are mine alone

Florina Muntenescu

@FMuntenescu

Android Developer Relations Engineer @ Google 🇷🇴 - 🇩🇪 - 🇬🇧

SeraAndroid retweeted

Varun Mohan

@_mohansolo

3 days ago

We’ve rolled out a new version of Gemini 3.5 Flash in Antigravity that boasts much less and has higher endurance on harder tasks. Thanks for all of the feedback on the model. Keep it coming, we will act quickly across the stack to make the experience even better. We’ve also gone ahead and reset Gemini rate limits for all users so you can start running this new model immediately.

168

139

108K

Tim Messerschmidt

@SeraAndroid

4 days ago

I enjoy seeing new benchmarks like DeepSWE and ProgramBench hit the spotlight as models become more capable. Previous benchmarks are getting saturated and it's harder to meaningfully compare what these systems can actually do. What I especially appreciate: cost is now an axis. That matches a builder's reality far more closely than most evaluations out there. https://t.co/Xm7Rh4dzZB https://t.co/bIe0Xhipjr

Tim Messerschmidt

@SeraAndroid

4 days ago

The team at @cursor_ai released their Developer Habits Report showing the massive shift in how software is built. According to the report, AI isn't leveling the playing field - it's widening it. Here are my 3 takeaways for engineering leaders: 1. P99 power users are producing 46x more lines of code and merging 15x more PRs than the median. AI productivity is highly concentrated. Rollout is easy; scaling the habits, workflows, and prompt patterns of your top 1% is where the actual value lies. 2. Context is the new compiler. The input-to-output token ratio is spiking, and cache-reads now account for ~90% of token activity. Clean codebase architecture and robust workspace indexing are now direct drivers of model output quality. Spaghetti code = bad AI results. 3. Trust is shifting to automation. Over 36% of agent-generated changes are now accepted and committed without manual review. The bottleneck has officially moved from writing code to validating it. Without automated testing and security guardrails, agentic throughput will stall. > We are moving from "copilots" helping individuals to agents acting as development infrastructure. The challenge now isn’t the quality of the raw model—it’s the quality of the system you build around it. https://t.co/sQQI8YoJjk

Tim Messerschmidt

@SeraAndroid

7 days ago

🔬 It's always fun to take new open models for a spin — StepFun's Step 3.7 Flash (MoE) dropped today, so I ran the NVFP4 variant on my 2x DGX Spark setup. First impressions: → Prefill throughput is solid (~2.7-3K t/s) → Decode is on the slower side (~21-42 t/s depending on concurrency) → The NVFP4 variant doesn't ship MTP-layer weights — that's a miss → KV cache is hungry Tool-calling quality scored a perfect 100/100 on tool-eval-bench — all 15 scenarios passed. But responsiveness landed at 30/100 with a 5.3s median turn time. The pattern is interesting: high quality output, but the latency cost is real. Usable for daily experiments, but this quant doesn't quite compete with faster options for interactive use. Release: https://t.co/URnYBV9i0F NVFP4 variant: https://t.co/xMq35Qh99n

SeraAndroid's tweet photo. 🔬 It's always fun to take new open models for a spin — StepFun's Step 3.7 Flash (MoE) dropped today, so I ran the NVFP4 variant on my 2x DGX Spark setup.

First impressions:
→ Prefill throughput is solid (~2.7-3K t/s)
→ Decode is on the slower side (~21-42 t/s depending on concurrency)
→ The NVFP4 variant doesn't ship MTP-layer weights — that's a miss
→ KV cache is hungry

Tool-calling quality scored a perfect 100/100 on tool-eval-bench — all 15 scenarios passed. But responsiveness landed at 30/100 with a 5.3s median turn time.

The pattern is interesting: high quality output, but the latency cost is real. Usable for daily experiments, but this quant doesn't quite compete with faster options for interactive use.

Release: https://t.co/URnYBV9i0F
NVFP4 variant: https://t.co/xMq35Qh99n

727

Tim Messerschmidt

@SeraAndroid

8 days ago

Typically vision-language models decode bounding boxes the same way they decode text — one coordinate token at a time. x1, then y1, then x2, then y2. Sequentially. It works, but it's slow and the coordinates have no awareness of each other during generation. NVIDIA's LocateAnything-3B takes a different approach: Parallel Box Decoding. Each bounding box is predicted atomically in a single forward pass. The result is significantly faster decoding throughput and better localization accuracy — because the coordinates are geometrically coherent by design, not by luck. What makes it interesting for you? It's a single 3B-parameter model (built on Qwen2.5-3B) that handles document understanding, GUI grounding, dense object detection, and OCR localization under one unified architecture. Small enough to run locally, capable enough to be useful. There's a live demo on HuggingFace if you want to try it before reading the paper. 🤗 https://t.co/6UErIWqwIH 📄 https://t.co/V41HqhK9W2 #AI #ComputerVision #ObjectDetection

Tim Messerschmidt

@SeraAndroid

8 days ago

Most tool-calling benchmarks test models in ideal conditions — clean context, well-formed payloads, single-turn. That's not how agents work in production. I built tool-eval-bench to find out what actually breaks. 74 deterministic scenarios testing multi-turn chains, safety boundaries, structured output, and error recovery — against any OpenAI-compatible endpoint (vLLM, llama.cpp, LiteLLM). Mocks inject realistic noise (extra metadata, timestamps, nested objects) because real APIs are messy. The feature I keep coming back to: --context-pressure. It pre-fills your context window before each scenario to simulate real agentic load. In my testing, most models hold up fine through 50% pressure. Past 75%, tool selection degrades, parameters get hallucinated from earlier context, and multi-turn chains collapse. The breaking point depends as much on your KV cache config as on the model itself. Also includes --spec-live for a live terminal view of speculative decoding acceptance rates, and integrates with llama-benchy for prefill/decode throughput sweeps. Heavily inspired by @stevibe's BenchLocal — I wanted to extend that foundation with multi-turn edge cases, structured output schemas, and pressure testing under load. https://t.co/dgZN8uqygi #AgenticAI #LLMs #vLLM

114

SeraAndroid retweeted

Shengzhe

@shengzheyao

9 days ago

Antigravity CLI 1.0.3 is just out! Now you can use Google AI credits when quota runs out. - ⁠/config -> UseF1Credits⁠ to turn it on. /credits⁠ to check balance. - Enhanced logo on Apple Terminal and more informational color scheme preview panels. - Improved ⁠/diff⁠ experience and various critical fixes. Getting started: https://t.co/EfaMSLXLX1

455

102

32K

Tim Messerschmidt

@SeraAndroid

9 days ago

@0xSero Glad it worked!

133

Tim Messerschmidt

@SeraAndroid

9 days ago

💡 The most underrated AI coding technique isn't writing code faster. It's writing better code more slowly. 👉 The insight from Nolan Lawson's approach: run multiple models on every PR, cross-validate their findings, fix what's real. Near-zero false positives. 💪 That's not vibe coding. That's engineering discipline. https://t.co/xXfJolmstp

106

SeraAndroid retweeted

Jack Wotherspoon

@JackWoth98

9 days ago

Theo hits a lot of points that ring true in my own workflow. → Learn how to best interact with the diff models → Get a remote coding setup (no half open laptop😅) → Play around with style of your AGENTS.md file → Create a gold standard spec for reference Take the time to teach your agent how you like to work with proper context and references, you'll notice the difference over time💡

SeraAndroid retweeted

Shengzhe

@shengzheyao

15 days ago

Antigravity CLI 1.0.1 is out. Key updates: - Fixed OAuth not persisting in some environments. - Enhanced the visual experience on Windows. - Added the new "proceed in sandbox" permission control. Restart agy to auto update or run “agy update". See the full changelog for details: https://t.co/zJNgkJck3Z

473

116

49K

SeraAndroid retweeted

Logan Kilpatrick

@OfficialLoganK

16 days ago

We just 3xed the rate limits across all tiers in Antigravity so that you can put 3.5 Flash through its paces even more, enjoy, and keep the feedback coming! :)

252

152

204

297K

Tim Messerschmidt

@SeraAndroid

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users