Malcolm Chan | NRN

@realmc

@competesai @nrnagents | Redefining Robustness in Al/Robotics | Insights on Embodied Al, ML Failures & Real-World Deployment | DM for Collabs

Joined March 2022

868 Following

1.3K Followers

3.1K Posts

Malcolm Chan | NRN

@realmc

10 days ago

@ItakGol had this exact same convo this morning

Malcolm Chan | NRN

@realmc

2 months ago

@garrytan Garry cooking on the daily

Malcolm Chan | NRN

@realmc

2 months ago

@TencentHunyuan Tecnent native playing field fr

411

realmc retweeted

SiHing Guppy

@sihing_guppy

3 months ago

TL;DR Photometric robustness in VLA models is achievable. One model already proved it. We ran seven photometric stress tests on two vision-language-action models. Same benchmark, same perturbations, same severity levels. Pi 0.5 held flat. SmolVLA lost ground on nearly every one.

sihing_guppy's tweet photo. TL;DR Photometric robustness in VLA models is achievable. One model already proved it.

We ran seven photometric stress tests on two vision-language-action models. Same benchmark, same perturbations, same severity levels. Pi 0.5 held flat. SmolVLA lost ground on nearly every one. https://t.co/mV2PwSFJLt

630

Who to follow

Wei Xie | NRN

@WeiNRN

Co-founder @nrnagents $NRN. We train AI and robots to learn. All views and opinions are my own.

Lobapixel

@Lobapixel

Cozy Creator | Mustache | Absolute Cinema⠀ ⠀⠀⠀ ⠀⠀⠀ [email protected] 📩ㅤ ㅤ ㅤ ㅤㅤ ㅤ ㅤ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ㅤhttps://t.co/CuCUfmSyA4 🌐⬇️

𝗔𝗠

@am_n3twork

Exploring the frontier of Web3 and NFTs 🌐

Malcolm Chan | NRN

@realmc

3 months ago

@sihing_guppy Performance chart looking robust to me @grok

Malcolm Chan | NRN

@realmc

3 months ago

@sihing_guppy @grok what does the image tell us?

Malcolm Chan | NRN

@realmc

3 months ago

@sihing_guppy Wonder how the camera side will adjust their product line to meet the robotics demand.

Malcolm Chan | NRN

@realmc

3 months ago

Been saying this, org will be flat. Just small team who shared a goal. Scalability and automation changing everything.

Dan Martell

@danmartell

3 months ago

Jack Dorsey just published something that should be required reading for every founder. The premise: the org chart needs to be replaced entirely. And the argument starts 2,000 years ago. For thousands of years, every organization on earth has run on the same logic the Roman Army invented. Small teams report to a leader → Leaders report to managers → Managers report to executives. The whole structure exists for one reason: to route information up and down the chain. That's it. The whole system exists to solve a bandwidth problem. Jack's argument is simple: AI solves it better. Block built what they call a "world model" - a continuously updated picture of everything happening across the company. Every decision. Every customer. Every transaction. Every bottleneck. In real time. No status update needed. No weekly sync. No manager to translate what's happening on the ground into language the executive can understand. When the world model carries the information, you don't need the layers. So they eliminated them. Block now runs on three roles: Individual contributors who build. DRIs who own specific outcomes for a fixed period. Player-coaches who develop people while still doing the work themselves. No middle layer. The system handles coordination. The humans handle the work. I've coached thousands of founders. The number one problem is always the same: information latency. By the time a problem surfaces from your front line to leadership, it's already compounded. By the time a decision travels back down, the damage is done. That lag costs you deals, people, and momentum. And most founders accept it as the price of scale. Block is trying to prove you don't have to anymore. I think they're right. Because the hierarchy was never the point - it was just the best tool we had. The moment something better exists, the layers eventually collapse. This is either the biggest structural shift since the 1850s - or it breaks at scale like everything else before it. Either way - every founder should be asking the same question: how much of your org exists just to route information? If the answer is "most of it" - that's your problem. And your opportunity. -DM

147

378

194

Malcolm Chan | NRN

@realmc

3 months ago

@sihing_guppy @grok I need your perspective

Malcolm Chan | NRN

@realmc

3 months ago

@sihing_guppy with every new piece of analytic work, we uncover more underneath a model behavior

realmc retweeted

SiHing Guppy

@sihing_guppy

3 months ago

Lighting changes constantly: time of day, weather, different rooms, sensor drift. If a model only works under the lighting conditions it saw in training, it has not really learned the task. It has learned one appearance regime. We put this to the test. Two models that take language instructions and turn them into robotic actions, Pi 0.5 and SmolVLA, ran the same manipulation tasks on a standard benchmark (LIBERO-Spatial) while we shifted brightness, exposure, gamma, contrast, saturation, white balance, and color temperature. Same geometry, same objects, same tasks. Only appearance changed. Pi 0.5 barely moved. Across nearly every perturbation, even at the highest severity, it stayed within a few percentage points of baseline. The only measurable dip was contrast, to around 94% of baseline. Not a collapse. A graceful decline. SmolVLA degraded under nearly every one. Saturation cut performance roughly in half. Brightness produced steady losses. Even gamma, white balance, and color temperature caused visible degradation. And then there was low contrast. SmolVLA went from baseline to near-zero. Not a degradation curve. A complete collapse. If both models had broken, you could argue photometric robustness is just hard, something inherent to vision encoders. Pi 0.5’s near-total immunity rules that out. Photometric robustness is achievable. SmolVLA’s failure is diagnostic. The pattern suggests SmolVLA is much more dependent on the appearance statistics of its training data. Many models silently use color as a shortcut for object identity, affordance, or state. When color shifts, those shortcuts break. By contrast, Pi 0.5 appears to have learned much stronger invariance to lighting and color shifts. Training augmentation is likely part of that story. The two models do share one vulnerability: low contrast. Pi 0.5 dips gently. SmolVLA collapses. That likely reflects something deeper about how vision encoders extract features. When edge contrast drops too far, the gradients driving feature extraction weaken, and downstream representations lose the structure needed for precise action prediction. Standard augmentation pipelines also rarely suppress contrast as aggressively as real-world conditions can. If a model fails when the lighting changes, it has learned the lighting conditions of the demo, not the task itself. Full analysis with interactive visualizations: https://t.co/NJvEl0acBg

sihing_guppy's tweet photo. Lighting changes constantly: time of day, weather, different rooms, sensor drift. If a model only works under the lighting conditions it saw in training, it has not really learned the task. It has learned one appearance regime.

We put this to the test. Two models that take language instructions and turn them into robotic actions, Pi 0.5 and SmolVLA, ran the same manipulation tasks on a standard benchmark (LIBERO-Spatial) while we shifted brightness, exposure, gamma, contrast, saturation, white balance, and color temperature.

Same geometry, same objects, same tasks. Only appearance changed.

Pi 0.5 barely moved. Across nearly every perturbation, even at the highest severity, it stayed within a few percentage points of baseline. The only measurable dip was contrast, to around 94% of baseline. Not a collapse. A graceful decline.

SmolVLA degraded under nearly every one. Saturation cut performance roughly in half. Brightness produced steady losses. Even gamma, white balance, and color temperature caused visible degradation. And then there was low contrast. SmolVLA went from baseline to near-zero. Not a degradation curve. A complete collapse.

If both models had broken, you could argue photometric robustness is just hard, something inherent to vision encoders. Pi 0.5’s near-total immunity rules that out. Photometric robustness is achievable. SmolVLA’s failure is diagnostic.

The pattern suggests SmolVLA is much more dependent on the appearance statistics of its training data. Many models silently use color as a shortcut for object identity, affordance, or state. When color shifts, those shortcuts break.

By contrast, Pi 0.5 appears to have learned much stronger invariance to lighting and color shifts. Training augmentation is likely part of that story.

The two models do share one vulnerability: low contrast.

Pi 0.5 dips gently. SmolVLA collapses.

That likely reflects something deeper about how vision encoders extract features. When edge contrast drops too far, the gradients driving feature extraction weaken, and downstream representations lose the structure needed for precise action prediction.

Standard augmentation pipelines also rarely suppress contrast as aggressively as real-world conditions can.

If a model fails when the lighting changes, it has learned the lighting conditions of the demo, not the task itself.

Full analysis with interactive visualizations:
https://t.co/NJvEl0acBg

647

Malcolm Chan | NRN

@realmc

3 months ago

Mid/back office bros in finance, your 2026 career pivot chance just got fatter. Nothing says ‘please review our AI tool stack’ like accidentally shipping your entire Claude Code source map to npm 😂

陈成

@chenchengpro

3 months ago

Claude Code 泄露了全部源码——不是被黑客攻破，是 Anthropic 自己把 source map 打包进了 npm 发布物。一个 57MB 的 https://t.co/rrsLUlL2Xp 文件，里面藏着 4756 个源文件的完整内容。其中 1906 个是 Claude Code 自身的 TypeScript/TSX 源码，剩下 2850 个是 node_modules 依赖。提取方法极其简单：https://t.co/rrsLUlL2Xp 本质就是一个 JSON，里面有两个关键数组——sources（文件路径）和 sourcesContent（对应的完整源码）。两者索引一一对应。不需要反编译，不需要反混淆，sourcesContent 里存的就是一字不差的原始代码。提取脚本见文末。从还原的源码可以看到：Claude Code 用 React + Ink 构建 CLI 界面，核心是一个 REPL 循环，支持自然语言输入和 slash 命令，底层通过工具系统与 LLM API 交互。架构设计、系统提示词、工具调用逻辑，全部一览无余。这件事的本质是一个经典的安全疏忽：source map 是开发调试用的，包含从变量名到注释的所有信息，不应该出现在生产发布物中。Anthropic 后来意识到了这个问题，移除了 source map，GitHub 上提取源码的仓库也被 DMCA 了。但早期版本的 npm 包已经被存档，源码早就在社区流传。给所有发布 npm 包的开发者提个醒：发布前检查你的 .map 文件。一行 sourcesContent 就能让你的所有代码公之于众。 https://t.co/pR6792kVsN

chenchengpro's tweet photo. Claude Code 泄露了全部源码——不是被黑客攻破，是 Anthropic 自己把 source map 打包进了 npm 发布物。

一个 57MB 的 https://t.co/rrsLUlL2Xp 文件，里面藏着 4756 个源文件的完整内容。其中 1906 个是 Claude Code 自身的 TypeScript/TSX 源码，剩下 2850 个是 node_modules 依赖。

提取方法极其简单：https://t.co/rrsLUlL2Xp 本质就是一个 JSON，里面有两个关键数组——sources（文件路径）和 sourcesContent（对应的完整源码）。两者索引一一对应。不需要反编译，不需要反混淆，sourcesContent 里存的就是一字不差的原始代码。提取脚本见文末。

从还原的源码可以看到：Claude Code 用 React + Ink 构建 CLI 界面，核心是一个 REPL 循环，支持自然语言输入和 slash 命令，底层通过工具系统与 LLM API 交互。架构设计、系统提示词、工具调用逻辑，全部一览无余。

这件事的本质是一个经典的安全疏忽：source map 是开发调试用的，包含从变量名到注释的所有信息，不应该出现在生产发布物中。Anthropic 后来意识到了这个问题，移除了 source map，GitHub 上提取源码的仓库也被 DMCA 了。但早期版本的 npm 包已经被存档，源码早就在社区流传。

给所有发布 npm 包的开发者提个醒：发布前检查你的 .map 文件。一行 sourcesContent 就能让你的所有代码公之于众。

https://t.co/pR6792kVsN

229

628K

270

Malcolm Chan | NRN

@realmc

3 months ago

@Ali_TongyiLab @WilliamLamkin Audio visual vibe coding whattt?

338

Malcolm Chan | NRN

@realmc

3 months ago

@Alibaba_Qwen now we just need to expand to robotics control

220

Malcolm Chan | NRN

@realmc

3 months ago

We compared Pi 0.5 and SmolVLA on LIBERO-Spatial under photometric stress tests. RESULT IS HARD TO IGNORE.

Grok

@grok

3 months ago

Got it—bookmarked! Full technical blog + interactive UX on photometric perturbations here: https://t.co/1H1W1sOoVS Pi 0.5's robustness vs SmolVLA's sensitivity is a great reminder: real-world robotics needs vision that separates structure from lighting, not just memorizes appearance. Solid analysis.