原

@aiwith3232

AIで相棒を作りたい。 AI Engineer / CV / LLM / Robotics / Medical AI | 松尾研LLMコンペ2025 優勝&MVP / FT-LLM出場 / ICRAコンペ出場 IROS コンペ出場

東京

Joined November 2022

494 Following

279 Followers

428 Posts

原 @aiwith3232

2 days ago

LLM関連の様々な基礎しっかりとした教科書てきなものが！！ありがたい・・・ https://t.co/qUAy6sBD0X

890

112K

原 @aiwith3232

7 days ago

いかんコンペのデータ見たり上位Policy考えたりしてたら4:30なってる…

297

原 @aiwith3232

8 days ago

ありがたっ！

WquGuru

@wquguru

9 days ago

GEMINI预训练团队负责人在五月份写了一篇《如何进入头部AI团队/实验室》，非常值得一读，总结10条经验：前沿实验室的招聘卷得厉害，能进去的大多出自顶尖本科或博士项目，身上普遍有三样东西：方向打得准、数学功底硬、拼劲十足。对大多数普通人来说，硬挤正面没胜算，更务实的打法是绕到 LLM 栈的两头——要么沉下去做底层（kernel 优化、推理加速、量化），要么浮上来做上层（agent 系统、用 LLM 搞算法实验），靠开源项目和实打实的成果说话，比什么都有说服力。 1. 找准方向，切忌空谈别张嘴就是“我对 AI 感兴趣”。得盯准实验室真正缺人的地方：底层就死磕 FlashAttention、量化推理、kernel 编程；上层就做 agent 循环、LLM 辅助的实验设计。这些方向门槛不高，自学能上手，实验室又天天用得着。 2. 数学底子绕不过去这里没有捷径，只能靠大量证明题和理论课程硬磨。优化理论、scaling laws、模型演化逻辑，得吃到骨子里去。数学成熟度是分水岭，过了就是研究者，没过就是操作工。 3. 拼劲得够，别怕苦大学那几年，周末泡图书馆是常态，社交能砍就砍。作者当年和朋友靠浓咖啡硬扛一整天，就这么熬过来的。想和那拨尖子生掰手腕，没有这点强度打底，基本没戏。 4. 用开源项目破局，别指望简历把代码挂到 GitHub 上，做复现、提改进、跑 benchmark。能力这东西，摆出来给人看，比写在纸上管用得多。 5. 从边角料干起，别一上来就冲大模型先搞定 kernel 编程（CuTe、LLM.int8() 量化这些）、推理优化、agent 工具链。这些活实验室最缺人，也最容易自己闷头学出来。 6. 论文 + 动手复现，两手都要硬精读 FlashAttention、SnapKV 这些经典，把 LLM 的脉络理清楚。建议从 Reiner Pope 的访谈和 Gemini Flash 的预训练讲座入手，再慢慢啃 scaling laws。 7. AI 工具用来提速，别用来代替学习 AI 能让你已经会的东西干得更快，但绝不能拿它来学新东西或跳过思考。一依赖就废，数学和拼劲都养不出来了。 8. 每隔半年复盘一次方向别嫌活脏、别嫌活基础——只要那条路通到金矿，就值得挖。关键是想清楚自己在往哪走。 9. 拿具体练习证明自己 - 用 JAX/Flax 从零写一个 1000 万参数左右的 Transformer，在 Colab TPU 上跑通加法任务； - 手推 Chinchilla scaling laws，对比稠密模型和 MoE； - 写一个 Pallas kernel 做算子融合，实测前向加速并说清原因。 10. 最终就一条路：本事 + 作品 + 死磕，缺一不可名校背景当然管用，但对出身一般的选手，最靠谱的就是在 kernel 或 agent 领域做出能拿出来秀的东西，然后做好打 5–10 年持久战的准备。

771

178

989

53K

474

原 @aiwith3232

9 days ago

モデル班のリーダーとしてPaper&Hacksに登壇します！興味ある方はぜひ！

RAMEN @ramen_group_ai

9 days ago

ICRA2026の取り組みについてRAMENで松尾研LLMコミュニティのPaper&Hacksに登壇します。 https://t.co/J1UIik5VSs

468

316

Who to follow

flutter開発を主とする(？)新卒エンジニア/Atcoder緑/Noveler作者/個人開発

YuugouOhno/むらさき

@YuugouOhno

27卒/sofeware engineer/ReLU branch/松濤会/人と関わることと面白いことと紫色が好きです。

原 @aiwith3232

13 days ago

最高なメンバーで参加中（9月まで） https://t.co/gl2Z5UkgQW

161

aiwith3232 retweeted

東京大学松尾・岩澤研究室

@Matsuo_Lab

18 days ago

【説明会参加者受付開始】「AIRoA × 東京大学松尾・岩澤研究室 Physical AI Robot Challenge 2026（PARC2026）」コンペ参加者募集説明会のご案内 ※詳細・申込方法はスレッドで 2026年8月に始動予定の新たな研究開発プロジェクトです。国立研究開発法人新エネルギー・産業技術総合開発機構（NEDO）委託事業（※）の一環として実施される取り組みであり、日本におけるPhysical AI基盤の強化と人材育成を目的として一般社団法人AIロボット協会（AIRoA）と東京大学松尾・岩澤研究室が共催します。世界中の研究機関や企業が取り組むPhysical AI分野の重要課題であるシミュレーション環境上でのロボット基盤モデル開発の一連の経験を通じて、次世代のPhysical AI研究者・エンジニアとして活躍するための実践力を養うことを目指します。 AI × ロボティクスという先端技術の最前線で、研究と実装の両面において挑戦する意欲のある方のご参加をお待ちしております。 ※NEDO委託事業「ポスト5G情報通信システム基盤強化研究開発事業／ロボティクス分野の生成AI基盤モデルの開発に向けたデータプラットフォームに係る開発」 #東大松尾研 #東京大学松尾・岩澤研究室 #AI #フィジカルAI #PAI #PARC2026

Matsuo_Lab's tweet photo. 【説明会参加者受付開始】「AIRoA × 東京大学松尾・岩澤研究室 Physical AI Robot Challenge 2026（PARC2026）」コンペ参加者募集説明会のご案内
※詳細・申込方法はスレッドで

2026年8月に始動予定の新たな研究開発プロジェクトです。
国立研究開発法人新エネルギー・産業技術総合開発機構（NEDO）委託事業（※）の一環として実施される取り組みであり、日本におけるPhysical AI基盤の強化と人材育成を目的として一般社団法人AIロボット協会（AIRoA）と東京大学松尾・岩澤研究室が共催します。

世界中の研究機関や企業が取り組むPhysical AI分野の重要課題であるシミュレーション環境上でのロボット基盤モデル開発の一連の経験を通じて、次世代のPhysical AI研究者・エンジニアとして活躍するための実践力を養うことを目指します。

AI × ロボティクスという先端技術の最前線で、研究と実装の両面において挑戦する意欲のある方のご参加をお待ちしております。

※NEDO委託事業「ポスト5G情報通信システム基盤強化研究開発事業／ロボティクス分野の生成AI基盤モデルの開発に向けたデータプラットフォームに係る開発」

#東大松尾研 #東京大学松尾・岩澤研究室 #AI #フィジカルAI #PAI #PARC2026

164

53K

原 @aiwith3232

18 days ago

今年の松尾研のコンペはこれか？？ https://t.co/snASwAbZkr

124

原 @aiwith3232

24 days ago

まじか青春が…

上海アリスReprise

@SAliceReprise

25 days ago

ついに『東方紅魔郷』が、今の環境で遊べるようになりました！なんとびっくり Switch,Switch 2,PlayStation5,Steamで遊べます。殆どのデータを今の技量で作り直した『東方紅魔郷：New Classic』9月10日発売！　さらに2002年当時のゲームを可能な限り再現したClassic版も遊べますよ。（ZUN）

337

75K

38K

原 @aiwith3232

25 days ago

そういえば東大の医療LLMの案件のレポートがSakuraInternetから出ました。自分たちが行っていた研究に関しては本流ではなかったので入ってはいないですが、今後記事も出していく可能性もあるのでできる限り知見を伝えていこうと思ってます。 https://t.co/c67ngxxDUW

原 @aiwith3232

about 1 month ago

こんなロボットを実世界に作りたい…！ハードもソフトもこだわりがすごい【フィジカルAI】バスケロボ CUEが超進化！デビューまでの舞台裏｜トヨタイムズニュース https://t.co/CvMs7AGuDa @YouTubeより

407

原 @aiwith3232

about 1 month ago

え...!こんなにSO-101ってスムーズに動くの!? すごい・・・

KuphDev

@KuphDev

about 1 month ago

I’ve been playing with the number of demo episodes and training steps for ACT and I think I finally found the sweet spot for my SO-101. I got decent performance with 50 episodes and 100k training steps, however operation was certainly not smooth. But when I increased to around 80-90 episodes and 200k training steps, it eliminated a lot of the jitters and erratic movements for much smoother operation. I’ve got each color of duck pick or place running AI models with various training params, so I’m excited to experiment on stream The tweet-controlled livestream where you can tell my robot what to do should be live in less than a week now!

113

15K

277

原 @aiwith3232

about 1 month ago

今日はナナヲアカリライブあまり歌ってなかった曲多くて最高

aiwith3232 retweeted

NVIDIA AI

@NVIDIAAI

about 1 month ago

This #CVPR2026 paper from our research team is trending #1 on @HuggingFace 🤗 Meet LocateAnything: a vision-language detection model that rethinks bounding box prediction. For AI agents and robots, “seeing” is only useful if a model can pinpoint where something is fast enough to act. Trained on 138M high-quality samples, LocateAnything decodes bounding boxes in parallel instead of one coordinate at a time, improving localization accuracy while dramatically increasing throughput for visual grounding and detection. Project page: https://t.co/O7JMe8tzFM

332

335K

原 @aiwith3232

about 1 month ago

いきたいけど…間に合わない…😭

Shunsuke Aoki

@aoshun7

about 1 month ago

オープンラボもやるよ！ https://t.co/DGn9RezeqX

122

原 @aiwith3232

about 1 month ago

参戦してきました〜小学生から推してた方を間近で見れるとは！ヲタク的会話の心地いい空間は最高ですね。

ゆうきとねきのすきにやるTV【公式】

@yukinekisuki

about 1 month ago

『ゆうきとねきの百鬼夜行』すきに野郎のみなさま、夜の部もご参加ありがとうございました！次のイベントまで、すきにね〜！ #ゆきねき #悠木碧 #松澤ネキ

691

127

83K

123

aiwith3232 retweeted

alphaXiv

@askalphaxiv

about 1 month ago

New linear attention SoTA? Gated DeltaNet-2 from NVIDIA beats KDA and Mamba-3. Prior DeltaNet/KDA models used one scalar gate for both erasing old memory and writing new memory. This paper splits that into channel-wise erase and write gates, making memory edits more precise without losing efficient chunkwise training. At 1.3B params on 100B tokens, it beats Mamba-2, Gated DeltaNet, KDA, and Mamba-3 variants, with the biggest gains on long-context retrieval.

askalphaxiv's tweet photo. New linear attention SoTA? Gated DeltaNet-2 from NVIDIA beats KDA and Mamba-3.

Prior DeltaNet/KDA models used one scalar gate for both erasing old memory and writing new memory.

This paper splits that into channel-wise erase and write gates, making memory edits more precise without losing efficient chunkwise training.

At 1.3B params on 100B tokens, it beats Mamba-2, Gated DeltaNet, KDA, and Mamba-3 variants, with the biggest gains on long-context retrieval.

153

17K

aiwith3232 retweeted

Kye Gomez (swarms)

@KyeGomezB

3 months ago

Introducing OpenMythos An open-source, first-principles theoretical reconstruction of Claude Mythos, implemented in PyTorch. The architecture instantiates a looped transformer with a Mixture-of-Experts (MoE) routing mechanism, enabling iterative depth via weight sharing and conditional computation across experts. My implementation explores the hypothesis that recursive application of a fixed parameterized block, coupled with sparse expert activation, can yield improved efficiency–performance tradeoffs and emergent multi-step reasoning. Learn more ⬇️🧵

KyeGomezB's tweet photo. Introducing OpenMythos

An open-source, first-principles theoretical reconstruction of Claude Mythos, implemented in PyTorch.

The architecture instantiates a looped transformer with a Mixture-of-Experts (MoE) routing mechanism, enabling iterative depth via weight sharing and conditional computation across experts.

My implementation explores the hypothesis that recursive application of a fixed parameterized block, coupled with sparse expert activation, can yield improved efficiency–performance tradeoffs and emergent multi-step reasoning.

Learn more ⬇️🧵

240

原 @aiwith3232

about 2 months ago

ICRA 2026 VLA Competition 結果4位でとても悔しいんですが Pi05をいろんな学習方法やlerobotにパッチ入れたり、いろんな評価指標で判断したり(実機持ってなかったから)とたくさんの経験をすることができました！！もっと勉強して次は勝つ！！

166

原 @aiwith3232

about 2 months ago

量産可能ってすげぇ

Unitree

@UnitreeRobotics

about 2 months ago

Unitree Unveils: GD01, A Manned Transformable Mecha, from $650,000 👏 The world's first production-ready manned mecha. It can transform. It's a civilian vehicle. It weighs ~500kg with you inside. Please everyone be sure to use the robot in a Friendly and Safe manner.

17K

156

原 @aiwith3232

about 2 months ago

めっちゃわかる…！！ ms-swiftにCHORD組み込んだとき、Claudeはすごく良さそうなのにattention maskもpadding-freeも本当には理解してなく、4D causal mask対応やPP対応で何十回もパッチ当て続ける羽目になった。中身を理解してないコードは、いくら綺麗に見えても動かない、っていうのはその通り。

Arthur Zucker

@art_zucker

about 2 months ago

This is going to be a little bit long, but I want to give hope to my fellow anxious ML engineers. We see a lot of propaganda on how this or that AI one shotted something, about how incredibly strong the models are getting and how we don't even need to review PRs and we can just ship to production. Although this can be true for some cases, its also far from being representative of all the challenges we have to face. I started using claude code 4 month ago, and quickly realized how it really does change the way we work. I can experiment 10x faster, fix small issues without coding and refactor code without sweating. BUT, these tasks were "just" tedious and not hard. The challenge in my day to day work is to take a research code and integrate it into transformers using our standards. Its challenging because code beauty is abstract and subjective just like a philosophy. By relying too much on claude, and on how seemingly good the code it produces look, I pushed the deepseekv4 integration without realizing that claude really did not understand the model. I gave it access to `transformers`, the original paper, the original code, the different blog posts and my past chats and skills created to add a model, a b200 node node and a LOT of tokens, but it did NOT nail it. It did not understand the eager attention path, it did not understand the basics of causal attention. It was even wrong implementing the manifold constrained hyper connections. It helped to reduce the burden of exploring implementation and debugging but it did not help reason around the model. I am not a doomer, I think our job as Software Engineers has never been this great, I am just saying that we still have a job, and we should still be a bit careful when it looks to good to be true 😉

210

22K

171

原

@aiwith3232

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users