atom @ctxzz - Twitter Profile

1 day ago

単一画像や動画からロボット用の操作可能なシミュレーション環境を自動生成するreal-to-simパイプライン「SIMFOUNDRY」。real-to-sim評価と sim-to-real 学習を一つの枠組みで実現。 SIMFOUNDRY: Modular and Automated Scene Generation for Policy Learning and Evaluation https://t.co/zvp4vq33ZO

ctxzz's tweet photo. 単一画像や動画からロボット用の操作可能なシミュレーション環境を自動生成するreal-to-simパイプライン「SIMFOUNDRY」。real-to-sim評価と sim-to-real 学習を一つの枠組みで実現。

SIMFOUNDRY: Modular and Automated Scene Generation for Policy Learning and Evaluation https://t.co/zvp4vq33ZO https://t.co/p6hTKo5Tvh

0

21

ctxzz retweeted

Ideogram @ideogram_ai

3 days ago

Introducing Ideogram 4.0: the best open image model in the world. Think it. Make it. Own it. Download the weights, fine-tune on your own data, and run it on your hardware. Live on every Ideogram plan and the API today.

398

8K

859

7K

2M

ctxzz retweeted

elie

@eliebakouch

3 days ago

microsoft MAI tech report is a gold mine, one of the most transparent for a model at this scale. this model uses zero synthetic data or distillation from previous models. this means reasoning, agentic behavior, tool use are all learned fully during post-training with no cold start. bold choice that makes it harder and requires more iterations to reach sota, but you get FULL control over your model series and it proves they are serious about being a frontier lab. the tech report is insanely detailed and precise about numbers. to give an example, they give the exact MFU across all the iterations of the model, with the exact changes etc. they also share the full scaling ladder recipe, to my knowledge this is the first time i've seen this in a tech report at this scale let's look at all of this in this likely very long thread 🧵

eliebakouch's tweet photo. microsoft MAI tech report is a gold mine, one of the most transparent for a model at this scale.

this model uses zero synthetic data or distillation from previous models. this means reasoning, agentic behavior, tool use are all learned fully during post-training with no cold start. bold choice that makes it harder and requires more iterations to reach sota, but you get FULL control over your model series and it proves they are serious about being a frontier lab.

the tech report is insanely detailed and precise about numbers. to give an example, they give the exact MFU across all the iterations of the model, with the exact changes etc. they also share the full scaling ladder recipe, to my knowledge this is the first time i've seen this in a tech report at this scale

let's look at all of this in this likely very long thread 🧵

41

2K

263

2K

269K

atom @ctxzz

3 days ago

手足が使いにくい人向けに、カメラ1台だけで使えるハンズフリーなPC操作インターフェース「LookAHead」。 LookAHead: Hybrid Gaze (Look) And Head Refinement Approach for Hands-Free Computer Interaction | Proceedings of the ACM on Human-Computer Interaction https://t.co/qawtPAkxZn

ctxzz's tweet photo. 手足が使いにくい人向けに、カメラ1台だけで使えるハンズフリーなPC操作インターフェース「LookAHead」。

LookAHead: Hybrid Gaze (Look) And Head Refinement Approach for Hands-Free Computer Interaction | Proceedings of the ACM on Human-Computer Interaction https://t.co/qawtPAkxZn https://t.co/OCy6iYjrRw

0

21

Who to follow

Magnet ◢ ◤

@Magnet135

フォロバ99%♪Follow you back! ポイント/懸賞/音楽聴いた曲をツイート/☕️/ she/her #ポケモンgo

かな＠相互フォロー

@harudaki12769

フォロバ忘れてたらDMで教えてくださいね💖

epatag@相互フォロー100%

@earn_patagonia

暖かツイートで癒します。botもありますが、人間も呟いてます。フォロー、お願いします！※Amazonアソシエイト参加者です。

atom @ctxzz

4 days ago

スケッチ生成を静止画像ではなく描くプロセスとして扱い、テキストから人間らしい描画順序のスケッチ動画を生成する「VideoSketcher」。インタラクティブな人–AIインタラクションが可能。 VideoSketcher: Video Models Prior Enable Versatile Sequential Sketch Generation https://t.co/xKmgfh7bMk

ctxzz's tweet photo. スケッチ生成を静止画像ではなく描くプロセスとして扱い、テキストから人間らしい描画順序のスケッチ動画を生成する「VideoSketcher」。インタラクティブな人–AIインタラクションが可能。

VideoSketcher: Video Models Prior Enable Versatile Sequential Sketch Generation https://t.co/xKmgfh7bMk https://t.co/Uwe9annP17

0

30

atom @ctxzz

4 days ago

動画コーデックのmotion vectorとresidualを直接利用し、フレームを少数のΔトークンに圧縮するVideoLM向けトークナイズ「CoPE-VideoLM」。トークン数を最大93％削減しつつ同等以上の精度 CoPE-VideoLM: Leveraging Codec Primitives For Efficient Video Language Modeling https://t.co/ZXbxQWzwdl

ctxzz's tweet photo. 動画コーデックのmotion vectorとresidualを直接利用し、フレームを少数のΔトークンに圧縮するVideoLM向けトークナイズ「CoPE-VideoLM」。トークン数を最大93％削減しつつ同等以上の精度

CoPE-VideoLM: Leveraging Codec Primitives For Efficient Video Language Modeling https://t.co/ZXbxQWzwdl https://t.co/4IFo7Q0yht

0

1

0

15

ctxzz retweeted

国土地理院 @GSI_chiriin

5 days ago

【リリース】本日、「3次元地図可視化サイト」を試験公開！ ―国土の姿をより分かりやすく― 「3次元電子国土基本図」のデータをウェブブラウザ上でご覧いただけます。建物、道路、鉄道を立体的な表現で見ることができます。 ▼「3次元地図可視化サイト」はこちら https://t.co/U13bpYibub

GSI_chiriin's tweet photo. 【リリース】
本日、「3次元地図可視化サイト」を試験公開！
―国土の姿をより分かりやすく―
「3次元電子国土基本図」のデータをウェブブラウザ上でご覧いただけます。
建物、道路、鉄道を立体的な表現で見ることができます。
▼「3次元地図可視化サイト」はこちら
https://t.co/U13bpYibub https://t.co/ynEkD6C7ll

10

4K

1K

2K

725K

atom @ctxzz

7 days ago

LLMをフルな世界モデルとして使うのではなく、アフォーダンスにもとづく部分世界モデルとして利用する理論と実験。探索の分岐数を大きく削減しつつ、高い報酬と効率的な計画が可能に。 Affordances Enable Partial World Modeling with LLMs https://t.co/6mOcJBCD2A

ctxzz's tweet photo. LLMをフルな世界モデルとして使うのではなく、アフォーダンスにもとづく部分世界モデルとして利用する理論と実験。探索の分岐数を大きく削減しつつ、高い報酬と効率的な計画が可能に。

Affordances Enable Partial World Modeling with LLMs https://t.co/6mOcJBCD2A https://t.co/beFGE6HSNf

0

16

atom @ctxzz

7 days ago

音声を通じたプロンプトインジェクション。人には普通の音声にしか聞こえないがAIには命令として解釈される敵対的音声を用い、AzureやMistral AIなどのAIエージェントを遠隔操作可能に AI voice bots hijacked by ‘hidden’ sounds in podcasts, MP3 files and YouTube clips https://t.co/KVayXKYJ26

0

1

0

45

atom @ctxzz

8 days ago

ペルソナ駆動で合成PC環境を作り、そこでエージェントに長期の仕事をさせて経験信号を収集する。得られた教訓や失敗パターンを職種別スキルとして整理するシミュレーションの枠組み。 Synthetic Computers at Scale for Long-Horizon Productivity Simulation https://t.co/B7iIX0PuCP

ctxzz's tweet photo. ペルソナ駆動で合成PC環境を作り、そこでエージェントに長期の仕事をさせて経験信号を収集する。得られた教訓や失敗パターンを職種別スキルとして整理するシミュレーションの枠組み。

Synthetic Computers at Scale for Long-Horizon Productivity Simulation https://t.co/B7iIX0PuCP https://t.co/fB3ITe9zwo

0

44

ctxzz retweeted

Jaya Gupta

@JayaGup10

9 days ago

https://t.co/eUp5waUIwl

58

444

56

809

347K

atom @ctxzz

9 days ago

画像あり評価と画像なし評価の比を「Mirage Score」として定義し、代表的な医療ベンチマークの多くがテキストだけでも高精度で解けてしまう構造的問題を指摘。テキストや設問構造、公開データ由来のリーク、分布統計など。 MIRAGE: The Illusion of Visual Understanding https://t.co/6UbQNCcSSk

ctxzz's tweet photo. 画像あり評価と画像なし評価の比を「Mirage Score」として定義し、代表的な医療ベンチマークの多くがテキストだけでも高精度で解けてしまう構造的問題を指摘。テキストや設問構造、公開データ由来のリーク、分布統計など。

MIRAGE: The Illusion of Visual Understanding https://t.co/6UbQNCcSSk https://t.co/BEWlkrFP3S

0

34

atom @ctxzz

10 days ago

画像・動画・3Dを単一の4次元潜在空間にマッピングするトークナイザ「ATOKEN」。再構成と意味理解を同時に実現し、ゼロショット精度82.2%など高い性能。 AToken: A Unified Tokenizer for Vision https://t.co/pGeofmN73x

ctxzz's tweet photo. 画像・動画・3Dを単一の4次元潜在空間にマッピングするトークナイザ「ATOKEN」。再構成と意味理解を同時に実現し、ゼロショット精度82.2%など高い性能。

AToken: A Unified Tokenizer for Vision https://t.co/pGeofmN73x https://t.co/aRmGiUKC9g

0

46

ctxzz retweeted

Google Flow

@FlowbyGoogle

11 days ago

Celebrating one week of Gemini Omni in Google Flow with 7 examples of what it can do 🧵

29

1K

105

1K

149K

atom @ctxzz

11 days ago

医療現場での不確実な診断を対象に、推論プロセスがコードレベルで検証・介入可能な「MedMSA（Medical Model Synthesis Architecture）」。LLMで症状文を確率的プログラミング言語 WebPPL の条件式などに変換して推論 Medical Model Synthesis Architectures: A Case Study https://t.co/zulIRsycgF

ctxzz's tweet photo. 医療現場での不確実な診断を対象に、推論プロセスがコードレベルで検証・介入可能な「MedMSA（Medical Model Synthesis Architecture）」。LLMで症状文を確率的プログラミング言語 WebPPL の条件式などに変換して推論

Medical Model Synthesis Architectures: A Case Study https://t.co/zulIRsycgF https://t.co/qjZUs3ybzI

0

37

atom @ctxzz

11 days ago

Efficient Video Intelligence in 2026 https://t.co/NadJ3gbrWV

0

7

atom @ctxzz

12 days ago

LLM/VLMが部分観測下でどこを見に行くか自律的に決め、内部に「空間的信念（認知地図）」を構築・改訂・活用できるかを評価する「Theory of Space」。#ICLR2026 Theory of Space: Can Foundation Models Construct Spatial Beliefs Through Active Exploration? https://t.co/5UXtbLEPM9

ctxzz's tweet photo. LLM/VLMが部分観測下でどこを見に行くか自律的に決め、内部に「空間的信念（認知地図）」を構築・改訂・活用できるかを評価する「Theory of Space」。#ICLR2026

Theory of Space: Can Foundation Models Construct Spatial Beliefs
Through Active Exploration? https://t.co/5UXtbLEPM9 https://t.co/cjVzY3aulx

1

0

32

atom @ctxzz

13 days ago

ローカル LLM が本当にクラウドより安いのかを検証するため、Apple Silicon M5 Max での推論コストを電力とハードウェア寿命込みで試算。1M トークンあたり 0.40〜4.79ドルとなり、OpenRouter 上の Gemma4 31B より概ね割高か同程度で、しかもクラウドの方が数倍高速だった。 https://t.co/0NSdbQQv56

0

126

atom @ctxzz

14 days ago

Geminiがポインタ周辺の画面内容を解釈し、「これを要約」「あれを移動」のような自然な指示で操作可能にする「AI対応ポインタ」を提案。 Shaping the future of AI interaction by reimagining the mouse pointer https://t.co/md5LnZlWCp

0

21

atom @ctxzz

17 days ago

人がAIエージェントについて「何を知っていると思っているか」という“二次の心の理論”をAI側に持たせることで、人とエージェントのやり取りを改善できるかを調べた研究。 What Do You Think I Think? Accounting for Human Beliefs Using Second-Order Theory of Mind https://t.co/ck48q1wVwa

ctxzz's tweet photo. 人がAIエージェントについて「何を知っていると思っているか」という“二次の心の理論”をAI側に持たせることで、人とエージェントのやり取りを改善できるかを調べた研究。

What Do You Think I Think? Accounting for Human Beliefs Using Second-Order Theory of Mind https://t.co/ck48q1wVwa https://t.co/oIbQ7Ota3y

0

24

atom

@ctxzz

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users