So Kuroki

@sharp_computer

Researcher @SakanaAILabs. LLM and user interface.

Joined August 2016

674 Following

638 Followers

102 Posts

So Kuroki @sharp_computer

26 days ago

Several people asked about the compute required for this research. Training takes about half a day on 2-3 GPUs. Ofc we used more compute during research and experimentation, but each iteration itself is lightweight. If you can SFT a 7B model (like Moshi), you can try it too.

0

6

1

1

388

So Kuroki @sharp_computer

28 days ago

KAME🐢 will be presented at tomorrow’s final session! Please stop by before you leave. 5/8 2:00–4:00 PM Poster Area 30.4 #ICASSP

about 1 month ago

We’re excited to introduce KAME: Tandem Architecture for Enhancing Knowledge in Real-Time Speech-to-Speech Conversational AI, accepted at #ICASSP2026! 🐢 Blog https://t.co/arVz1TGpJJ Paper https://t.co/0EwpyRXeCs Can a speech AI think deeply without pausing to process? In real conversation, we don’t wait until we’ve fully worked out what we want to say—we start talking, and our thoughts catch up as the sentence unfolds. Fast speech-to-speech models achieve this, but their reasoning tends to stay shallow. Cascaded pipelines that route through a knowledgeable LLM are smarter, but the added latency breaks the flow—they fall back to "think, then speak." In our new paper, we propose a way to break this trade-off. We call it KAME (Turtle in Japanese). A speech-to-speech model handles the fast response loop and starts replying immediately. In parallel, a backend LLM runs asynchronously, generating response candidates that are continuously injected as "oracle" signals in real time. This shifts the AI paradigm from "think, then speak" to "speak while thinking." The backend LLM is completely swappable. You can plug in GPT-4.1, Claude Opus, or Gemini 2.5 Flash depending on the task without changing the frontend. In our experiments, Claude tended to score higher on reasoning, while GPT did better on humanities questions. Try the model yourself here: https://t.co/uDA0nvvjhS

13

738

144

439

290K

2

41

4

6

6K

sharp_computer retweeted

Yotaro @yotarokubo

28 days ago

#ICASSP 我々のKAME🐢論文 (SLP-P56.4)は、本日の現地時間14:00からPoster Area 30で発表されます。学習データの構成法からモデル学習の実際まで、なんでも聞いてください。私も発表者の近くにいるハズなので、よろしくお願いします。一応ヘッドホン持って行きますが、スペース十分にあるかな？

0

25

4

2

2K

sharp_computer retweeted

about 1 month ago

For the past few years, humans have been doing “prompt engineering” to coax the best performance out of different LLMs. In this work, we explored what happens if we train an AI to do that job instead. By training a Conductor model with RL, we found that it naturally learns to write highly effective, custom instructions for a whole pool of other models. It essentially learns to ‘manage’ them in natural language. What surprised me most was how it dynamically adapts. For simple factual questions, it just queries one model. But for hard coding problems, it autonomously spins up a whole pipeline of planners, coders, and verifiers. Really excited to see where this paradigm of “AI managing AI” goes next, especially as we start moving from single-agent chain-of-thought to multi-agent “chain-of-command”. Link to our #ICLR2026 paper: https://t.co/EwbjjRPLUb Along with our TRINITY paper which we announced earlier, this work also powers our new multi-agent system: Sakana Fugu (https://t.co/2m8VRdOYqG) 🐡

40

1K

175

1K

183K

Who to follow

@Job_hun_newbie

おさかなチンパンジー

@OERuVBqbNkbk2GD

神保町在住 25歳

sharp_computer retweeted

takkyu @takkyuO2

about 1 month ago

#ICML2026 に論文がacceptされました！今度は拡散言語モデルのtest-time scalingの研究です。複数の拡散言語モデルに協力させるとコーディングや数学の能力が大きく上げられるという研究で、性能も良いしアルゴリズム自体も面白くてお気に入りの研究です。韓国でお会いしましょう🇰🇷

takkyuO2's tweet photo. #ICML2026 に論文がacceptされました！今度は拡散言語モデルのtest-time scalingの研究です。

複数の拡散言語モデルに協力させるとコーディングや数学の能力が大きく上げられるという研究で、性能も良いしアルゴリズム自体も面白くてお気に入りの研究です。韓国でお会いしましょう🇰🇷 https://t.co/7UmJiDJEz9

1

43

3

7

3K

sharp_computer retweeted

about 1 month ago

We’re excited to introduce KAME: Tandem Architecture for Enhancing Knowledge in Real-Time Speech-to-Speech Conversational AI, accepted at #ICASSP2026! 🐢 Blog https://t.co/arVz1TGpJJ Paper https://t.co/0EwpyRXeCs Can a speech AI think deeply without pausing to process? In real conversation, we don’t wait until we’ve fully worked out what we want to say—we start talking, and our thoughts catch up as the sentence unfolds. Fast speech-to-speech models achieve this, but their reasoning tends to stay shallow. Cascaded pipelines that route through a knowledgeable LLM are smarter, but the added latency breaks the flow—they fall back to "think, then speak." In our new paper, we propose a way to break this trade-off. We call it KAME (Turtle in Japanese). A speech-to-speech model handles the fast response loop and starts replying immediately. In parallel, a backend LLM runs asynchronously, generating response candidates that are continuously injected as "oracle" signals in real time. This shifts the AI paradigm from "think, then speak" to "speak while thinking." The backend LLM is completely swappable. You can plug in GPT-4.1, Claude Opus, or Gemini 2.5 Flash depending on the task without changing the frontend. In our experiments, Claude tended to score higher on reasoning, while GPT did better on humanities questions. Try the model yourself here: https://t.co/uDA0nvvjhS

13

738

144

439

290K

sharp_computer retweeted

about 1 month ago

kameのweightとブログ記事公開だ！！ｻｶｰﾅいつもありがとう Cascade な full-duplex 大好きマンだけどﾒﾁｬやりたくなる https://t.co/YYmWyrDgFS

0

39

4

17

7K

sharp_computer retweeted

about 1 month ago

音声AIの素早さと賢さを両立できるか？私たち人間は会話の中で、言いたいことを全部まとめてから話し始めるのではなく、話しながら考えを整理していきます。応答の速い Speech-to-Speech モデルは、この「話しながら考える」を実現しましたが、そのぶん思考が浅くなりがちです。かといって知識豊富な LLM を挟むカスケード型では、遅延が生じるため「話しながら」が成立しません。そこで Sakana AI は、このトレードオフを克服するKAMEモデルを開発しました。Speech-to-Speech モデルが高速な応答ループを担当し、即座に話し始めます。その裏でバックエンドの LLM が非同期に推論を進めて応答候補を生成し、それをオラクル信号としてリアルタイムに注入します。これにより「考えてから話す」ではなく「話しながら考える」ことが可能になります。バックエンドの LLM は差し替えが可能で、タスクに応じてGPT-4.1、Claude Opus、Gemini 2.5 Flashなどを使い分けられます。フロントエンド側の変更は必要ありません。私たちの実験では、Claudeは推論系のタスクで、GPTは人文系のタスクで、それぞれ高いスコアを出す傾向が見られました。本研究は #ICASSP2026 で発表されます。ぜひ、お試しください。ブログ: https://t.co/arVz1TGpJJ 論文: https://t.co/0EwpyRXeCs モデル: https://t.co/uDA0nvvjhS

6

523

106

266

80K

sharp_computer retweeted

Yotaro @yotarokubo

about 1 month ago

🐢の紹介です。重みとコードが公開されます。是非お試ししてみてください！

1

49

9

9

14K

So Kuroki @sharp_computer

about 1 month ago

音声対話モデルの研究をしていました！既存のspeech-to-speech (Moshi)と話した時に、その応答の自然さに感動する一方で、もう少しだけ賢くしたいと思ったのが研究のきっかけです。 5月のICASSPで発表します。モデル、コードも公開しているのでぜひ試してみてください！

about 1 month ago

We’re excited to introduce KAME: Tandem Architecture for Enhancing Knowledge in Real-Time Speech-to-Speech Conversational AI, accepted at #ICASSP2026! 🐢 Blog https://t.co/arVz1TGpJJ Paper https://t.co/0EwpyRXeCs Can a speech AI think deeply without pausing to process? In real conversation, we don’t wait until we’ve fully worked out what we want to say—we start talking, and our thoughts catch up as the sentence unfolds. Fast speech-to-speech models achieve this, but their reasoning tends to stay shallow. Cascaded pipelines that route through a knowledgeable LLM are smarter, but the added latency breaks the flow—they fall back to "think, then speak." In our new paper, we propose a way to break this trade-off. We call it KAME (Turtle in Japanese). A speech-to-speech model handles the fast response loop and starts replying immediately. In parallel, a backend LLM runs asynchronously, generating response candidates that are continuously injected as "oracle" signals in real time. This shifts the AI paradigm from "think, then speak" to "speak while thinking." The backend LLM is completely swappable. You can plug in GPT-4.1, Claude Opus, or Gemini 2.5 Flash depending on the task without changing the frontend. In our experiments, Claude tended to score higher on reasoning, while GPT did better on humanities questions. Try the model yourself here: https://t.co/uDA0nvvjhS

13

738

144

439

290K

0

117

16

37

15K

sharp_computer retweeted

about 1 month ago

We’re launching the beta for our new commercial AI product: Sakana Fugu 🐡, a multi-agent orchestration system! Blog: https://t.co/36Ud311KCP Fugu hits SOTA on SWE-Pro, GPQA-D, and ALE-Bench, and has been our internal secret weapon. It dynamically coordinates frontier models, autonomously selecting the optimal agent combinations and roles for each task. Available as an OpenAI-compatible API, you can seamlessly integrate Fugu into your existing workflows with minimal changes. 🐟 Fugu Mini: High-speed orchestration optimized for latency 🐡 Fugu Ultra: Full model pool utilization for deep, complex reasoning Apply for the beta test here: https://t.co/1fjuAha7ci

SakanaAILabs's tweet photo. We’re launching the beta for our new commercial AI product: Sakana Fugu 🐡, a multi-agent orchestration system!

Blog: https://t.co/36Ud311KCP

Fugu hits SOTA on SWE-Pro, GPQA-D, and ALE-Bench, and has been our internal secret weapon. It dynamically coordinates frontier models, autonomously selecting the optimal agent combinations and roles for each task.

Available as an OpenAI-compatible API, you can seamlessly integrate Fugu into your existing workflows with minimal changes.

🐟 Fugu Mini: High-speed orchestration optimized for latency
🐡 Fugu Ultra: Full model pool utilization for deep, complex reasoning

Apply for the beta test here: https://t.co/1fjuAha7ci

28

705

160

334

366K

sharp_computer retweeted

Yotaro @yotarokubo

about 1 month ago

音声系のインターンを募集しています。音声インターフェースを使ったアプリケーションを作ることに興味のある人、誰か一緒に働きませんか？DMください。

0

32

17

2

4K

sharp_computer retweeted

小泉進次郎

about 2 months ago

サカナAI @SakanaAILabs の伊藤社長と意見交換。防衛大臣直轄の吉田AIチーム長ら職員も参加して非常に有意義な時間になりました。ありがとうございました！

shinjirokoiz's tweet photo. サカナAI @SakanaAILabs の伊藤社長と意見交換。防衛大臣直轄の吉田AIチーム長ら職員も参加して非常に有意義な時間になりました。ありがとうございました！ https://t.co/GwQHeU35tO

0

1K

127

34

87K

sharp_computer retweeted

Hikaru Asano @hikaru_asan0

2 months ago

Joined Sakana AI as a Research Intern 🐟 Super excited 🔥

8

145

8

12

17K

sharp_computer retweeted

2 months ago

🐟Ultra Deep Researchアシスタント「Sakana Marlin」、βテスター募集🐟 Sakana AIは、当社初の商用プロダクトとして、独自のエージェント技術によるビジネス向けAIリサーチアシスタント「Sakana Marlin」を開発しました。 https://t.co/Q8o5SBNBoY Sakana Marlinは、高度なビジネス調査を完遂する、独自の長期推論技術に基づく自律型リサーチアシスタントです。主な特徴・テーマを与えると、8時間近くにわたり自律的にリサーチ・詳細な調査ドキュメントとまとめスライドを自動生成・複数人のチームが数週間かけるプロフェッショナルな戦略調査を想定複雑な社会情勢の中で良質な判断を下すため、AIのポテンシャルを最大限生かすソリューションとして構想しました。本技術は、先日Nature誌にも掲載された科学的発見の自動化「AIサイエンティスト」の知見と、戦略的探索を可能にする「AB-MCTS」を融合。長く考えた分だけアウトプットの質が向上する「効率的な推論スケーリング」を実現しています。クローズドβテストを実施します金融機関・事業会社の経営戦略/事業企画部門、コンサルファーム、シンクタンクなど、日常的に高度なリサーチに取り組む方が対象です（期間中無料）。皆様からのフィードバックをもとに改善を重ねていきます。 ▼ クローズドβテスター応募はこちら https://t.co/fkaCwJceHb

SakanaAILabs's tweet photo. 🐟Ultra Deep Researchアシスタント「Sakana Marlin」、βテスター募集🐟

Sakana AIは、当社初の商用プロダクトとして、独自のエージェント技術によるビジネス向けAIリサーチアシスタント「Sakana Marlin」を開発しました。

https://t.co/Q8o5SBNBoY

Sakana Marlinは、高度なビジネス調査を完遂する、独自の長期推論技術に基づく自律型リサーチアシスタントです。

主な特徴
・テーマを与えると、8時間近くにわたり自律的にリサーチ
・詳細な調査ドキュメントとまとめスライドを自動生成
・複数人のチームが数週間かけるプロフェッショナルな戦略調査を想定

複雑な社会情勢の中で良質な判断を下すため、AIのポテンシャルを最大限生かすソリューションとして構想しました。

本技術は、先日Nature誌にも掲載された科学的発見の自動化「AIサイエンティスト」の知見と、戦略的探索を可能にする「AB-MCTS」を融合。長く考えた分だけアウトプットの質が向上する「効率的な推論スケーリング」を実現しています。

クローズドβテストを実施します
金融機関・事業会社の経営戦略/事業企画部門、コンサルファーム、シンクタンクなど、日常的に高度なリサーチに取り組む方が対象です（期間中無料）。皆様からのフィードバックをもとに改善を重ねていきます。

▼ クローズドβテスター応募はこちら
https://t.co/fkaCwJceHb

13

446

106

200

279K

sharp_computer retweeted

2 months ago

Sakana Chatの公開です！今回開発した「Namazu」モデルは、DeepSeek-V3.1等のオープンLLMに事後学習を適用したものです。優れた性能を維持しながら、日本での利用に適した振る舞いをします。Web検索機能についてもよく作り込んでいるので、日常用途には十分実用的だと思います。是非お試し下さい。

16

879

218

253

302K

So Kuroki @sharp_computer

2 months ago

Sakana Chat、公開されました！皆さんぜひお試しください！

2 months ago

🐟 Sakana Chat 公開 🐟 Sakana AIは、Sakana Chatを無料公開しました。 https://t.co/lyIrGJY4fW Web検索機能と高速レスポンスを備えたAIチャットです。日本国内から、どなたでもお使いいただけます。ぜひ、お試しください。

111

6K

2K

3K

2M

0

28

5

1

6K

sharp_computer retweeted

2 months ago

Sakana AI 初の一般向けサービス Sakana Chat を公開しました🐟 強力なWeb検索エージェントを備え、高速で信頼性の高い情報を引き出せます。世界の高性能なオープンモデルには、開発元のバイアスが不可避的に内在しています。我々は独自の事後学習により、①これらのバイアスの除去、②日本の価値観の反映、③安全かつ文脈に即した適応を実現する技術を開発しました。今回のリリースは、その技術実証の第一弾。国内で誰もが安心して使えるAIの選択肢の一つとして、ぜひお試しください！

19

862

186

377

292K

sharp_computer retweeted

2 months ago

🐟 Sakana Chat 公開 🐟 Sakana AIは、Sakana Chatを無料公開しました。 https://t.co/lyIrGJY4fW Web検索機能と高速レスポンスを備えたAIチャットです。日本国内から、どなたでもお使いいただけます。ぜひ、お試しください。

111

6K

2K

3K

2M

sharp_computer retweeted

3 months ago

Sakana AIは、防衛装備庁防衛イノベーション科学技術研究所より「複数AI技術の組み合わせによる観測・報告・情報統合・資源配分高速化の研究」を受託しました。 https://t.co/3ZeIEqrK62 本研究では、当社の強みである「小規模視覚言語モデル(SVLM)」や自律型AIエージェント技術を活用し、ドローン等のエッジデバイスから得られる膨大なデータの分析・統合、そして最適な意思決定に至るプロセスを一気通貫で高速化するシステムの構築を目指します。安全保障領域における「情報力」の重要性が高まる中、日本発のAI企業として技術的自律性を確保し、最先端の研究成果を日本の安全保障の基盤強化へと実装してまいります。

SakanaAILabs's tweet photo. Sakana AIは、防衛装備庁防衛イノベーション科学技術研究所より「複数AI技術の組み合わせによる観測・報告・情報統合・資源配分高速化の研究」を受託しました。

https://t.co/3ZeIEqrK62

本研究では、当社の強みである「小規模視覚言語モデル(SVLM)」や自律型AIエージェント技術を活用し、ドローン等のエッジデバイスから得られる膨大なデータの分析・統合、そして最適な意思決定に至るプロセスを一気通貫で高速化するシステムの構築を目指します。

安全保障領域における「情報力」の重要性が高まる中、日本発のAI企業として技術的自律性を確保し、最先端の研究成果を日本の安全保障の基盤強化へと実装してまいります。

20

827

235

270

434K

Last Seen Users on Sotwe

Trends for you

Most Popular Users