ashipong

@ashipong

趣味で機械学習と電子工作的なことをやってます。最近はレトロPCも。

Joined January 2016

42 Following

72 Followers

498 Posts

ashipong retweeted

Nikolaus West

@NikolausWest

about 1 month ago

If you’re serious about robot learning you (unfortunately) need to know about video compression. Camera streams dominate data volumes for most datasets at 90+% even when compressed. Video is more complicated to deal with but the size wins are too big to give up. The unit of compression is a Group Of Pictures (GOP). In the simplest case (what you should use in robotics), GOPs start with a keyframe (I-frame) that is followed by several delta frames (P-frames). Delta frames only need to encode the difference to the previous frame which is where the compression win comes from. That means to decode frame 15 of a 30-frame GOP you need to feed all the preceding frames in the GOP to the decoder to get out that one frame. The GOP controls the tradeoff between random access and compression. Why does this matter for robot learning? Because while training, dataloader performance is dominated by fetching and decoding video. To build a streaming dataloader (you need this for large datasets) it needs to take GOPs into consideration when fetching data for a time step. It’s hard enough to build a dataloader that doesn’t starve your GPUs that most teams forgo flexibility. That means researchers at most of the best funded robotics efforts currently wait around for large export jobs before training can start after each change to the dataset mix or the wrong hyperparameter. This situation obviously won’t last since they all know that experiment cycle times is a key lever to fast progress and the competitive pressure is enormous. If you want to compete in this space you need both flexibility and performance.

NikolausWest's tweet photo. If you’re serious about robot learning you (unfortunately) need to know about video compression. Camera streams dominate data volumes for most datasets at 90+% even when compressed. Video is more complicated to deal with but the size wins are too big to give up.

The unit of compression is a Group Of Pictures (GOP). In the simplest case (what you should use in robotics), GOPs start with a keyframe (I-frame) that is followed by several delta frames (P-frames). Delta frames only need to encode the difference to the previous frame which is where the compression win comes from.

That means to decode frame 15 of a 30-frame GOP you need to feed all the preceding frames in the GOP to the decoder to get out that one frame. The GOP controls the tradeoff between random access and compression.

Why does this matter for robot learning? Because while training, dataloader performance is dominated by fetching and decoding video. To build a streaming dataloader (you need this for large datasets) it needs to take GOPs into consideration when fetching data for a time step. It’s hard enough to build a dataloader that doesn’t starve your GPUs that most teams forgo flexibility. That means researchers at most of the best funded robotics efforts currently wait around for large export jobs before training can start after each change to the dataset mix or the wrong hyperparameter.

This situation obviously won’t last since they all know that experiment cycle times is a key lever to fast progress and the competitive pressure is enormous. If you want to compete in this space you need both flexibility and performance.

525

614

47K

ashipong retweeted

Shu

@NGO275

3 months ago

結局二足歩行じゃなくて台車式でリアルな業務をさせていこうってなるよな

202

458

685K

ashipong retweeted

片山良平@paiza創業者

@rk611

10 months ago

自分でやって100点取れるその領域のシニア（経験者）がこれやるのは良いのだけど、20点しか取れないジュニアが生成AI任せで16点のものを100個作られるとシニアがチェックで死に、全体としての生産性が落ちる。 …という問題が生成AI駆動開発では既に起きている。

894

638K

ashipong retweeted

ロボ太 @kaityo256

12 months ago

これ、「プログラミング言語の再発明では？」って反応が多いんだけど、ちょっと違って、既存のプログラミング言語は(概ね)Howを記述するものであったのにたいして、LLMへの指示のためのWhatの記述がメインになる人工言語が出てくるだろう(出てきて欲しい)という主張。

248

341

219K

Who to follow

Fan_PC-1251

@pio1986_10

かつてポケコンにハマりました。全国のユーザーの解析によりマシン語コードが判明、その無限の可能性に興奮しました。先輩諸氏の作品に学びつつ自分でも投稿するなど貴重な経験となりました。代表作は『PC-インタープリタ』『TINY-麻雀』など。当時はPC-1251/45/61/1350ですがポケコン文化全般が守備範囲です。

Miyuki Inoue I Conoris（ベンダーリスク管理ツール）

@gatlingmiyuki

Founder & CEO of Conoris Technologies/大阪出身・アメリカ（GA州）育ち/元VC/セキュリティチェックシートのデジタル化を実現するConorisを運営/子育て中（小学生&幼児）/得意領域はHR Techと業務システム/ex-Worksap, Recruit, Persol

Takeo Imai (bonotake)

@bonotake

Agile Coach | Origanizational Consultant (Product Strategy, Org Design) | Software Eng. Researcher / Amazonアソシエイト / お仕事関連は [email protected] へ

ashipong retweeted

まつにぃ

@yugen_matuni

about 1 year ago

Cursorが日本含む学生向けに1年間のProプランを無償提供開始。学生認証で利用可能です。これはやっておいて損はないです。

654

946K

ashipong retweeted

ガッシー｜Repsona

@GussieTech

about 1 year ago

複数の依存関係のあるタスクや、親子関係のタスクを、一気に動かせるガントチャートの世界。

185

578

216K

ashipong retweeted

AIDB @ai_database

over 1 year ago

「LLM専用の言葉を新しく作る必要がある」とDeepMindの研究者らは考え、実際にこれまで世の中になかった単語を作り実験しました。人間とLLMの間にある、既成の言葉に対する”概念のズレ"を排除することが目的です。 https://t.co/5krmxy6C4b 結果、LLMが通常苦手する「指定の文字数で出力する」、「バリエーションに富んだ応答をする」、「モデル自体が好ましいと判断する回答をする」といったことができるようになる事例が示されました。新しい単語（ネオロジズムと呼ばれています）を導入することで、単に人間が既存単語で指示・分析するよりもはるかに柔軟で精密なコントロール・理解ができるようになると主張されています。

648

116

329

56K

ashipong retweeted

May Yamaura | AI画像10年目

@myapdx

over 1 year ago

Cursor でうまく問題修正できなかったらこのプロンプト試すといいよだって「問題の原因として考えられるものを5～7つ挙げ、それを1～2つの最も可能性が高い原因に絞り込んでください。その上で、実際のコード修正に進む前に、仮説を検証するためのログを追加してください。」 "Reflect on 5-7 different possible sources of the problem, distill those down to 1-2 most likely sources, and then add logs to validate your assumptions before we move onto implementing the actual code fix"

889

136K

ashipong retweeted

とみたかずき │ サポットさんの人

@Jessicazu418

over 1 year ago

イキったタイトルをつけましたが、数年ぶりに技術記事を書きました。 Cursorを使いこなして、実装時間を1/5にしている具体的手法を紹介してます。 https://t.co/NFipOGcZ9U

862

907

66K

ashipong retweeted

piqcy

@icoxfog417

over 1 year ago

君はGitHub上にあるJupyter Notebook のうち4%しか結果が再現できないことを知っていたか案件。セル間の依存関係を把握して必要十分なセルについてきちんとリランすることで再現性を担保する、新しいインタラクティブ環境marinoの紹介。すでに9000Starついてる https://t.co/CzOrhXQG25

208

101

23K

ashipong retweeted

ｇａｒｙ @keijiro_im

over 1 year ago

すごすぎる...

260

ashipong retweeted

bioshok

@bioshok3

over 1 year ago

ついに米国政府は半導体輸出規制をTier1から3までに分類。そこまでは数日前から言われてたことだが、今回ついにAIシステムそのもののTier3へのホストを禁止することになった。Deepseekが合成データを使っていると思われることを鑑みてもこれは安全保障上大事なことだ。Tier2でもセキュリティ要件など制限がある。ついに時代はハードの輸出規制からソフトにまで至ってきた。オープンソースはまだ規制の範囲外だが、そのうち規制内になるだろうと思う。これから先は国家安全保障上のAI版東西冷戦となると感じる。他には第2層は25-27年で5万台の制限と米国企業は計算資源の半分を米国内に展開する必要がある。もう明確に「国家安全保障」であり、これはビジネストレンドではないというところが大きくなっているだろう。 https://t.co/3mca6HVOOT

bioshok3's tweet photo. ついに米国政府は半導体輸出規制をTier1から3までに分類。そこまでは数日前から言われてたことだが、今回ついにAIシステムそのもののTier3へのホストを禁止することになった。Deepseekが合成データを使っていると思われることを鑑みてもこれは安全保障上大事なことだ。Tier2でもセキュリティ要件など制限がある。ついに時代はハードの輸出規制からソフトにまで至ってきた。オープンソースはまだ規制の範囲外だが、そのうち規制内になるだろうと思う。これから先は国家安全保障上のAI版東西冷戦となると感じる。
他には第2層は25-27年で5万台の制限と米国企業は計算資源の半分を米国内に展開する必要がある。もう明確に「国家安全保障」であり、これはビジネストレンドではないというところが大きくなっているだろう。
https://t.co/3mca6HVOOT

999

772K

ashipong retweeted

Kazunori Sato

@kazunori_279

over 1 year ago

XMLとJSONの話：元々XMLはSGMLや文書管理の人（ゼロックス村田さんとか）が作った構造化ドキュメントのフォーマットで、時流的にデータの入れ物として使われたけど、そもそも文化や発想がソフトウェア屋さんとは違ってた。我々は単純なデータフォーマットがほしかったのだ。

ashipong retweeted

Koki Saitoh 斎藤康毅 @SaitohKoki

over 1 year ago

すべてはNANDから始まる物語——NANDが加算器となり、計算を行い、画面に絵を描く。そしてテトリスが動き出す。コンピュータの真髄を、かくも美しく描き切った一冊。本日発売です。 https://t.co/Ts6Jb1s9XB #ad

726

165

271

78K

ashipong retweeted

はたのあいかフリー声優/声真似

@H_aika_45

over 1 year ago

「少しずつ寒くなれや」を歌いました

275

165K

28K

23K

17M

ashipong retweeted

T.Yamazaki @ZappyZappy7

over 1 year ago

人間に近い足取りでスムーズに歩く汎用人型ロボット https://t.co/CikjzhkWZe 強化学習と模倣学習を組み合わせたニューラルネットワークモデルを通じて自然な歩行を実現 #bipedal #GeneralPurpose #humanoid #ReinforcementLearning #ImitationLearning #EngineAI_Robot

376

168

78K

ashipong retweeted

タランチュラ | データサイエンティスト

@tarantula_ds_

over 1 year ago

【機械学習の計算量オーダー】大規模なデータで機械学習を行うようになると各モデルの計算量を考慮する必要が出てきます。そこで代表的な機械学習モデルの学習・推定時のオーダーをまとめてみました！

913

652

91K

ashipong retweeted

開発室Graph @stu3dio_graph

over 1 year ago

平成の人間なので「GPU計算環境からgithubつなぐのって ssh-keygen して GitHub にペタッと貼るんですかね〜」って言ってたら令和の若者に「いや gh auth login で一発ですよ」と教えてもらった

330

460K

ashipong retweeted

ritar

@rtr_dnd

over 1 year ago

【βリリースのお知らせ】データ解析のための、Jupyter + Figmaのようなツールを友人と作ってます！・Pythonをフローチャートのようにブラウザ内で実行・AIと一緒に書けるから文法暗記いらず・解析プロセスが視覚的に残る Google Colab/RStudioに疲弊してるリサーチャー・開発者集まれ！！（続く）

353

308K

ashipong retweeted

cvpaper.challenge | AI/CV研究コミュニティ @CVpaperChalleng

over 1 year ago

2024メタサーベイまとめを公開します。 -- ・LLM x CV: https://t.co/YfHiRJRnof ・Neural Fields: https://t.co/6WnJr3ad2D ・Remote Sensing x Computer Vision: https://t.co/aMEQBUuFK3 ・画像超解像モデル: https://t.co/JoKBKjMq2N ・Video Recognition: https://t.co/BAbOAWJ43I

265

181

49K

ashipong

@ashipong

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users