wolfshow @wolfshowme - Twitter Profile

wolfshow @wolfshowme

about 1 month ago

Adding border overlay to GBA on #AnaloguePocket @analogue

0

64

wolfshow @wolfshowme

3 months ago

MisterClaw: Run PicoClaw AI on Your MiSTer FPGA https://t.co/9qBUrOCyfz via @YouTube @MiSTerFPGA @SipeedIO #MisterClaw #PicoClaw #MisterFPGA #OpenClaw #Retrogaming

0

1

0

1

302

wolfshow @wolfshowme

3 months ago

https://t.co/Ge1NvNbIej MisterClaw meme designed by Nano Banana @MiSTerFPGA @SipeedIO

0

1

0

69

wolfshow @wolfshowme

3 months ago

MisterClaw: World's first @SipeedIO PicoClaw AI assistant on MiSTer FPGA (DE10-Nano) #MisterFPGA #OpenClaw #PicoClaw #RetroGaming #MisterClaw https://t.co/Ge1NvNbIej

0

14

4

13

4K

Who to follow

Qingxiu Dong

@qx_dong

Research Scientist @GoogleDeepmind, #Gemini RL ✨ Prev: PhD @PKU1898, Intern @MSFTResearch Asia.

Li Dong

@donglixp

Researcher at Microsoft Research

Ledell Wu

@LedellWu

Co-Founder & Chief Scientist @ Creatify AI ICML Test-of-Time Award Co-author of DPR (RAG-core) Past: FAIR @MetaAI

wolfshow @wolfshowme

3 months ago

Built a tiny AI assistant powered by @SipeedIO PicoClaw 🦞 on my MiSTer FPGA. 10MB RAM, 600+ arcade games, all managed via Telegram. Retro gaming meets AI. 🎮🤖 #MiSTerFPGA #OpenClaw #PicoClaw #RetroGaming

wolfshowme's tweet photo. Built a tiny AI assistant powered by @SipeedIO PicoClaw 🦞 on my MiSTer FPGA. 10MB RAM, 600+ arcade games, all managed via Telegram. Retro gaming meets AI. 🎮🤖 #MiSTerFPGA #OpenClaw #PicoClaw #RetroGaming https://t.co/ljzvqnLrfa

0

1

0

222

wolfshow @wolfshowme

5 months ago

Just got the N64 x7, should I solder it myself? @krikzz

1

3

1

0

964

wolfshowme retweeted

krikzz @krikzz

7 months ago

Black Friday sales begins! 20% off on all products at https://t.co/eIkI80XsCQ. RT this message and you will have a chance to get EverDrive or FXPAK for free! Winner will be chosen 02.12.2025 #GIVEAWAY

94

1K

98

109K

wolfshow @wolfshowme

7 months ago

#AesCoder on Design Arena

Grace Li

@grx_xce

7 months ago

Congrats to @MSFTResearch on AesCoder-4B, a tiny model that introduces GRPO-AR to jointly optimize functionality and code aesthetics, holding its own against models 100x its size Our team is proud to see SOTA cite Design Arena as baseline Official Design Arena results coming soon

grx_xce's tweet photo. Congrats to @MSFTResearch on AesCoder-4B, a tiny model that introduces GRPO-AR to jointly optimize functionality and code aesthetics, holding its own against models 100x its size

Our team is proud to see SOTA cite Design Arena as baseline

Official Design Arena results coming soon

2

31

4

15

4K

0

1

0

153

wolfshowme retweeted

FW @thegenerality

8 months ago

BitDistill finetunes any full-precision LLMs into 1.58-bit for specific tasks with the same peformance

0

5

3

2

462

wolfshow @wolfshowme

8 months ago

Thrilled to introduce #PART, a new method to protect LLM reasoning from unauthorized distillation while keeping it transparent for users. By removing self-talk and reordering conclusions, we disrupt illicit training and preserve valuable information. https://t.co/HE1vzFOfeZ

0

2

1

0

224

wolfshow @wolfshowme

8 months ago

Beyond just text quality! We're introducing #DocReward, a model that evaluates and improves the visual structure and style of documents. In our tests, DOCREWARD achieved a 60.8% win rate in generating human-preferred documents, compared to GPT-5's 37.7%. https://t.co/ChY9z0X05f

1

7

4

1

1K

wolfshow @wolfshowme

10 months ago

#Kosmos-2.5 now in Hugging Face

Niels Rogge @NielsRogge

10 months ago

KOSMOS 2.5 by @Microsoft has finally been integrated into @huggingface Transformers 🙌🔥 End-to-end document AI model similar to Donut/Pix2Struct, pre-trained on 357.4 million documents Handles image-to-markdown, OCR with spatial coordinates and chatting with documents!

NielsRogge's tweet photo. KOSMOS 2.5 by @Microsoft has finally been integrated into @huggingface Transformers 🙌🔥

End-to-end document AI model similar to Donut/Pix2Struct, pre-trained on 357.4 million documents

Handles image-to-markdown, OCR with spatial coordinates and chatting with documents! https://t.co/kBNTSYqp1M

9

253

41

167

17K

1

2

0

247

wolfshowme retweeted

机器之心 JIQIZHIXIN

@jiqizhixin

10 months ago

How do you stop an LLM from getting “thrown off” by a few wild tokens? UCAS, CUHK, HKUST, and Microsoft Research researchers think they’ve cracked it with Geometric-Mean Policy Optimization (GMPO) — a twist on GRPO that tames outliers by optimizing the geometric mean of token-level rewards. Result: more stable training, +4.1% on math benchmarks, +1.4% on multimodal reasoning.

jiqizhixin's tweet photo. How do you stop an LLM from getting “thrown off” by a few wild tokens?

UCAS, CUHK, HKUST, and Microsoft Research researchers think they’ve cracked it with Geometric-Mean Policy Optimization (GMPO) — a twist on GRPO that tames outliers by optimizing the geometric mean of token-level rewards.

Result: more stable training, +4.1% on math benchmarks, +1.4% on multimodal reasoning.

5

162

33

90

12K

wolfshowme retweeted

DAIR.AI

@dair_ai

10 months ago

Top AI Papers of The Week (July 28 - August 3): - GEPA - Graph-R1 - AlphaEarth - Self-Evolving Agents - Hierarchical Reasoning Model - Efficient Attention Mechanisms - Geometric-Mean Policy Optimization Read on for more:

13

1K

136

913

145K

wolfshowme retweeted

DailyPapers

@HuggingPapers

10 months ago

Microsoft Research introduces Geometric-Mean Policy Optimization (GMPO)! A new RL method that stabilizes LLM reasoning by maximizing the geometric mean of token-level rewards. No more unstable updates!

HuggingPapers's tweet photo. Microsoft Research introduces Geometric-Mean Policy Optimization (GMPO)!

A new RL method that stabilizes LLM reasoning by maximizing the geometric mean of token-level rewards.

No more unstable updates! https://t.co/pbd78Qjzcs

10

987

109

579

62K

wolfshowme retweeted

DailyPapers

@HuggingPapers

10 months ago

GMPO outperforms GRPO by 4.1% on math & 1.4% on multimodal reasoning benchmarks. It achieves better stability and performance, moving us closer to reliable AI. Learn more & get the code: Paper: https://t.co/Mh8kJjLeV0 Code: https://t.co/eDEGGiCCPF

1

43

5

18

2K

wolfshowme retweeted

Remek Kinas

@KinasRemek

11 months ago

RL(LLM) - Pisałem ostatnio o GSPO. A dzisiaj publikacje na temat -> GMPO - Geometric-Mean Policy Optimization, ARPO - Agentic Reinforced Policy Optimization, IRL - Inverse RL … Chyba najbardziej kwitnący obszar treningowy LLM. U nas Bielik-v3 też już trenowany RL (GRPO, DR-GRPO, DAPO, GSPO - przygotowane) … czekamy na nową bazę. Wczoraj zakończyłem pracować nad największym polskim matematycznym datasetem treningowym RL - blisko 500k unikalnych i weryfikowalnych polskich zdań. Będzie moc 😁 Team - Krzysiek Ociepa @ChrisOciepa , Łukasz Flis, Adrian Gwoździej, Krzysiek Wróbel i moje wsparcie - pracuje teraz na pełnych obrotach. Dream team🤩Praca sama idzie. Nowe pomysły wdrażane w kilka minut, nie trzeba za wiele mówić - delivery najważniejsze. Ekstra pracuje się w takiej ekipie. Codziennie mamy postęp! Ogromne wsparcie @Cyfronet ❤️🔥

6

85

8

12

5K

wolfshowme retweeted

AI Native Foundation

@AINativeF

11 months ago

8. Geometric-Mean Policy Optimization 🔑 Keywords: Geometric-Mean Policy Optimization, Policy Updates, Token-Level Rewards, Multimodal Reasoning, AI Native 💡 Category: Natural Language Processing 🌟 Research Objective: - The research aims to stabilize policy updates in large language models through Geometric-Mean Policy Optimization (GMPO), enhancing the performance on mathematical and multimodal reasoning benchmarks. 🛠️ Research Methods: - GMPO introduces the use of geometric mean for token-level rewards to provide a less sensitive approach to outliers and maintain stable importance sampling ratios. Comprehensive theoretical and experimental analyses are conducted to validate GMPO's design and stability benefits. 💬 Research Conclusions: - GMPO demonstrates improved stability and a performance increase, surpassing GRPO by 4.1% on mathematical benchmarks and 1.4% on multimodal reasoning benchmarks like AIME24, AMC, MATH500, OlympiadBench, Minerva, and Geometry3K. 👉 Paper link: https://t.co/rGre3jOZJI