Shuai Kyle Zheng @bittnt - Twitter Profile

about 2 months ago

By outsourcing computation, thinking, and recall, you still need understand it. Maybe we human never read, we just look at pictures. Share some of cool algorithms in pictures.

bittnt's tweet photo. By outsourcing computation, thinking, and recall, you still need understand it. Maybe we human never read, we just look at pictures.

Share some of cool algorithms in pictures. https://t.co/SG5FNiY3vE

0

2

0

3

152

Shuai Kyle Zheng

@bittnt

5 months ago

From next-word prediction to next-world prediction

0

2

0

150

Shuai Kyle Zheng

@bittnt

5 months ago

One of my fun parenting wins with OpenClaw so far: on-demand coloring pages! 🖨️🎨 Kid asks for a Robot car or a race car? Voice in → search image → GPT + Banana APIs → instant black-and-white coloring version → print. Downside: we’re burning through API tokens AND printer paper at an alarming rate 💸📄😅 Parents, is this genius or expensive chaos? #AIParenting #OpenClaw

bittnt's tweet photo. One of my fun parenting wins with OpenClaw so far: on-demand coloring pages! 🖨️🎨

Kid asks for a Robot car or a race car? Voice in → search image → GPT + Banana APIs → instant black-and-white coloring version → print.

Downside: we’re burning through API tokens AND printer paper at an alarming rate 💸📄😅

Parents, is this genius or expensive chaos? #AIParenting #OpenClaw

0

198

Shuai Kyle Zheng

@bittnt

5 months ago

After working with AI agents CC, Cursor and Clawdbot, I’ve become too impatient with any delay from human. Why we argue different ideas without implementations? Why you cannot finish reviewing a few already-chunked PRs in two hours?

1

0

131

Who to follow

Silvio Savarese

@silviocinguetta

Executive Vice President, Chief Scientist @salesforce. Adjunct Professor of Computer Science @Stanford University. Faculty co-director @StanfordSVL. #AI

Clément Farabet

@clmt

AI @ Google DeepMind (Gemini, Gemma & Beyond). Ex NVIDIA (self-driving cars, https://t.co/QtrCBg3wx0), Twitter (founded Cortex), MadBits (founded+sold) 🇺🇸🇫🇷

Phillip Isola

@phillip_isola

Associate Professor in EECS at MIT, trying to understand intelligence.

Shuai Kyle Zheng

@bittnt

5 months ago

I've been using Graphite for a few months now and I've never felt better. I have more energy to review code and ask for review code. My skin is cleaner. My eye sight has improved.

0

1

0

110

Shuai Kyle Zheng

@bittnt

7 months ago

@m_wulfmeier @Waymo Go to take one at Mountain View or SF

0

1

0

63

bittnt retweeted

Anirudh Chakravarthy @anirudhchak

7 months ago

Excited to present our paper, "PROFIT: A Specialized Optimizer for Deep Fine Tuning" at #NeurIPS2025! We introduce an optimizer specifically designed for finetuning pre-trained models, drawing inspiration from multi-task learning. (1/2)

anirudhchak's tweet photo. Excited to present our paper, "PROFIT: A Specialized Optimizer for Deep Fine Tuning" at #NeurIPS2025!

We introduce an optimizer specifically designed for finetuning pre-trained models, drawing inspiration from multi-task learning.

(1/2) https://t.co/aHInCXlOAt

2

6

2

0

298

Shuai Kyle Zheng

@bittnt

7 months ago

Looking forward to being at NeurIPS 2025 this upcoming week to present our paper, "PROFIT: A Specialized Optimizer for Deep Fine Tuning"! PROFIT is the first optimizer that is specially designed for fine-tuning; and because it’s an optimizer, it’s easy to drop in and drop out of any deep learning training system. Intuitively, PROFIT intelligently keeps your system from straying too far from its current state by making sure subsequent gradient updates don’t conflict. We take ideas around gradient conflict commonly found in the multitask learning literature applied across tasks and apply them to the temporal axis instead! We show PROFIT works across a wide range of modalities, from computer vision, VLMs to motion prediction. Please also stop by the General Motors team booth. Learn more about GM’s presence at NeurIPS 2025. https://t.co/Zg6ohHsJpF Catch us at the poster session: When: Wednesday, Dec 3rd (11am - 2pm PT) Where: San Diego Convention Center, Exhibit Hall C/D/E Poster: #905 Paper: https://t.co/IlAq1PWNVW

bittnt's tweet photo. Looking forward to being at NeurIPS 2025 this upcoming week to present our paper, "PROFIT: A Specialized Optimizer for Deep Fine Tuning"!

PROFIT is the first optimizer that is specially designed for fine-tuning; and because it’s an optimizer, it’s easy to drop in and drop out of any deep learning training system. Intuitively, PROFIT intelligently keeps your system from straying too far from its current state by making sure subsequent gradient updates don’t conflict.

We take ideas around gradient conflict commonly found in the multitask learning literature applied across tasks and apply them to the temporal axis instead!

We show PROFIT works across a wide range of modalities, from computer vision, VLMs to motion prediction.

Please also stop by the General Motors team booth. Learn more about GM’s presence at NeurIPS 2025. https://t.co/Zg6ohHsJpF

Catch us at the poster session:
When: Wednesday, Dec 3rd (11am - 2pm PT)
Where: San Diego Convention Center, Exhibit Hall C/D/E
Poster: #905
Paper: https://t.co/IlAq1PWNVW

0

3

1

0

167

Shuai Kyle Zheng

@bittnt

8 months ago

Notice that Gemini Pro 2.5 (even on Gemini studio) gives much better results last week than others like ChatGPT to help with understanding library documentation and debug. Is this due to google search results rule change?

0

135

Shuai Kyle Zheng

@bittnt

9 months ago

Given Sora 2 can make better photoreal videos at will, the next real safety frontier is Digital watermarking, not just human readable ones.

0

1

0

218

Shuai Kyle Zheng

@bittnt

9 months ago

@ber24 @hivergeai Way to go, Bernardino! Congratulations to the team!

0

1

0

31

bittnt retweeted

Bernardino Romera-Paredes @ber24

9 months ago

We took on the challenge and we’ve put our system to work on the nanoGPT benchmark. @hivergeai tech discovered new algorithmic improvements beyond the existing optimizations. Check out the results in the PR https://t.co/CIS5phXK04 and read our blogpost https://t.co/92Bi764Ofv!

0

8

3

0

836

Shuai Kyle Zheng

@bittnt

9 months ago

@jcjohnss Awesome work! Is music also generated by the same model?

1

0

468

Shuai Kyle Zheng

@bittnt

11 months ago

I’m tired of seeing SOTA results on curated benchmarks. Amazing, but where are the failure cases? Very few academic papers mention failures. Worse, this silence is spilling into closed-source models like GPT-5. What matters most is how models fail on out-of-distribution data.

0

2

0

156

Shuai Kyle Zheng

@bittnt

12 months ago

@LigengZhu It seems that Grok4 might have trained on the public AIME. The private benchmark suggested there is a room to be improved. https://t.co/8riV2uBgtV

0

1

0

93

Shuai Kyle Zheng

@bittnt

about 1 year ago

@ducha_aiki @PascalMettes @giffmana @TaiNguyen34 Well, it’s only fair, remember, China has PRCV, UK has BMVC, Germany has GCPR… and the US? We got CVPR, Costco, and unlimited refills. It’s just how things works.

3

5

0

692

Shuai Kyle Zheng

@bittnt

about 1 year ago

@xunhuang1995 lol. Fortunately u r not working at NVDA, otherwise u might have to change the paper.

1

0

1

817

Shuai Kyle Zheng

@bittnt

about 1 year ago

OpenAI’s deep research and 3o-mini are great. I feel like a wizard. Then I tried Gemini’s: it wrote a report and exported it to Google Docs. Asked it for code, boom, straight to Colab. This is giving a little “bundling IE with Windows while Netscape cries in a corner” vibes.

0

3

0

296

Shuai Kyle Zheng

@bittnt

over 1 year ago

@doomie @willccbb Another maybe important question: which 75% training samples are selected? Are they randomly selected?

0

32

Shuai Kyle Zheng

@bittnt

over 1 year ago

That's actually not true. It uses JSON representation to represent all the data. ```messages=[ {"role": "user", "content": [ {"type": "text", "text": prompt}, {"type": "image_url", "image_url": { "url": f"data:image/png;base64,{base64_image}"} } ]} ],```. It requires user to encode the raw image as base64 and then converts this base64 string into a UTF-8 string. e.g. ```import base64 def encode_image(image_path): with open(image_path, "rb") as image_file: # Read the image in binary mode return base64.b64encode(image_file.read()).decode("utf-8") ```. I think this can later be read using OpenCV.

0

46

Shuai Kyle Zheng

@bittnt

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users