Jacob Si @jacobyhsi88 - Twitter Profile

about 14 hours ago

Excited to share flow control, a method that allows users to steer VLAs in real time with keyboard arrow presses. This method doesn't require any fine-tuning of a VLA and can run out-of-the-box.

2

47

11

40

5K

jacobyhsi88 retweeted

Ian Shi @ianshi3

about 1 month ago

Excited to annouce our seed round!

0

10

2

1

1K

jacobyhsi88 retweeted

yingzhen @liyzhen2

about 1 month ago

Our own answer: structured coupling https://t.co/wCsuYCpLsM - flow matching with VAE-based coupling - VAE encoder & flow sharing networks - VAE decoder init. + flow refinement for sampling flow matching 🤝 VAEs -> good representation & sample quality🚀

liyzhen2's tweet photo. Our own answer: structured coupling
https://t.co/wCsuYCpLsM

- flow matching with VAE-based coupling
- VAE encoder & flow sharing networks
- VAE decoder init. + flow refinement for sampling

flow matching 🤝 VAEs -> good representation & sample quality🚀 https://t.co/968jWlFJCr

3

241

45

184

23K

jacobyhsi88 retweeted

Lisa Alazraki @LisaAlazraki

about 2 months ago

Scaling Small Agents Through Strategy Auctions accepted at #ICML2026 🇰🇷 🎉 Grateful to have worked with @akhilmathurs @williamfshen @yorambac

2

49

3

5

2K

jacobyhsi88 retweeted

Jerry Ji @jerryji2019

about 2 months ago

Clinical ML has a generalization problem. The standard playbook: train a model, watch it fail at the next hospital, and retrain it with new tricks. We invert this! Don't change the model—change the input so any model will generalize. Introducing Record2Vec at #ICLR2026! 🔄🚀 1/5

1

9

2

4

3K

jacobyhsi88 retweeted

Chen-Hao (Lance) Chao @chenhao_chao

3 months ago

(1/7) We introduce MDM-Prime-v2 which scales 21.8× better than autoregressive models (ARMs) in compute-optimal comparisons. 📎 Paper: https://t.co/VhBVo75abe 🌟 Blog: https://t.co/miWdTmcGtL ⌨️ Github: https://t.co/ac1eDV8O8Q Here’s how we did it👇:

8

327

50

317

49K

jacobyhsi88 retweeted

yingzhen @liyzhen2

4 months ago

A few months back I pushed @jacobyhsi88 to test how well GPT-5.2 did on this task. We found that open-source models can do even better when they are instructed to use the right tools. 😉 (I hope the Qwen team will continue leading the open-source LLM contributions 🫡)

1

12

2

4

2K

Jacob Si

@jacobyhsi88

4 months ago

🧵10/10 Thank you for your interest in our work 🙏! It has been a pleasure working with @mikezqu, @michelle_xioax, & my supervisors, @MarekRei and @liyzhen2 🙌. This work was done at @imperialcollege and @Columbia🎓. https://t.co/KGHyGiLEZO

0

2

0

72

Jacob Si

@jacobyhsi88

4 months ago

Ever queried RAG pipelines about tabular documents 𝄜 and find that answers are often incorrect ❌🤔? 🚀 Introducing TabRAG, an end-to-end parsing-based RAG framework designed to improve tabular document question answering via structured representations 👍! 📄 Paper: https://t.co/I90RvDPn9h 💻 GitHub: https://t.co/Pv93LkB945

jacobyhsi88's tweet photo. Ever queried RAG pipelines about tabular documents 𝄜 and find that answers are often incorrect ❌🤔?

🚀 Introducing TabRAG, an end-to-end parsing-based RAG framework designed to improve tabular document question answering via structured representations 👍!

📄 Paper: https://t.co/I90RvDPn9h
💻 GitHub: https://t.co/Pv93LkB945

2

19

4

5

3K

Jacob Si

@jacobyhsi88

4 months ago

🧵9/10 We analyze the effect of the number of self-generated ICL demonstrations on our generation performance. The performance improves sharply when moving from zero to a small number of demonstrations, with three ICL examples consistently yielding stable gains across most datasets.