Gregory Lau

@greglau

AI/ML PhD student @NUSingapore. Former policymaker @ Singapore Government, entrepreneur. BSc in Physics and in Economics, MFin @MIT

Singapore

Joined March 2009

80 Following

63 Followers

33 Posts

greglau retweeted

Bryan Kian Hsiang Low

about 1 month ago

[1/5] 🚨 How do you prove that an #LLM company actually deleted your private or copyrighted data upon your request? 🕵️‍♂️ Spoiler: right now, you really can't! 😱 Existing LLM unlearning metrics are not applicable in the real world. 😬 Thrilled to present WaterDrum (#ICLR2026)! We use text #watermarks to finally measure what an LLM has truly "forgotten."💧🥁 This work was co-led by @lululu0082 @xinyuan3142 @greglau in collaboration with @nhungbui1299 @RachaelSim2 John Russell Himawan, Fanyu Wen, Chuan-Sheng Foo, See-Kiong Ng, @bryanklow. 📄 Paper: https://t.co/jo08kPyMQT Catch our poster @iclr_conf on 23 Apr 10:30AM Pavilion 4 P4-#4210! 🇧🇷 See the thread below 🧵👇

3

12

3

0

477

greglau retweeted

Bryan Kian Hsiang Low

about 1 month ago

[1/4] You want to fine-tune an #LLM. 1️⃣ What's the best data mixture? 2️⃣ What LoRA config. to use? The answers to the above depend on each other. 🐔🥚 Our team (with @ZhiliangChen94 @alfredleongwl @ShaoYongOng @apivich_h @greglau et al.) calls this the Chicken-and-Egg Dilemma of LLM fine-tuning — and we just solved it. 📄Paper: https://t.co/jMqXvVMbYc 📍Poster: @iclr_conf #ICLR2026 DATA-FM Workshop, 26 Apr, Riocentro Room 203 A+B

bryanklow's tweet photo. [1/4] You want to fine-tune an #LLM.

1️⃣ What's the best data mixture?
2️⃣ What LoRA config. to use?

The answers to the above depend on each other. 🐔🥚

Our team (with @ZhiliangChen94 @alfredleongwl @ShaoYongOng @apivich_h @greglau et al.) calls this the Chicken-and-Egg Dilemma of LLM fine-tuning — and we just solved it.

📄Paper: https://t.co/jMqXvVMbYc
📍Poster: @iclr_conf #ICLR2026 DATA-FM Workshop, 26 Apr, Riocentro Room 203 A+B

2

18

4

2

523

greglau retweeted

Bryan Kian Hsiang Low

about 1 month ago

[1/3] 🤔An interesting and practical question: How can we find the optimal #LLM training data mixture that maximizes a free-form downstream task metric? For instance, what fine-tuning data mixture should we use to maximize same-demographic user ratings ⭐ across our chatbots? Our #ICLR2026 work (with @ZhiliangChen94 @greglau Chuan-Sheng Foo) called DUET interleaves #BayesianOptimization and #DataSelection to automatically discover the best data mixture that maximizes any free-form downstream feedback, without manually searching through countless combinations. 📄Paper: https://t.co/2fS7elnb0W 📅Catch us at @iclr_conf 🇧🇷Poster Session 3 Fri Apr 24 10:30AM Pavilion 3 P3-#305. More below👇.

1

14

3

1

554

greglau retweeted

Bryan Kian Hsiang Low

4 months ago

When a company claims that your personal data has been removed from their model, have you ever wondered whether they've indeed done so? 🤔 If a new paper on arXiv claims that its proposed #MachineUnlearning algorithm can unlearn your personal data from an #LLM, how do we know if it can indeed do so? 🛢️WaterDrum is the first data-centric LLM unlearning metric based on watermarking that is calibrated, requires no retraining, works for blackbox models and when forget/retain sets have similar data. Let the (Water)Drums roll at Rio! @iclr_conf #ICLR2026

0

6

1

0

461

Who to follow

BB BLACKDOG have a unique sound, with just two bass and drums (no guitars). Original Rock in a Steampunk Style. https://t.co/sQgFIZunoS

Verified account

SWE | @fastdotai alumni, independent researcher, tester | ex-entrepreneur @AntlerGlobal | GitHub: cedrickchee | building a new computer

EarningsWatcher

Verified account

@earnings_watch

📊 Trade earnings with data, not luck · Research-backed options strategies · Built by ex-Palantir Data Scientist

greglau retweeted

Bryan Kian Hsiang Low

7 months ago

Given a single model, how do we improve an #LLM’s reasoning performance with limited resources 💻 and inference time ⌛️? Can a smaller 1.5B model outperform a 7B model without incurring long inference time from sequential queries? In the work of @_Hu_Wenyang @greglau et al., we introduce the framework called Dipper to create #LLMs ensembles from an optimized set of diverse reasoning prompts to improve performance. Dipper runs queries in parallel with a prompt optimization method inspired by Determinantal Point Processes (DPP), making it super fast ⏩ and effective. Furthermore, Dipper can work with LLM APIs without model access 📦! With Dipper, we demonstrated how a small ensemble of just three 1.5B models can outperform a 7B model on a range of math and non-math reasoning tasks, while taking almost the same inference time and just < 3x compute for a normal query thanks to accelerated batch inference methods 😱 ! Find out more at @emnlpmeeting #EMNLP2025 (Poster 4492) at Hall C, Session 11, Nov 6 at 16:30! Paper: https://t.co/AMPwoCCjFj

0

19

1

0

538

greglau retweeted

Bryan Kian Hsiang Low

10 months ago

Inspired by the success of inverse problems in uncovering fundamental scientific laws, our position paper, which is accepted to @emnlpmeeting #EMNLP2025 (findings), argues that inverse problems can be used to efficiently uncover underlying scaling laws for #LLMs that help the building of LLMs to achieve a desirable performance with significantly better cost-effectiveness. Joint work with @arun_v3rma @WuZhaoxuan @BobbyZhouZijian @xiaoqiang_98 @ZhiliangChen94 @RachaelSim2 @ray_qiaorui @JingtanW @nhungbui1299 @greglau @_Hu_Wenyang @xinyuan3142 Zi-Yu @ZhaoZitong00 @michael_xinyi @apivich_h See-Kiong.

bryanklow's tweet photo. Inspired by the success of inverse problems in uncovering fundamental scientific laws, our position paper, which is accepted to @emnlpmeeting #EMNLP2025 (findings), argues that inverse problems can be used to efficiently uncover underlying scaling laws for #LLMs that help the building of LLMs to achieve a desirable performance with significantly better cost-effectiveness.

Joint work with @arun_v3rma @WuZhaoxuan @BobbyZhouZijian @xiaoqiang_98 @ZhiliangChen94 @RachaelSim2 @ray_qiaorui @JingtanW @nhungbui1299 @greglau @_Hu_Wenyang @xinyuan3142 Zi-Yu @ZhaoZitong00 @michael_xinyi @apivich_h See-Kiong.

1

32

7

3

1K

greglau retweeted

Bryan Kian Hsiang Low

10 months ago

Joint work with @_Hu_Wenyang @greglau et al. on Dipper (Diversity in Prompts for Producing #LLM Ensembles in Reasoning tasks) is accepted to @emnlpmeeting #EMNLP2025 (main)!

0

15

2

0

637

greglau retweeted

Bryan Kian Hsiang Low

11 months ago

When a company claims that your personal data has been removed from their model, have you ever wondered whether they've indeed done so? 🤔 If a new paper on arXiv claims that its proposed #MachineUnlearning algorithm can unlearn your personal data from an #LLM, how do we know if it can indeed do so? 🛢️WaterDrum is the first data-centric LLM unlearning metric based on watermarking that is calibrated, requires no retraining, works for blackbox models and when forget/retain sets have similar data. Find out more about Waterdrum from @greglau at our poster today (18 July) in the #ICML2025 Machine Unlearning for #GenerativeAI (MUGen @icmlconf) workshop during 3-3.45pm in West Meeting Room 202-204! Joint work with @lululu0082 xinyuan @greglau nhung @RachaelSim2 fanyu chuansheng seekiong.

0

4

1

0

336

greglau retweeted

Bryan Kian Hsiang Low

11 months ago

🤔How can we reliably use #Multimodal Large Language Models (#MLLMs) in practical settings? With multiple modalities come more challenges in managing uncertainty and mitigating errors where responses may seem plausible but are incorrect 😱. Introducing ⚖️UMPIRE, an #UncertaintyQuantification framework for MLLMs that estimates task instance uncertainties at inference-time without external tools or additional training. Inspired by #DeterminantalPointProcess (#DPP), the UMPIRE metric is based on a global measure of the semantic volume formed by sampled responses, adjusted by local measures of each sample’s coherence with the multimodal query. Find out more about UMPIRE from @greglau at poster tomorrow (19 July) in the @icmlconf #ICML2025 #R2-FM Reliable and Responsible Foundation Models workshop during 12-1pm and 4-5pm in West Ballroom C! Joint work with @greglau and hieu. Paper: https://t.co/XTHL55CYA0

0

1

1

0

172

greglau retweeted

Bryan Kian Hsiang Low

11 months ago

Obtaining 🔎 interpretable symbolic equations is critical for scientific understanding and decision making in high-stakes environment. In fact, many such real-world applications (e.g., interactive/adaptive experiments, real-time decision support) demand rapid results within ⏳seconds. Hence, we ask❓: how can we rapidly discover accurate and parsimonious symbolic equations from data? In 📖README: Rapid #EquationDiscovery using #MultimodalEncoders, we develop an efficient #Multimodal foundation model for rapid equation discovery with 3 key insights: 1.⁠ ⁠Use image plots rather than raw numerical data to efficiently extract key trends; 2.⁠ ⁠Make use of pre-trained image and text encoders to efficiently train foundation model for equation discovery; 3.⁠ ⁠Develop an query-efficient optimization approach for efficient inference (Grey Wolf Algorithm x #BayesianOptimization). Find out more about README from @greglau at our poster today (18 July) in the @icmlconf #ICML2025 AI4MATH workshop during 10:50am-12.20pm in West Ballroom C! Joint work with @greglau yueran zi-yu @apivich_h @ruthchewing.

1

1

1

0

270

greglau retweeted

Bryan Kian Hsiang Low

11 months ago

https://t.co/Dkw3c6k0e4 (https://t.co/XBWReWZ8nA) will be @icmlconf #ICML2025: @ray_qiaorui @JingtanW @greglau @_Hu_Wenyang @ruthchewing! More details on each work later. Interested to apply for a faculty position or Ph.D. in AI at @NUSComputing , ping me on Whova to meet up. See you soon! #DataSelection #DataCentricAI #LLMs #LLM #FederatedLearning #BayesianOptimization #MachineUnlearning #RLHF #SymbolicRegression

bryanklow's tweet photo. https://t.co/Dkw3c6k0e4 (https://t.co/XBWReWZ8nA) will be @icmlconf #ICML2025: @ray_qiaorui @JingtanW @greglau @_Hu_Wenyang @ruthchewing!

More details on each work later.

Interested to apply for a faculty position or Ph.D. in AI at @NUSComputing , ping me on Whova to meet up.

See you soon!

#DataSelection #DataCentricAI #LLMs #LLM #FederatedLearning #BayesianOptimization #MachineUnlearning #RLHF #SymbolicRegression

5

19

3

1

1K

Gregory Lau @greglau

12 months ago

Excited to be visiting @UniofOxford at @OxfordStats @OxCSML over the summer, hosted by @desirivanova! Just arrived, and already having many interesting discussions. Looking forward to meeting people, exchanging ideas and engaging in research collaborations!

0

15

1

0

417

greglau retweeted

Bryan Kian Hsiang Low

about 1 year ago

Introducing WaterDrum🛢️, the first data-centric #LLM #unlearning metric that leverages robust text #watermarking💧to provide an effective, practical, and resilient way to evaluate LLM unlearning performance😉! (1/n) #MachineUnlearning #LLMs

bryanklow's tweet photo. Introducing WaterDrum🛢️, the first data-centric #LLM #unlearning metric that leverages robust text #watermarking💧to provide an effective, practical, and resilient way to evaluate LLM unlearning performance😉! (1/n)

#MachineUnlearning #LLMs https://t.co/ZY4ZYpAqQn

1

20

2

0

1K

greglau retweeted

Bryan Kian Hsiang Low

about 1 year ago

@iclr_conf @apivich_h @greglau @greglau and @apivich_h are getting PIED @iclr_conf #ICLR2025 🦶🥧🤣

bryanklow's tweet photo. @iclr_conf @apivich_h @greglau @greglau and @apivich_h are getting PIED @iclr_conf #ICLR2025 🦶🥧🤣 https://t.co/OwTBaQmk1e

0

3

2

0

315

greglau retweeted

Bryan Kian Hsiang Low

about 1 year ago

🧪 A key challenge to solving #InverseProblems, especially in science and engineering settings, is limited data and compute for simulations. 🥧 The @iclr_conf work of @apivich_h @greglau , PIED, is an #ExperimentalDesign framework for #PDE inverse problems that efficiently optimizes for the best observation inputs. 📌 PIED utilizes #PINNs to perform forward simulation and to solve the inverse problems, which allows for higher efficiency over numerical simulators and allow for meshless and differentiable simulations. 🤖 Using PINNs along with differentiable observation selection criteria, PIED selects the optimal design parameters for one-shot deployment, while exploitation parallel computation and gradient-based optimization methods. 📈 These benefits allow PIED to outperform existing ED benchmarks in IPs for both finite-dimensional and function-values inverse parameters, as demonstrated in various settings including on real experimental data. Check out our #ICLR2025 Poster #28 on 26 Apr in Poster Session 6 Sat 3pm Hall 3 + Hall 2B. Paper: https://t.co/9IMhSEpnp9 GitHub: https://t.co/Qa4ivyY4i0

1

7

2

0

372

greglau retweeted

Bryan Kian Hsiang Low

over 1 year ago

@NeurIPSConf We showed how a small ensemble of 3 1.5B models can outperform a 7B model on MATH with almost the same inference time & <3x compute for a normal query due to accelerated batch inference methods 😱 ! C u @NeurIPSConf #NeurIPS2024 Workshop on Foundation Model Interventions (n/n)

0

4

2

0

228

greglau retweeted

Bryan Kian Hsiang Low

over 1 year ago

@NeurIPSConf Furthermore, Dipper can work with LLM APIs without model access 📦! (3/n) @NeurIPSConf #NeurIPS2024

1

3

1

0

225

greglau retweeted

Bryan Kian Hsiang Low

over 1 year ago

@NeurIPSConf Introducing Dipper: a framework to create #LLMs ensembles from an optimized set of diverse reasoning prompts to improve perf. Unlike sequential inference time methods, Dipper runs queries in parallel, making it super fast ⏩ and effective. (2/n) @NeurIPSConf #NeurIPS2024

1

3

1

0

196

greglau retweeted

Bryan Kian Hsiang Low

over 1 year ago

Given a single model, how do we improve an #LLM’s reasoning performance with limited resources 💻 and inference time⌛️? Can a smaller 1.5B model outperform a 7B model without incurring long inference time from sequential queries? (1/n) @NeurIPSConf #NeurIPS2024 #LLMs

1

6

1

0

786

greglau retweeted

Bryan Kian Hsiang Low

over 1 year ago

https://t.co/G2ltYL9Pb4 (https://t.co/N3ho5ApQlH) will be @NeurIPSConf #NeurIPS2024: @_Hu_Wenyang @WuZhaoxuan @BobbyZhouZijian @ZhuanghuaL @ray_qiaorui @greglau. Interested to apply for faculty or PhD in AI @NUSComputing, ping me on Whova to meet up. See you soon!

bryanklow's tweet photo. https://t.co/G2ltYL9Pb4 (https://t.co/N3ho5ApQlH) will be @NeurIPSConf #NeurIPS2024: @_Hu_Wenyang @WuZhaoxuan @BobbyZhouZijian @ZhuanghuaL @ray_qiaorui @greglau.

Interested to apply for faculty or PhD in AI @NUSComputing, ping me on Whova to meet up.

See you soon! https://t.co/YdTbsZiIgq

2

28

2

0

1K

Last Seen Users on Sotwe

Trends for you

Most Popular Users