Tsinghua KEG (THUDM) @thukeg - Twitter Profile

Pinned Tweet

almost 2 years ago

#VisualAgentBench: 4o, 4o-mini, 3.5-sonnet currently have an edge as visual foundation agents for now, but open models InternVL & GLM-4V are catching up fast, a similar story to LLMs as agents as revealed in #AgentBench back in Aug 2023. https://t.co/1LoiPkjHAx https://t.co/ddbELDos0T

thukeg's tweet photo. #VisualAgentBench: 4o, 4o-mini, 3.5-sonnet currently have an edge as visual foundation agents for now, but open models InternVL & GLM-4V are catching up fast, a similar story to LLMs as agents as revealed in #AgentBench back in Aug 2023.
https://t.co/1LoiPkjHAx
https://t.co/ddbELDos0T

2

25

6

9K

thukeg retweeted

Tsinghua University

@Tsinghua_Uni

about 1 year ago

Prof. Liu's team built an #AI doctor for everyday #healthcare! In a #virtual hospital, it treated 10K+ virtual patients with 93% accuracy. They covered 300+ diseases across 21 departments & released BioMedGPT, PathOrchestra, and more for a full #medical AI pipeline. #THUAndBeyond

Tsinghua_Uni's tweet photo. Prof. Liu's team built an #AI doctor for everyday #healthcare! In a #virtual hospital, it treated 10K+ virtual patients with 93% accuracy. They covered 300+ diseases across 21 departments & released BioMedGPT, PathOrchestra, and more for a full #medical AI pipeline. #THUAndBeyond https://t.co/Fx1LBe6L7x

3

28

8

11

4K

thukeg retweeted

Tsinghua CS @thudcst

about 1 year ago

🏆Congrats to the Storage Research Group from #Tsinghua DCST for winning the#ASPLOS2025/#EuroSys2025 Large-Scale Model Inference Optimization Contest in Rotterdam! They outperformed global competitors, boosting inference performance by 1.1x using AWS NKI framework optimizations.

thudcst's tweet photo. 🏆Congrats to the Storage Research Group from #Tsinghua DCST for winning the#ASPLOS2025/#EuroSys2025 Large-Scale Model Inference Optimization Contest in Rotterdam! They outperformed global competitors, boosting inference performance by 1.1x using AWS NKI framework optimizations. https://t.co/YjoELJlhFB

0

7

5

3

3K

thukeg retweeted

Stanford AI Lab

@StanfordAILab

over 1 year ago

Check out our latest blog post about MiniVLA, a smaller open-source vision-language-action model! https://t.co/6TvEMWGBSF

5

84

11

52

13K

Who to follow

Ning Ding

@stingning

Researcher of AI. Assistant Professor @Tsinghua_Uni. Working on scalable methods of language and physical models @nature_will_ai.

Harrison Chase

@hwchase17

@LangChain Always hiring: https://t.co/D5Ut3loFO7

Yi Tay

@YiTayML

research scientist @googledeepmind ✨♊, model co-lead/captain of gemini deepthink imo gold medal 🥇, opinions are my own.

thukeg retweeted

Paul Vicol @PaulVicol

over 1 year ago

Ruslan Salakhutdinov at the Adaptive Foundation Models Workshop!

0

30

4

8

8K

thukeg retweeted

Richard Socher

@RichardSocher

over 1 year ago

AI has a "last-mile problem" similar to self-driving cars. With self-driving cars, early demos impressed, but real-world deployment took years. It's easy to hack up a prototype, but making it work reliably at scale is hard. If each step of an AI agent is only 95% accurate, none of the 30-step workflows will work reliably. Going from 95% to 99.9% accuracy is the real challenge.

4

146

24

77

15K

thukeg retweeted

Z.ai @Zai_org

over 1 year ago

🌈AndroidLab: a comprehensive platform for developing and evaluating Android agents. By integrating a controlled environment and standardized benchmarks, and leveraging the Android Instruct dataset, we significantly boost open-source model performance. https://t.co/qLac9Rjvbq

1

42

12

16

6K

thukeg retweeted

Yuxiao Dong

@ericdongyx

over 1 year ago

#AutoGLM: Autonomous Foundation Agents for GUIs by @ShawLiu12 and team at @thukeg & @ChatGLM! Here are some AutoGLM for phone use demos --- beta testing since Oct 25 --- and its tech report https://t.co/ONkwT5rllu

1

15

5

3

1K

thukeg retweeted

Z.ai @Zai_org

over 1 year ago

Thank you to the passionate developers for your continued support and patience. CogVideoX-5B-I2V, release!😀 Github: https://t.co/VNpl283CPS CogVideoX-5B-I2V model: https://t.co/85AiDO6YcD Gradio space: https://t.co/f0dR1IqrCT

4

193

51

87

18K

thukeg retweeted

Tiezhen WANG

@Xianbao_QIAN

over 1 year ago

What has just happened? @thukeg has just released the CogVideoX image-to-video generation model. Amazing result. Combined demo of T2V/I2V and V2V: https://t.co/HgmFRUc1QM Please duplicate the space with a L4s to avoid the long waiting queue. Model: https://t.co/3gxwGRicsu

3

43

7

14

6K

thukeg retweeted

Gradio

@Gradio

almost 2 years ago

LongWriter-glm4-9b from @thukeg is capable of generating 10,000+ words at once!🚀 Paper identifies a problem with current long context LLMs -- they can process inputs up to 100,000 tokens, yet struggle to generate outputs exceeding lengths of 2,000 words. Paper proposes that an LLM's effective generation length is inherently bounded by the sample it has seen during supervised fine-tuning😮 Demonstrates that existing long context LLMs already possess the potential for a larger output window--all you need is data with extended output during model alignment to unlock this capability. Code & models are released under Apache License 2.0🧡

4

142

37

93

18K

thukeg retweeted

AK

@_akhaliq

almost 2 years ago

New from @thukeg LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs author @realYushiBai is active in discussion section to answer your questions: https://t.co/UeebckjJjf

_akhaliq's tweet photo. New from @thukeg

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

author @realYushiBai is active in discussion section to answer your questions: https://t.co/UeebckjJjf https://t.co/FfOczYwz4G

1

52

5

16

13K

thukeg retweeted

Yushi Bai

@realYushiBai

almost 2 years ago

Thanks @_akhaliq! We find that your long context LLM is secretly a LongWriter💡All you need is data with extended output during model alignment to unlock this capability. Our code, data, and models: https://t.co/9KN9fKWFLC

1

33

11

19

14K

thukeg retweeted

AK

@_akhaliq

almost 2 years ago

LongWriter Unleashing 10,000+ Word Generation from Long Context LLMs discuss: https://t.co/CHHl12U7RF Current long context large language models (LLMs) can process inputs up to 100,000 tokens, yet struggle to generate outputs exceeding even a modest length of 2,000 words. Through controlled experiments, we find that the model's effective generation length is inherently bounded by the sample it has seen during supervised fine-tuning (SFT). In other words, their output limitation is due to the scarcity of long-output examples in existing SFT datasets. To address this, we introduce AgentWrite, an agent-based pipeline that decomposes ultra-long generation tasks into subtasks, enabling off-the-shelf LLMs to generate coherent outputs exceeding 20,000 words. Leveraging AgentWrite, we construct LongWriter-6k, a dataset containing 6,000 SFT data with output lengths ranging from 2k to 32k words. By incorporating this dataset into model training, we successfully scale the output length of existing models to over 10,000 words while maintaining output quality. We also develop LongBench-Write, a comprehensive benchmark for evaluating ultra-long generation capabilities. Our 9B parameter model, further improved through DPO, achieves state-of-the-art performance on this benchmark, surpassing even much larger proprietary models. In general, our work demonstrates that existing long context LLM already possesses the potential for a larger output window--all you need is data with extended output during model alignment to unlock this capability.

3

208

46

140

51K

Tsinghua KEG (THUDM)

@thukeg

almost 2 years ago

#VisualAgentBench: proprietary models (4o, 4o-mini, 3.5-sonnet) currently have an edge as visual foundation agents for now, but open models InternVL & GLM-4V are catching up fast, a similar story to LLMs as agents as revealed in #AgentBench back in Aug 2023. https://t.co/1LoiPkjHAx https://t.co/ddbELDos0T

Xiao Liu (Shaw)

@ShawLiu12

almost 2 years ago

🚨Thrilled to present VisualAgentBench (VAB) with @yugu_nlp and Tianjie, where we enable both TRAINING & TESTING of visual foundation agents across 5 different environments! In all 17 large multimodal models (LMMs) are tested. Find our paper, data, and more insights below 👇 Paper: https://t.co/EtURrhGZe3 Code & Data: https://t.co/XrsG9cJwkp Thanks @_akhaliq for sharing on today’s arxiv on HF!

1

50

16

25

23K

0

11

1

0

1K

thukeg retweeted

Z.ai @Zai_org

almost 2 years ago

We are not just doing “demo only” for video generation. Ying, we are bringing a video generation AI that everyone can use. Create a 6-second video in just 30 seconds. Try our new product now. YING:https://t.co/wH5pQusd7s https://t.co/Rt3eXXR8qB

7

103

30

57

36K

thukeg retweeted

Tsinghua CS @thudcst

almost 2 years ago

🏆Proud moment for us! Our paper on 'Explicit factor models for explainable recommendation'(https://t.co/uL7CVxkqZk) has won the Test of Time Award at #SIGIR2024, leading the way in 'explainable recommendation' since 2014. Congrats to outstanding THUIR group from #DCST, #Tsinghua

thudcst's tweet photo. 🏆Proud moment for us! Our paper on 'Explicit factor models for explainable recommendation'(https://t.co/uL7CVxkqZk) has won the Test of Time Award at #SIGIR2024, leading the way in 'explainable recommendation' since 2014. Congrats to outstanding THUIR group from #DCST, #Tsinghua https://t.co/AZnapecmYX

0

19

4

2

5K

thukeg retweeted

Z.ai @Zai_org

almost 2 years ago

🚀 We published a tech report about GLM's Family! ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools. https://t.co/cgCKNtE93r

Zai_org's tweet photo. 🚀 We published a tech report about GLM's Family! ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools.

https://t.co/cgCKNtE93r https://t.co/hVU8krimg2

1

39

9

10

3K

thukeg retweeted

Aran Komatsuzaki

@arankomatsuzaki

almost 2 years ago

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools GLM-4: - closely rivals GPT-4 on MMLU, MATH, GPQA, etc - gets close to GPT-4 in instruction following and long context tasks hf: https://t.co/ZQayrOSdn4 repo: https://t.co/ZtxFCfaWcx abs: https://t.co/Pem3LX2i0P

arankomatsuzaki's tweet photo. ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

GLM-4:
- closely rivals GPT-4 on MMLU, MATH, GPQA, etc
- gets close to GPT-4 in instruction following and long context tasks

hf: https://t.co/ZQayrOSdn4
repo: https://t.co/ZtxFCfaWcx
abs: https://t.co/Pem3LX2i0P

2

100

26

34

13K

thukeg retweeted

AK

@_akhaliq

almost 2 years ago

ChatGLM A Family of Large Language Models from GLM-130B to GLM-4 All Tools We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air,

_akhaliq's tweet photo. ChatGLM

A Family of Large Language Models from GLM-130B to GLM-4 All Tools

We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, https://t.co/Fbof7qN907

1

80

31

30

14K

Tsinghua KEG (THUDM)

@thukeg

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users