Eniton @eniton - Twitter Profile

I'm a very visual person. when I was first getting into ML, I'd try to draw out every concept on pen and paper. back then I couldn't vibe-code a visualization. but now you can! here are my favorite ML visualizations I've been saving for a while. take them as inspo for the next complex topic you want to visualize 🧵

33

1K

166

2K

84K

Eniton @eniton

16 days ago

I have a feeling MiniMax will announce that the 50% promotional price is permanent after the first week. https://t.co/i7qSe2I1oK

0

12

Who to follow

Zhuoli Li

@DreamingPiggy

Learn LLVM/Swift. 5 years iOS Developer experience in ByteDance. Maintainer of SDWebImage wanted! Believe of Web and Open-source.

Sofish

@sofish

working on https://t.co/3ZGUcCJ7BA 🕊 ❤️ 🤖

Lex Tang

@lexrus

iOS/macOS developer https://t.co/0IHFZ92Fy0 https://t.co/rj4ExkW8oZ

eniton retweeted

Nathan Lambert

@natolambert

about 1 month ago

Visiting most of the leading Chinese AI labs, I'm struck by a culture that's extremely well suited to building LLMs with fewer resources, but one happening in a very different ecosystem, more companies at play, almost no data industry, etc. Full report: https://t.co/ibmtMWnfTc

47

2K

227

2K

679K

Eniton @eniton

4 months ago

It's art. https://t.co/onO3VGw0Jb

0

32

eniton retweeted

Sebastian Raschka

@rasbt

6 months ago

https://t.co/GV9DTkesul

39

2K

387

3K

293K

eniton retweeted

Paul Klein IV

@pk_iv

6 months ago

I spent all of Christmas reverse engineering Claude Chrome so it would work with remote browsers. Here's how Anthropic taught Claude how to browse the web (1/7)

87

2K

183

3K

395K

eniton retweeted

Chao Huang

@huang_chao4969

6 months ago

🚀 Paper2Slides is now open source! Transform research papers & technical reports into professional presentations with ONE click! We've generated stunning presentation slides from the latest DeepSeek V3.2 paper in diverse styles - check them out and share your feedback! 🔥 Core Features: - 📄 Multi-format support - PDF, Word, Excel, PowerPoint & more - 🎯 Smart content understanding - Captures key insights, figures, formulas, tables & data points. - 🎨 Custom styling - Professional themes with full personalization. - ⚡ Lightning fast - High-quality PPT generation in minutes. GitHub: https://t.co/zNxlFifDU3 Never build slides from scratch again! ✨ Come play with it ⭐! #Paper2Slides #AIPPT

huang_chao4969's tweet photo. 🚀 Paper2Slides is now open source! Transform research papers & technical reports into professional presentations with ONE click!

We've generated stunning presentation slides from the latest DeepSeek V3.2 paper in diverse styles - check them out and share your feedback!

🔥 Core Features:
- 📄 Multi-format support - PDF, Word, Excel, PowerPoint & more
- 🎯 Smart content understanding - Captures key insights, figures, formulas, tables & data points.
- 🎨 Custom styling - Professional themes with full personalization.
- ⚡ Lightning fast - High-quality PPT generation in minutes.

GitHub: https://t.co/zNxlFifDU3

Never build slides from scratch again! ✨ Come play with it ⭐!

#Paper2Slides #AIPPT

38

2K

261

2K

467K

eniton retweeted

Nathan Lambert

@natolambert

6 months ago

Slides: https://t.co/LzAdf2pnDi

2

47

3

49

5K

eniton retweeted

MBZUAI

@mbzuai

7 months ago

Today, we are releasing a new version of K2 (K2-V2), a 360-open LLM built from scratch as a superior base for reasoning adaptation, while still excelling at core LLM capabilities like conversation, knowledge retrieval, and long-context understanding. K2 fills a major gap: highly capable models with no transparency. Instead of releasing only weights, we’re sharing the full training story — dataset recipes, mid-training checkpoints, logs, code, and evaluation tools. That’s 360-open. What’s inside: • 70B dense transformer engineered as a reasoning-enhanced base model • Native 512K context (extendable via RoPE scaling) • Mid-training reasoning phase • Strong tool-use scaffolding What we’re open-sourcing: • 250M+ reasoning traces (math, planning, multi-step logic) • Full pre- & mid-training data compositions • All mid-training checkpoints • Training logs, code, Eval360 Performance: • GPQA-Diamond: 55.1% mid-training → 69.3% after SFT (strongest fully open 70B model) • KK-8 Logic Puzzles: 83% — competitive with DeepSeek-R1 & OpenAI o3-mini-high • ArenaHard V2: 62.1% — close to Qwen3 235B • Outperforms Qwen2.5-72B and approaches Qwen3-235B despite being smaller and fully transparent. 🔗 The Model: https://t.co/gsjRUwfnvN 🔗Technical Report: https://t.co/oFZQuLQaNg 🔗Blog: https://t.co/zQdpmLgEUt

mbzuai's tweet photo. Today, we are releasing a new version of K2 (K2-V2), a 360-open LLM built from scratch as a superior base for reasoning adaptation, while still excelling at core LLM capabilities like conversation, knowledge retrieval, and long-context understanding.

K2 fills a major gap: highly capable models with no transparency. Instead of releasing only weights, we’re sharing the full training story — dataset recipes, mid-training checkpoints, logs, code, and evaluation tools. That’s 360-open.

What’s inside:
• 70B dense transformer engineered as a reasoning-enhanced base model
• Native 512K context (extendable via RoPE scaling)
• Mid-training reasoning phase
• Strong tool-use scaffolding

What we’re open-sourcing:
• 250M+ reasoning traces (math, planning, multi-step logic)
• Full pre- & mid-training data compositions
• All mid-training checkpoints
• Training logs, code, Eval360

Performance:
• GPQA-Diamond: 55.1% mid-training → 69.3% after SFT (strongest fully open 70B model)
• KK-8 Logic Puzzles: 83% — competitive with DeepSeek-R1 & OpenAI o3-mini-high
• ArenaHard V2: 62.1% — close to Qwen3 235B
• Outperforms Qwen2.5-72B and approaches Qwen3-235B despite being smaller and fully transparent.

🔗 The Model:
https://t.co/gsjRUwfnvN

🔗Technical Report:
https://t.co/oFZQuLQaNg

🔗Blog:
https://t.co/zQdpmLgEUt

3

111

35

48

55K

Eniton @eniton

9 months ago

@immeivise It's a great table. Could you share the link to it?

0

6

Eniton @eniton

10 months ago

@openrouter How does the pricing model for image generation work? It says "$1.238/K input imgs, $0.03/K output imgs" — what does “/K” mean here?

0

16

eniton retweeted

Suny Shtedritski @shtedritski

about 1 year ago

Introducing SynCity 🌆 SynCity generates entire 3D worlds from a text prompt with no training or optimisation. It leverages pretrained 2D and 3D generators and generates scenes on a grid, tile by tile. The generated 3D environments are diverse, fully coherent, and navigable. 🧵👇

75

3K

354

3K

280K

eniton retweeted

Stevie Mac

@StevieMac03

over 1 year ago

Monk and his giant warsteed traversing a river. Kling 1.6 image to video. 🔊🔊

60

2K

219

446

208K

Eniton @eniton

over 1 year ago

what?

0

68

Eniton @eniton

over 1 year ago

@kevin_logan @tunguz exactly

0

1

0

86

eniton retweeted

Philipp Schmid

@_philschmid

over 1 year ago

How can we evaluate LLMs across 1000+ languages? 🌎 The first step towards FineWeb Multilingual was creating FineTasks, a data-driven evaluation framework that helps select reliable evaluation tasks for any language. The @huggingface Team validated it across 9 different languages and evaluated 35 open and closed LLMs. 👀 TL;DR: 🎯 Created FineTasks - a framework for selecting reliable multilingual evaluation tasks 📊 Tasks based on 4 key metrics: monotonicity, low noise, non-random performance, and model ordering consistency 🔍 Tested 185 tasks across 9 diverse languages (Chinese, French, Arabic, Russian, Thai, Hindi, Turkish, Swahili, Telugu) 📋 Selected 96 final tasks covering reading comprehension, general knowledge, language understanding, and reasoning 🧪 Found task formulation matters: Cloze Format better for early training, Multiple Choice Format for later evaluation 📈 Metrics recommendation: Use length normalization for most tasks, PMI for complex reasoning 🔥 Open models are narrowing the gap with closed-source models in multilingual performance. 🏆 Evaluated 35 open and closed-source LLMs; Qwen 2 models dominated high/mid-resource languages: Gemma-2 excelled in low-resource languages 🌐 Framework supports 550+ tasks across various languages

_philschmid's tweet photo. How can we evaluate LLMs across 1000+ languages? 🌎 The first step towards FineWeb Multilingual was creating FineTasks, a data-driven evaluation framework that helps select reliable evaluation tasks for any language. The @huggingface Team validated it across 9 different languages and evaluated 35 open and closed LLMs. 👀

TL;DR:
🎯 Created FineTasks - a framework for selecting reliable multilingual evaluation tasks
📊 Tasks based on 4 key metrics: monotonicity, low noise, non-random performance, and model ordering consistency
🔍 Tested 185 tasks across 9 diverse languages (Chinese, French, Arabic, Russian, Thai, Hindi, Turkish, Swahili, Telugu)
📋 Selected 96 final tasks covering reading comprehension, general knowledge, language understanding, and reasoning
🧪 Found task formulation matters: Cloze Format better for early training, Multiple Choice Format for later evaluation
📈 Metrics recommendation: Use length normalization for most tasks, PMI for complex reasoning
🔥 Open models are narrowing the gap with closed-source models in multilingual performance.
🏆 Evaluated 35 open and closed-source LLMs; Qwen 2 models dominated high/mid-resource languages: Gemma-2 excelled in low-resource languages
🌐 Framework supports 550+ tasks across various languages

1

31

4

7

3K

eniton retweeted

Maziyar PANAHI

@MaziyarPanahi

over 1 year ago

I can't believe I need to say this, but run the code below in your local Jupyter notebook and save 138,830 arXiv papers in multi-markdown format now before they're gone! 😅 Available on @huggingface Datasets: https://t.co/5sOPYBLZBs