Pablo Duy K. Vu @duyvk - Twitter Profile

duyvk retweeted

about 2 months ago

Andrej Karpathy could have packaged this into a $2,000 masterclass. Instead, he uploaded it to YouTube for free. Three hours covering how modern LLMs actually work: tokenization, neural nets, RLHF, hallucinations, tool use, reinforcement learning, and systems like AlphaGo and DeepSeek. This isn’t about prompts. It’s about understanding the machine behind the magic.

15

512

112

1K

160K

duyvk retweeted

Rimsha Bhardwaj

@heyrimsha

3 months ago

🚨BREAKING: A dev just open-sourced the #1 ranked OCR model on Earth. It's called GLM-OCR and it just hit 94.62 on OmniDocBench V1.5, beating every OCR model in existence. Only 0.9B parameters. One pip install. Handles documents no other model could touch. 100% Open Source.

heyrimsha's tweet photo. 🚨BREAKING: A dev just open-sourced the #1 ranked OCR model on Earth.

It's called GLM-OCR and it just hit 94.62 on OmniDocBench V1.5, beating every OCR model in existence.

Only 0.9B parameters. One pip install. Handles documents no other model could touch.

100% Open Source. https://t.co/WNdIJiw6ZX

48

3K

391

4K

207K

duyvk retweeted

Alex Xu

@alexxubyte

2 months ago

12 Claude Code Features Clearly Explained - CLAUDE.md - Permissions - Plan Mode - Checkpoints - Skills - Hooks - MCP - Plugins - Context - Slash Commands - Compaction - Subagents

27

2K

419

3K

261K

duyvk retweeted

Melvin Vivas @donvito

7 months ago

OCR to HTML 🤯🤯🤯 Qwen3 VL 30B can convert handwritten notes to html try it https://t.co/2Fpo5Xx8fy

4

296

32

228

16K

Who to follow

Fabio Bertozzi

@FabioBertozzi

○ Mbet @UWaterloo Alumni

KOHEI

@JC_hoshiyama

株式会社JAMCODE代表取締役。バックパッカー＆サウンドクリエイター。使用言語・ツール：Ruby,Python,Java,Scala,PHP HTML5,CSS3,sass,javascript,Vagrant,ansible,docker,Github,AWS等

ankushmalik

@iamankushmalik

I break things; I make things

duyvk retweeted

Adina Yakup

@AdinaYakup

3 months ago

Another OCR model just dropped on @huggingface (so many OCRs lately!) dots.mocr from @xiaohongshu Hi Lab looks really impressive on the benchmarks. -Model: https://t.co/tv9CrfnddG -Paper: https://t.co/JAGPQnPBxJ ✨ 3B ✨ Multilingual support ✨ Converts charts, diagrams, and UI layouts directly into SVG code

AdinaYakup's tweet photo. Another OCR model just dropped on @huggingface (so many OCRs lately!)

dots.mocr from @xiaohongshu Hi Lab looks really impressive on the benchmarks.

-Model: https://t.co/tv9CrfnddG
-Paper: https://t.co/JAGPQnPBxJ

✨ 3B
✨ Multilingual support
✨ Converts charts, diagrams, and UI layouts directly into SVG code

4

203

33

147

8K

Pablo Duy K. Vu @duyvk

4 months ago

@Motion_Viz PROMPT

0

4

duyvk retweeted

Bo Wang

@BoWang87

4 months ago

A Chinese hardware team just mass-democratized AI agents. They took a 430,000-line AI assistant that needs a $599 Mac Mini and 1GB of RAM — and rewrote it in Go so it runs on a $9.9 dev board with less than 10MB of memory. Boot time: from 500 seconds to 1 second. Cost: from $599 to $9.9. Memory: from 1GB to 10MB. Same features: code generation, web search, Discord/Telegram chat, memory system, scheduled tasks, security sandbox. The wildest part? They claim 95% of the new codebase was written by AI agents themselves. The humans just guided the architecture. It's an AI assistant that literally rebuilt itself to be smaller. Launched February 9th. Four days later: 7,400+ GitHub stars. This is the pattern no one's talking about enough. Every AI capability that starts expensive gets commoditized within months. GPT-4 level models went open source in 6 months. Now the hardware floor for running a personal AI agent just dropped 60x in weeks. The infrastructure moat in AI isn't sustainable. The only defensible advantage is what you do with these tools — not access to them.

BoWang87's tweet photo. A Chinese hardware team just mass-democratized AI agents.

They took a 430,000-line AI assistant that needs a $599 Mac Mini and 1GB of RAM — and rewrote it in Go so it runs on a $9.9 dev board with less than 10MB of memory.

Boot time: from 500 seconds to 1 second.
Cost: from $599 to $9.9.
Memory: from 1GB to 10MB.

Same features: code generation, web search, Discord/Telegram chat, memory system, scheduled tasks, security sandbox.

The wildest part? They claim 95% of the new codebase was written by AI agents themselves. The humans just guided the architecture. It's an AI assistant that literally rebuilt itself to be smaller.

Launched February 9th. Four days later: 7,400+ GitHub stars.

This is the pattern no one's talking about enough.

Every AI capability that starts expensive gets commoditized within months. GPT-4 level models went open source in 6 months. Now the hardware floor for running a personal AI agent just dropped 60x in weeks.

The infrastructure moat in AI isn't sustainable. The only defensible advantage is what you do with these tools — not access to them.

322

8K

1K

9K

3M

duyvk retweeted

AK

@akbuilds_

4 months ago

I don’t know about others, but I’m letting Opus 4.6 design directly in Figma via Cursor. This is starting to feel unfair. And yeah… I just sit there and enjoy life. ❤️

240

5K

204

7K

816K

duyvk retweeted

0xMarioNawfal

@RoundtableSpace

4 months ago

OpenClaw can now use 8 different Claude Code instances with Opus all day. This guy just turned 1 autonomous AI engineer into 8.

77

988

62

1K

124K

duyvk retweeted

Charly Wargnier

@DataChaz

4 months ago

Chinese engineers literally refactored @OpenClaw in Go for insane efficiency 🤯 It runs on a $10 @Raspberry_Pi instead of a $399 Mac mini. → Uses <10 MB of memory (99% smaller than OpenClaw) → 400× faster startup time → Boots in 1 second Free and open source, repo in 🧵↓

DataChaz's tweet photo. Chinese engineers literally refactored @OpenClaw in Go for insane efficiency 🤯

It runs on a $10 @Raspberry_Pi instead of a $399 Mac mini.

→ Uses <10 MB of memory (99% smaller than OpenClaw)
→ 400× faster startup time
→ Boots in 1 second

Free and open source, repo in 🧵↓ https://t.co/5VBYJhRo0B

155

4K

521

4K

424K

duyvk retweeted

Unwind AI

@unwind_ai_

4 months ago

China's Alibaba just opensourced the SQLite of vector databases. zvec runs as a library inside your app and is built for on-device RAG no external server. no pinecone. no qdrant instance. 100% opensource.

unwind_ai_'s tweet photo. China's Alibaba just opensourced the SQLite of vector databases.

zvec runs as a library inside your app and is built for on-device RAG

no external server. no pinecone. no qdrant instance.

100% opensource. https://t.co/KFQ1JM4KXK

50

3K

345

3K

133K

duyvk retweeted

Hugging Models

@HuggingModels

4 months ago

NVIDIA just dropped PersonaPlex-7B 🤯 A full-duplex voice model that listens and talks at the same time. No pauses. No turn-taking. Real conversation. 100% open source. Free. Voice AI just leveled up. https://t.co/YfzFQfBzMS

206

8K

1K

8K

1M

duyvk retweeted

Stan Girard

@_StanGirard

4 months ago

I reverse-engineered Claude Code's binary. Found a flag they hid from --help. --sdk-url Enable it and the terminal disappears. The CLI becomes a WebSocket client. We built a server to catch the connection. Added a React UI on top. Now I run Claude Code from my browser. From my phone. From anywhere. Same $200/month subscription. Zero extra API costs. bunx the-vibe-companion https://t.co/8Dnp7M46vU

266

4K

285

5K

693K

duyvk retweeted

𝗿𝗮𝗺𝗮𝗸𝗿𝘂𝘀𝗵𝗻𝗮— 𝗲/𝗮𝗰𝗰

@techwith_ram

5 months ago

MIT just released a 700-page book that actually teaches machines how to think. Pdf: https://t.co/143xxVxLdg

60

5K

847

8K

379K

duyvk retweeted

Boris Cherny

@bcherny

4 months ago

I'm Boris and I created Claude Code. I wanted to quickly share a few tips for using Claude Code, sourced directly from the Claude Code team. The way the team uses Claude is different than how I use it. Remember: there is no one right way to use Claude Code -- everyones' setup is different. You should experiment to see what works for you!

924

51K

6K

103K

9M

duyvk retweeted

Vaishnavi

@_vmlops

7 months ago

These creators make AI/ML truly understandable the real heroes for anyone learning this field🤩

56

3K

324

2K

91K

duyvk retweeted

Rohan Paul

@rohanpaul_ai

8 months ago

Converts PDF documents to Markdown format using DeepSeek-OCR with FastAPI backend. This guy bave it 10000 pdfs to convert to markdown. averaging less than 1 second per page. Hardware - 1 x A6000 ADA on a Ryzen 1700 /w 32gb ram Dockerized model with fastapi in a wsl environment.

rohanpaul_ai's tweet photo. Converts PDF documents to Markdown format using DeepSeek-OCR with FastAPI backend.

This guy bave it 10000 pdfs to convert to markdown.

averaging less than 1 second per page.

Hardware - 1 x A6000 ADA on a Ryzen 1700 /w 32gb ram

Dockerized model with fastapi in a wsl environment.

37

2K

256

3K

258K

duyvk retweeted

vLLM

@vllm_project

8 months ago

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping 97% OCR accuracy at <10×. 📄 Outperforms GOT-OCR2.0 & MinerU2.0 on OmniDocBench using fewer vision tokens. 🤝 The vLLM team is working with DeepSeek to bring official DeepSeek-OCR support into the next vLLM release — making multimodal inference even faster and easier to scale. 🔗 https://t.co/rnBG9VUuMy #vLLM #DeepSeek #OCR #LLM #VisionAI #DeepLearning

vllm_project's tweet photo. 🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping 97% OCR accuracy at <10×.
📄 Outperforms GOT-OCR2.0 & MinerU2.0 on OmniDocBench using fewer vision tokens.

🤝 The vLLM team is working with DeepSeek to bring official DeepSeek-OCR support into the next vLLM release — making multimodal inference even faster and easier to scale.

🔗 https://t.co/rnBG9VUuMy
#vLLM #DeepSeek #OCR #LLM #VisionAI #DeepLearning

53

3K

360

1K

2M

duyvk retweeted

Andrej Karpathy

@karpathy

8 months ago

I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language person) is whether pixels are better inputs to LLMs than text. Whether text tokens are wasteful and just terrible, at the input. Maybe it makes more sense that all inputs to LLMs should only ever be images. Even if you happen to have pure text input, maybe you'd prefer to render it and then feed that in: - more information compression (see paper) => shorter context windows, more efficiency - significantly more general information stream => not just text, but e.g. bold text, colored text, arbitrary images. - input can now be processed with bidirectional attention easily and as default, not autoregressive attention - a lot more powerful. - delete the tokenizer (at the input)!! I already ranted about how much I dislike the tokenizer. Tokenizers are ugly, separate, not end-to-end stage. It "imports" all the ugliness of Unicode, byte encodings, it inherits a lot of historical baggage, security/jailbreak risk (e.g. continuation bytes). It makes two characters that look identical to the eye look as two completely different tokens internally in the network. A smiling emoji looks like a weird token, not an... actual smiling face, pixels and all, and all the transfer learning that brings along. The tokenizer must go. OCR is just one of many useful vision -> text tasks. And text -> text tasks can be made to be vision ->text tasks. Not vice versa. So many the User message is images, but the decoder (the Assistant response) remains text. It's a lot less obvious how to output pixels realistically... or if you'd want to. Now I have to also fight the urge to side quest an image-input-only version of nanochat...

558

13K

2K

7K

3M

Pablo Duy K. Vu @duyvk

11 months ago

Excitement is building as new whisky lines hit the shelves. Explore the latest releases and discover unique flavors from around the world. Discover the latest whisky releases, featuring rare blends and aged single malts from top distilleries around the world.🧞‍♂️ Explore new fav

duyvk's tweet photo. Excitement is building as new whisky lines hit the shelves.
Explore the latest releases and discover unique flavors from around the world.
Discover the latest whisky releases, featuring rare blends and aged single malts from top distilleries around the world.🧞‍♂️
Explore new fav https://t.co/htJOJgch2t

0

20

Pablo Duy K. Vu

@duyvk

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users