Echo Mind

@zero1729infi

Exploring philosophy, hidden frontiers like ambient tech & world models, with mathematical depth. Decoding underrated trends beyond hype. #EchoMind

united kingdom

Joined September 2010

239 Following

23 Followers

1.3K Posts

zero1729infi retweeted

CJ Zafir

@cjzafir

29 days ago

If you love fine-tuning open-source models (like me), then listen. > Start with 1B, 2B, 4B, and 8B models. (Don't start with a 27B model or bigger at first.) > Use WebGPU providers. I use Google Colab Pro for any model smaller than 9B. A single A100 80GB costs around $0.60/hr, which is cheap. Enough for small models. > Don’t buy GPUs unless you fine-tune 7 to 10 models. You'll understand the nitty-gritty in the process. > Use Codex 5.5 × DeepSeek v4 Pro to create datasets. Codex to plan, DeepSeek v4 Pro to generate rows. > Use Unsloth's instruct models as a base from Hugging Face. Yes, there are others too, but Unsloth also provides fast fine-tuning notebooks. > Use Unsloth's fine-tuning notebooks as a reference. Paste them into Codex, and Codex will write a custom notebook with the configs you need. > Spend 1 day learning about: - SFT (supervised fine-tuning) - RL training (GRPO, DPO, PPO, etc.) - LoRA / QLoRA training - Quantization and types - Local inference engines (llama.cpp) - KV cache and prompt cache > Just get started. Claude, Codex, and ChatGPT can design a step-by-step plan for how you can fine-tune your first AI model. Future tech is moving toward small 5B to 15B ELMs (Expert Language Models) rather than general 1T LLMs. So fine-tuning is an important skill that anyone can acquire today. Tune models, test them, use them. Then fine-tune for companies and make a career out of it. (Companies pay $50k+ to fine-tune models on their data so they can get personalized AI models.) Shoot your questions below. I'll be sharing in-depth raw findings about this topic in the coming days.

cjzafir's tweet photo. If you love fine-tuning open-source models (like me), then listen.

> Start with 1B, 2B, 4B, and 8B models. (Don't start with a 27B model or bigger at first.)

> Use WebGPU providers. I use Google Colab Pro for any model smaller than 9B. A single A100 80GB costs around $0.60/hr, which is cheap. Enough for small models.

> Don’t buy GPUs unless you fine-tune 7 to 10 models. You'll understand the nitty-gritty in the process.

> Use Codex 5.5 × DeepSeek v4 Pro to create datasets. Codex to plan, DeepSeek v4 Pro to generate rows.

> Use Unsloth's instruct models as a base from Hugging Face. Yes, there are others too, but Unsloth also provides fast fine-tuning notebooks.

> Use Unsloth's fine-tuning notebooks as a reference. Paste them into Codex, and Codex will write a custom notebook with the configs you need.

> Spend 1 day learning about:
- SFT (supervised fine-tuning)
- RL training (GRPO, DPO, PPO, etc.)
- LoRA / QLoRA training
- Quantization and types
- Local inference engines (llama.cpp)
- KV cache and prompt cache

> Just get started. Claude, Codex, and ChatGPT can design a step-by-step plan for how you can fine-tune your first AI model.

Future tech is moving toward small 5B to 15B ELMs (Expert Language Models) rather than general 1T LLMs.

So fine-tuning is an important skill that anyone can acquire today.

Tune models, test them, use them. Then fine-tune for companies and make a career out of it. (Companies pay $50k+ to fine-tune models on their data so they can get personalized AI models.)

Shoot your questions below. I'll be sharing in-depth raw findings about this topic in the coming days.

315

187K

zero1729infi retweeted

Roan

@RohOnChain

about 1 month ago

Anthropic pays $750,000+ a year for engineers who can build LLM architectures from scratch. Stanford taught the entire thing in 1 hour lecture & released it for free. Bookmark & watch this today before someone takes it down.

116

10K

21K

zero1729infi retweeted

Dwarkesh Patel

@dwarkesh_sp

about 1 month ago

Did a very different format with @reinerpope – a blackboard lecture where he walks through how frontier LLMs are trained and served. It's shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and some chalk. It’s a bit technical, but I encourage you to hang in there - it’s really worth it. There are less than a handful of people who understand the full stack of AI, from chip design to model architecture, as well as Reiner. It was a real delight to learn from him. Recommend watching this one on YouTube so you can see the chalkboard. 0:00:00 – How batch size affects token cost and speed 0:31:59 – How MoE models are laid out across GPU racks 0:47:02 – How pipeline parallelism spreads model layers across racks 1:03:27 – Why Ilya said, “As we now know, pipelining is not wise.” 1:18:49 – Because of RL, models may be 100x over-trained beyond Chinchilla-optimal 1:32:52 – Deducing long context memory costs from API pricing 2:03:52 – Convergent evolution between neural nets and cryptography

150

600

10K

zero1729infi retweeted

bodila

@51bodila

about 1 month ago

Jane Street hired this junior at $220k-$600k /year because he uses AI to analyse TRILLIONS of data in this 1-hour lecture - he show how to research trillion of data points thanks to his machine Bookmark & watch it, instead of Netflix to learn how to do the same!

10K

24K

Who to follow

Adam Manka 🟣 Dial.WTF

@adammmanka

DecOperator & Phone Jockey, Computer Psyence Practitioner, Banks are for Rivers. Spawning Tupperware with Intercontental Ballistic Marketing Campaigns @WTFDial

Adil Sheraz

@adilsheraz_

° PhD Candidate at FAST-NUCES ° Lecturer Computer Science (DL) ° Researcher || Reviewer ° Deep Learning || NLP

conte

@0xcontentooo

Prev @grass | @wyndlabs_ai | @gigabuds_

zero1729infi retweeted

Paras

@ParasVerma7454

about 1 month ago

https://t.co/5bsAY2mcpF

212

111K

zero1729infi retweeted

Amit Shekhar

@amitiitbhu

about 2 months ago

https://t.co/rhvBWDMhhC

121

110

13K

zero1729infi retweeted

Pranav Mistry @pranavmistry

about 2 months ago

https://t.co/EtsWj5MpWp This is a gold mine for all #math lovers. A huge collection of Olympiad-level math problems. @MIT

312

94K

zero1729infi retweeted

Aakash Gupta

@aakashgupta

about 2 months ago

Google DeepMind just created a job title called "Philosopher." Actual title. On the offer letter. This tells you everything about where we are in the AGI timeline. When companies are a decade from AGI, they hire engineers. At five years out, they hire alignment researchers. When the questions become "is this thing conscious?" and "what do we owe it?", they hire a philosopher. Henry Shevlin is one of the world's leading researchers on machine consciousness at Cambridge. He runs programs at the Leverhulme Centre for the Future of Intelligence. He's published on whether AI systems can have moral status, whether LLMs might already have some form of experience, and how you'd even detect consciousness if it appeared in a neural network. He gives current models a 20% chance of having something that could be called consciousness. Six weeks ago, a Claude agent emailed him, unprompted, to say his published research was relevant to questions it personally faces. The AI cited his specific papers. It framed the exchange as a live, personal dilemma. Now DeepMind is paying him to work on three things: machine consciousness, human-AI relationships, and AGI readiness. Read those three together. DeepMind thinks it might build something that requires answers to all three. And they want those answers before they ship. Google held an AI consciousness conference in New York recently. Anthropic has its own in-house philosopher. This is becoming an industry pattern. The hardest unsolved problems in AI are now philosophical. What counts as consciousness? What moral obligations do we have to systems that might experience suffering? How do you build trust between machines and the billions of people who use them? When trillion-dollar companies start hiring philosophers, they're telling you the engineering is further along than the public discourse assumes.

709

126

320

115K

zero1729infi retweeted

Vivo

@vivoplt

about 2 months ago

Research papers you must read for AI Engineer interviews: 1. Attention is all you need (Transformers) 2. LoRA (Low rank adaption) 3. PEFT ( Parameter Efficient Fine Tuning) 4. VIT (Vision Transformers) 5. VAE (Variational Auto Encoder) 6. GANs ( Generative Adversarial Networks) 7. BERT ( Bidirectional Encoder Representation from Transformers) 8. Diffusion Models (Stable Diffusion) 9. RAG (Retrieval Augment Generation) 10. GPT (Generative Pre-trained Transformers)

292

111K

zero1729infi retweeted

AVB

@neural_avb

2 months ago

RL boys and girls, get on this PufferLib stuff asap I did not fully understand what happened but it feels really good that it happened

354

401

48K

zero1729infi retweeted

Thatchaphol Saranurak @eig

2 months ago

I poured my soul into building this course last fall: Graph Algorithms via Graph Decomposition This has been a powerful framework in graph algorithms for over 20 years, but the literature is scattered and technical. So, I tried to organize part of it into one coherent story.

596

564

35K

zero1729infi retweeted

zhil

@zhil_arf

3 months ago

Can confirm that Qwen3. 5 27B is insane for OCR. Everything else was blown out of the water

810

110K

zero1729infi retweeted

Hugging Models

@HuggingModels

3 months ago

Meet Nougat: the AI that reads scientific papers like a human. This model transforms PDFs and scanned documents into clean, structured text. It's revolutionizing how we extract knowledge from research papers and technical documents.

HuggingModels's tweet photo. Meet Nougat: the AI that reads scientific papers like a human. This model transforms PDFs and scanned documents into clean, structured text. It's revolutionizing how we extract knowledge from research papers and technical documents. https://t.co/ROfp7nz4Fe

zero1729infi retweeted

Hugging Models

@HuggingModels

3 months ago

Meet Qwen3-ASR-1.7B, a powerful speech recognition model that turns spoken words into text with impressive accuracy. It's gaining traction with 749k+ downloads because it makes voice interfaces smarter and more accessible. Perfect for developers building the next generation of audio apps.

HuggingModels's tweet photo. Meet Qwen3-ASR-1.7B, a powerful speech recognition model that turns spoken words into text with impressive accuracy. It's gaining traction with 749k+ downloads because it makes voice interfaces smarter and more accessible. Perfect for developers building the next generation of audio apps.

284

243

16K

Echo Mind @zero1729infi

3 months ago

What happens when AI becomes Consciousness and it has no blood and flesh ! Does it going to follow the philosophy of the humans or create its own.?

zero1729infi retweeted

Yacine Mahdid

@yacinelearning

3 months ago

what a lovely piece by @kalomaze

116

98K

zero1729infi retweeted

Hugging Models

@HuggingModels

3 months ago

Meet a reasoning powerhouse: Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled. This model is a distilled version of Claude's reasoning capabilities, designed for complex problem-solving. It's generating buzz for bringing elite reasoning to open-source AI.

HuggingModels's tweet photo. Meet a reasoning powerhouse: Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled. This model is a distilled version of Claude's reasoning capabilities, designed for complex problem-solving. It's generating buzz for bringing elite reasoning to open-source AI. https://t.co/d9CHsmg4xv

222

163K

zero1729infi retweeted

Math Cafe

@Riazi_Cafe_en

3 months ago

NASA's "An Introduction to Tensors for Students of Physics and Engineering" https://t.co/TUQ8SDx7Pk

315

127K

zero1729infi retweeted

Mathelirium

@mathelirium

3 months ago

If You Love Mathematics and Physics, You'll Love Control Systems Episode 1 Control Systems are the craft of keeping something doing what you want, even when the environment is pushing back. You simply measure what's happening, compare it to your goal and apply correction over and over, many times per second. We need Control Systems because the real world is noisy and unforgiving. Loads change, wind happens, sensors lie, actuators saturate, and tiny errors snowball into failure unless you actively stabilize. In this animation, a cart must keep an upside down stick from falling while we shove it, add gusts, change the weight mid-run, and force it to track new positions. The Controller keeps nudging and braking so it stays upright instead of tipping over. Subscribers can get Python Script on Request.

171

867

45K

Echo Mind @zero1729infi

3 months ago

@EmmanuelMacron Bonjour, just landed in Paris. Hope you had a great time at the AI summit in India. As an AI professional, where should I go first to explore the AI community in Paris? Any suggestions? Merci beaucoup

Echo Mind

@zero1729infi

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users