TheTechGuy

@code2deploy

AI Engineer | Vibe Coder | GenAI | Building in Public

Twitter'a katıldığı tarih April 2025

81 Takip Edilen

10 Takipçi

181 Gönderi

code2deploy retweetledi

CJ Zafir

@cjzafir

28 gün önce

Qwen 3.5 has the best SLMs to fine-tune! Its 4B model is really smart if you train it on a well structured dataset. I fine-tuned the model on a 135M dataset generated by Codex 5.5 + DeepSeek v4 Pro. I achieved 96%+ accurate results with Qwen 3.5 4B. And 95% on Qwen 3.5 2B (that only requires 3.5GB RAM). For context, on the same pipeline: > Sonnet 4.6 achieved 89% > GPT 5.4 Mini achieved 85% > Haiku 4.5 achieved 72% I don't trust evals, so I ran a 7000+ row hard-boundary test, and the results of Qwen 3.5 were consistent. A 4B fine-tuned model beating a 20x bigger model in accuracy and latency is no joke. It cost me $173 in total to generate the dataset and cover the cloud GPU cost to fine-tune both models. I said this before, and I'll say it again: not everything requires a 1T-parameter LLM. We need ELMs (Expert Language Models) that are specialized for one domain only. ELMs > LLMs. I'll be writing more about how SLM fine-tuning works. So stay tuned.

cjzafir's tweet photo. Qwen 3.5 has the best SLMs to fine-tune!

Its 4B model is really smart if you train it on a well structured dataset.

I fine-tuned the model on a 135M dataset generated by Codex 5.5 + DeepSeek v4 Pro.

I achieved 96%+ accurate results with Qwen 3.5 4B.

And 95% on Qwen 3.5 2B (that only requires 3.5GB RAM).

For context, on the same pipeline:
> Sonnet 4.6 achieved 89%
> GPT 5.4 Mini achieved 85%
> Haiku 4.5 achieved 72%

I don't trust evals, so I ran a 7000+ row hard-boundary test, and the results of Qwen 3.5 were consistent.

A 4B fine-tuned model beating a 20x bigger model in accuracy and latency is no joke.

It cost me $173 in total to generate the dataset and cover the cloud GPU cost to fine-tune both models.

I said this before, and I'll say it again: not everything requires a 1T-parameter LLM. We need ELMs (Expert Language Models) that are specialized for one domain only.

ELMs > LLMs.

I'll be writing more about how SLM fine-tuning works. So stay tuned.

695

553

28K

code2deploy retweetledi

CJ Zafir

@cjzafir

29 gün önce

If you love fine-tuning open-source models (like me), then listen. > Start with 1B, 2B, 4B, and 8B models. (Don't start with a 27B model or bigger at first.) > Use WebGPU providers. I use Google Colab Pro for any model smaller than 9B. A single A100 80GB costs around $0.60/hr, which is cheap. Enough for small models. > Don’t buy GPUs unless you fine-tune 7 to 10 models. You'll understand the nitty-gritty in the process. > Use Codex 5.5 × DeepSeek v4 Pro to create datasets. Codex to plan, DeepSeek v4 Pro to generate rows. > Use Unsloth's instruct models as a base from Hugging Face. Yes, there are others too, but Unsloth also provides fast fine-tuning notebooks. > Use Unsloth's fine-tuning notebooks as a reference. Paste them into Codex, and Codex will write a custom notebook with the configs you need. > Spend 1 day learning about: - SFT (supervised fine-tuning) - RL training (GRPO, DPO, PPO, etc.) - LoRA / QLoRA training - Quantization and types - Local inference engines (llama.cpp) - KV cache and prompt cache > Just get started. Claude, Codex, and ChatGPT can design a step-by-step plan for how you can fine-tune your first AI model. Future tech is moving toward small 5B to 15B ELMs (Expert Language Models) rather than general 1T LLMs. So fine-tuning is an important skill that anyone can acquire today. Tune models, test them, use them. Then fine-tune for companies and make a career out of it. (Companies pay $50k+ to fine-tune models on their data so they can get personalized AI models.) Shoot your questions below. I'll be sharing in-depth raw findings about this topic in the coming days.

cjzafir's tweet photo. If you love fine-tuning open-source models (like me), then listen.

> Start with 1B, 2B, 4B, and 8B models. (Don't start with a 27B model or bigger at first.)

> Use WebGPU providers. I use Google Colab Pro for any model smaller than 9B. A single A100 80GB costs around $0.60/hr, which is cheap. Enough for small models.

> Don’t buy GPUs unless you fine-tune 7 to 10 models. You'll understand the nitty-gritty in the process.

> Use Codex 5.5 × DeepSeek v4 Pro to create datasets. Codex to plan, DeepSeek v4 Pro to generate rows.

> Use Unsloth's instruct models as a base from Hugging Face. Yes, there are others too, but Unsloth also provides fast fine-tuning notebooks.

> Use Unsloth's fine-tuning notebooks as a reference. Paste them into Codex, and Codex will write a custom notebook with the configs you need.

> Spend 1 day learning about:
- SFT (supervised fine-tuning)
- RL training (GRPO, DPO, PPO, etc.)
- LoRA / QLoRA training
- Quantization and types
- Local inference engines (llama.cpp)
- KV cache and prompt cache

> Just get started. Claude, Codex, and ChatGPT can design a step-by-step plan for how you can fine-tune your first AI model.

Future tech is moving toward small 5B to 15B ELMs (Expert Language Models) rather than general 1T LLMs.

So fine-tuning is an important skill that anyone can acquire today.

Tune models, test them, use them. Then fine-tune for companies and make a career out of it. (Companies pay $50k+ to fine-tune models on their data so they can get personalized AI models.)

Shoot your questions below. I'll be sharing in-depth raw findings about this topic in the coming days.

315

187K

code2deploy retweetledi

Jahir Sheikh

@jahirsheikh8

yaklaşık 1 ay önce

As an AI Infrastructure Engineer. Please learn: - GPU/VRAM fundamentals, quantization & batching - vLLM / TensorRT-LLM / inference optimization - KV caching, speculative decoding & token throughput - Distributed training basics (DDP/FSDP/DeepSpeed) - Model serving & autoscaling - Vector DB retrieval pipelines - Prompt caching & cost optimization - Observability for LLM apps This is what production AI teams actually care about.

410

239K

code2deploy retweetledi

dope-a-meme at WWDC

@aannuujX

yaklaşık 2 ay önce

Introducing Swiggy Builders Club We’re opening @Swiggy commerce infrastructure to developers and enterprises to build on top - build AI agents, apps, and integrations on top of Swiggy’s Food, Instamart, and Dineout ecosystems - with real APIs, real data, and real users. What you get: 3 MCP Servers (Food, Instamart, Dineout) 18+ API tools covering the full convenience stack Production data access from day one Direct engineering support Who it’s for: Individual developers with bold ideas Startups building AI-native commerce products Enterprises looking to integrate Swiggy into their platforms Smart grocery restock bots. AI ordering assistants. Dining recommendation agents. Group ordering tools, health first products. If it makes commerce better for users, we want to see it. Ship something great and we’ll feature it. Ship something exceptional and our recruiting team might reach out.

aannuujX's tweet photo. Introducing Swiggy Builders Club
We’re opening @Swiggy commerce infrastructure to developers and enterprises to build on top - build AI agents, apps, and integrations on top of Swiggy’s Food, Instamart, and Dineout ecosystems - with real APIs, real data, and real users.

What you get:
3 MCP Servers (Food, Instamart, Dineout)
18+ API tools covering the full convenience stack
Production data access from day one
Direct engineering support

Who it’s for:
Individual developers with bold ideas
Startups building AI-native commerce products
Enterprises looking to integrate Swiggy into their platforms

Smart grocery restock bots. AI ordering assistants. Dining recommendation agents. Group ordering tools, health first products.

If it makes commerce better for users, we want to see it.
Ship something great and we’ll feature it. Ship something exceptional and our recruiting team might reach out.

201

134

191K

code2deploy retweetledi

ollama

@ollama

yaklaşık 2 ay önce

Kimi K2.6 raises the bar for open-source models. 🦙 available on Ollama's cloud! Try it with OpenClaw: ollama launch openclaw --model kimi-k2.6:cloud Try it with Hermes Agent: ollama launch hermes --model kimi-k2.6:cloud Try it with Claude Code: ollama launch claude --model kimi-k2.6:cloud more integrations 🧵

175

728

207K

code2deploy retweetledi

Nav Toor

@heynavtoor

yaklaşık 2 ay önce

If I had to become an AI engineer in 90 days, I would not start with courses. I would build projects from these 10 GitHub repos. 1. LangChain The LLM application framework on almost every AI engineer JD. If you want to build production LLM apps, start here. repo → https://t.co/alIh6rDDIu 2. LangGraph Stateful agents as graphs. The repo JDs mean when they say "agentic workflows." repo → https://t.co/bzVBn9uecV 3. LlamaIndex The go-to framework for RAG and document agents. Every "retrieval pipeline" JD points here. repo → https://t.co/m4oJ9FiCrX 4. CrewAI Multi-agent teams with roles and tasks. Used in production by enterprises across the Fortune 500. repo → https://t.co/0xohE065sD 5. Qdrant A production vector database written in Rust. JDs name it alongside Pinecone, Chroma, and FAISS. repo → https://t.co/ziSSXW2dzZ 6. Ragas The standard framework for evaluating RAG pipelines. Hallucination, faithfulness, relevancy, all measurable. repo → https://t.co/vgOInvREU5 7. Ollama Run open-source LLMs locally in one command. JDs ask for local inference for cost and privacy reasons. repo → https://t.co/gyZhUdzsnZ 8. Awesome MCP Servers Model Context Protocol is the newest skill on JDs. This repo indexes every production MCP server out there. repo → https://t.co/ejVOgkRJDX 9. Awesome LLM Apps 100+ end-to-end templates for RAG, agents, multi-agent teams, voice agents, and MCP. Real working code. repo → https://t.co/oXrD5A8K6a 10. AI Agents for Beginners Microsoft's free 12-lesson curriculum covering the full AI agent stack. No paywall, no signup. repo → https://t.co/7dNsDw6bTj AI engineer job descriptions in 2026 keep asking for the same things: RAG, agents, vector databases, evals, MCP. These 10 repos teach all of it. Pick one. Build one project. Push it to GitHub. That's how you start. 100% free. 100% open source.

465

219K

code2deploy retweetledi

Ihtesham Ali

@ihtesham2005

yaklaşık 2 ay önce

If you want to become an AI engineer in 2026 and don't know where to start. Here's the complete list of free resources you actually need. 1. Stanford CS229- Machine Learning The course Andrew Ng built before he became Andrew Ng. Full lectures on YouTube. Problem sets on the Stanford site. Audit free on Coursera. Link: https://t.co/K2kFE3dHQ8 2. Harvard CS50 AI - Intro to AI with Python The cleanest beginner AI course on the internet. Harvard puts it out for free. No excuses. Link: https://t.co/LMf6IJeOZp 3. MIT 6.S191 - Deep Learning MIT's official intro to deep learning. Updated every year. Guest speakers from Google, Microsoft, MIT labs. Link: https://t.co/N7WsOA7UDZ 4. UC Berkeley - LLM Agents MOOC Guest lectures from Anthropic, OpenAI, Google DeepMind, and NVIDIA on how agents actually work. The most industry-relevant free course on agents right now. Link: https://t.co/hLGAcVNd5u 5. Hugging Face LLM Course Covers transformers, fine-tuning, deployment, and advanced LLM techniques. Built by the team that ships the tools you'll actually use on the job. Link: https://t.co/apLq1Y652G 6. Hugging Face AI Agents Course Goes from theory to production. You build with smol-agents, LangGraph, and LlamaIndex. Free certificate included. Link: https://t.co/8QYPr5l6gM 7. Anthropic Academy – Developer Deep Dive 13 free courses from the team that built Claude. Covers the API, MCP, prompt engineering, and production integration patterns. Launched March 2026. No paywall. Link: https://t.co/JLRiDIzbPe 8. Anthropic Prompt Engineering Course 9 chapters, hands-on exercises, runs directly in Claude. The only prompt engineering course where you learn from the people who built the model. Link: https://t.co/ZIiTGzEdo8 9. DeepLearningAI Short Courses 88 free courses. Each one runs 1–2 hours. Built with OpenAI, LangChain, Anthropic, and Mistral. Start with "ChatGPT Prompt Engineering for Developers." Link: https://t.co/btuHWx2zEo 10. fast AI - Practical Deep Learning for Coders Jeremy Howard's course. Code-first. No math gatekeeping. The fastest way to go from zero to actually building neural networks. Link: https://t.co/yWbkhFTbuu All of it is free. None of it requires a paid subscription. If you could only pick ONE of these to start with in 2026, which would it be?

ihtesham2005's tweet photo. If you want to become an AI engineer in 2026 and don't know where to start.

Here's the complete list of free resources you actually need.

1. Stanford CS229- Machine Learning

The course Andrew Ng built before he became Andrew Ng. Full lectures on YouTube. Problem sets on the Stanford site. Audit free on Coursera.
Link: https://t.co/K2kFE3dHQ8

2. Harvard CS50 AI - Intro to AI with Python

The cleanest beginner AI course on the internet. Harvard puts it out for free. No excuses.
Link: https://t.co/LMf6IJeOZp

3. MIT 6.S191 - Deep Learning

MIT's official intro to deep learning. Updated every year. Guest speakers from Google, Microsoft, MIT labs.
Link: https://t.co/N7WsOA7UDZ

4. UC Berkeley - LLM Agents MOOC

Guest lectures from Anthropic, OpenAI, Google DeepMind, and NVIDIA on how agents actually work. The most industry-relevant free course on agents right now.
Link: https://t.co/hLGAcVNd5u

5. Hugging Face LLM Course

Covers transformers, fine-tuning, deployment, and advanced LLM techniques. Built by the team that ships the tools you'll actually use on the job.
Link: https://t.co/apLq1Y652G

6. Hugging Face AI Agents Course

Goes from theory to production. You build with smol-agents, LangGraph, and LlamaIndex. Free certificate included.
Link: https://t.co/8QYPr5l6gM

7. Anthropic Academy – Developer Deep Dive

13 free courses from the team that built Claude. Covers the API, MCP, prompt engineering, and production integration patterns. Launched March 2026. No paywall.
Link: https://t.co/JLRiDIzbPe

8. Anthropic Prompt Engineering Course

9 chapters, hands-on exercises, runs directly in Claude. The only prompt engineering course where you learn from the people who built the model.
Link: https://t.co/ZIiTGzEdo8

9. DeepLearningAI Short Courses

88 free courses. Each one runs 1–2 hours. Built with OpenAI, LangChain, Anthropic, and Mistral. Start with "ChatGPT Prompt Engineering for Developers."
Link: https://t.co/btuHWx2zEo

10. fast AI - Practical Deep Learning for Coders

Jeremy Howard's course. Code-first. No math gatekeeping. The fastest way to go from zero to actually building neural networks.
Link: https://t.co/yWbkhFTbuu

All of it is free.

None of it requires a paid subscription.

If you could only pick ONE of these to start with in 2026, which would it be?

421

686

19K

code2deploy retweetledi

Aadi Kulshrestha

@MankyDankyBanky

yaklaşık 2 ay önce

I trained a 12M parameter LLM on my own ML framework using a Rust backend and CUDA kernels for flash attention, AdamW, and more. Wrote the full transformer architecture, and BPE tokenizer from scratch. The framework features: - Custom CUDA kernels (Flash Attention, fused LayerNorm, fused GELU) for 3x increased throughput - Automatic WebGPU fallback for non-NVIDIA devices - TypeScript API with Rust compute backend - One npm install to get started, prebuilt binaries for every platform Try out the model for yourself: https://t.co/TB2itlmCVT Built with @_reesechong. Check out the repos and blog if you want to learn more. Shoutout to @modal for the compute credits allowing me to train on 2 A100 GPUs without going broke cc @sundeep @GavinSherry

131

258

808K

TheTechGuy @code2deploy

yaklaşık 2 ay önce

@ClaudeDevs @alexalbert__ Need a special tier-access only for devs😄 Make it happen @ClaudeDevs

TheTechGuy @code2deploy

yaklaşık 2 ay önce

@JulianGoldieSEO Would like to give this a try!! Can i know about any quantized models that i can run on my pc?

code2deploy retweetledi

Qwen

@Alibaba_Qwen

yaklaşık 2 ay önce

⚡ Meet Qwen3.6-35B-A3B：Now Open-Source！🚀🚀 A sparse MoE model, 35B total params, 3B active. Apache 2.0 license. 🔥 Agentic coding on par with models 10x its active size 📷 Strong multimodal perception and reasoning ability 🧠 Multimodal thinking + non-thinking modes Efficient. Powerful. Versatile. Try it now👇 Blog：https://t.co/EXx5y466su Qwen Studio：https://t.co/bg4tAU1p74 HuggingFace：https://t.co/w4pDX14DZS ModelScope：https://t.co/SuRyLzdQiO API（‘Qwen3.6-Flash’ on Model Studio）：Coming soon～ Stay tuned

Alibaba_Qwen's tweet photo. ⚡ Meet Qwen3.6-35B-A3B：Now Open-Source！🚀🚀

A sparse MoE model, 35B total params, 3B active. Apache 2.0 license.

🔥 Agentic coding on par with models 10x its active size
📷 Strong multimodal perception and reasoning ability
🧠 Multimodal thinking + non-thinking modes

Efficient. Powerful. Versatile. Try it now👇

Blog：https://t.co/EXx5y466su
Qwen Studio：https://t.co/bg4tAU1p74
HuggingFace：https://t.co/w4pDX14DZS
ModelScope：https://t.co/SuRyLzdQiO
API（‘Qwen3.6-Flash’ on Model Studio）：Coming soon～ Stay tuned

445

12K

TheTechGuy @code2deploy

yaklaşık 2 ay önce

@UnslothAI @Alibaba_Qwen Would love to try it for sure!!

115

code2deploy retweetledi

Unsloth AI

@UnslothAI

yaklaşık 2 ay önce

@Alibaba_Qwen Thank you for supporting open-source! We just made GGUFs so you can run the model locally on 23GB RAM / Mac for 4-bit. 💜🥰 Qwen3.6-35-A3B: https://t.co/VlyW8UwDjw

760

268

40K

code2deploy retweetledi

Karan🧋

@kmeanskaran

yaklaşık 2 ay önce

These are actual high-demand next-level AI skills, not RAG: - Data prep for instruction fine-tuning - @UnslothAI ecosystem for fine-tuning, reasoning models, quantization, etc. - Fine-tuning embedding models - Backend design using FastAPI, Redis caching, queue workers, rate limiting - Making LLM inference layer using @vllm_project - @DeepSpeedAI for multi-GPU training - Learn NVIDIA Triton for running any vision and object detection models along with LLMs - Learn about Docker and CUDA setup - Use of @huggingface ecosystem - Distributed systems using @anyscalecompute - Kubernetes and Terraform setup for serving your model - Learn about @modal for low-config setup deployment - Deploying LLM on Ollama for easier calls - Context engineering - Agent memory harness - Multi-agent orchestration - LLM as a judge These skills are more focused towards inference engineering to ensure that your model runs smoothly in production. Memory, inference, and orchestration matter a lot more than frontier models. I am learning inference engineering and agent harness nowdays. Totally worth exploring!

403

515

13K

TheTechGuy @code2deploy

yaklaşık 2 ay önce

@amasad @Razorpay Thats huge 🥳

code2deploy retweetledi

Alvaro Cintas

@dr_cintas

2 ay önce

This peanut-sized chinese model just dethroned Gemini at reading documents. It’s called glm-ocr. it’s a tiny 0.9b parameter vision-language model that is about to replace every expensive ocr api you use. → Handles text, tables, formulas, handwriting → Scored 94.62 on OmniDocBench V1.5 → 8 languages → vLLM, SGLang, Ollama And it’s 100% open-source.

423

564

37K

code2deploy retweetledi

Suni

@suni_code

2 ay önce

Don’t overcomplicate it. • Build a File Search Tool to learn recursion and indexing • Build a Markdown to HTML Converter to practice parsing • Build a Rate Limiter to understand queues and timing • Build a JSON Database to learn serialization and storage • Build a Chat Server (CLI) to understand sockets and concurrency • Build a Static Site Generator to learn templates and file systems • Build a Git-like Version Tracker to understand diffs and commits • Build a Task Scheduler to learn cron like logic and timing • Build a Cache System to understand TTL and eviction strategies • Build a Load Balancer (simple) to understand routing logic • Build a Port Scanner to learn networking basics • Build a Config Manager to learn env files and parsing • Build a Search Autocomplete using prefix trees (Trie) • Build a Notes App with tagging to learn indexing • Build a Simple Queue System to learn producers & consumers • Build a Backup Tool to learn file copying and versioning • Build a Feature Flag System to learn toggles and configs • Build a Simple API Client (like Postman CLI) • Build a Diff Tool to compare files line-by-line • Build a Mini Template Engine to learn string processing Build Projects, Not just tutorials.

138

46K

code2deploy retweetledi

Amit Shekhar

@amitiitbhu

2 ay önce

My recent 4 articles on X: - KV Cache in LLMs - Paged Attention in LLMs - Causal Masking in Attention - Byte Pair Encoding in LLMs X is a knowledge sharing platform.

145

151

12K

TheTechGuy @code2deploy

3 ay önce

@arpit_bhayani Thats the actual content I visit X for!!

TheTechGuy @code2deploy

3 ay önce

Working on a RAG system that works locally! A code migrator LLMs: Qwen coder, gptoss Embedding model: qwen-embedding (since code is involved) VectorDB: FAISS Any suggestions on how this can be done better? Better tools, workflows? #community #llms #rag #qwen #gpt

TheTechGuy

@code2deploy

Sotwe'de En Son Ziyaret Edilenler

Senin İçin Trendler

En Popüler Kullanıcılar