Be Curious @Cuurious_guy - Twitter Profile

Be Curious @Cuurious_guy

14 days ago

@dahi_kachorie Chihuahua

0

4

Be Curious @Cuurious_guy

about 1 month ago

@Jennnyyyyyy 18KG

0

3

Cuurious_guy retweeted

Tech Fusionist

@techyoutbe

8 months ago

MLOps Roadmap (Core topics to focus on) Programming Fundamentals - Python - Go - Bash Version Control Systems - Git - GitHub Cloud Computing - AWS / Azure / GCP - Cloud-native ML Services Containerization - Docker - Kubernetes Machine Learning Fundamentals Data Engineering Fundamentals - Data Pipelines - Airflow - Data Lakes & Warehouses - Data Ingestion Architecture - Spark - Kafka - Flink MLOps Principles MLOps Components - Version Control - CI/CD - Orchestration - Experiment Tracking & Model Registry - Data Lineage & Feature Stores - Model Training & Serving - Monitoring & Observability Infrastructure as Code

9

304

46

350

19K

Cuurious_guy retweeted

Ahmad

@TheAhmadOsman

8 months ago

step-by-step LLM Engineering Projects each project = one concept learned the hard (i.e. real) way Tokenization & Embeddings > build byte-pair encoder + train your own subword vocab > write a “token visualizer” to map words/chunks to IDs > one-hot vs learned-embedding: plot cosine distances Positional Embeddings > classic sinusoidal vs learned vs RoPE vs ALiBi: demo all four > animate a toy sequence being “position-encoded” in 3D > ablate positions—watch attention collapse Self-Attention & Multihead Attention > hand-wire dot-product attention for one token > scale to multi-head, plot per-head weight heatmaps > mask out future tokens, verify causal property transformers, QKV, & stacking > stack the Attention implementations with LayerNorm and residuals → single-block transformer > generalize: n-block “mini-former” on toy data > dissect Q, K, V: swap them, break them, see what explodes Sampling Parameters: temp/top-k/top-p > code a sampler dashboard — interactively tune temp/k/p and sample outputs > plot entropy vs output diversity as you sweep params > nuke temp=0 (argmax): watch repetition KV Cache (Fast Inference) > record & reuse KV states; measure speedup vs no-cache > build a “cache hit/miss” visualizer for token streams > profile cache memory cost for long vs short sequences Long-Context Tricks: Infini-Attention / Sliding Window > implement sliding window attention; measure loss on long docs > benchmark “memory-efficient” (recompute, flash) variants > plot perplexity vs context length; find context collapse point Mixture of Experts (MoE) > code a 2-expert router layer; route tokens dynamically > plot expert utilization histograms over dataset > simulate sparse/dense swaps; measure FLOP savings Grouped Query Attention > convert your mini-former to grouped query layout > measure speed vs vanilla multi-head on large batch > ablate number of groups, plot latency Normalization & Activations > hand-implement LayerNorm, RMSNorm, SwiGLU, GELU > ablate each—what happens to train/test loss? > plot activation distributions layerwise Pretraining Objectives > train masked LM vs causal LM vs prefix LM on toy text > plot loss curves; compare which learns “English” faster > generate samples from each — note quirks Finetuning vs Instruction Tuning vs RLHF > fine-tune on a small custom dataset > instruction-tune by prepending tasks (“Summarize: ...”) > RLHF: hack a reward model, use PPO for 10 steps, plot reward Scaling Laws & Model Capacity > train tiny, small, medium models — plot loss vs size > benchmark wall-clock time, VRAM, throughput > extrapolate scaling curve — how “dumb” can you go? Quantization > code PTQ & QAT; export to GGUF/AWQ; plot accuracy drop Inference/Training Stacks: > port a model from HuggingFace to Deepspeed, vLLM, ExLlama > profile throughput, VRAM, latency across all three Synthetic Data > generate toy data, add noise, dedupe, create eval splits > visualize model learning curves on real vs synth each project = one core insight. build. plot. break. repeat. > don’t get stuck too long in theory > code, debug, ablate, even meme your graphs lol > finish each and post what you learned your future self will thank you later

9

518

69

816

30K

Who to follow

Sanchit

@hakku015

to become the absolute best

Daniel Barrios Mogollón

@dbmcoder

#100Devs Committed to become a Full Stack Web Dev!

Paul Basargin

@BasarginPaul

Be Curious @Cuurious_guy

8 months ago

@tomcrawshaw01 Guide

0

3

Be Curious @Cuurious_guy

11 months ago

@GitaShlokas_ Samay

0

99

Cuurious_guy retweeted

ISRO

@isro

over 1 year ago

SpaDeX Docking Update: 🌟Docking Success Spacecraft docking successfully completed! A historic moment. Let’s walk through the SpaDeX docking process: Manoeuvre from 15m to 3m hold point completed. Docking initiated with precision, leading to successful spacecraft capture. Retraction completed smoothly, followed by rigidisation for stability. Docking successfully completed. India became the 4th country to achieve successful Space Docking. Congratulations to the entire team! Congratulations to India! 🛰️🛰️ #SPADEX #ISRO

2K

34K

9K

603

2M