Raja

@genupula

Proud Indian, System Administrator

Hyderabad/Bangalore

Joined September 2011

264 Following

107 Followers

1.7K Posts

Pinned Tweet

Raja @genupula

over 3 years ago

Hello Everyone, I wrote shell script to enable/ disable CPU cores to save battery while running laptop on battery power. Repository https://t.co/Igw48aDuOp I tested this in Debian 10. Hope it helps #bash #Linux #opensource

340

genupula retweeted

Vivek Galatage

@vivekgalatage

3 days ago

How to Write Shared Libraries - Ulrich Drepper https://t.co/W8GtDzcvrN

386

364

27K

genupula retweeted

Dmitrii Kovanikov

@ChShersh

3 days ago

@genupula You won’t believe it

764

genupula retweeted

Vivek Galatage

@vivekgalatage

3 days ago

Introduction to Computer Graphics by David J. Eck https://t.co/MgLCes43aI

360

327

10K

Who to follow

movies, programming, math, overthinking

genupula retweeted

3 days ago

Give me some coding / tech related book or content recommendations. Drop your comments 👇🏻

10K

genupula retweeted

Antonio Lupetti

@antoniolupetti

3 days ago

"Transformers" by Daniel Jurafsky and James H. Martin is one of the clearest and most mathematically grounded introductions to the Transformer architecture I have ever read. Chapter 8 introduces the Transformer as the standard architecture behind modern large language models. What makes this chapter particularly interesting is its step-by-step presentation of the underlying mechanisms: contextual embeddings, self-attention, query, key and value vectors, scaled dot-product attention, multi-head attention, residual streams, feedforward layers, layer normalization, masking, and the parallel matrix formulation of attention. In particular, the treatment of attention as a weighted sum of contextual representations is especially valuable. The chapter first develops an intuitive, simplified view of attention and then gradually derives the full formulation using the Q, K, and V matrices. This approach makes it easier to understand what is actually happening inside the architecture from an algebraic and matrix-based perspective, rather than simply viewing the usual block diagrams. I think it is an excellent resource for anyone interested in understanding how Transformers work from linguistic, mathematical, and computational perspectives. https://t.co/3fitdPy6Fv

antoniolupetti's tweet photo. "Transformers" by Daniel Jurafsky and James H. Martin is one of the clearest and most mathematically grounded introductions to the Transformer architecture I have ever read.

Chapter 8 introduces the Transformer as the standard architecture behind modern large language models. What makes this chapter particularly interesting is its step-by-step presentation of the underlying mechanisms: contextual embeddings, self-attention, query, key and value vectors, scaled dot-product attention, multi-head attention, residual streams, feedforward layers, layer normalization, masking, and the parallel matrix formulation of attention.

In particular, the treatment of attention as a weighted sum of contextual representations is especially valuable. The chapter first develops an intuitive, simplified view of attention and then gradually derives the full formulation using the Q, K, and V matrices. This approach makes it easier to understand what is actually happening inside the architecture from an algebraic and matrix-based perspective, rather than simply viewing the usual block diagrams.

I think it is an excellent resource for anyone interested in understanding how Transformers work from linguistic, mathematical, and computational perspectives.

https://t.co/3fitdPy6Fv

350

223K

Raja @genupula

3 days ago

@ChShersh If I want to learn C++ now, from where I should start ? I want to learn by building projects, syntax I am aware.

704

genupula retweeted

Swati Gupta

@hrswatigupta

6 days ago

Anthropic pays $750,000+ a year for engineers who can build LLM architectures from scratch. Stanford taught the entire thing in 1 hour lecture & released it for free. Bookmark & watch this today before someone takes it down and read this article below

859

10K

940K

genupula retweeted

Jaydeep

@_jaydeepkarale

7 days ago

Day 4 in Observability Zero to Hero We look at How Observability Reduces MTTR(Mean Time to Resolution) I explain in this one 1. Production Incident Investigation Explained 2. Observability Maturity Models

244

219

18K

genupula retweeted

Avi Chawla

@_avichawla

6 days ago

Researchers made KMeans 200x faster. And the new technique also beats approaches like cuML and FAISS. Flash-KMeans is an IO-aware implementation of exact KMeans that redesigns the algorithm around modern GPU bottlenecks. By attacking the memory bottlenecks directly, Flash-KMeans achieves: - 33x speedup over cuML - 200x speedup over FAISS This speedup comes from how it moves through GPU memory. Standard KMeans runs in two steps, and both are bottlenecked by reads and writes to GPU memory: 1) The first step matches every point to its nearest centroid. Standard KMeans computes the full point-to-centroid distance matrix, writes it out to GPU memory, then reads it back to find each nearest centroid. That write-then-read round trip is the bottleneck. Flash-KMeans combines the distance calculation with the nearest-centroid step, so the result is computed on-chip and the full matrix is never written out. 2) The second step recomputes each centroid by averaging the points assigned to it. Standard KMeans has thousands of threads writing into the same centroid slots at once, so they stall waiting for their turn. Flash-KMeans sorts points by cluster first, turning scattered writes into sequential reductions that read and write memory in one efficient pass. Using these two optimizations at the million-scale, Flash-KMeans completes a standard KMeans iteration in a few milliseconds. The video below depicts this in action. Several reasons why this is important: KMeans has always been an offline primitive. Something you run once to preprocess data and move on. These speedups make the approach viable in several runtime-critical systems. ↳ Vector indices like FAISS use KMeans to build search indices. Faster KMeans means you can re-index dynamically as data changes. ↳ LLM quantization methods need KMeans to find optimal weight codebooks, per layer, repeatedly. What takes hours could now take minutes. ↳ MoE models need fast token routing at inference time. Flash-KMeans makes it viable to run this inside the inference loop, not just in preprocessing. I have shared the paper in the replies. That said, memory is the real constraint Flash-KMeans solves, and the problem is not just limited to clustering. The vectors a RAG system stores after indexing create similar bottlenecks. I wrote a detailed walkthrough recently on cutting this vector memory by 32x with binary quantization, querying 36M+ vectors in a few milliseconds. Read it below.

668

749

88K

Raja @genupula

6 days ago

Should watch #go

Vivek Galatage

@vivekgalatage

6 days ago

What Every Programmer Should Know about How CPUs Work by @mattgodbolt https://t.co/V1usGa2akA

238

208

genupula retweeted

Ihtesham Ali

@ihtesham2005

13 days ago

Dennis Ritchie invented C in 1972, co-built Unix in 1969, and his code is running inside every device you are reading this on right now and the colleague who announced his death had to do it through a Google+ post because no journalist thought to check. He worked at Bell Labs in New Jersey for 44 years. He never gave a keynote. He never ran a company. He never appeared on a magazine cover. He just wrote code that became the invisible foundation everything else is built on. Here is what he actually built, and why it matters more than almost anything that happened in tech. In 1969, Bell Labs had just walked away from one of the most ambitious computing projects in history. The Multics project, a joint effort between MIT, Bell Labs, and General Electric, had collapsed under its own weight. Too complex. Too expensive. Too slow. Bell Labs pulled out. Ken Thompson and Dennis Ritchie refused to let the ideas die. Working in a small office in Murray Hill, New Jersey, Thompson wrote the first version of Unix in three weeks during the summer of 1969. One week for the file system. One week for the process management. One week for the command shell. Ritchie was working alongside him, and when the system needed a language that could express what they were building, he built one. In 1972 he completed C. C was not just another programming language. It was a different philosophy about what a programming language should be. Before C, most systems code was written in assembly, which meant every program was tied to the specific hardware it ran on. You could not move code between machines. You rewrote it from scratch every time. C changed that. It sat close enough to the hardware to be fast, but abstract enough to run on anything. When Thompson rewrote the Unix kernel in C in 1973, it became the first operating system that could be picked up and moved to a completely different machine without starting over. Portability was a new idea. Ritchie made it real. The branching that followed is almost impossible to overstate. Unix spread from Bell Labs to universities. At Berkeley, it became BSD. BSD became the foundation of macOS and iOS. Unix influenced Linus Torvalds, who built Linux in 1991. Linux now runs every Android phone, every major web server, every supercomputer on the Top500 list, and the overwhelming majority of cloud infrastructure at AWS, Google, and Microsoft. C became the parent language of C++, Java, JavaScript, Python, and Objective-C. Rob Pike, who worked across the hall from Ritchie at Bell Labs for 20 years, said it plainly: "The browsers are written in C. The Unix kernel that the entire internet runs on is written in C. Web servers are written in C, and if they're not, they're written in Java or C++, which are C derivatives, or Python or Ruby, which are implemented in C." Ritchie won the Turing Award in 1983. He won the National Medal of Technology in 1998, presented by President Clinton. He was head of System Software Research at Bell Labs for decades. He answered emails from strangers with technical questions until the end of his life. His home address stayed listed in the phone book. His colleague Brian Kernighan, who co-authored the definitive C textbook with him, said Ritchie was a private person who did no self-salesmanship. That was not false modesty. It was just who he was. He died on October 12, 2011, at his home in Berkeley Heights, New Jersey. He was 70. He had been ill for some time. The world did not notice until Rob Pike posted a quiet announcement on Google+, and the news spread through the programming community in hushed tones. No front pages. No tributes from heads of state. No candlelight vigils outside corporate campuses. The device you are reading this on runs code that traces directly back to what he built. So does the server that delivered it to you. So does the browser or app you opened to get here. Most people will never know his name. The ones who built everything you use every day do.

ihtesham2005's tweet photo. Dennis Ritchie invented C in 1972, co-built Unix in 1969, and his code is running inside every device you are reading this on right now and the colleague who announced his death had to do it through a Google+ post because no journalist thought to check.

He worked at Bell Labs in New Jersey for 44 years. He never gave a keynote. He never ran a company. He never appeared on a magazine cover. He just wrote code that became the invisible foundation everything else is built on.

Here is what he actually built, and why it matters more than almost anything that happened in tech.

In 1969, Bell Labs had just walked away from one of the most ambitious computing projects in history. The Multics project, a joint effort between MIT, Bell Labs, and General Electric, had collapsed under its own weight. Too complex. Too expensive. Too slow. Bell Labs pulled out.

Ken Thompson and Dennis Ritchie refused to let the ideas die.

Working in a small office in Murray Hill, New Jersey, Thompson wrote the first version of Unix in three weeks during the summer of 1969. One week for the file system. One week for the process management. One week for the command shell. Ritchie was working alongside him, and when the system needed a language that could express what they were building, he built one.

In 1972 he completed C.

C was not just another programming language. It was a different philosophy about what a programming language should be. Before C, most systems code was written in assembly, which meant every program was tied to the specific hardware it ran on. You could not move code between machines. You rewrote it from scratch every time.

C changed that. It sat close enough to the hardware to be fast, but abstract enough to run on anything. When Thompson rewrote the Unix kernel in C in 1973, it became the first operating system that could be picked up and moved to a completely different machine without starting over. Portability was a new idea. Ritchie made it real.

The branching that followed is almost impossible to overstate.

Unix spread from Bell Labs to universities. At Berkeley, it became BSD. BSD became the foundation of macOS and iOS. Unix influenced Linus Torvalds, who built Linux in 1991. Linux now runs every Android phone, every major web server, every supercomputer on the Top500 list, and the overwhelming majority of cloud infrastructure at AWS, Google, and Microsoft.

C became the parent language of C++, Java, JavaScript, Python, and Objective-C. Rob Pike, who worked across the hall from Ritchie at Bell Labs for 20 years, said it plainly: "The browsers are written in C. The Unix kernel that the entire internet runs on is written in C. Web servers are written in C, and if they're not, they're written in Java or C++, which are C derivatives, or Python or Ruby, which are implemented in C."

Ritchie won the Turing Award in 1983. He won the National Medal of Technology in 1998, presented by President Clinton. He was head of System Software Research at Bell Labs for decades.

He answered emails from strangers with technical questions until the end of his life. His home address stayed listed in the phone book. His colleague Brian Kernighan, who co-authored the definitive C textbook with him, said Ritchie was a private person who did no self-salesmanship. That was not false modesty. It was just who he was.

He died on October 12, 2011, at his home in Berkeley Heights, New Jersey. He was 70. He had been ill for some time. The world did not notice until Rob Pike posted a quiet announcement on Google+, and the news spread through the programming community in hushed tones.

No front pages. No tributes from heads of state. No candlelight vigils outside corporate campuses.

The device you are reading this on runs code that traces directly back to what he built. So does the server that delivered it to you. So does the browser or app you opened to get here.

Most people will never know his name.

The ones who built everything you use every day do.

846

770

86K

genupula retweeted

Matt Dancho (Business Science)

@mdancho84

13 days ago

🚨BREAKING: Google just DROPPED a masterclass on GPUs Get it here 100% free:

647

104

813

25K

genupula retweeted

Vivek Galatage

@vivekgalatage

13 days ago

Dive Into Systems - free online book diving into systems engineering. The chapter on code optimization talks about various compiler flags and the respective work done. https://t.co/0nuYiNiC1O

vivekgalatage's tweet photo. Dive Into Systems - free online book diving into systems engineering. The chapter on code optimization talks about various compiler flags and the respective work done.

https://t.co/0nuYiNiC1O https://t.co/pauHG29hyo

288

329

genupula retweeted

Vivek Galatage

@vivekgalatage

19 days ago

I was revisiting my archive of articles and found this awesome series about code reviews by Arne Mertz - great read. https://t.co/gKUpQQbrTh

vivekgalatage's tweet photo. I was revisiting my archive of articles and found this awesome series about code reviews by Arne Mertz - great read.

https://t.co/gKUpQQbrTh https://t.co/khYgzUIgnS

genupula retweeted

Ihtesham Ali

@ihtesham2005

18 days ago

A Google engineer named Lee Boonstra wrote down everything she knew about prompting in one 68-page document, and Google gave it away for free instead of selling it. Link is in the comments. Download it

ihtesham2005's tweet photo. A Google engineer named Lee Boonstra wrote down everything she knew about prompting in one 68-page document, and Google gave it away for free instead of selling it.

Link is in the comments. Download it https://t.co/iBeYTXaYue

123

130

genupula retweeted

Dan Kornas

@DanKornas

18 days ago

For deeper Machine Learning Foundations study, this YouTube playlist gives you the sequence in one place. Good save when you want the path, not a one-off video: Ep #1 - What is ML? → Machine Learning Foundations. 𝗙𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻𝘀: ↳ Machine Learning Foundations: Ep #1 - What is ML? ↳ Computer vision by building a neural network with TensorFlow | Machine Learning Foundations ↳ Machine Learning Foundations: Ep #3 - Convolutions and pooling ↳ Machine Learning Foundations: Ep #4 - Coding with Convolutional Neural Networks ↳ Real-world image classification using convolutional neural networks | Machine Learning Foundations Best use: treat it as a map of the field. Watch once for the arc, then revisit the parts where you need implementation depth. Link is in the first comment 👇 ♻️ Share this with your network if you found it useful or insightful.

DanKornas's tweet photo. For deeper Machine Learning Foundations study, this YouTube playlist gives you the sequence in one place.

Good save when you want the path, not a one-off video: Ep #1 - What is ML? → Machine Learning Foundations.

𝗙𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻𝘀:
↳ Machine Learning Foundations: Ep #1 - What is ML?
↳ Computer vision by building a neural network with TensorFlow | Machine Learning Foundations
↳ Machine Learning Foundations: Ep #3 - Convolutions and pooling
↳ Machine Learning Foundations: Ep #4 - Coding with Convolutional Neural Networks
↳ Real-world image classification using convolutional neural networks | Machine Learning Foundations

Best use: treat it as a map of the field. Watch once for the arc, then revisit the parts where you need implementation depth.

Link is in the first comment 👇

♻️ Share this with your network if you found it useful or insightful.

175

168

genupula retweeted

Rohit Ghumare

@ghumare64

19 days ago

Learn everything at one place: https://t.co/DV12HanJ9a

323

403

21K

genupula retweeted

Ahmad

@TheAhmadOsman

18 days ago

Step-By-Step LLM Engineering Projects Roadmap - Build a tokenizer - Learn embeddings - Implement RoPE / ALiBi - Hand-wire attention - Build MHA - Build a Transformer block - Train a mini-former - Compare objectives - Build sampling - Speculative decoding - KV cache - MQA / GQA / MLA - Long context - FlashAttention - Hardware budgets - Toy MoE - Sparse model trade-offs - State-space / linear attention - Diffusion language models - Data pipelines - Synthetic data - Scaling laws - SFT / DPO / RLHF / GRPO - Quantization - Serving stacks - Eval harnesses - RAG - Tool use / agents - Vision-language adapters - Interpretability - Red-team suite - Full capstone model system One request: Choose an Opensource AI lab when you make it Opensource is where humanity gets to keep the tools DM me when you've made it ;)

262

122K

genupula retweeted

Rohit Kumar Tiwari

@_rohit_tiwari_

19 days ago

AI Engineering from Scratch. 503 lessons. 20 phases. 320 hours. https://t.co/UuX9N62VCU Phase 00: Setup & Tooling (12 lessons) Phase 01: Math Foundations (22 lessons) Phase 02: ML Fundamentals (18 lessons) Phase 03: Deep Learning Core (13 lessons) Phase 04: Computer Vision (28 lessons) Phase 05: NLP (29 lessons) Phase 06: Speech & Audio (17 lessons) Phase 07: Transformers Deep Dive (14 lessons) Phase 08: Generative AI (14 lessons) Phase 09: Reinforcement Learning (12 lessons) Phase 10: LLMs from Scratch (22 lessons) Phase 11: LLM Engineering (15 lessons) Phase 12: Multimodal AI (25 lessons) Phase 13: Tools & Protocols (23 lessons) Phase 14: Agent Engineering (42 lessons) Phase 15: Autonomous Systems (22 lessons) Phase 16: Multi-Agent & Swarms (25 lessons) Phase 17: Infrastructure & Production (28 lessons) Phase 18: Ethics, Safety & Alignment (30 lessons) Phase 19: Capstone Projects (85 lessons)

_rohit_tiwari_'s tweet photo. AI Engineering from Scratch.

503 lessons. 20 phases. 320 hours.

https://t.co/UuX9N62VCU

Phase 00: Setup & Tooling (12 lessons)
Phase 01: Math Foundations (22 lessons)
Phase 02: ML Fundamentals (18 lessons)
Phase 03: Deep Learning Core (13 lessons)
Phase 04: Computer Vision (28 lessons)
Phase 05: NLP (29 lessons)
Phase 06: Speech & Audio (17 lessons)
Phase 07: Transformers Deep Dive (14 lessons)
Phase 08: Generative AI (14 lessons)
Phase 09: Reinforcement Learning (12 lessons)
Phase 10: LLMs from Scratch (22 lessons)
Phase 11: LLM Engineering (15 lessons)
Phase 12: Multimodal AI (25 lessons)
Phase 13: Tools & Protocols (23 lessons)
Phase 14: Agent Engineering (42 lessons)
Phase 15: Autonomous Systems (22 lessons)
Phase 16: Multi-Agent & Swarms (25 lessons)
Phase 17: Infrastructure & Production (28 lessons)
Phase 18: Ethics, Safety & Alignment (30 lessons)
Phase 19: Capstone Projects (85 lessons)

246

49K

genupula retweeted

KC Sivaramakrishnan @kc_srk

19 days ago

I'm offering "Functional Programming with OCaml" on the NPTEL platform in July 2026 sem. Enrollment is open now. The first 8 modules of the interactive book should be fairly stable. The rest is still in development. Sharing early in the spirit of building in the open.

kc_srk's tweet photo. I'm offering "Functional Programming with OCaml" on the NPTEL platform in July 2026 sem. Enrollment is open now.

The first 8 modules of the interactive book should be fairly stable. The rest is still in development. Sharing early in the spirit of building in the open. https://t.co/s3T9AfUJBK

202

114

15K

Raja

@genupula

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users