Alien Fumeta @fumetalien - Twitter Profile

Alien Fumeta @fumetalien

7 days ago

120 CI 0 Interviews passed Over 9000 retarded shit said

0

2

fumetalien retweeted

aditya

@adxtyahq

20 days ago

“design a RAG pipeline for 10M docs with zero hallucination” apparently this was asked in a Google L5 interview round. came across it somewhere on the internet and honestly it’s a way more interesting system design problem than most classic distributed systems questions 1. ingest + normalize docs - remove duplicates, standardize formats, extract metadata, maintain version history 2. hybrid retrieval (BM25 + embeddings) - BM25 handles exact keyword matching while embeddings capture semantic meaning - semantic search alone usually struggles with precision at massive scale 3. ANN retrieval + reranking - ANN (Approximate nearest neighbor ) quickly pulls top candidate chunks from millions of docs - then a reranker rescoring step improves relevance by deeply comparing query vs retrieved chunks 4. source confidence scoring - every retrieved chunk gets scored based on freshness, trust level, overlap and retrieval consistency - low-confidence context should never heavily influence generation 5. constrained generation - the model is only allowed to answer using retrieved context (nothing new to be invented outside of the retrieved context) 6. citation-backed responses - every major claim links back to exact chunks, documents or timestamps 7. hallucination fallback layer - if retrieval confidence drops below a threshold: “insufficient evidence found” 8. continuous evals - run adversarial queries, retrieval recall benchmarks and hallucination tests continuously 9. caching + memory layer - cache high-frequency enterprise queries and retrieval paths (improves latency and output) 10. observability everywhere - trace retrieval paths, chunk rankings, token attribution and failure points Also at 10M docs, retrieval quality matters more than the frontier model itself.

adxtyahq's tweet photo. “design a RAG pipeline for 10M docs with zero hallucination”

apparently this was asked in a Google L5 interview round. came across it somewhere on the internet and honestly it’s a way more interesting system design problem than most classic distributed systems questions

1. ingest + normalize docs
- remove duplicates, standardize formats, extract metadata, maintain version history

2. hybrid retrieval (BM25 + embeddings)
- BM25 handles exact keyword matching while embeddings capture semantic meaning
- semantic search alone usually struggles with precision at massive scale

3. ANN retrieval + reranking
- ANN (Approximate nearest neighbor ) quickly pulls top candidate chunks from millions of docs
- then a reranker rescoring step improves relevance by deeply comparing query vs retrieved chunks

4. source confidence scoring
- every retrieved chunk gets scored based on freshness, trust level, overlap and retrieval consistency
- low-confidence context should never heavily influence generation

5. constrained generation
- the model is only allowed to answer using retrieved context (nothing new to be invented outside of the retrieved context)

6. citation-backed responses
- every major claim links back to exact chunks, documents or timestamps

7. hallucination fallback layer
- if retrieval confidence drops below a threshold: “insufficient evidence found”

8. continuous evals
- run adversarial queries, retrieval recall benchmarks and hallucination tests continuously

9. caching + memory layer
- cache high-frequency enterprise queries and retrieval paths (improves latency and output)

10. observability everywhere
- trace retrieval paths, chunk rankings, token attribution and failure points

Also at 10M docs, retrieval quality matters more than the frontier model itself.

87

3K

326

5K

195K

fumetalien retweeted

witcheer

@witcheer

20 days ago

when I started running local models a few weeks ago, I knew absolutely nothing. what's GGUF. what's quantisation. why does my GPU run out of memory. what the hell is a MoE expert. now I know maybe 1%. I am still massively ignorant about most of this. but that 1% feels like I learned a ridiculous amount. I wrote this guide because it's exactly what I needed and couldn't find when I started. just "here's what each thing actually means and here's the command that works." if you've been curious about local LLMs but felt intimidated by the setup, have a good read.

witcheer's tweet photo. when I started running local models a few weeks ago, I knew absolutely nothing.

what's GGUF. what's quantisation. why does my GPU run out of memory. what the hell is a MoE expert.

now I know maybe 1%. I am still massively ignorant about most of this. but that 1% feels like I learned a ridiculous amount.

I wrote this guide because it's exactly what I needed and couldn't find when I started. just "here's what each thing actually means and here's the command that works."

if you've been curious about local LLMs but felt intimidated by the setup, have a good read.

15

252

31

321

25K

fumetalien retweeted

Tech with Mak

@techNmak

21 days ago

These are literally the kind of LLM interview questions most candidates wish they had seen earlier. A curated list of 50 LLM interview questions - shared by Hao Hoang. What's covered: Fundamentals: → Tokenization and why it matters → Attention mechanisms in transformers → Context windows and their tradeoffs → Embeddings and initialization → Positional encodings Fine-tuning & Efficiency: → LoRA vs QLoRA → PEFT to prevent catastrophic forgetting → Model distillation → Adaptive Softmax for large vocabularies Generation & Decoding: → Beam search vs greedy decoding → Temperature, top-k, top-p sampling → Autoregressive vs masked models Advanced Concepts: → RAG (Retrieval-Augmented Generation) → Chain-of-Thought prompting → Mixture of Experts (MoE) → Knowledge graph integration → Zero-shot and few-shot learning Math & Theory: → Softmax in attention → Cross-entropy loss → KL divergence → Gradient computation for embeddings → Vanishing gradient solutions in transformers You don't need to follow me (@techNmak) and comment "LLM". I will put the link in the comments.

techNmak's tweet photo. These are literally the kind of LLM interview questions most candidates wish they had seen earlier.

A curated list of 50 LLM interview questions - shared by Hao Hoang.

What's covered:

Fundamentals:
→ Tokenization and why it matters
→ Attention mechanisms in transformers
→ Context windows and their tradeoffs
→ Embeddings and initialization
→ Positional encodings

Fine-tuning & Efficiency:
→ LoRA vs QLoRA
→ PEFT to prevent catastrophic forgetting
→ Model distillation
→ Adaptive Softmax for large vocabularies

Generation & Decoding:
→ Beam search vs greedy decoding
→ Temperature, top-k, top-p sampling
→ Autoregressive vs masked models

Advanced Concepts:
→ RAG (Retrieval-Augmented Generation)
→ Chain-of-Thought prompting
→ Mixture of Experts (MoE)
→ Knowledge graph integration
→ Zero-shot and few-shot learning

Math & Theory:
→ Softmax in attention
→ Cross-entropy loss
→ KL divergence
→ Gradient computation for embeddings
→ Vanishing gradient solutions in transformers

You don't need to follow me (@techNmak) and comment "LLM". I will put the link in the comments.

18

369

75

580

20K

Who to follow

Polo M (slow/steady)

@traintest_split

Research Scientist. AI, machine learning and tech policy. after hours ai takes. passionate about data security, integrity, future studies, and cogsec

jordan cole

@jordancoleNA

Nobody cares how much you know, until they know how much you care. - Theodore Roosevelt | Currently building https://t.co/f6Zc7hsa38

Mohit Nihalani

@MohitNihalani9

Building something new. Ex-MLE @ Bloomberg

fumetalien retweeted

Ahmad

@TheAhmadOsman

20 days ago

DROP EVERYTHING The bible for running LLMs locally is now available online to read for free Covers what to use on - Laptop / edge / odd hardware - Mac-first workflows - Single RTX GPUs - 2-4+ NVIDIA / CUDA GPUs - General production serving - Long-context / MoE / routing - NVIDIA max performance - Cluster orchestration Software - llama.cpp - MLX / MLX-LM - ExLlamaV2 - ExLlamaV3 - vLLM - SGLang - TensorRT-LLM - NVIDIA Dynamo You should read this, and if you cannot now then you most definitely wanna bookmark it for later Local AI FTW

44

2K

240

4K

239K

fumetalien retweeted

CyrilXBT

@cyrilXBT

26 days ago

GITHUB JUST CREATED AN OFFICIAL CERTIFICATION FOR THE MOST IN-DEMAND DEVELOPER ROLE OF 2026. It is called Agentic AI Developer. GH-600. And it is the first formal signal that running AI agent teams is now a recognized engineering discipline with a credential behind it. Not a prompt engineer. Not a vibe coder. An Agentic AI Developer. The person who operates, supervises, and integrates AI agents across the entire software development lifecycle. The person who knows where agents fail in production. The person who understands how to build autonomous workflows that do not introduce catastrophic failure modes into CI/CD pipelines. The person every engineering team is going to need and almost none of them have right now. GitHub certifying this role changes the hiring conversation permanently. Before GH-600: "Do you work with AI agents?" is an interview question with no standard answer. After GH-600: the credential tells the hiring manager exactly what you know and what you can do before the interview starts. The engineers who get certified in the first wave of GH-600 will have a credential for a role that has more demand than supply for the next 3 to 5 years. The engineers who wait until it is mainstream will be competing with everyone who moved first. If you are already working with GitHub Copilot or building agent-driven workflows you are already doing this job. GH-600 is how you prove it. Bookmark this. Follow @cyrilXBT for every AI certification worth your time the moment it drops.

cyrilXBT's tweet photo. GITHUB JUST CREATED AN OFFICIAL CERTIFICATION FOR THE MOST IN-DEMAND DEVELOPER ROLE OF 2026.

It is called Agentic AI Developer.

GH-600.

And it is the first formal signal that running AI agent teams is now a recognized engineering discipline with a credential behind it.

Not a prompt engineer.

Not a vibe coder.

An Agentic AI Developer.

The person who operates, supervises, and integrates AI agents across the entire software development lifecycle.

The person who knows where agents fail in production.

The person who understands how to build autonomous workflows that do not introduce catastrophic failure modes into CI/CD pipelines.

The person every engineering team is going to need and almost none of them have right now.

GitHub certifying this role changes the hiring conversation permanently.

Before GH-600: "Do you work with AI agents?" is an interview question with no standard answer.

After GH-600: the credential tells the hiring manager exactly what you know and what you can do before the interview starts.

The engineers who get certified in the first wave of GH-600 will have a credential for a role that has more demand than supply for the next 3 to 5 years.

The engineers who wait until it is mainstream will be competing with everyone who moved first.

If you are already working with GitHub Copilot or building agent-driven workflows you are already doing this job.

GH-600 is how you prove it.

Bookmark this.

Follow @cyrilXBT for every AI certification worth your time the moment it drops.

132

7K

743

9K

1M

fumetalien retweeted

Xiuyu Li

@sheriyuo

26 days ago

The Hands-on Modern RL tutorial everyone has been waiting for is finally available in English🥳🥳🥳 PDF download link: https://t.co/LgeX8gXBqT

sheriyuo's tweet photo. The Hands-on Modern RL tutorial everyone has been waiting for is finally available in English🥳🥳🥳

PDF download link: https://t.co/LgeX8gXBqT https://t.co/eRBuEgcL6v

9

817

119

1K

57K

fumetalien retweeted

Akshay Shinde

@ConsciousRide

26 days ago

90% of LangGraph & Agent Framework interviews in 2026 are just these 10 concepts repeated:

14

986

95

2K

95K

fumetalien retweeted

Sebastian Raschka

@rasbt

27 days ago

A little talk on what we can learn from implementing LLM architectures from scratch in Python and PyTorch. And how I approach new open-weight models, compare them against reference implementations etc: https://t.co/crKd2l9xGg

23

995

167

820

70K

Alien Fumeta @fumetalien

26 days ago

@rucaraw You should just cum in front of the camera for all of us

0

59

fumetalien retweeted

Puneet Patwari

@system_monarch

about 1 month ago

As an AI Engineer. Please learn: -Prompt caching & semantic caching tradeoffs -KV cache management at scale -Speculative decoding vs quantization -RAG evaluation (RAGAS + human evals) -Cost monitoring & hidden token leaks -Agent guardrails & infinite loop detection

28

2K

202

3K

93K

Alien Fumeta @fumetalien

about 1 month ago

@sanidadgob Definitivamente esto estaba calculado por los judíos. La venganza ha llegado rápido.

0

4

Alien Fumeta @fumetalien

about 1 month ago

I see the jews already took vengeance from Spain as they promised. The new #hantavirus events is definitely their doing.

0

3

0

85

fumetalien retweeted

Sebastian Raschka

@rasbt

about 1 month ago

Here is a 2nd batch of April architecture drops. What a month! - Ant Ling 2.6 1T - Minimax M2.7 - Xiaomi MiMo V2.5 - Poolside Laguna XS.2 - Tencent Hy3-preview - IBM Granite 4.1

rasbt's tweet photo. Here is a 2nd batch of April architecture drops. What a month!
- Ant Ling 2.6 1T
- Minimax M2.7
- Xiaomi MiMo V2.5
- Poolside Laguna XS.2
- Tencent Hy3-preview
- IBM Granite 4.1 https://t.co/ILjjyZYeC5

20

841

117

358

41K

fumetalien retweeted

Julien V

@julesvcode

about 1 month ago

read this article if you're building an iOS app and it looks like AI slop

2

34

5

28

4K

Alien Fumeta @fumetalien

about 1 month ago

@athcanft How do you upload / schedule in TikTok and IG? Youtube has a nice API but those don't afaik.

0

80

Alien Fumeta @fumetalien

about 1 month ago

@CogniCarbon How do you check what "claims" are from those?

0

67

fumetalien retweeted

Yann LeCun

@ylecun

about 1 month ago

https://t.co/cfwFhKL5Aw

28

747

100

437

128K

fumetalien retweeted

shirish

@shiri_shh

about 1 month ago

we are so cooked 😭 these guys let Claude run wild on Wall St. Look at this insider trades scanner it built in 4 mins that: > reads every SEC filing where execs buy their own stock > flags clusters where multiple execs buy at once > emails me the top 3 trades every morning before the open

115

3K

275

5K

617K

Alien Fumeta

@fumetalien

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users