Ioannis Papapanagiotou

@ipapapa

Engineering @Google | ex-industry: @SnowflakeDB @Netflix @IBM | ex-academia: @Purdue @NCState

California, USA

Joined March 2010

650 Following

610 Followers

1.5K Posts

ipapapa retweeted

Ruslan Meshenberg @rusmeshenberg

9 days ago

An overview of how we're integrating AI in the reliability tech and practices - https://t.co/lmKpcENaje credit: @ipapapa

888

Ioannis Papapanagiotou @ipapapa

5 months ago

Benchmarked: H100 vs Gemini Flash. Result: H100 choked on prefill. Gemini flew. Without complex disaggregation, a single H100 is just an expensive heater. https://t.co/74t8Lwh4ex

152

Ioannis Papapanagiotou @ipapapa

5 months ago

@karanjagtiani04 Holding up well! After the first query warms the cache, we're seeing ~98% cost reduction. The catch: your prefix (system prompt + context) needs to stay consistent. Change it, cache resets. Quality improved too - full context catches cross-references that chunked RAG misses.

Ioannis Papapanagiotou @ipapapa

5 months ago

I've been recommending RAG to save costs. Time to reconsider. Testing Gemini Flash 3 with 22 technical manuals + implicit caching (75% off repeated prefixes): Full documents as system prompt became: → Cost-competitive with RAG → Higher quality responses

115

Who to follow

PVLDB

@pvldb

The Proceedings of the VLDB Endowment (PVLDB) RSS Feed: https://t.co/5wEKOfq2OD Bluesky: https://t.co/jULSIiQ5M3

Ippokratis Pandis

@Hippotas

Distinguished Engineer at Databricks. Mostly focused on large scale analytics and transaction processing. Tweets are personal opinions.

Sudhir Tonse

@stonse

Data Platform at Doordash. Previously at Uber and Netflix Cloud Platform. Interested in Data scale challenges and managing engineering excellence. Hiring now!

Ioannis Papapanagiotou @ipapapa

5 months ago

📑 Markdown/Header: Splits by structure → Respects semantic boundaries → Uneven chunk sizes - which affects retrieval consistency and costs. All splitters have flaws - you're just choosing which flaw to accept.

Ioannis Papapanagiotou @ipapapa

5 months ago

Choosing a text splitter for RAG? 📏 Character: Splits by count (e.g. 2000 chars) → Simple, predictable → Cuts mid-sentence, mid-table 🔢 Token (TikToken): Splits by LLM tokens → Better for context planning → Still arbitrary cuts

Ioannis Papapanagiotou @ipapapa

5 months ago

For technical support chatbots with stable system prompts? Full Documents + caching might be the better architecture. Caching changes the ROI calculation completely.

Ioannis Papapanagiotou @ipapapa

5 months ago

The trade-off isn't cost vs quality anymore. It's: • Cold start cost (Full Docs higher) • Query volume (Full Docs wins at scale) • Context stability (Full Docs needs consistent prefix) RAG retrieves fragments. Full Docs preserves relationships.

Ioannis Papapanagiotou @ipapapa

6 months ago

Retrieval: If you are dealing with non-English technical RAG, don't struggle with generic local models. The specialized multilingual APIs are worth every millisecond of latency saved.⚡️ #AI #RAG

Ioannis Papapanagiotou @ipapapa

6 months ago

Running a high-performance multilingual reranker on a 1 vCPU instance is a recipe for latency. The local ms-marco model was "good enough," but I needed "perfect" handling of foreign technical terminology without crashing the server.

Ioannis Papapanagiotou @ipapapa

6 months ago

The new reranker is so effective it retrieved too many similar manuals. I countered this with a System Prompt "Phase Filter"—explicitly instructing the LLM to verify requirements against the retrieved specs.

Ioannis Papapanagiotou @ipapapa

6 months ago

Model: Gemini 3.0 Pro (Preview) Embeddings: gemini-embedding-001 Reranker: ms-marco-MiniLM (Local) The accuracy difference is night and day. If you aren't reranking, you aren't really searching. 🚀 #AI #RAG #Engineering #Gemini #LLMs

Ioannis Papapanagiotou @ipapapa

6 months ago

Just spent the weekend debugging a high-precision RAG pipeline for technical manuals. 🛠️

Ioannis Papapanagiotou @ipapapa

6 months ago

The LLM kept "lazy-loading" generic definitions from general category files instead of reading the specific manuals. I injected a "Critical Instruction" into the System Prompt. Result: It now ignores the easy answer and digs for the specific brands.

Ioannis Papapanagiotou

@ipapapa

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users