Paritosh

@paritosh_91

An Opensource enthusiast | Data Engineer | Spark/Scala/ML

Joined April 2017

366 Following

46 Followers

506 Posts

Pinned Tweet

Paritosh @paritosh_91

7 months ago

AGI in Q4 2026.

Paritosh @paritosh_91

about 1 month ago

@ItsJustPandey Work on isolation exercises. Arms (bies, tries, forarms), back and shoulders (traps,upper and mis,lats,lower) and most importantly core

287

Paritosh @paritosh_91

3 months ago

Give clown a crown and watch the palace turn into a circus.

paritosh_91 retweeted

Narendra Modi

@narendramodi

3 months ago

Received a call from President Trump and had a useful exchange of views on the situation in West Asia. India supports de-escalation and restoration of peace at the earliest. Ensuring that the Strait of Hormuz remains open, secure and accessible is essential for the whole world. We agreed to stay in touch regarding efforts towards peace and stability. @realDonaldTrump @POTUS

158K

17K

52M

Who to follow

Rallapalli Goutham Deekshith

De profesión “DFIR Analyst” e Ingeniera de Teleco porque decir “cotilla profesional” queda feo

wtfmugen?🇪🇸🇦🇷🇳🇴

@wtfwasif

Paritosh @paritosh_91

3 months ago

@FlyingHammer33 @avarakai For once I agree

133

Paritosh @paritosh_91

3 months ago

Looking at the market crash that happened today, it’s going to be a rough few weeks for the global economy. Once the emotions settle in we may again be at the beginning of the start of another Bull run. Let’s watch.

paritosh_91 retweeted

ollama

@ollama

4 months ago

❤️ We are partnering with @MiniMax_AI to give Ollama users free usage of MiniMax M2.5 for the next couple of days! ollama run minimax-m2.5:cloud Use MiniMax M2.5 with OpenCode, Claude Code, Codex, OpenClaw via ollama launch! OpenCode: ollama launch opencode --model minimax-m2.5:cloud Claude: ollama launch claude --model minimax-m2.5:cloud

ollama's tweet photo. ❤️ We are partnering with @MiniMax_AI to give Ollama users free usage of MiniMax M2.5 for the next couple of days!

ollama run minimax-m2.5:cloud

Use MiniMax M2.5 with OpenCode, Claude Code, Codex, OpenClaw via ollama launch!

OpenCode:
ollama launch opencode --model minimax-m2.5:cloud

Claude:
ollama launch claude --model minimax-m2.5:cloud

152

346

440K

paritosh_91 retweeted

Neo Kim

@systemdesignone

4 months ago

If you want to become good at system design, learn these 19 case studies (save this): 1 How Stock Exchange Works: ↳ https://t.co/iFNSX9TM9O 2 How Payment System Works: ↳ https://t.co/ARiLxGR43G 3 How YouTube Works: ↳ https://t.co/kHk3g6jz6t 4 How Google Docs Works: ↳ https://t.co/W57IkAjXpT 5 How Kafka Works: ↳ https://t.co/8rOy9KgCMo 6 How URL Shortener Works: ↳ https://t.co/tGndgdhH0V 7 How WhatsApp Works: ↳ https://t.co/VScq8QwHMr 8 How Airbnb Works: ↳ https://t.co/Bi5SAjfv5S 9 How Spotify Works: ↳ https://t.co/BxrH3oHIFS 10 How Slack Works: ↳ https://t.co/eIo29uOQOJ 11 How Reddit Works: ↳ https://t.co/o6Pw2hhj3T 12 How Bluesky Works: ↳ https://t.co/2rLYlRlky0 13 How Tinder Works: ↳ https://t.co/4E1zfgfvlw 14 How Twitter Timeline Works: ↳ https://t.co/pF2RYmPaIG 15 How Uber Finds Nearby Drivers: ↳ https://t.co/kJ2t8dtmch 16 How Amazon Lambda Works: ↳ https://t.co/lx0BjeSRZt 17 How Amazon S3 Works: ↳ https://t.co/iReWAEHwmj 18 How Do Apple AirTags Work: ↳ https://t.co/upWcgsXwKh 19 How LLMs Like ChatGPT Actually Work: ↳ https://t.co/5lCKxq2g4N What else should make this list? —— 👋 PS - Want my System Design Playbook for FREE? Join my newsletter with 200K+ software engineers right now: → https://t.co/ByOFTtOihX ——— 💾 Save this for later & RT to help other software engineers ace system design. 👤 Follow @systemdesignone + turn on notifications.

335

108K

paritosh_91 retweeted

Andrej Karpathy

@karpathy

4 months ago

New art project. Train and inference GPT in 243 lines of pure, dependency-free Python. This is the *full* algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further. https://t.co/HmiRrQugnP

647

25K

29K

paritosh_91 retweeted

Tech with Mak

@techNmak

4 months ago

For 38 years, computer scientists believed Dijkstra's algorithm was optimal for sparse graphs. The logic seemed airtight: Dijkstra sorts vertices by distance. Sorting has a lower bound of O(n log n). Therefore shortest paths can't be faster. 5 researchers proved the assumption wrong. The trick: combine Dijkstra's priority queue with Bellman-Ford's dynamic programming. Divide and conquer on vertex sets. Shrink the frontier. Result: O(m log^(2/3) n) First improvement for directed graphs since Fibonacci heap in 1987. Tsinghua. Stanford. Max Planck. 17 pages.

techNmak's tweet photo. For 38 years, computer scientists believed Dijkstra's algorithm was optimal for sparse graphs.

The logic seemed airtight:

Dijkstra sorts vertices by distance.
Sorting has a lower bound of O(n log n).
Therefore shortest paths can't be faster.

5 researchers proved the assumption wrong.

The trick: combine Dijkstra's priority queue with Bellman-Ford's dynamic programming. Divide and conquer on vertex sets. Shrink the frontier.

Result: O(m log^(2/3) n)

First improvement for directed graphs since Fibonacci heap in 1987.

Tsinghua. Stanford. Max Planck. 17 pages.

263

236K

paritosh_91 retweeted

Md Ismail Šojal 🕷️

@0x0SojalSec

4 months ago

WTF DeepMind. 🤯 They built an AI that doesn't need RAG, and it has perfect memory of everything it's ever read. It's called Recursive Language Models, and it might mark the death of traditional context windows forever. - Everyone's been obsessed with context windows like it's a dick-measuring contest. We have 2M tokens! No, WE have 10M tokens! Cool. Your model still forgets everything past 100K. They call it "context rot" and every frontier model suffers from it. - RAG was supposed to save us. Just retrieve the relevant chunks, stuff them in the prompt, problem solved. Except RAG is fundamentally broken for anything that actually matters. It can't handle tasks where you need to look at multiple parts of a document simultaneously. - Recursive Language Models flip the entire script. Instead of shoving 10M tokens directly into the model, you load the prompt as a variable in a Python REPL. The model writes code to search, slice, and recursively call itself on relevant snippets. It's so obvious in hindsight. - RLMs let AI do exactly that. The prompt isn't processed linearly it's an environment the model navigates programmatically. - The results are absolutely disgusting: GPT-5 base model on multi-document research: 0% (literally couldn't fit it) GPT-5 with RLM: 91% On information-dense reasoning: Base: 0.04% RLM: 58% That's not incremental improvement. That's a different capability emerging. - What's wild is the models figured out their own strategies without being trained for this. They started using regex to filter context without reading it all. Breaking tasks into recursive sub-calls. Verifying answers by querying themselves again. Zero special training. Just emergent behavior. - The cost situation is actually better than you'd think. Yeah, some trajectories get expensive because the model keeps recursing. But the median RLM run is cheaper than the base model. Why? Because it only reads what it needs instead of ingesting 10M tokens upfront. - This completely changes what's possible: - Analyzing entire legal codebases - Understanding million-line repositories - Synthesizing hundreds of research papers - Processing years of medical records All of these were theoretically possible but practically broken. Not anymore. - The researchers tested on GPT-5 and Qwen3-Coder with zero modification. No fine-tuning. No special training. Just give them a REPL environment and recursive access. That means you can use this approach right now with existing models. - Here's the kicker: current models are terrible at this compared to what's possible. They make dumb decisions. Repeat work. Sometimes compute the right answer and then... ignore it. Imagine explicitly training models to think recursively. We're still at the starting line. - Everyone's been focused on the wrong problem. The question isn't "how do we cram more tokens into context windows" It's "how do we let AI intelligently navigate unbounded information." RLMs just proved you don't need bigger windows. You need smarter navigation. paper : https://t.co/8mcm19V2E1

0x0SojalSec's tweet photo. WTF DeepMind. 🤯

They built an AI that doesn't need RAG, and it has perfect memory of everything it's ever read.

It's called Recursive Language Models, and it might mark the death of traditional context windows forever.

- Everyone's been obsessed with context windows like it's a dick-measuring contest.
We have 2M tokens! No, WE have 10M tokens!
Cool. Your model still forgets everything past 100K. They call it "context rot" and every frontier model suffers from it.

- RAG was supposed to save us.
Just retrieve the relevant chunks, stuff them in the prompt, problem solved.

Except RAG is fundamentally broken for anything that actually matters.

It can't handle tasks where you need to look at multiple parts of a document simultaneously.

- Recursive Language Models flip the entire script.

Instead of shoving 10M tokens directly into the model, you load the prompt as a variable in a Python REPL.
The model writes code to search, slice, and recursively call itself on relevant snippets.

It's so obvious in hindsight.

- RLMs let AI do exactly that.

The prompt isn't processed linearly it's an environment the model navigates programmatically.

- The results are absolutely disgusting:
GPT-5 base model on multi-document research: 0% (literally couldn't fit it)
GPT-5 with RLM: 91%
On information-dense reasoning:
Base: 0.04%
RLM: 58%

That's not incremental improvement.

That's a different capability emerging.
- What's wild is the models figured out their own strategies without being trained for this.

They started using regex to filter context without reading it all. Breaking tasks into recursive sub-calls. Verifying answers by querying themselves again.

Zero special training.

Just emergent behavior.
- The cost situation is actually better than you'd think.
Yeah, some trajectories get expensive because the model keeps recursing.

But the median RLM run is cheaper than the base model.
Why? Because it only reads what it needs instead of ingesting 10M tokens upfront.

- This completely changes what's possible:
- Analyzing entire legal codebases
- Understanding million-line repositories
- Synthesizing hundreds of research papers
- Processing years of medical records
All of these were theoretically possible but practically broken. Not anymore.
- The researchers tested on GPT-5 and Qwen3-Coder with zero modification.

No fine-tuning.
No special training.
Just give them a REPL environment and recursive access.
That means you can use this approach right now with existing models.

- Here's the kicker: current models are terrible at this compared to what's possible.
They make dumb decisions. Repeat work. Sometimes compute the right answer and then... ignore it.

Imagine explicitly training models to think recursively. We're still at the starting line.

- Everyone's been focused on the wrong problem.
The question isn't "how do we cram more tokens into context windows"
It's "how do we let AI intelligently navigate unbounded information."
RLMs just proved you don't need bigger windows.

You need smarter navigation.

paper : https://t.co/8mcm19V2E1

244

195K

paritosh_91 retweeted

Tom Dörr

@tom_doerr

5 months ago

Python library to retrieve data from TradingView screener https://t.co/C3dJ7yqBRx

392

271K

paritosh_91 retweeted

Avi Chawla

@_avichawla

5 months ago

Vector search is not always the answer. A 30-year-old algorithm with zero training, zero embeddings, and zero fine-tuning still powers Elasticsearch, OpenSearch, and most production search systems today. It's called BM25, and it's worth understanding why it refuses to die. Let's say you're searching for "transformer attention mechanism" in a library of ML papers. BM25 scores documents using three core ideas: 1) Word rarity matters more than word frequency Every paper contains "the" and "is" so those words carry no signal. But "transformer" is specific and informative, so BM25 gives it a much higher weight. In the formula, this is captured by IDF(qᵢ). 2) Repetition helps, but with diminishing returns If "attention" appears 10 times in a paper, that's a strong relevance signal. But the jump from 10 to 100 occurrences barely moves the score. BM25 applies a saturation curve controlled by f(qᵢ, D) and the parameter k₁, preventing keyword stuffing from gaming the results. 3) Document length gets normalized A 50-page paper will naturally contain more keyword hits than a 5-page paper. BM25 adjusts for this using |D|/avgdl, controlled by parameter b, so longer documents don't dominate the rankings just because they have more text. Three ideas. No neural networks. No training data. Just elegant math that has stood the test of time. Here's the part most people overlook: BM25 excels at exact keyword matching, which is something embeddings genuinely struggle with. When a user searches for "error code 5012" vector search might return semantically similar error codes. BM25 will surface the exact match every time. This is exactly why hybrid search has become the default in top RAG systems. Combining BM25 with vector search gives you semantic understanding AND precise keyword matching in a single pipeline. So before you throw GPUs at every search problem, consider that BM25 might already solve it, or at the very least, make your semantic search significantly better when the two are combined.

_avichawla's tweet photo. Vector search is not always the answer.

A 30-year-old algorithm with zero training, zero embeddings, and zero fine-tuning still powers Elasticsearch, OpenSearch, and most production search systems today.

It's called BM25, and it's worth understanding why it refuses to die.

Let's say you're searching for "transformer attention mechanism" in a library of ML papers.

BM25 scores documents using three core ideas:

1) Word rarity matters more than word frequency

Every paper contains "the" and "is" so those words carry no signal.

But "transformer" is specific and informative, so BM25 gives it a much higher weight. In the formula, this is captured by IDF(qᵢ).

2) Repetition helps, but with diminishing returns

If "attention" appears 10 times in a paper, that's a strong relevance signal. But the jump from 10 to 100 occurrences barely moves the score.

BM25 applies a saturation curve controlled by f(qᵢ, D) and the parameter k₁, preventing keyword stuffing from gaming the results.

3) Document length gets normalized

A 50-page paper will naturally contain more keyword hits than a 5-page paper.

BM25 adjusts for this using |D|/avgdl, controlled by parameter b, so longer documents don't dominate the rankings just because they have more text.

Three ideas. No neural networks. No training data. Just elegant math that has stood the test of time.

Here's the part most people overlook: BM25 excels at exact keyword matching, which is something embeddings genuinely struggle with.

When a user searches for "error code 5012" vector search might return semantically similar error codes. BM25 will surface the exact match every time.

This is exactly why hybrid search has become the default in top RAG systems.

Combining BM25 with vector search gives you semantic understanding AND precise keyword matching in a single pipeline.

So before you throw GPUs at every search problem, consider that BM25 might already solve it, or at the very least, make your semantic search significantly better when the two are combined.

167

87K

paritosh_91 retweeted

ollama

@ollama

5 months ago

ollama pull glm-ocr All local. You own your data. GLM-OCR delivers state-of-the-art performance for document understanding. Use it for recognizing text, tables, and figures, or output to a specific JSON format. Drag and drop images into the terminal, script it or access via Ollama's API.

ollama's tweet photo. ollama pull glm-ocr

All local. You own your data.

GLM-OCR delivers state-of-the-art performance for document understanding.

Use it for recognizing text, tables, and figures, or output to a specific JSON format.

Drag and drop images into the terminal, script it or access via Ollama's API.

177

162K

paritosh_91 retweeted

Z.ai @Zai_org

5 months ago

Introducing GLM-OCR: SOTA performance, optimized for complex document understanding. With only 0.9B parameters, GLM-OCR delivers state-of-the-art results across major document understanding benchmarks, including formula recognition, table recognition, and information extraction. Weights: https://t.co/vqIBgBCXYi Try it: https://t.co/Ld7H8Pasls API: https://t.co/xVLNG0XSfP

Zai_org's tweet photo. Introducing GLM-OCR: SOTA performance, optimized for complex document understanding.

With only 0.9B parameters, GLM-OCR delivers state-of-the-art results across major document understanding benchmarks, including formula recognition, table recognition, and information extraction.

Weights: https://t.co/vqIBgBCXYi
Try it: https://t.co/Ld7H8Pasls
API: https://t.co/xVLNG0XSfP

110

419

553K

paritosh_91 retweeted

atul

@atullchaurasia

5 months ago

Research papers you must read for AI Engineer interviews - 1. Attention is all you need (Transformers) 2. LoRA (Low rank adaption) 3. PEFT ( Parameter Efficient Fine Tuning) 4. VIT (Vision Transformers) 5. VAE (Variational Auto Encoder) 6. GANs ( Generative Adversarial Networks) 7. BERT ( Bidirectional Encoder Representation from Transformers) 8. Diffusion Models (Stable Diffusion) 9. RAG (Retrieval Augment Generation) 10. GPT (Generative Pre-trained Transformers)

418

143K

paritosh_91 retweeted

Abhishek Singh

@0xlelouch_

5 months ago

As a Backend Engineer in 2026 aiming for Staff, please learn: 1. One language deeply (Go/Rust/Java) Not “I can write APIs”, but runtime model, memory, concurrency, profiling, GC behavior (if any), and how to read stack traces like a native. 2. Data modeling and storage fundamentals Relational modeling, constraints, isolation levels, indexes, query plans, locks, deadlocks, migrations, backup/restore, partitioning. Most “scaling” problems are schema + query shape problems. 3. Distributed systems basics that actually show up in prod Consistency vs availability, timeouts, retries, idempotency, backpressure, message ordering, leader election, clock skew, eventual consistency, and what happens during partial failures. 4. API design and contracts Versioning, pagination, filtering, error models, idempotency keys, rate limits, backwards compatibility, and how to avoid breaking mobile clients for months. 5. Performance and capacity engineering Latency budgets (p50/p95/p99), tail latency causes, load testing, queueing theory intuition, connection pools, CPU vs IO bound, and capacity planning with real numbers. 6. Reliability engineering SLOs/SLIs, incident response, postmortems, alerting that does not spam, error budgets, graceful degradation, feature flags, circuit breakers, bulkheads. 7. Observability like a pro Structured logs, metrics, tracing, correlation ids, RED/USE metrics, sampling strategies, and how to debug “it is slow sometimes” without just guessing. 8. Security fundamentals AuthN/AuthZ, least privilege, secrets management, token expiry, OWASP basics, SSRF, injection, secure defaults, audit logs, threat modeling for your own services. 9. Messaging and async systems Kafka/Rabbit/SQS semantics, at-least-once vs exactly-once (and why “exactly once” is mostly a marketing term), consumer groups, retries, DLQs, replay, dedupe. 10. Caching with correctness Cache invalidation strategies, TTLs, stampede protection, read-through/write-through, negative caching, and when caching makes bugs harder than latency. 11. Infrastructure literacy Linux basics, networking (DNS, TCP, TLS), containers, k8s concepts, autoscaling, deployment strategies (blue/green, canary), and what your cloud bill is really paying for. 12. System design, but with tradeoffs Designing is picking pain. Learn to write down constraints, failure modes, data growth, and operational cost. Staff is judged on tradeoffs, not diagrams. 13. Codebase leadership Design docs, RFCs, review quality, mentorship, aligning teams, reducing complexity, owning a subsystem end-to-end, making boring systems that do not wake people at 2am. 14. Pick ONE domain to go deep Payments, search, streaming, identity, infra, data platform, etc. Staff engineers are “the person for a hard area”, not generic API writers. Stop hopping stacks every month. Pick a lane, build proof of reliability, and become the person people call when prod is on fire. That is Staff.

506

548K

Paritosh @paritosh_91

6 months ago

@geekyranjit How is the charging network and availability in Hyderabad? Did you have any issues with the battery life or performance?

Paritosh @paritosh_91

6 months ago

@thetylerdaniel Pretty sure the grim reaper is laughing his ass off reading this and watch Bryan go through all the circus!

274

Paritosh @paritosh_91

6 months ago

Let them think you are loosing , avoid that tempation of showing people what God is doing in your life

Paritosh @paritosh_91

6 months ago

The future belongs to the engineer who treats LLMs as an extension of their own mind not a replacement for it.

Paritosh

@paritosh_91

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users