nicococo @nicococo23 - Twitter Profile

3 months ago

A recent study looked at how open-soruce developers use MLOps frameworks. Surprising: teams rarely use them end-to-end. Instead, they assemble their own custom workflows. Interesting signal about how ML systems are really built in the wild 🦥 https://t.co/EBJfdlyha4

0

1

0

26

nicococo23 retweeted

Sebastian Raschka

@rasbt

3 months ago

https://t.co/7JkJxiUPv6

14

1K

192

1K

95K

nicococo @nicococo23

3 months ago

Claude scans now codebases for vulnerabilities and suggests fixes (currently a limited research preview). We've applied to participate with MLOX🛡️ Now the question is whether AI is really starting to challenge specialist security vendors. https://t.co/fbGGYz1pYc

0

29

nicococo23 retweeted

Akshay 🚀

@akshay_pachaar

4 months ago

Researchers built a new RAG approach that: - does not need a vector DB. - does not embed data. - involves no chunking. - performs no similarity search. And it hit 98.7% accuracy on a financial benchmark (SOTA). Here's the core problem with RAG that this new approach solves: Traditional RAG chunks documents, embeds them into vectors, and retrieves based on semantic similarity. But similarity ≠ relevance. When you ask "What were the debt trends in 2023?", a vector search returns chunks that look similar. But the actual answer might be buried in some Appendix, referenced on some page, in a section that shares zero semantic overlap with your query. Traditional RAG would likely never find it. PageIndex (open-source) solves this. Instead of chunking and embedding, PageIndex builds a hierarchical tree structure from your documents, like an intelligent table of contents. Then it uses reasoning to traverse that tree. For instance, the model doesn't ask: "What text looks similar to this query?" Instead, it asks: "Based on this document's structure, where would a human expert look for this answer?" That's a fundamentally different approach with: - No arbitrary chunking that breaks context. - No vector DB infrastructure to maintain. - Traceable retrieval to see exactly why it chose a specific section. - The ability to see in-document references ("see Table 5.3") the way a human would. But here's the deeper issue that it solves. Vector search treats every query as independent. But documents have structure and logic, like sections that reference other sections and context that builds across pages. PageIndex respects that structure instead of flattening it into embeddings. Do note that this approach may not make sense in every use case since traditional vector search is still fast, simple, and works well for many applications. But for professional documents that require domain expertise and multi-step reasoning, this tree-based, reasoning-first approach shines. For instance, PageIndex achieved 98.7% accuracy on FinanceBench, significantly outperforming traditional vector-based RAG systems on complex financial document analysis. Everything is fully open-source, so you can see the full implementation in GitHub and try it yourself. I have shared the GitHub repo in the replies!

92

3K

323

5K

235K

Who to follow

pan tom

@daspantom

Michael (Mikhail) Kachala

@mKachala

AI, ML and Data Leadership @ Zalando, before that: Meta, Bayer, EMBL, MIPT

Rudolf Schneider

@rdlfSchneider

Angewandte Korrelationsapproximation, NLP, Information Retrieval, AI for Chemistry, Coffee Nerd

nicococo @nicococo23

4 months ago

Our (rather technical) ICML paper just hit it's 100th citation 🥳 https://t.co/WEdti5GpnP

0

35

nicococo23 retweeted

Tech with Mak

@techNmak

4 months ago

Don't pick a RAG architecture. Pick a use case. General-purpose QA: → Standard RAG Personal assistants, research aids: → Agentic RAG Expert systems (medicine, law, engineering): → Graph RAG Large projects with frequent updates: → Modular RAG Chatbots with long-term context: → Memory-Augmented RAG Image captioning, video summarization: → Multi-Modal RAG Healthcare, sensitive data, cross-org platforms: → Federated RAG Live reporting, financial tickers: → Streaming RAG Search engines, virtual assistants: → ODQA RAG Customer support chatbots: → Contextual Retrieval RAG Legal, medical, educational tools: → Knowledge-Enhanced RAG + Domain-Specific RAG Complex QA with lexical + semantic matching: → Hybrid RAG Content creation requiring high accuracy: → Self-RAG Research assistance in niche fields: → HyDE RAG Analytical problem-solving, multi-turn dialogue: → Recursive / Multi-Step RAG 16 architectures. 16 sweet spots. Match accordingly.

techNmak's tweet photo. Don't pick a RAG architecture. Pick a use case.

General-purpose QA:
→ Standard RAG

Personal assistants, research aids:
→ Agentic RAG

Expert systems (medicine, law, engineering):
→ Graph RAG

Large projects with frequent updates:
→ Modular RAG

Chatbots with long-term context:
→ Memory-Augmented RAG

Image captioning, video summarization:
→ Multi-Modal RAG

Healthcare, sensitive data, cross-org platforms:
→ Federated RAG

Live reporting, financial tickers:
→ Streaming RAG

Search engines, virtual assistants:
→ ODQA RAG

Customer support chatbots:
→ Contextual Retrieval RAG

Legal, medical, educational tools:
→ Knowledge-Enhanced RAG + Domain-Specific RAG

Complex QA with lexical + semantic matching:
→ Hybrid RAG

Content creation requiring high accuracy:
→ Self-RAG

Research assistance in niche fields:
→ HyDE RAG

Analytical problem-solving, multi-turn dialogue:
→ Recursive / Multi-Step RAG

16 architectures. 16 sweet spots. Match accordingly.

27

759

167

739

33K

nicococo23 retweeted

Unsloth AI

@UnslothAI

4 months ago

You can now train LLMs in VS Code for free via Colab & Unsloth. We made a guide showing you how to connect any fine-tuning notebook in VS Code to a Colab runtime. Train locally or on a free Google Colab GPU. Guide: https://t.co/ZHJCb0oHQT GitHub: https://t.co/2kXqhhvLsb

32

1K

178

890

59K

nicococo23 retweeted

Z.ai @Zai_org

4 months ago

Presenting the GLM-5 Technical Report! https://t.co/CGjxEISvFK After the launch of GLM-5, we’re pulling back the curtain on how it was built. Key innovations include: - DSA Adoption: Significantly reduces training and inference costs while preserving long-context fidelity - Asynchronous RL Infrastructure: Drastically improves post-training efficiency by decoupling generation from training - Agent RL Algorithms: Enables the model to learn from complex, long-horizon interactions more effectively Through these innovations, GLM-5 achieves SOTA performance among open-source models, with particularly strong results in real-world software engineering tasks.

Zai_org's tweet photo. Presenting the GLM-5 Technical Report!
https://t.co/CGjxEISvFK

After the launch of GLM-5, we’re pulling back the curtain on how it was built. Key innovations include:
- DSA Adoption: Significantly reduces training and inference costs while preserving long-context fidelity
- Asynchronous RL Infrastructure: Drastically improves post-training efficiency by decoupling generation from training
- Agent RL Algorithms: Enables the model to learn from complex, long-horizon interactions more effectively
Through these innovations, GLM-5 achieves SOTA performance among open-source models, with particularly strong results in real-world software engineering tasks.

87

2K

341

1K

447K

nicococo23 retweeted

DAIR.AI

@dair_ai

4 months ago

A paper worth paying close attention to. It presents Lossless Context Management (LCM), which reframes how agents handle long contexts. It outperforms Claude Code on long-context tasks. Recursive Language Models give the model full autonomy to write its own memory scripts. LCM takes that power back, handing it to a deterministic engine that compresses old messages into a hierarchical DAG while keeping lossless pointers to every original. Less expressive in theory, far more reliable in practice. The results: Their agent (Volt, on Opus 4.6) beats Claude Code at *every* context length from 32K to 1M tokens on the OOLONG benchmark. +29.2 points average improvement versus Claude Code's +24.7. The gap widens at longer contexts. The implication is one we keep relearning from software engineering history: how you manage what the model sees may matter more than giving the model tools to manage it itself. Every agent framework shipping with "let the model figure it out" memory strategies may be building on the wrong abstraction entirely. Paper: https://t.co/LtqS7pzmP4 Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c

dair_ai's tweet photo. A paper worth paying close attention to.

It presents Lossless Context Management (LCM), which reframes how agents handle long contexts.

It outperforms Claude Code on long-context tasks.

Recursive Language Models give the model full autonomy to write its own memory scripts. LCM takes that power back, handing it to a deterministic engine that compresses old messages into a hierarchical DAG while keeping lossless pointers to every original. Less expressive in theory, far more reliable in practice.

The results:

Their agent (Volt, on Opus 4.6) beats Claude Code at *every* context length from 32K to 1M tokens on the OOLONG benchmark. +29.2 points average improvement versus Claude Code's +24.7. The gap widens at longer contexts.

The implication is one we keep relearning from software engineering history: how you manage what the model sees may matter more than giving the model tools to manage it itself. Every agent framework shipping with "let the model figure it out" memory strategies may be building on the wrong abstraction entirely.

Paper: https://t.co/LtqS7pzmP4
Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c

24

531

85

647

67K

nicococo23 retweeted

Andrej Karpathy

@karpathy

4 months ago

I spent more test time compute and realized that my micrograd can be dramatically simplified even further. You just return local gradients for each op and get backward() to do the multiply (chaining) with global gradient from loss. So each op just expresses the bare fundamentals of what it needs to: the forward computation and the backward gradients for it. Huge savings from 243 lines of code to just 200 (~18%). Also, the code now fits even more beautifully to 3 columns and happens to break just right: Column 1: Dataset, Tokenizer, Autograd Column 2: GPT model Column 3: Training, Inference Ok now surely we are done.

karpathy's tweet photo. I spent more test time compute and realized that my micrograd can be dramatically simplified even further. You just return local gradients for each op and get backward() to do the multiply (chaining) with global gradient from loss. So each op just expresses the bare fundamentals of what it needs to: the forward computation and the backward gradients for it.

Huge savings from 243 lines of code to just 200 (~18%).

Also, the code now fits even more beautifully to 3 columns and happens to break just right:

Column 1: Dataset, Tokenizer, Autograd
Column 2: GPT model
Column 3: Training, Inference

Ok now surely we are done.

90

3K

175

1K

267K

nicococo23 retweeted

DAIR.AI

@dair_ai

4 months ago

https://t.co/XGHJE8uIIf

3

379

58

460

112K

nicococo23 retweeted

Unwind AI

@unwind_ai_

4 months ago

China's Alibaba just opensourced the SQLite of vector databases. zvec runs as a library inside your app and is built for on-device RAG no external server. no pinecone. no qdrant instance. 100% opensource.

unwind_ai_'s tweet photo. China's Alibaba just opensourced the SQLite of vector databases.

zvec runs as a library inside your app and is built for on-device RAG

no external server. no pinecone. no qdrant instance.

100% opensource. https://t.co/KFQ1JM4KXK

50

3K

346

3K

133K

nicococo23 retweeted

Peter Steinberger 🦞

@steipete

4 months ago

I'm joining @OpenAI to bring agents to everyone. @OpenClaw is becoming a foundation: open, independent, and just getting started.🦞 https://t.co/XOc7X4jOxq

4K

42K

4K

7K

6M

nicococo23 retweeted

Hugging Models

@HuggingModels

4 months ago

Qwen3.5 is here 🚀 397B params, just 17B active. Native multimodal agents for coding, reasoning, GUI + video. 200+ languages. Open weights. Real scale. The next frontier is open. 🔗 https://t.co/pQQOEDAXBi

20

1K

138

654

108K

nicococo @nicococo23

4 months ago

Montags-Update 🦥 MLOX now has a plugin system. Core stays lean. Extensions can be added modularly. Fewer forks, more flexibility. Still early but this feels like the right direction. #OpenSource #MLOps

0

19

nicococo23 retweeted

Carl Hendy @carlhendy

12 months ago

If you’re in media, this is worth a watch. Cloudflare handles ≈20% of global traffic, so when CEO Matthew Prince warns at Cannes that AI bots are reshaping the web, publishers need to adapt or risk being left behind.

83

2K

456

2K

1M

nicococo23 retweeted

Hugging Models

@HuggingModels

4 months ago

NVIDIA just dropped PersonaPlex-7B 🤯 A full-duplex voice model that listens and talks at the same time. No pauses. No turn-taking. Real conversation. 100% open source. Free. Voice AI just leveled up. https://t.co/YfzFQfBzMS

206

8K

1K

8K

1M

nicococo23 retweeted

Andi Marafioti

@andimarafioti

4 months ago

30x real-time speech-to-text in your browser. No installs. No servers. Just open the website.

39

1K

121

2K

105K

nicococo23 retweeted

Nav Toor

@heynavtoor

4 months ago

🚨BREAKING: Microsoft just solved the "Agent Loop" problem. Agent Lightning is an open-source framework that lets agents learn from their own mistakes using Reinforcement Learning. Your agent fails a task → Agent Lightning analyzes why → Updates the prompt automatically → Next run succeeds. 100% Opensource.

heynavtoor's tweet photo. 🚨BREAKING: Microsoft just solved the "Agent Loop" problem.

Agent Lightning is an open-source framework that lets agents learn from their own mistakes using Reinforcement Learning.

Your agent fails a task → Agent Lightning analyzes why → Updates the prompt automatically → Next run succeeds.

100% Opensource.

44

1K

188

2K

114K

nicococo

@nicococo23

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users