Olivier Balais (overnetcity.bsky.social) @overnetcity - Twitter Profile

Olivier Balais (overnetcity.bsky.social) @overnetcity

over 1 year ago

Tested #Gemini 2.0 Flash to convert recordings+audio into specs for Windsurf — mind-blowing. Now hoping for native support :)

0

2

0

145

Olivier Balais (overnetcity.bsky.social) @overnetcity

over 1 year ago

A few weeks ago, I was blown away by @windsurf_ai IDE! Next step: a multimodal approach (video+text+audio) so we can "show" code while explaining logic—like a screen record in a tech issue.

1

2

0

149

Olivier Balais (overnetcity.bsky.social) @overnetcity

over 1 year ago

@dunglas Amazing! Thanks for the pointer, @dunglas! I checked the page you shared but couldn’t find any mention of the BoltDB you referenced. Is there any other documentation or guidance you could point me to?

1

0

33

Olivier Balais (overnetcity.bsky.social) @overnetcity

over 1 year ago

Hey @dunglas, hope all’s well! I’m exploring https://t.co/SjdW8AQTpm for handling SSE auto-retries on network failure. Does Mercure have a built-in feature to store/buffer events for client auto-retry, or is it something we typically manage on the backend?

1

0

218

Who to follow

Sofia Lescano

@SofLesc

🇪🇸🇦🇷(🇲🇫) PHP Backend developer - International speaker

Bastien Jaillot

@bastnic

A @JoliCode, @redirectionio and @Capcollectif guy. Mastodon : @[email protected]

Elao

@Elao

Atelier de conception d'applications web et mobiles depuis plus de 20 ans ! Experts #Symfony et #ReactJS 🔗 https://t.co/cFzM54UmVG

Olivier Balais (overnetcity.bsky.social) @overnetcity

over 1 year ago

o3 just dropped and it’s a monumental leap in AI capabilities. From blazing code gen to near-human performance on the ARC AGI test (yes, that test), it changes what we thought AI could do. Exciting? Absolutely. Terrifying? A bit. Hardware is now the real bottleneck. Buckle up!

2

3

1

0

177

overnetcity retweeted

Pinecone

@pinecone

over 1 year ago

Our new models — fully integrated alongside our database — bring best-in-class retrieval to your applications: ✔️ Our new sparse embedding model — pinecone-sparse-english-v0 — boosts performance for keyword-based queries, delivering up to 44% and on average 23% better NDCG@10 than BM25 on TREC. ✔️ Our new reranking model — pinecone-rerank-v0 — improves search accuracy by up to 60% and on average 9% over industry-leading models on the BEIR benchmark. ✔️ @cohere's latest model — cohere-rerank-v3.5 — balances performance and latency for a wide range of enterprise search applications. Learn more by visiting our Model Gallery: https://t.co/AaUJqbB3WH

1

2

1

0

529

overnetcity retweeted

Pinecone

@pinecone

over 1 year ago

First-of-its-kind Pinecone Knowledge Platform Powers Best-in-class Retrieval for Customers 💠 Industry-leading vector database capabilities combined with proprietary AI models help developers build up to 48% more accurate AI applications: faster & easier https://t.co/Pz9GkmUUe8

0

8

3

1

1K

Olivier Balais (overnetcity.bsky.social) @overnetcity

over 1 year ago

Bye bye Twitter. Let’s meet on Bsky: https://t.co/6ciohU3wz2.

0

1

0

1

54

Olivier Balais (overnetcity.bsky.social) @overnetcity

over 1 year ago

I’ve been working and building products with generative AI for over 4 years now. It’s hard to impress me at this point. But damn, @windsurf_ai, your IDE is absolutely next-level! 🔥

0

1

0

1

278

overnetcity retweeted

Visual Studio Code

@code

over 1 year ago

New in Copilot Chat... enhanced links for any workspace symbols that Copilot mentions 🔗 These links appear in responses as little pills, letting you jump directly to definitions for better understanding.

13

579

51

131

78K

overnetcity retweeted

goosewin

@Goosewin

over 1 year ago

build failed

15

12K

672

457

419K

overnetcity retweeted

Jerry Liu

@jerryjliu0

over 1 year ago

Pretty excited about this new RAG technique I cooked up 🧑‍🍳 A top issue with RAG chunking is it splits the document into fragmented pieces, causing top-k retrieval to return partial context. Also most documents have multiple hierarchies of sections: top-level sections, sub-sections, etc. This is also why lots of people are interested in exploring the idea of knowledge graphs - pulling in "links" to related pages to expand retrieved context. This notebook lets you retrieve contiguous chunks without having to spend a lot of time tuning the chunking algorithm, thanks to GraphRAG-esque metadata tagging + retrieval. Tag chunks with sections, and use the section ID to expand the retrieved set. Check it out https://t.co/mIolxuMT12

jerryjliu0's tweet photo. Pretty excited about this new RAG technique I cooked up 🧑‍🍳

A top issue with RAG chunking is it splits the document into fragmented pieces, causing top-k retrieval to return partial context. Also most documents have multiple hierarchies of sections: top-level sections, sub-sections, etc.

This is also why lots of people are interested in exploring the idea of knowledge graphs - pulling in "links" to related pages to expand retrieved context.

This notebook lets you retrieve contiguous chunks without having to spend a lot of time tuning the chunking algorithm, thanks to GraphRAG-esque metadata tagging + retrieval. Tag chunks with sections, and use the section ID to expand the retrieved set.

Check it out

https://t.co/mIolxuMT12

13

630

112

934

118K

overnetcity retweeted

Rohan Paul

@rohanpaul_ai

over 1 year ago

Text chunking now matches human reading patterns by detecting natural breaks in information flow. Meta-chunking, proposed in this paper, uses probability patterns to find natural segment boundaries in documents, just like humans do Original Problem 🎯: Text chunking in Retrieval-Augmented Generation (RAG) systems often fails to maintain logical coherence between segments, leading to incomplete or fragmented information retrieval. Current methods rely on fixed-length splits or basic semantic similarity, missing crucial logical connections between sentences. ----- Solution in this Paper ⚡: • Meta-Chunking: A novel segmentation technique operating between sentence and paragraph levels • Two key strategies: - Margin Sampling: Uses LLMs for binary classification to determine segment boundaries based on probability differences - Perplexity (PPL) Chunking: Analyzes perplexity distribution to identify natural text boundaries • Dynamic combination approach to balance fine and coarse-grained segmentation • KV caching mechanism for handling longer texts efficiently ----- Key Insights 💡: • Smaller models (1.5B parameters) can effectively perform chunking tasks • PPL distribution characteristics guide optimal threshold selection • Dynamic chunk sizing preserves logical integrity better than fixed-length approaches • Re-ranking performance improves significantly with Meta-Chunking ----- Results 📊: • Outperforms similarity chunking by 1.32 on 2WikiMultihopQA while using only 45.8% processing time • PPL Chunking with Qwen2-1.5B achieves 0.3760 BLEU-1 score on single-hop queries • Maintains consistent performance across both Chinese and English datasets • Shows 3.59% improvement in Hits@8 metric when combined with PPLRerank

rohanpaul_ai's tweet photo. Text chunking now matches human reading patterns by detecting natural breaks in information flow.

Meta-chunking, proposed in this paper, uses probability patterns to find natural segment boundaries in documents, just like humans do

Original Problem 🎯:

Text chunking in Retrieval-Augmented Generation (RAG) systems often fails to maintain logical coherence between segments, leading to incomplete or fragmented information retrieval. Current methods rely on fixed-length splits or basic semantic similarity, missing crucial logical connections between sentences.

-----

Solution in this Paper ⚡:

• Meta-Chunking: A novel segmentation technique operating between sentence and paragraph levels

• Two key strategies:
- Margin Sampling: Uses LLMs for binary classification to determine segment boundaries based on probability differences
- Perplexity (PPL) Chunking: Analyzes perplexity distribution to identify natural text boundaries

• Dynamic combination approach to balance fine and coarse-grained segmentation

• KV caching mechanism for handling longer texts efficiently

-----

Key Insights 💡:

• Smaller models (1.5B parameters) can effectively perform chunking tasks

• PPL distribution characteristics guide optimal threshold selection

• Dynamic chunk sizing preserves logical integrity better than fixed-length approaches

• Re-ranking performance improves significantly with Meta-Chunking

-----

Results 📊:

• Outperforms similarity chunking by 1.32 on 2WikiMultihopQA while using only 45.8% processing time

• PPL Chunking with Qwen2-1.5B achieves 0.3760 BLEU-1 score on single-hop queries

• Maintains consistent performance across both Chinese and English datasets

• Shows 3.59% improvement in Hits@8 metric when combined with PPLRerank

5

410

65

356

24K

Olivier Balais (overnetcity.bsky.social) @overnetcity

over 1 year ago

Peur sur les pistes cyclables : gros pneus et coups de sonnettes, la menace de la « SUVisation du vélo » – via @lemondefr https://t.co/jZKXMoKylC

0

1

0

81

overnetcity retweeted

Marko Denic

@denicmarko

over 1 year ago

It's true.

109

34K

2K

3K

1M

overnetcity retweeted

Yann LeCun

@ylecun

over 1 year ago

I don't want to say "I told you so", but I told you so. Repeatedly. I was heavily criticized for it. Mocked, even.

132

2K

172

335

339K

overnetcity retweeted

Rohan Paul

@rohanpaul_ai

over 1 year ago

This project tries to implement a real-time replication of OpenAI’s groundbreaking O1 model. Exploring advanced reasoning capabilities and a specific "journey learning" mechanisms for AI. They propose a new approach: “journey learning”. This paradigm goes beyond the traditional focus on specific tasks and emphasizes continuous progress through learning, reflection, and adaptation.

rohanpaul_ai's tweet photo. This project tries to implement a real-time replication of OpenAI’s groundbreaking O1 model.

Exploring advanced reasoning capabilities and a specific "journey learning" mechanisms for AI.

They propose a new approach: “journey learning”. This paradigm goes beyond the traditional focus on specific tasks and emphasizes continuous progress through learning, reflection, and adaptation.

5

168

28

127

19K

Olivier Balais (overnetcity.bsky.social) @overnetcity

over 1 year ago

🤯

Brad Costanzo

@BradCostanzo

over 1 year ago

Wow! @HeyGen_Official just released today ability to have an AI avatar join a Zoom meeting and interact. I invited one of their AI avatars into a Zoom room and recorded this clip. Time to build my own now

65

1K

183

1K

239K

0

1

0

93

overnetcity retweeted

Brad Costanzo

@BradCostanzo

over 1 year ago

Wow! @HeyGen_Official just released today ability to have an AI avatar join a Zoom meeting and interact. I invited one of their AI avatars into a Zoom room and recorded this clip. Time to build my own now

65

1K

183

1K

239K

Olivier Balais (overnetcity.bsky.social)

@overnetcity

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users