Alejandro Cantarero @a_drdro - Twitter Profile

Alejandro Cantarero @a_drdro

16 days ago

6/ More details on all of the above here 👇 https://t.co/fkYXFoiyzh

0

Alejandro Cantarero @a_drdro

16 days ago

I recently spent a few weeks using #ClaudeDesign and #ClaudeCode to redesign a mobile app I'd built in my spare time. Some of it worked well, some did not. Details in the thread.

1

0

14

Alejandro Cantarero @a_drdro

16 days ago

5/ #ClaudeCode and @ChatGPTapp are both pretty bad at taking a PNG of an icon and turning it into an SVG or SwiftUI vector format. I built a skill to handle this, it still isn't perfect but the results are good with a few iterations. 👉 https://t.co/jOlVn8tzWt

1

0

10

a_drdro retweeted

The New Stack

@thenewstack

about 1 year ago

DataStax AI Platform Drives Enterprise Agents on NVIDIA Blackwell GPUs via @DataStax, by @a_drdro https://t.co/RThfDPudyf

0

1

325

Who to follow

Bonjour Hej Hallo Ciao

Thomas Hall 🔰

@ThomasBHall

Tax is the most powerful policy lever: take tax away from work and production, and on to natural monopolies. Policy is pointless without a body politic.

Alejandro Cantarero @a_drdro

about 1 year ago

GAME-CHANGING news from @DataStax at #GTC25! 🚀 Introducing Astra DB Hybrid Search 💥 With @NVIDIA NeMo Retriever Text Reranking, Astra DB Hybrid Search boosts relevance by 45%, delivering smarter, more accurate AI responses. Learn more: https://t.co/hAYmDzWqWi

0

1

0

271

Alejandro Cantarero @a_drdro

over 1 year ago

Really cool development from @spyced to address concerns we've seen around latency on colBERT implementations. Worth a look!

Jonathan Ellis

@spyced

over 1 year ago

ColBERT Live! is a library that makes your existing vector database smarter with ColBERT multi-vector search. Get the relevance of ColBERT while respecting your ACLs and integrating with your metadata filtering. And introducing query embedding pooling! Article in thread.

spyced's tweet photo. ColBERT Live! is a library that makes your existing vector database smarter with ColBERT multi-vector search. Get the relevance of ColBERT while respecting your ACLs and integrating with your metadata filtering. And introducing query embedding pooling!

Article in thread. https://t.co/YdRApCZp7t

2

26

8

17

3K

0

2

1

0

288

a_drdro retweeted

LangChain

@LangChain

about 2 years ago

⭐️How Physics Wallah Uses LangChain, DataStax Astra DB, Vector Search and RAG to Revolutionize Education Physics Wallah built an AI Guru - a personalized AI tutor application, to assist students with their academic and support queries. https://t.co/xu3uOAU91j

LangChain's tweet photo. ⭐️How Physics Wallah Uses LangChain, DataStax Astra DB, Vector Search and RAG to Revolutionize Education

Physics Wallah built an AI Guru - a personalized AI tutor application, to assist students with their academic and support queries.

https://t.co/xu3uOAU91j https://t.co/fnoz9DTtWF

3

182

29

118

22K

Alejandro Cantarero @a_drdro

about 2 years ago

Building GenAI apps just got 100x easier and more fun. ⛓🎉🤝 Langflow joins DataStax! Learn how Langflow’s visual framework for building RAG apps with LangChain helps developers get their wild new GenAI ideas into production — quick. https://t.co/kF6iwMwtcI

a_drdro's tweet photo. Building GenAI apps just got 100x easier and more fun. ⛓🎉🤝 Langflow joins DataStax!

Learn how Langflow’s visual framework for building RAG apps with LangChain helps developers get their wild new GenAI ideas into production — quick. https://t.co/kF6iwMwtcI https://t.co/9TOhg4mJ8z

0

17

a_drdro retweeted

LangChain

@LangChain

over 2 years ago

@AstraDB Cassandra-based Vector DB 🌌 Explore the langchain-astradb integration package! Harness the serverless, vector-capable database built on Apache Cassandra for scalable NoSQL solutions enhanced with the power of vector similarity search. https://t.co/0NlMqatiqB

LangChain's tweet photo. @AstraDB Cassandra-based Vector DB 🌌

Explore the langchain-astradb integration package! Harness the serverless, vector-capable database built on Apache Cassandra for scalable NoSQL solutions enhanced with the power of vector similarity search.

https://t.co/0NlMqatiqB https://t.co/NCR2Qmpf43

0

24

15

6

2K

a_drdro retweeted

LlamaIndex 🦙

@llama_index

over 2 years ago

We’re excited to feature the AI Chatbot Starter 🤖🧰 - a web server powered by AstraDB (@DataStax) and @llama_index that you can easily spin up to chat over any web documentation 🌐📑 Easily use this as a standalone service or integrate it with your full-stack application. Set it up by simply setting a few credentials and ingesting your documents. A huge shoutout to the @Datastax team for creating this: check it out below 👇 Repo: https://t.co/yENP00DorQ

5

181

45

157

24K

Alejandro Cantarero @a_drdro

over 2 years ago

@ThawabHAlsubaie @skalskip92 @ClementDelangue @llama_index $6 per month is insanely expensive. Compare to Astra DB (https://t.co/BaXLo9Rx4A), $0.25 per month for 1 GB. 24x more expensive.

0

58

Alejandro Cantarero @a_drdro

over 2 years ago

@ThawabHAlsubaie @skalskip92 @ClementDelangue OpenAI is pretty limited here. Only 20 documents. No control over how they are searched. Insanely expensive storage (about 20x most DBs). Good enough for some experiments but can’t see running a production app on this.

1

0

54

a_drdro retweeted

Greg Kamradt

@GregKamradt

over 2 years ago

Pressure Testing GPT-4-128K With Long Context Recall 128K tokens of context is awesome - but what's performance like? I wanted to find out so I did a “needle in a haystack” analysis Some expected (and unexpected) results Here's what I found: Findings: * GPT-4’s recall performance started to degrade above 73K tokens * Low recall performance was correlated when the fact to be recalled was placed between at 7%-50% document depth * If the fact was at the beginning of the document, it was recalled regardless of context length So what: * No Guarantees - Your facts are not guaranteed to be retrieved. Don’t bake the assumption they will into your applications * Less context = more accuracy - This is well know, but when possible reduce the amount of context you send to GPT-4 to increase its ability to recall * Position matters - Also well know, but facts placed at the very beginning and 2nd half of the document seem to be recalled better Overview of the process: * Use Paul Graham essays as ‘background’ tokens. With 218 essays it’s easy to get up to 128K tokens * Place a random statement within the document at various depths. Fact used: “The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.” * Ask GPT-4 to answer this question only using the context provided * Evaluate GPT-4s answer with another model (gpt-4 again) using @langchain evals * Rinse and repeat for 15x document depths between 0% (top of document) and 100% (bottom of document) and 15x context lengths (1K Tokens > 128K Tokens) Next Steps To Take This Further: * Iterations of this analysis were evenly distributed, it’s been suggested that doing a sigmoid distribution would be better (it would tease out more nuanced at the start and end of the document) * For rigor, one should do a key:value retrieval step. However for relatability I did a San Francisco line within PGs essays. Notes: * While I think this will be directionally correct, more testing is needed to get a firmer grip on GPT4s abilities * Switching up prompt with vary results * 2x tests were run at large context lengths to tease out more performance * This test cost ~$200 for API calls (a single call at 128K input tokens costs $1.28) * Thank you to @charles_irl for being a sounding board and providing great next steps

GregKamradt's tweet photo. Pressure Testing GPT-4-128K With Long Context Recall

128K tokens of context is awesome - but what's performance like?

I wanted to find out so I did a “needle in a haystack” analysis

Some expected (and unexpected) results

Here's what I found:

Findings:
* GPT-4’s recall performance started to degrade above 73K tokens
* Low recall performance was correlated when the fact to be recalled was placed between at 7%-50% document depth
* If the fact was at the beginning of the document, it was recalled regardless of context length

So what:
* No Guarantees - Your facts are not guaranteed to be retrieved. Don’t bake the assumption they will into your applications
* Less context = more accuracy - This is well know, but when possible reduce the amount of context you send to GPT-4 to increase its ability to recall
* Position matters - Also well know, but facts placed at the very beginning and 2nd half of the document seem to be recalled better

Overview of the process:
* Use Paul Graham essays as ‘background’ tokens. With 218 essays it’s easy to get up to 128K tokens
* Place a random statement within the document at various depths. Fact used: “The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.”
* Ask GPT-4 to answer this question only using the context provided
* Evaluate GPT-4s answer with another model (gpt-4 again) using @langchain evals
* Rinse and repeat for 15x document depths between 0% (top of document) and 100% (bottom of document) and 15x context lengths (1K Tokens > 128K Tokens)

Next Steps To Take This Further:
* Iterations of this analysis were evenly distributed, it’s been suggested that doing a sigmoid distribution would be better (it would tease out more nuanced at the start and end of the document)
* For rigor, one should do a key:value retrieval step. However for relatability I did a San Francisco line within PGs essays.

Notes:
* While I think this will be directionally correct, more testing is needed to get a firmer grip on GPT4s abilities
* Switching up prompt with vary results
* 2x tests were run at large context lengths to tease out more performance
* This test cost ~$200 for API calls (a single call at 128K input tokens costs $1.28)
* Thank you to @charles_irl for being a sounding board and providing great next steps

200

4K

611

2K

1M

a_drdro retweeted

alex peysakhovich

@alex_peys

over 2 years ago

ran gpt4 128k context on the "1 useful document + K distractors" from our "attention sorting" paper, seems like the very long (more than 32k) doesn't work that well. 32k is still extremely impressive though! also claude2 clearly has some nice trick behind the scenes

alex_peys's tweet photo. ran gpt4 128k context on the "1 useful document + K distractors" from our "attention sorting" paper, seems like the very long (more than 32k) doesn't work that well. 32k is still extremely impressive though!

also claude2 clearly has some nice trick behind the scenes https://t.co/d0pr8cBWMF

8

197

27

56

47K

Alejandro Cantarero @a_drdro

over 2 years ago

@jibinmathew_69 ➡️Limited to 20 documents ➡️Expensive storage ($.20/GB/day vs $0.25/GB/month in Astra DB and other similar vector databases), about 25x more expensive ➡️ Black box on chunking methodology. Good for getting started, but not enough for production RAG.

0

23

Alejandro Cantarero @a_drdro

over 2 years ago

Seems good for spinning up some quick tests and getting good results on small sets of documents for Q&A, but probably isn't going to be good enough for production-scale RAG applications.

0

17

Alejandro Cantarero @a_drdro

over 2 years ago

Did OpenAI kill RAG with GPTs and the assistants API? Seeing lots of discussion on this. Great stuff in OpenAI's announcements but: #GenAI #LLMs

1

2

0

105

Alejandro Cantarero @a_drdro

over 2 years ago

➡ Limited to 20 documents ➡ Expensive storage ($.20/GB/day vs $0.25/GB/month in Astra DB and other similar vector databases), about 25x more expensive ➡ Black box on chunking strategies

1

0

29

Alejandro Cantarero

@a_drdro

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users