Caching will make your LLM application cheaper and faster to run.
But caching is hard. As the famous saying goes, "There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors."
Here is how caching works at a very high level:
1. A new request comes in with a prompt.
2. The application checks whether an identical or similar prompt already exists in the cache.
3. If found, the application returns the cached response.
4. If not found, the application generates a new response for the prompt and caches it.
If you implement this right, you'll get two main benefits:
1. Your application will be much faster. Returning responses from the cache have much lower latency than generating the response with an LLM.
2. Your application will be much cheaper. You will be saving a ton of money in tokens.
However, implementing a robust caching system is a ton of work.
Here is an idea:
If you are using OpenAI’s models, Llama 3, Mixtral, or Gemma, take a look at CogCache. They are sponsoring this post:
https://t.co/Gq0EaUl5qc
CogCache is an out-of-the-box caching solution with intelligent caching: It will automatically cache and serve responses for semantically similar queries.
Some of the metrics:
• You'll get up to 100x faster response times.
• You'll save up to 50% in costs.
• They integrate with Groq for super fast response times.
• Lowest token price in the market thanks to their partnership with Microsoft.
They have a pay-as-you-go model, which is great for all sorts of businesses. And if you're an Azure customer, you can use your annual Azure commitment to cover your inference costs.
The attached image shows a Python example. Your code doesn't change at all, and you use the same OpenAI's Completion API, but now with cache enabled.
That's pretty sweet!
@dvassallo No need to extract or aggregate. Simply create a project on Claude, upload all the PDFs to it and start querying it. Then ask it to create an artifact to chart how specific variables have changes over time.
@girdley@pozzoron@girdley we've been to CDMX (and SMDA, Oaxaca, Puebla) many times without knowing a word in Spanish. Hitlist is here: https://t.co/wFqy9PEEme
This is a photo of the city of Tel Aviv in 1944.
Tel Aviv was founded in 1909 by 66 Jewish families.
The neighbor city of Jaffa had a Jewish community and Jewish history.
Palestine was the name of the region, but it was never a country.
Khalissee and many others are trying to hint that Jews didn't live here before the current state of Israel existed and to rewrite history, when it's well documented.
#ThePalestinianLie
There's no International Women’s Day without the freedom of ALL women.
Share and call for the immediate release of all hostages unconditionally.
#bringthemhomenow#bringthemhome
Millions of people watched this tonight, at @SuperBowl !
@StandUp2JewHate teamed up with Dr. Clarence Jones, activist and author of the best speech ever wrote, “I have a dream”. Robert Kraft graciously spent 7 mil for…30 sec of pure emotion ! #StanduptoJewishhate#SuperBowl
This Super Bowl ad that will be aired today, sponsored by FCAS, is a poignant reminder for every American that while it may not be popular, standing up against anti-Jewish hate is the only team we must be on.