Just dropped a new episode of the #dp100#azureml course in the @daertml academy YouTube channel... We are training our first model in the cloud! Enjoy! 🚀
https://t.co/n8815snFTX
The #Daert#Academy is LIVE 🔴 in @YouTube !
Check our first video on how to train a #LLaMA2 model with a single 3090 GPU!
Any suggestion/improvement/comment/question/request is well appreciated! Like, Sub and hit the Bell button!!
https://t.co/C1dqZy9g53
The #Daert#Academy is LIVE 🔴 in @YouTube !
Check our first video on how to train a #LLaMA2 model with a single 3090 GPU!
Any suggestion/improvement/comment/question/request is well appreciated! Like, Sub and hit the Bell button!!
https://t.co/C1dqZy9g53
⚡ txtai 6.1 is out ⚡
This release adds metadata support for client-server databases (Postgres, MariaDB/MySQL, MSSQL) and custom scoring implementations.
https://t.co/t6KZHx3y8G
I've generated 70M tokens of extremely high quality synthetic textbooks - https://t.co/HvyojqTpUz , using retrieval and gpt-3.5.
Seriously, the quality is 💯.
I'm generating 1B tokens, but will use llama for $$ reasons. Please DM if you can sponsor compute or credits.
@LucSGeorges I see there is no HTTP support; and from reproducing the request that VSCode does... I guess you can not use self signed certificates? Is there any way to accept self signed certificates without modifying the source code of llm-vscode? Thanks in advance!! (getting closer...)
@LucSGeorges New behavior is happening (as I have closed vscode and reopened): "Client is not running" error at the right down corner of my vscode; it appears each time I write a character in the document. Somehow, the API server is running.
Also, found this log:
@LucSGeorges AFAIK, llm-vscode will send a POST to the custom URL that we set in the config (ctrl+,). This is the Flask server that we are testing with to wrap a local call to llama.cpp (listening as a REST server as well):
(Yep, 172.17.0.2 is the docker running llama; manually calling work)
0.1.0 is out - huggingface-vscode is dead, long live llm-vscode! 👑
llm-vscode is an open source extension integrating LLMs inside your favorite code editor with the goal of improving your efficiency and productivity.
🦀 It is now powered by llm-ls, our Rust language server.
Gartner has just published one of their famous Hype Cycle curves for 2023, specialized to Generative AI!
The curve is encouragingly optimistic on Vector Databases, predicting we still have 5-10 years to hit the infamous Peak of Inflated Expectations!
Here are 5 reasons why I also think Vector Databases are still an upcoming technology and no where near the peak: (1) RAG + "GPT-5", (2) RAG + Easy Fine-Tuning, (3) Easy Data Ingestion, (4) Generative Feedback Loops, and (5) Self-Driving DBs
1. RAG + "GPT-5"
As a TLDR, the next-generation "GPT-5" will likely be a long context LLM. There is a huge opportunity to pre-train these kind of LLMs with more naturally long context data such as podcast transcriptions or code. It is also quite likely that these models have some retrieval-aware tuning as well to prevent hallucination to retrieved context.
Many people currently come to Vector DBs with a classic kind of "I have a 20-30 page PDF that I can't fit to ChatGPT". This is missing the point of the Zero-Shot LLM + RAG in my opinion, don't just give the LLM your document -- give it the background knowledge as well! This is why I am a huge fan of the work in LlamaIndex and LangChain to pioneer query engineers across multiple search indexes.
2. RAG + Easy Fine-Tuning
The tooling for fine-tuning is getting really strong, quick hat tip to HuggingFace, MosaicML, and Weights & Biases.
Imagine you are a lawyer. In addition to having the relevant laws you need to solve a case, you also need to have the skill of making the case. Making the case could entail surface level "style" (the current most common argument for this) or more complex compositional generalization that may be only possible to represent in high-dimensional data structures with non-linear interaction effects.
RAG is a fundamental modeling architecture that is perfectly amenable to fine-tuning. RAG generally adds (1) interpretability (you can see the docs that influenced the prediction, not 100% linked ofc), (2) parameter efficiency (by decomposing retriever-reader you get away with cheaper readers e.g. ATLAS), (3) continual updating (keeping the data as fresh in a parametric only LLM as a Kafka stream or what have you is unlikely - the unlearning stuff is cool though).
RAG + Fine-Tuning also has a massive opportunity to create better search by training the search models end-to-end with the gradients from the reader. RLHF back to the embedding models (maybe rankers can be trained like this as well).
3. Easy Data Ingestion
Parsing PDFs into both unstructured text and structured layout information will ofc dramatically facilitate how many people can use Vector DBs. The tooling here is also getting incredibly strong thanks to Unstructured, LlamaIndex, and LangChain. Connecting this with your Twitter APIs, web scrapers, etc. through scheduled Cron Jobs will be amazing.
4. Generative Feedback Loops
RAG innovates on the output from DBs, Generative Feedback Loops (where we save generated or transformed data from an LLM back into the database), will innovate on the fundamental - What's in the database?
This will really get us to the peak in my opinion because it will also evangelize everyone having 100M+ vectors on their laptops (if managing their own personal DB) or say this kind of thing in a knowledge management platform like Notion / Confluence / GitHub / HuggingFace. Scale will really unlock the value of Vector DBs, not that I really agree with the Numpy is all you need argument anyways that overlooks the CRUD compatibility, cloud scaling, symbolic properties, search features like hybrid / filtering, etc. TLDR - AI will take your documents you like, pictures, movies, songs --- and create more of it! You will then need databases to navigate this explosion of content!
For this reason I think it is also important to think of Vector DBs as traditional DBs + Search Engines + Recommendation Systems -- because Recommendation Systems have a bit of nuance vs. Search only with user representation and more use of symbolic re-rankers like XGBoost -- also potentially an explore-exploit RL component to the recommender (kudos to whoever builds that).
5. Self-Driving DBs
Gorilla is an exciting research project that translates natural language commands to API syntax. Text-to-SQL is making a ton of progress! I think this will not only generalize to Text-to-SQL, but also the structuring of data with e.g. properties, tables, key join -- learned by monitoring and maintaining the system. I also think it's possible to use LLMs to optimize lower level physical storage configurations.
Between RAG + Next-Gen Zero-Shot LLMs, RAG + Easy Fine-Tuning, Easy Data Ingestion, Generative Feedback Loops, and Self-Driving DBs -- I think Gartner is right and we still have a long way to go to the peak of Vector Databases!
Thanks for reading! Check out Weaviate! 😎👍
We are glad to announce the addition of models that can implement #qlearning from a pseudocode description to solve #gym environments from @OpenAI . We provide code to solve FrozenLake:
https://t.co/xUrbI97UCH
@ThomasSimonini 👀
We are glad to announce the addition of models that can implement #qlearning from a pseudocode description to solve #gym environments from @OpenAI . We provide code to solve FrozenLake:
https://t.co/xUrbI97UCH
@ThomasSimonini 👀
We are glad to announce the development of a new #agent 🕵️ that solves the Mountain Car 🏔️🚙 @OpenAI gym environment; by using an #LLM that generates code to follow an strategy:
https://t.co/j0vV5hjFXK
At @daertml, we just quantized 🧮 the new models phi-1 and phi-1.5 from @Microsoft; this is possibly the tiniest 🔬 and best performing #LLM out there! And we made it even tinier using the #bitsandbytes lib! Check them out:
- https://t.co/IVtPsq8V2J
- https://t.co/9PBNFqyZKe
Evaluating the Ripple Effects of Knowledge Editing in Language Models
paper page: https://t.co/dhPG2ERHMi
Modern language models capture a large body of factual knowledge. However, some facts can be incorrectly induced or become obsolete over time, resulting in factually incorrect generations. This has led to the development of various editing methods that allow updating facts encoded by the model. Evaluation of these methods has primarily focused on testing whether an individual fact has been successfully injected, and if similar predictions for other subjects have not changed. Here we argue that such evaluation is limited, since injecting one fact (e.g. ``Jack Depp is the son of Johnny Depp'') introduces a ``ripple effect'' in the form of additional facts that the model needs to update (e.g.``Jack Depp is the sibling of Lily-Rose Depp''). To address this issue, we propose a novel set of evaluation criteria that consider the implications of an edit on related facts. Using these criteria, we then construct , a diagnostic benchmark of 5K factual edits, capturing a variety of types of ripple effects. We evaluate prominent editing methods on , showing that current methods fail to introduce consistent changes in the model's knowledge. In addition, we find that a simple in-context editing baseline obtains the best scores on our benchmark, suggesting a promising research direction for model editing.