prettyprompt is the beginning of a collection of tools for cleaning, sanitising and converting inputs to prompts:
https://t.co/YWGGXlC4hs
These may be useful alongside @langchain apps like Microllama. Issue #1 is prompt injection detection.
Microllama now supports streaming responses. Working this out took me as long as the original @langchain / @FastAPI proof of concept! I've moved sources first to 1. highlight authority of response (vs standard chatGPT) 2. give you something to look at while the answer loads.
Here's Microllama, which is intended to be the easiest possible way of deploying a talk-to-your-content API:
https://t.co/q7PlrSSLyX
It leans heavily on @langchain. IMO the most interesting bit is to do with baking the vector index into the container at build time.
I'm hosting my test Microllamas on Google Cloud Run, which feels like the perfect platform for Langchain-style apps (OpenAI wrappers + FAISS indexes). Hosting is basically free until you're famous, and then OpenAI costs are your problem.