🚀🚀 Super excited to share the latest benchmark results for our quantized BGE models.
A few weeks ago, these models were introduced with the aim of enhancing performance and efficiency for generating embeddings. And we've now conducted thorough comparisons between running PyTorch SentenceTransformers vs. our DeepSparse-optimized models on both a 10-core laptop and a 16-core AWS instance.
The benchmarks have yielded significant improvements in processing speed. For example, running the bge-small quantized model on the 10-core laptop, achieves up to a 3X increase in speedup. What's even better, is that when tested on a 16-core AWS instance, these models achieved up to a 5X improvement.
🤗 Updated model cards:
bge-small-quant: https://t.co/UnoYEwHFNv 6K+ downloads
bge-base-quant: https://t.co/VILmxvA1pA 2K+ downloads
bge-large-quant: https://t.co/hcPrbgq05D 2K+ downloads (#1 model for STS datasets on the MTEB leaderboard)
Don't forget to check out the DeepSparse repo https://t.co/eXEZMDsBIF for more information on benchmarking and running these models on the MTEB leaderboard. 💥
cc @neuralmagic
I love the #ChatGPT Cheat Sheet by Ricky Costa (@Quantum_Stat)
which includes
🔹NLP Tasks
🔹Code
🔹Structured Output Styles
🔹Unstructured Output Styles
🔹Media Types
🔹Meta ChatGPT
🔹Expert Prompting
Get your hands on this amazing resource at:https://t.co/Bg1roxcMFO
⚡IT HAPPENED!⚡
There's a new state-of-the-art sentence embeddings model for the semantic textual similarity task on Hugging Face's MTEB leaderboard 🤗!
Bge-large-en-v1.5-quant was the model I quantized in less than an hour using a single CLI command using Neural Magic's open
⚡Getting to Know the NPZ file format to Compress BGE Embedding Models ⚡
For One-Shot Quantization (INT8), Sparsify relies on the .npz format for data storage, a file format rooted in the mighty NumPy library.
Check the image below for an example of what I'm discussing 👇 We are soon releasing a notebook with an end-to-end example for anyone to replicate the compressed bge models which achieve great accuracy results on the MTEB Leaderboard.
@mrm8488@huggingface Hey @mrm8488 , if are ever interested in training a model on financial sentiment from tweets check out my dataset, it's currently #1 twitter finance dataset: https://t.co/MF5zs40dDL
source library Sparsify! Not only is it ONNX and INT8 quantized (faster and lighter) but is able to run on CPUs using DeepSparse! 💥
cc @neuralmagic
Model: https://t.co/gi4exnSp0C
Exciting News! 🚀 DeepSparse is now integrated with @langchain , opening up a world of possibilities in Generative AI on CPUs. Langchain, known for its innovative design paradigms for large language model (LLM) applications, was often constrained by expensive APIs or cumbersome GPUs.
But with Neural Magic's DeepSparse integration, developers can now accelerate their models on CPU hardware, making it a breeze to create powerful Langchain applications.
Langchain Doc link: https://t.co/JVzAGoxsXs
DeepSparse Langchain Blog: https://t.co/6VR0bIuSfz
cc @hwchase17@neuralmagic
🌟First, want to thank everyone for pushing this model past 1,000 downloads in only a few days!! Additionally, I added bge-base models to MTEB.
Most importantly, code snippets were added for running inference in the model cards for everyone to try out!
https://t.co/NZO7DPGubb