Miguel Jimenez

over 2 years ago

Need to take this in account, because the cost of host a model vs let process our data by openAI. Maybe if the quantity of tokens related to CPU and memory metrics do not consume so much tokens like the logs. The is faster to use openai for vectorize them

Virat Singh

@virattt

over 2 years ago

Exploring LLM Pricing With so many new LLMs, how do API costs compare? I delved into cost comparisons of models that I would use in production. Main takeaways: • cohere leads with cost-effective model • gpt-3.5 remains excellent value • mistral cost higher than anticipated • gemini 1.0 pro is pleasant surprise • gpt-4 is very expensive Models are ranked by input cost, asc.

virattt's tweet photo. Exploring LLM Pricing

With so many new LLMs, how do API costs compare?

I delved into cost comparisons of models that I would use in production.

Main takeaways:
• cohere leads with cost-effective model
• gpt-3.5 remains excellent value
• mistral cost higher than anticipated
• gemini 1.0 pro is pleasant surprise
• gpt-4 is very expensive

Models are ranked by input cost, asc.

23

380

65

415

84K

0

26

over 2 years ago

This is interesting, because some data i dumped from metrics is on tabular format. Also want to analyze some other data like list of resources from AWS.

elvis

@omarsar0

over 2 years ago

LLMs on Tabular Data An overview of LLMs for tabular data tasks including key techniques, metrics, datasets, models, and optimization approaches. Also covers limitations and unexplored ideas with insights for future research directions.

omarsar0's tweet photo. LLMs on Tabular Data

An overview of LLMs for tabular data tasks including key techniques, metrics, datasets, models, and optimization approaches.

Also covers limitations and unexplored ideas with insights for future research directions.

5

610

87

547

50K

0

1

0

1

147

mjimeneztrilogy retweeted

elvis

@omarsar0

over 2 years ago

Datasets for LLMs A comprehensive overview (180+ pages) and analysis of LLM datasets.

8

1K

271

1K

91K

over 2 years ago

Do you know any other ARM hardware like rpi to train or run models? do not want to spend too much money on an apple silicon machine.

0

30

mjimeneztrilogy retweeted

over 2 years ago

I highly recommend @_nerdai_'s notebooks and diagrams for any RAG developer, whether you're a beginner or expert. This notebook is a comprehensive introduction to RAG for those just getting into the space 👇

0

196

29

145

18K

over 2 years ago

Need to check that, because i am only dumping the data to qdrant vector store

LlamaIndex 🦙

@llama_index

over 2 years ago

In naive RAG, the vector db is primarily used for unstructured data. Yet most use cases involve both unstructured and structured data - you want a RAG architecture that can query both data and store it in the same db! Excited to feature this new blog post by @ClickHouseDB which outlines a RAG architecture over structured/unstructured data. ✅ It uses @llama_index query capabilities combining both structured (StackOverflow) and unstructured (HN) data (SQLAutoVectorQueryEngine) ✅ It uses @ClickHouseDB as a single storage layer for both structured tables and unstructured text chunks. Check it out! Blog: https://t.co/GoMVZlsCoB

llama_index's tweet photo. In naive RAG, the vector db is primarily used for unstructured data. Yet most use cases involve both unstructured and structured data - you want a RAG architecture that can query both data and store it in the same db!

Excited to feature this new blog post by @ClickHouseDB which outlines a RAG architecture over structured/unstructured data.
✅ It uses @llama_index query capabilities combining both structured (StackOverflow) and unstructured (HN) data (SQLAutoVectorQueryEngine)
✅ It uses @ClickHouseDB as a single storage layer for both structured tables and unstructured text chunks.

Check it out!

Blog: https://t.co/GoMVZlsCoB

2

292

56

214

35K

0

17

mjimeneztrilogy retweeted

over 2 years ago

Different types of queries require different retrieval parameters. For instance, in hybrid search, the optimal alpha value for a user entering keywords would be close to 0, while the optimal alpha value for a user asking natural language questions would be close to 1. A cool idea here is you can create a dynamic query interface by having the LLM pick the optimal alpha parameter for hybrid search depending on the query! Check out our Koda Retriever pack for more details: https://t.co/qP2fdbd34h

jerryjliu0's tweet photo. Different types of queries require different retrieval parameters. For instance, in hybrid search, the optimal alpha value for a user entering keywords would be close to 0, while the optimal alpha value for a user asking natural language questions would be close to 1.

A cool idea here is you can create a dynamic query interface by having the LLM pick the optimal alpha parameter for hybrid search depending on the query!

Check out our Koda Retriever pack for more details: https://t.co/qP2fdbd34h

4

63

20

60

13K

mjimeneztrilogy retweeted

Xingyu Lin @Xingyu2017

over 2 years ago

What state representation should robots have? 🤖 I’m thrilled to present an Any-point Trajectory Model (ATM), which models physical motions from videos without additional assumptions and shows significant positive transfer from cross-embodiment human and robot videos! 🧵👇

10

542

100

256

123K

mjimeneztrilogy retweeted

Alexander Wan @alexwan55

over 2 years ago

What happens when RAG models are provided with documents that have conflicting information? In our new paper, we study how LLMs answer subjective, contentious, and conflicting queries in real-world retrieval-augmented situations.

alexwan55's tweet photo. What happens when RAG models are provided with documents that have conflicting information?

In our new paper, we study how LLMs answer subjective, contentious, and conflicting queries in real-world retrieval-augmented situations. https://t.co/3Re50rXEUD

10

300

48

268

51K

mjimeneztrilogy retweeted

Sumit @_reachsumit

over 2 years ago

Health-LLM: Personalized Retrieval-Augmented Disease Prediction Model Proposes a framework integrating LLMs and medical expertise to enhance exploitation of health reports for disease prediction and preventative care. 📝https://t.co/FeFimU3rxz 👨🏽‍💻https://t.co/lQDElymqmd

_reachsumit's tweet photo. Health-LLM: Personalized Retrieval-Augmented Disease Prediction Model

Proposes a framework integrating LLMs and medical expertise to enhance exploitation of health reports for disease prediction and preventative care.

📝https://t.co/FeFimU3rxz
👨🏽‍💻https://t.co/lQDElymqmd https://t.co/3Vyz8u5bgn

6

138

41

103

14K

mjimeneztrilogy retweeted

elvis

@omarsar0

over 2 years ago

Accelerating the Science of Language Models This is huge! @allen_ai just released its first Open Language Model (OLMo), a 7B parameter model. It includes open training code, open data, full model weights, evaluation code, and fine-tuning code. It shows strong performance on many generative tasks. There is also a smaller version of it, OLMo 1B. I will be writing a complete prompting guide for this. Stay tuned! This is brilliant work and will ignite even more research opportunities for the AI community.

omarsar0's tweet photo. Accelerating the Science of Language Models

This is huge!

@allen_ai just released its first Open Language Model (OLMo), a 7B parameter model.

It includes open training code, open data, full model weights, evaluation code, and fine-tuning code.

It shows strong performance on many generative tasks.

There is also a smaller version of it, OLMo 1B.

I will be writing a complete prompting guide for this. Stay tuned!

This is brilliant work and will ignite even more research opportunities for the AI community.

13

1K

247

792

99K

mjimeneztrilogy retweeted

over 2 years ago

Adding Agentic Layers to RAG Here’s an in-depth presentation showing you how to add agentic layers to your RAG pipeline - better understand/intuit how using increasingly more tokens leads to increasingly higher QA performance! 🪜 Helps you better plan for your use case vs. just plopping in a pre-built agent implementation. We gave this talk at the AI User Conference yesterday, and we’re making it available with full @llama_index guides in each section. Check it out 👇 https://t.co/O6Q1f5LAyd

jerryjliu0's tweet photo. Adding Agentic Layers to RAG

Here’s an in-depth presentation showing you how to add agentic layers to your RAG pipeline - better understand/intuit how using increasingly more tokens leads to increasingly higher QA performance! 🪜

Helps you better plan for your use case vs. just plopping in a pre-built agent implementation.

We gave this talk at the AI User Conference yesterday, and we’re making it available with full @llama_index guides in each section. Check it out 👇

https://t.co/O6Q1f5LAyd

0

211

55

195

35K

over 2 years ago

Need to test this approach, and check what about ollama

over 2 years ago

Check out this awesome architecture by @Otmane404 on how to spin up a fully local RAG API - serving all your edge device/data privacy needs 🔥 Stack: @llama_index, @qdrant_engine, @ollama, @FastAPI Blog: https://t.co/htVmO7DLZB Repo: https://t.co/Yy2mMbwT4e

jerryjliu0's tweet photo. Check out this awesome architecture by @Otmane404 on how to spin up a fully local RAG API - serving all your edge device/data privacy needs 🔥

Stack: @llama_index, @qdrant_engine, @ollama, @FastAPI

Blog: https://t.co/htVmO7DLZB

Repo: https://t.co/Yy2mMbwT4e https://t.co/iRJJP4wTNW

1

165

28

174

27K

0

2

0

55

over 2 years ago

Because we need to load data from prometheus for the server metrics, decided to create a prometheus reader for llamahub, to be able to load the data with llamaindex, already sent my PR: https://t.co/nRc4TXiYH3 Hoping gets approved soon.

0

17

mjimeneztrilogy retweeted

Xavier (Xavi) Amatriain

@xamat

over 2 years ago

As many of you know, over the past few months I have been sharing Prompt Engineering resources in different forms. I have now compiled them all into a cohesive publication and uploaded to arxiv: https://t.co/7TZgGF67lj

xamat's tweet photo. As many of you know, over the past few months I have been sharing Prompt Engineering resources in different forms. I have now compiled them all into a cohesive publication and uploaded to arxiv: https://t.co/7TZgGF67lj https://t.co/IaXJIEAC2h

5

411

93

468

42K

mjimeneztrilogy retweeted

Rishabh Srivastava

@rishdotblog

over 2 years ago

Some nerd stuff! 🤓 - If you want a very simple way to play with it, check out our Github repo: https://t.co/zgMT0C2g4q - You'll need a TON of VRAM to this fast. We've found doing AWQ quantizations a really good to keep accuracy high while keeping latency and VRAM low. Would recommend! - Larger models have so much more "intelligence" than smaller ones. This had a very similar dataset as our 34B model, but performs significantly better. It does have some know flaws and weirdly trips up on certain words (like "domain", strangely) – but we'll fix that soon

7

40

3

29

12K

mjimeneztrilogy retweeted

Rishabh Srivastava

@rishdotblog

over 2 years ago

We just opened sourced SQLCoder-70B! It outperforms all publicly accessible LLMs for Postgres text-to-SQL generation by a very wide margin. SQLCoder is finetuned on @AIatMeta's CodeLlama-70B model that was released yesterday on less than 20,000 hand-curated prompt completion pairs. You can find it on @huggingface at https://t.co/abHsKzWYGW. This follows our 15B, 7B, and 34B models – and is the most capable of them all. The model has a cc-by-sa-4 license, which means that you are free to use it as is for any use (including commercial) as long as you also open-source any changes to you make to it (i.e., if you fine-tune it further).

rishdotblog's tweet photo. We just opened sourced SQLCoder-70B! It outperforms all publicly accessible LLMs for Postgres text-to-SQL generation by a very wide margin.

SQLCoder is finetuned on @AIatMeta's CodeLlama-70B model that was released yesterday on less than 20,000 hand-curated prompt completion pairs.

You can find it on @huggingface at https://t.co/abHsKzWYGW. This follows our 15B, 7B, and 34B models – and is the most capable of them all.

The model has a cc-by-sa-4 license, which means that you are free to use it as is for any use (including commercial) as long as you also open-source any changes to you make to it (i.e., if you fine-tune it further).

58

2K

268

1K

281K

mjimeneztrilogy retweeted

Zhongyu Li

@ZhongyuLi4

over 2 years ago

Interested in making your bipedal robots to be athletes? We summarized our RL work to create robust & adaptive controllers for general bipedal skills. 400m-dash, running over terrains/against perturbations, targeted jumping, compliant walking, not a problem for bipeds now.🧵👇

16

446

84

136

87K

mjimeneztrilogy retweeted