Frank Doyle

@lacronicus

I work for PlantVillage (out of Penn State University), using AI to help low-income/subsistence farmers deal with climate change, diseases, and pests.

San Antonio

Joined November 2012

119 Following

88 Followers

156 Posts

Frank Doyle @lacronicus

almost 2 years ago

@farzyness The co-sponsor of this bill, my congressional representative, has openly stated he wants to "ethnically cleanse" me from his district. Why should I believe he's not going to find some way to use this to deny me my vote? Why should I support *anything* he wants to do?

lacronicus's tweet photo. @farzyness The co-sponsor of this bill, my congressional representative, has openly stated he wants to "ethnically cleanse" me from his district. Why should I believe he's not going to find some way to use this to deny me my vote? Why should I support *anything* he wants to do? https://t.co/uwZ4QbMhcN

Frank Doyle @lacronicus

almost 2 years ago

@EdvinMilos @wholemars I am astounded by the number of people saying "crosswalks don't count if you're going too fast to see the people."

Frank Doyle @lacronicus

about 2 years ago

@JHochderffer @ylecun seems a bit self-important to assume no being in the universe could ever have a more general intelligence than what humans have right now. And if "human level intelligence" isn't the limit, why should we treat it like it is?

Frank Doyle @lacronicus

over 2 years ago

@anuswaram @MAbbas003 Ok, but that's worse. You do see how that's worse, right?

168

Who to follow

Lyon Mushira

@lyonmushira1

Forester/Enviromentalist/Climate change activist/Reseacher @plantVillage @dreamTeamke Businessman Multi-talented Singer & performing artist/actor SAWBO PARTNER

AgnesKa

@PlantVillageUG

Improving the livelihoods of smallholder farmers.

Imboko

@ElvisLawrence1

Farmer

lacronicus retweeted

Isaac King 🔎 @IsaacKing314

over 2 years ago

Elaborate jokes that will only be understood by like 5 people on the planet are the actual best

349

31K

lacronicus retweeted

Cameron R. Wolfe, Ph.D.

@cwolferesearch

over 2 years ago

What’s the easiest way to specialize an LLM over your own data? Recent research has studied this problem in depth, and RAG is way more effective (and easier to implement) compared to extended pretraining or finetuning… Knowledge from pretraining. A lot of factual information is inherently present within an LLM’s pretrained weights, but the knowledge possessed by these models is highly dependent upon the characteristics of their pretraining data. Unfortunately, this means that—at least in the current paradigm of LLMs—the knowledge base of these models is static (e.g., ChatGPT has a knowledge cutoff date) and may lack detailed information. Knowledge injection. Given a pretrained LLM, there are two postprocessing techniques that we can use for injecting new data into the LLM’s knowledge base: - Finetuning: continuing the model’s pretraining process over a smaller, domain-specialized corpus of new information. - Retrieval Augmented Generation (RAG): modifying the LLM’s input query by retrieving relevant information that can be leveraged by the model via in-context learning to generate a more grounded/factual output. The variant of finetuning referenced above is a continued pretraining style of finetuning, where a next token prediction objective is used to further train a pretrained model over a specialized corpus of text. In contrast, SFT and RLHF emphasize the quality of model responses rather than improving the LLM’s breadth of knowledge. “Given some knowledge base in the form of a text corpus, what is the best way to teach a pre-trained model this knowledge?” - from [1] Recent research. In [1], authors compare RAG and finetuning to determine the superior knowledge injection approach. The RAG setup uses vector search to retrieve relevant document chunks to include in the model’s prompt. Given a corpus of information, we can: 1. Divide this corpus into chunks of text. 2. Use an embedding model (e.g., bge-large-en) to generate a dense vector for each chunk of text. 3. Search for relevant chunks by embedding the model’s input and performing a vector search. 4. Add relevant chunk’s into the model’s prompt. What do we learn? While finetuning does improve model performance, RAG consistently outperforms finetuning for the injection of both new and previously encountered knowledge. Put simply, LLMs struggle to learn new information through finetuning. Though finetuning does yield a benefit in performance relative to the base model, RAG has a significant advantage over finetuning. Combining RAG with finetuning—though effective in some cases—does not consistently benefit performance. Finetuning with paraphrases. We can improve the performance of finetuning for knowledge injection by training the model over several different paraphrases of the same information. In order to teach an LLM new information via finetuning, we must repeat this information in numerous ways. —— [1] Ovadia, Oded, et al. "Fine-tuning or retrieval? comparing knowledge injection in llms." arXiv preprint arXiv:2312.05934 (2023).

cwolferesearch's tweet photo. What’s the easiest way to specialize an LLM over your own data? Recent research has studied this problem in depth, and RAG is way more effective (and easier to implement) compared to extended pretraining or finetuning…

Knowledge from pretraining. A lot of factual information is inherently present within an LLM’s pretrained weights, but the knowledge possessed by these models is highly dependent upon the characteristics of their pretraining data. Unfortunately, this means that—at least in the current paradigm of LLMs—the knowledge base of these models is static (e.g., ChatGPT has a knowledge cutoff date) and may lack detailed information.

Knowledge injection. Given a pretrained LLM, there are two postprocessing techniques that we can use for injecting new data into the LLM’s knowledge base:

- Finetuning: continuing the model’s pretraining process over a smaller, domain-specialized corpus of new information.
- Retrieval Augmented Generation (RAG): modifying the LLM’s input query by retrieving relevant information that can be leveraged by the model via in-context learning to generate a more grounded/factual output.

The variant of finetuning referenced above is a continued pretraining style of finetuning, where a next token prediction objective is used to further train a pretrained model over a specialized corpus of text. In contrast, SFT and RLHF emphasize the quality of model responses rather than improving the LLM’s breadth of knowledge.

“Given some knowledge base in the form of a text corpus, what is the best way to teach a pre-trained model this knowledge?” - from [1]

Recent research. In [1], authors compare RAG and finetuning to determine the superior knowledge injection approach. The RAG setup uses vector search to retrieve relevant document chunks to include in the model’s prompt. Given a corpus of information, we can:

1. Divide this corpus into chunks of text.
2. Use an embedding model (e.g., bge-large-en) to generate a dense vector for each chunk of text.
3. Search for relevant chunks by embedding the model’s input and performing a vector search.
4. Add relevant chunk’s into the model’s prompt.

What do we learn? While finetuning does improve model performance, RAG consistently outperforms finetuning for the injection of both new and previously encountered knowledge. Put simply, LLMs struggle to learn new information through finetuning. Though finetuning does yield a benefit in performance relative to the base model, RAG has a significant advantage over finetuning. Combining RAG with finetuning—though effective in some cases—does not consistently benefit performance.

Finetuning with paraphrases. We can improve the performance of finetuning for knowledge injection by training the model over several different paraphrases of the same information. In order to teach an LLM new information via finetuning, we must repeat this information in numerous ways.

——
[1] Ovadia, Oded, et al. "Fine-tuning or retrieval? comparing knowledge injection in llms." arXiv preprint arXiv:2312.05934 (2023).

827

153

118K

lacronicus retweeted

Carnegie Mellon University Africa

@cmu_africa

almost 3 years ago

APPLICATIONS ARE OPEN! Are you passionate about solving problems using technology? Apply for a Master of Science in Information Technology, Electrical and Computer Engineering, and Engineering Artificial Intelligence today. Apply here: https://t.co/UltEhJzSWG #ApplytoCMUAfrica

cmu_africa's tweet photo. APPLICATIONS ARE OPEN!
Are you passionate about solving problems using technology? Apply for a Master of Science in Information Technology, Electrical and Computer Engineering, and Engineering Artificial Intelligence today.
Apply here:
https://t.co/UltEhJzSWG

#ApplytoCMUAfrica https://t.co/76LhqbQRm6

126

15K

Frank Doyle @lacronicus

almost 3 years ago

@andrewchen Websites that don't use standard scrolling behavior.

Frank Doyle @lacronicus

almost 3 years ago

@ninetwofiveone @KevinNaughtonJr Are you really asking? Cause https://t.co/yEO81zNXEU https://t.co/ualB9eBcNm

Frank Doyle @lacronicus

almost 3 years ago

@berkeleygfx If this guy designed apple maps:

Frank Doyle @lacronicus

almost 3 years ago

@garrytan @DH_PlantVillage @IFPRI collaborated with @plantvillage to develop an ai-powered app to study nutrition in adolescents. The goal was to build an app that could compete with dedicated nutritionists. The results are impressive https://t.co/GgqeV0EMu2

Frank Doyle @lacronicus

almost 3 years ago

@garrytan @DH_PlantVillage ^^

104

Frank Doyle @lacronicus

almost 3 years ago

@TheJackForge I don't think it's *that* spicy. Disney basically made a whole movie agreeing with you.

Frank Doyle @lacronicus

almost 3 years ago

@Noahpinion @BigJohn2310 Then why are you using homeownership as an indicator of how millennials are doing in the housing market relative to past generations? It tells us nothing. Anybody can be well off if debt doesn't count.

Frank Doyle @lacronicus

almost 3 years ago

@TheGameLooters @SuperSaf corporations don't care about fixing problems, they care about *being seen* fixing problems. the former is expensive, so if they can get away with only doing the latter, they will.

170

Frank Doyle @lacronicus

almost 3 years ago

@tunguz When you're dealing with real world problems, simply knowing the solution isn't actually the same as solving it.

Frank Doyle @lacronicus

almost 3 years ago

@liron @HannahFrankman If you do the math, 40k over three years comes to about 35 a day. And I kinda don't believe someone sat and counted forty thousand questions for multiple kids, so I'm guessing, if this number isn't just made up, someone probably just took an average and expanded it to 3 years.

144

Frank Doyle @lacronicus

almost 3 years ago

@drgurner @HannahFrankman @baconsheikh Did you come across any other studies on this? That's the only one I could find. Everything else was just "studies show" fluff articles.

136

Frank Doyle @lacronicus

almost 3 years ago

@CarolinAramburo @manxbenji @HannahFrankman possibly relevant: 40k questions over 3 years comes to ~35 questions a day.

Frank Doyle @lacronicus

almost 3 years ago

@CarolinAramburo @manxbenji @HannahFrankman Those are books, not studies. It's impossible to have a discussion about data I can't actually look at. I can only find one study, and, skimming through it, it seems incredibly irresponsible to interpret it as the original tweet has.

Frank Doyle

@lacronicus

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users