for domain specific agentic flows, SLM >> LLM
- reduces cost and latency
- agents need only a narrow subset of an LLM's capabilities with strict output format
https://t.co/Ms35QB8xPK
https://t.co/QhJqzDvLCL
I’ll add a few other random thoughts
- specialized re-rankers are going away
-more interest in max sim functions for candidate retrieval
- more interest in learned sparse reps
- bm25 is a baseline
- models that can output multiple reps in a single inference (eg m3)
- omnimodels
finetuning and LoRA causes catastrophic forgetting. context tuning methods like prefix tuning cannot learn new attention patterns. this might be features as it won't loose pretrained knowledge in the model.
https://t.co/YmDiuNtaP2
transferrable skill:
- system design
- first principle thinking
- project management
- unblocking yourself
- relentless focus on impact
- documentation and collaboration
- managing failures
- doing the right thing
beginners focuses on the first, experienced focuses on the latter
in software development, there are two types of skills:
non transferable skills:
- programming language
- coding tools, ide, deployment tools
- framework, build system, etc
Stocks don’t hit major peaks and bottoms every month, quarter or year rather once in a decade. For example (data from 2014-2024)-
1- Tata Motors-
2015- 587
2020- 65
2024- 1066 (so far)
2- Kotak Mahindra Bank
2014- 336
2021- 2171
2024- 1544 (so far)
3- Motilal
2014- 22
2018- 394
2020- 120
2024- 690 (so far)
4- ACE
2014- 12
2018- 200
2020- 32
2024- 1695 (so far)
Stocks give you enough opportunities and time to accumulate as well as to sell. If you understand valuation, cyclicality of sectors, quality of management, growth prospects, rerating/derating concepts, you can comfortably earn much more than the benchmarks.
@jobergum Agreed. But this issue will not occur with nDCG because by definition it will require judgements for all the retrieved documents, for each of the source. This problem will occur with metrics like categorical accuracy. Am I missing something here?
Super simple implementation of a vectordb. Looking for few folks to build this more as a side project.
Code: https://t.co/FlBDO14oy7
Blog: https://t.co/nOjcrsw2So
re: finetuning LLMs
0/ you don't always need it. try everything else first.
1/ lot of data + limited compute -> go for PEFT
2/ limited data + lot of compute -> full fine-tuning
3/ limited data + limited compute -> `import openai` ;)
Finetuning + RAG
- retrieve relevant context and finetune using those relevant context to generate final output
- best of both worlds
- needs smart data generation methods
*Fine Tuning*
- good for teaching complex instructions. not good for teaching base model knowledge (dont agree to this)
- improves model efficiency and reduces tokens
- knowledge transfer from large model to smaller models