I have worn many hats. I’m a herder of no cats. I'm a #healthcare#DataScientist who not-so-secretly also loves rocks and volcanoes. Opinions are just mine.
Now I know why this A.M. there was a door out of order in a brand new train. I've worked on starting up large scale eng projects, it's not that different from sending SW to production, 💩 happens even if you think you've thoroughly tested your system. The crew will sort it out
Hey @Caltrain I know we're about to travel on the new EMUs, but the old trains need some "love", in particular their RESTROOMS, they stink big time!! Riding on any car forces us to be disgusted for a good 45 minutes. Can the crew at least throw some disinfectant in those toilets?
Trump is selling bibles, and Fox is blaming the bridge collapse on our “broken border.”
There’s a big sector of our society that has absolutely lost its collective marbles.
Useful Claude 3 trick to help you visualize code better.
Paste some code in, and ask it to make a flowchart.
Then, paste the flowchart code into a Mermaid viewer, and you'll get a nice, understandable visualization of your code!
RAPTOR - a new tree-structured advanced RAG technique 🔥
A big issue with naive top-k RAG is that it retrieves low-level details best suited at answering questions over specific facts in the document.
But it struggles with any questions over higher-level context.
RAPTOR introduces a new tree-structured technique, which hierarchically clusters/summarizes chunks into a tree structure containing both high-level and low-level pieces. This lets you dynamically surface high-level / low-level context depending on the question.
For those of you who remember this is basically the grown-up version of gpt index v0, our beloved tree index 🌲
Thanks to MarouaneMaatouk and @LoganMarkewich, you can now access this as a LlamaPack in @llama_index 👇
Pack: https://t.co/BRWfIykPew
Notebook: https://t.co/98pxhT8AxR
Source paper: https://t.co/4AjPiTp47s
A lot of newer RAG techniques involve some form of query analysis - taking the raw user query and converting it into a more optimized version
We've added a bunch of new docs on this, including implementations of a bunch of techniques as well as some how-to guides
💼 Retrieve SEC Filings with LangChain
RAG infra for financial data is hard to get right. We need to parse tables, normalize entities, hybrid search, and experiment a lot.
With @Kaydotai and @CybersynInc, you can search SEC embeddings with an API ⚡️
https://t.co/pG56Yf8YjA
Radiology-Llama2: Best-in-Class LLM for Radiology
Presents an LLM based on Llama 2 tailored for radiology. It's tuned on a large dataset of radiology reports to generate coherent and clinically useful impressions from radiology findings.
This an interesting use case of how to leverage instruction tuning for what I consider a hard and knowledge-intensive domain.
A few models are assessed and compared including the proposed model, Llama 2, Dolly, ChatGPT, and Alpaca. It seems that Radiology-Llama2 produces the top performance on a couple of benchmarks.
Great insights included!
https://t.co/6GUcRJfTho
This talk is the most complete version of my thinking around Large Language Models to date, including thoughts on personal AI ethics, practical applications and the enormous impact we are already feeling from Llama 2
Video, slides, transcript & links: https://t.co/hluBrcNcYt
Matt Mochary is a Silicon Valley legend.
He's coached the founders of OpenAI, Notion, Rippling, Robinhood, Coinbase, Reddit, @naval, and many others.
His entire course is open-sourced, even the templates. Here's a link 👇
https://t.co/deeeu5WmCj
It’s still important to ship fast to learn.
But you won’t learn anything if your product is too buggy.
The key is to cut product scope so much that you actually can ship something bug-free.
Cut every non-essential feature.
Ship fast, with small scope, high quality.
"Automated systems are often being used to extend bureaucracy, adding additional places to deflect responsibility." A great talk and thread on AI ethics and accountability: https://t.co/5IeNRXZpcB
“Race is not a biologic proxy,” said Wright. “Race is a social construct and has no place being embedded in a clinical guideline like this.” 👏👏👏 https://t.co/BldNORwQEs via @statnews
LLM Agent for Alzheimer’s Disease Infodemiology
-Data collection, processing, analysis in autonomous manner
-Trend analysis, topic mapping, etc related to Alzheimer’s Disease across new sources for public health research
-Integrates dynamic visualization
https://t.co/w203c4vKWg
💰Pre-seed and seed investors: there are underrepresented founders looking to add us to their cap tables but don't know who we are. Leave your deets for them to reach out.
💁🏽♀️Investor type:
🔮Thesis:
🌱Stage:
💸Check-size:
🙌🏽Leads or follows:
🌎Geo:
✉️Contact: