🚀 Checkout our ICLR 2024 Spotlight Poster #165 today (Session 3)
🩺 "MOTOR: A Time-to-Event Foundation Model For Structured Medical Records"
✨ Highlights:
- The first TTE foundation model for structured EHRs
- Improves SOTA TTE by 4.6% & boosts label efficiency up to 95%
- TTE pretraining scales to 16k tasks & reduces GPU memory usage by ~35%
- Model weights available for research use!
🔍 Tutorial: https://t.co/1zKF05JS6A
🤖 Model: https://t.co/Ka9YUHNJNG
#ICLR2024 #AI #Healthcare
Korl is available for public beta!
We’ve built a platform that auto-generates consumable product presentations in seconds, each optimized for a common use case and audience.
Start with a 2-week free trial from our site: https://t.co/zQNRwa3Spr
Fun stuff - "In-Context Learning Creates Task Vectors" (via @_akhaliq) isolates something (more or less) resembling the "program key" described below as an internal attention state, demonstrating with state substitution/patching experiments https://t.co/cWBc33ql7v
My interpretation of prompt engineering is this:
1. A LLM is a repository of many (millions) of vector programs mined from human-generated data, learned implicitly as a by-product of language compression. A "vector program" is just a very non-linear function that maps part of the latent space unto itself.
2. When you're prompting, you're fetching one of these programs and running it on an input -- part of your prompt serves as a kind of "program key" (as in database key) and part serves as program argument(s). Like, in "write this paragraph in the style of Shakespeare: {my paragraph}", the part "write this paragraph in the stye of X: Y" is a program key, with arguments X=Shakespeare and Y={my paragraph}.
3. The program fetched by your key may or may not work well for the task at hand. There's no reason why it should be optimal. There are lots of related programs to choose from.
4. Prompt engineering represents a search over many keys in order a find a program that is empirically more accurate for what you're trying to do. It's no different than trying different keywords when searching for a Python library.
5. Everything else is unnecessary anthropomorphism on the part of the prompter. You're not talking to a human who understands language the way you do. Stop pretending you are.
Nice @NetflixEng blog on Causal ML for title card art (people love a single expressive face) https://t.co/aCQLhZevQT
The science is finally catching up to @joeveix "Your Pretty Face is Going to Sell" https://t.co/44ZUAFdfs3
Can we get rigorous guarantees on predictions w/o assumptions on the model OR the data distribution? Incredibly, yes - excellent intro to Conformal Prediction by @ml_angelopoulos & @stats_stephen: https://t.co/sh5xeFjjhA
Paper itself is an absolute treat: great diagrams, abundant code samples, and as a nice bonus some historical context around the origins and development of the research ideas.
Credit: saw it in @deliprao's "AI Research & Strategy" newsletter https://t.co/uz2PU219nb
...furthermore: outperforms XGBoost, does Lasso in one-pass, seems not to rely on nearest-neighbor.
Future work: "...an intriguing possibility where we might be able to reverse engineer the Transformer to obtain better learning algorithms." (!)
Easy to get "breakthrough fatigue" in ML recently, but in this work the NN learns *how to learn* linear regression, decision trees, 2-layer ReLU nets 😲
"...we train Transformer models to discover algorithms for different learning problems."
https://t.co/xuY1SecOrP
Day 1 of #Illuminate22 starts tomorrow! Check out some of the talks that will kick off at 1:00 am PT/4:00 am ET including @SumoLogic's general session on #reliability management and #security in one platform as well as speakers from @eSentire and @Zyston. https://t.co/d0j1fBzNPY
ML on the M1 has magically transported me back to my youth: a hellscape of Python toolchain chaos with insane fixes like “find a random .py file in your /lib and swap the order of two import statements” https://t.co/vZ4VlYSfDF
@hkarthik@Carnage4Life is there an innocuous technical / data reason why (anecdotally) users overwhelmingly experience asymmetric “slippage” (ie, longer waits) in the predictions?
Pondering the TBs of data crunched in an ultramodern cloud ML platform in order to power cutting-edge causal models targeting tightly business-aligned KPIs in one of the world’s most esteemed tech firms, all leading up to this email.
I’m hiring for an ML Platform Engineer role in the SF Bay Area - we’ve got some interesting problems, please let me know if you’d want to learn more! https://t.co/e5hF1pk7X8