Nima Dokoohaki

Embeddings power every modern LLM. But what do they actually learn? This Berkeley (BAIR) paper is one of the clearest reads on how AI systems learn and why embeddings really work. https://t.co/qj10TMZjnp

antoniolupetti's tweet photo. Embeddings power every modern LLM. But what do they actually learn?

This Berkeley (BAIR) paper is one of the clearest reads on how AI systems learn and why embeddings really work.

https://t.co/qj10TMZjnp https://t.co/TuvAeDJz3b

887

151

899

46K

Nima Dokoohaki @nimdoc

about 1 month ago

@catherine_6907 While in Cannes

848

nimdoc retweeted

DAIR.AI

@dair_ai

about 1 month ago

// Survey on Multi-Agent Systems // The paper traces the landscape from classical paradigms (consensus, distributed control, swarm intelligence, cooperative learning) to foundation-model-enabled MAS (LLM-based planning, role specialization, task decomposition, multi-modal coordination). It highlights the hard open problems that neither camp has solved on its own: scalability in heterogeneous systems, alignment across agent collectives, efficient knowledge transfer, and real-time adaptation. Paper: https://t.co/FhDcRjmBUO Learn to build effective AI agents in our academy: https://t.co/LRnpZN7deE

dair_ai's tweet photo. // Survey on Multi-Agent Systems //

The paper traces the landscape from classical paradigms (consensus, distributed control, swarm intelligence, cooperative learning) to foundation-model-enabled MAS (LLM-based planning, role specialization, task decomposition, multi-modal coordination).

It highlights the hard open problems that neither camp has solved on its own: scalability in heterogeneous systems, alignment across agent collectives, efficient knowledge transfer, and real-time adaptation.

Paper: https://t.co/FhDcRjmBUO

Learn to build effective AI agents in our academy: https://t.co/LRnpZN7deE

162

131

12K

Who to follow

ChinHuiChen

@jcchinhui

Comp neuro Ph.D. Virginia Tech NDNE lab. Neuroscientist, senior data engineer passionate about computational neuroscience and AGI. https://t.co/KXiniJWZ7a

Cemex HRM

@cemexHRM

Cloud HRM Software | #HRTech #Startup #HRM #HRMsoftware #CloudHR #SaasHR #SaaS #Cloud #HRsoftware #TimeAttendance #LeaveManagement #Payroll #HRIS

Grigori Fursin

@grigori_fursin

Developing automation and knowledge management tools for extreme co-design of highly efficient & cost-effective computer systems for AI, ML and other workloads

nimdoc retweeted

Andrej Karpathy

@karpathy

2 months ago

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

59K

107K

21M

nimdoc retweeted

Nicholas Fabiano, MD

@NTFabiano

3 months ago

Addiction to short-form videos reduces brain activity in the frontal lobe weakening the ability to focus.

746

44K

10K

19K

10M

nimdoc retweeted

Tom Dörr

@tom_doerr

4 months ago

Visualizes transformer operations in the browser https://t.co/BxUuXBt29N

193

67K

nimdoc retweeted

Andrej Karpathy

@karpathy

6 months ago

Don't think of LLMs as entities but as simulators. For example, when exploring a topic, don't ask: "What do you think about xyz"? There is no "you". Next time try: "What would be a good group of people to explore xyz? What would they say?" The LLM can channel/simulate many perspectives but it hasn't "thought about" xyz for a while and over time and formed its own opinions in the way we're used to. If you force it via the use of "you", it will give you something by adopting a personality embedding vector implied by the statistics of its finetuning data and then simulate that. It's fine to do, but there is a lot less mystique to it than I find people naively attribute to "asking an AI".

28K

18K

nimdoc retweeted

Vincent Yiu

@vysecurity

6 months ago

Add a colon to your password, ":", because all the stealer logs have colons, so it'll end up splitting your password incorrectly.

249

50K

17K

nimdoc retweeted

Matthias Niessner

@MattNiessner

7 months ago

The hot topic at #ICCV2025 was World Models. They come in different flavors — (interactive) video models, neural simulators, reconstruction models, etc. — but the overarching goal is clear: Generative AI that predict and simulate how the real world works.

456

147

70K

nimdoc retweeted

Tom Yeh

@ProfTomYeh

8 months ago

Evolution of Deep Learning by Hand ✍️ As my tribute to Geoff Hinton's Nobel Prize, I drew this animation to illustrate the key idea behind Hinton's major contributions to deep learning over the years, with artistic liberty. ---- 100% original, made by hand ✍️ Join 40k readers of my newsletter: https://t.co/fFt8roc8D9

357

111K

nimdoc retweeted

Chester

@chesterzelaya

8 months ago

< Choosing a Vision Backbone > your model’s backbone is its perspective pick ResNet, and it sees in edges pick a ViT, and it sees in patches the backbone decides how your model thinks here are some of the most practical backbones and when you should choose them, from the paper "Battle of the Backbones" (2023): > ResNet - good for fast prototyping, small models, and edge devices > ConvNeXt - great all-purpose backbone; strong for detection & segmentation > Swin Transformer (V2) - best for large-scale detection, segmentation, and high-res inputs > ViT (Vision Transformer) - good when you have huge datasets; less bias, more global context > CLIP - best for vision-language, zero-shot, and retrieval tasks > DINO / MoCo / MAE (SSL) - great when you have little or no labeled data > MiDaS - surprisingly strong if you care about depth, geometry, or robotics perception > Stable Diffusion Encoder - useful for creative or aesthetic tasks; not for accuracy-critical CV > EfficientNet / RegNet / ResNet-18 - good lightweight options for edge or mobile deployment

chesterzelaya's tweet photo. < Choosing a Vision Backbone >

your model’s backbone is its perspective

pick ResNet, and it sees in edges
pick a ViT, and it sees in patches

the backbone decides how your model thinks

here are some of the most practical backbones and when you should choose them, from the paper "Battle of the Backbones" (2023):

> ResNet - good for fast prototyping, small models, and edge devices
> ConvNeXt - great all-purpose backbone; strong for detection & segmentation
> Swin Transformer (V2) - best for large-scale detection, segmentation, and high-res inputs
> ViT (Vision Transformer) - good when you have huge datasets; less bias, more global context
> CLIP - best for vision-language, zero-shot, and retrieval tasks
> DINO / MoCo / MAE (SSL) - great when you have little or no labeled data
> MiDaS - surprisingly strong if you care about depth, geometry, or robotics perception
> Stable Diffusion Encoder - useful for creative or aesthetic tasks; not for accuracy-critical CV
> EfficientNet / RegNet / ResNet-18 - good lightweight options for edge or mobile deployment

981

111

859

59K

nimdoc retweeted

Deena Mousa

@deenamousa

8 months ago

In 2016 Geoffrey Hinton said “we should stop training radiologists now" since AI would soon be better at their jobs. He was right: models have outperformed radiologists on benchmarks for ~a decade. Yet radiology jobs are at record highs, with an average salary of $520k. Why?

deenamousa's tweet photo. In 2016 Geoffrey Hinton said “we should stop training radiologists now" since AI would soon be better at their jobs.

He was right: models have outperformed radiologists on benchmarks for ~a decade.

Yet radiology jobs are at record highs, with an average salary of $520k.

Why? https://t.co/yEbzdMyDmK

199

442

nimdoc retweeted

Yann LeCun

@ylecun

8 months ago

Code World Model: producing code by imagining the effect of executing instructions and planning instructions that produce the desired effect.

166

568

241K

nimdoc retweeted

Jean-Rémi King @JeanRemiKing

9 months ago

Can AI help understand how the brain learns to see the world? Our latest study, led by @JRaugel from FAIR at @AIatMeta and @ENS_ULM, is now out! 📄 https://t.co/y2Y3GP3bI5 🧵 A thread:

318

244K

nimdoc retweeted

@_akhaliq

11 months ago

Energy-Based Transformers are Scalable Learners and Thinkers

877

125

683

150K

nimdoc retweeted

Rohan Paul

@rohanpaul_ai

11 months ago

beautiful explanations of self-attention 💡 wild that one little math hack made all these possible.

486

560

67K

nimdoc retweeted

Guido van Rossum

@gvanrossum

about 1 year ago

The official trailer for the official Python documentary is out!!! ❤️to Ida.

524

947

177K

nimdoc retweeted

Gergely Orosz

@GergelyOrosz

about 1 year ago

“AI Engineering is just software engineering with AI models thrown into the stack” - @abacaj in the book AI Engineering by @chipro Totally agree - and it’s why learning how to integrate LLMs is such a big win for devs!

GergelyOrosz's tweet photo. “AI Engineering is just software engineering with AI models thrown into the stack”

- @abacaj in the book AI Engineering by @chipro

Totally agree - and it’s why learning how to integrate LLMs is such a big win for devs! https://t.co/4AZaGcQIih

516

264

112K

nimdoc retweeted

Alfredo Canziani

@alfcnz

about 1 year ago

Training of a 2 → 100 → 2 → 5 fully connected ReLU neural net via cross-entropy minimisation. • it starts outputting small embeddings • around epoch 300 learns an identity function • takes 1700 epochs more to unwind the data manifold

357

170

35K

Nima Dokoohaki

@nimdoc

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users