Ana Trisovic

@atrisovic

Research Scientist at MIT | previously with @HarvardBiostats & @IQSS, @UChicago, @Cambridge_Uni and @LHCbExperiment @CERN

Cambridge, MA

Joined May 2009

483 Following

522 Followers

429 Posts

atrisovic retweeted

Neil Thompson @ProfNeilT

6 months ago

Our new paper breaks down algorithm progress in AI (which is faster than GPU progress). Surprisingly, much of the gains have only gone to the largest models, which is bad news for AI efficiency improvement. @MITFutureTech https://t.co/vf0Bw4Zyrt

16K

atrisovic retweeted

fly51fly @fly51fly

6 months ago

[LG] On the Origin of Algorithmic Progress in AI H Gundlach, A Fogelson, J Lynch, A Trisovic... [MIT FutureTech, CSAIL] (2025) https://t.co/BkfMRRuZ6m

fly51fly's tweet photo. [LG] On the Origin of Algorithmic Progress in AI
H Gundlach, A Fogelson, J Lynch, A Trisovic... [MIT FutureTech, CSAIL] (2025)
https://t.co/BkfMRRuZ6m https://t.co/ofT7KyI9Ci

atrisovic retweeted

BSC-CNS

@BSC_CNS

11 months ago

📢#BSCSeminar: The #AI Frontier: Transformative role of foundation models across scientific disciplines 🗣@atrisovic, @MIT 📆 16 July ➡ https://t.co/uO1QaBS67S @SOMM_alliance

BSC_CNS's tweet photo. 📢#BSCSeminar: The #AI Frontier: Transformative role of foundation models across scientific disciplines

🗣@atrisovic, @MIT
📆 16 July

➡ https://t.co/uO1QaBS67S

@SOMM_alliance https://t.co/S3AZZuHJxM

412

Ana Trisovic @atrisovic

about 1 year ago

Excited to be at Dzmitry Bahdanau’s #ICLR2025 talk on the origin story of the attention mechanism, a foundational idea (and Test of Time runner-up) that shaped how we build and understand large language models.

atrisovic's tweet photo. Excited to be at Dzmitry Bahdanau’s #ICLR2025 talk on the origin story of the attention mechanism, a foundational idea (and Test of Time runner-up) that shaped how we build and understand large language models. https://t.co/ot5m48OYWI

227

Who to follow

Akanksha Ahuja

@Akanksha_Ahuja9

AI on Graphs, Networks, Topologies 👩🏻‍💻🌎| PhD @Cambridge_Eng @Cambridge_Uni | MSc @CompSciOxford @UniofOxford I Prev @FCC_study, @CMSExperiment, AWAKE @CERN

Supercollider physicist · Higgsologist · Author, Grace in All Simplicity | Gauge Theories … | @chrisquigg.bsky.social

atrisovic retweeted

arXiv.org @arxiv

about 1 year ago

Audio summaries are coming to arXiv! 👂📑 We've partnered with @Science_Cast to pilot 60 second #AI generated audio summaries, starting with astro-ph.HE. More info on this partnership on the blog: https://t.co/KV47dzYXni

arxiv's tweet photo. Audio summaries are coming to arXiv! 👂📑

We've partnered with @Science_Cast to pilot 60 second #AI generated audio summaries, starting with astro-ph.HE.

More info on this partnership on the blog: https://t.co/KV47dzYXni https://t.co/cJxyIpg3Dd

715

173

110

76K

atrisovic retweeted

CERNpress @CERNpress

about 1 year ago

[Press Release] The LHC experiment collaborations at CERN receive Breakthrough Prize The Breakthrough Prize in Fundamental Physics was awarded to the @ALICEexperiment , @ATLASexperiment, @CMSExperiment and @LHCbExperiment. Find out more: https://t.co/BOnBjV96PH

CERNpress's tweet photo. [Press Release] The LHC experiment collaborations at CERN receive Breakthrough Prize

The Breakthrough Prize in Fundamental Physics was awarded to the @ALICEexperiment , @ATLASexperiment, @CMSExperiment and @LHCbExperiment.

Find out more: https://t.co/BOnBjV96PH https://t.co/WoJrCpZkSM

185

26K

atrisovic retweeted

Breakthrough

@brkthroughprize

about 1 year ago

CERN's experiments are global efforts. The 2025 Breakthrough Prize in Fundamental Physics honors over 13,000 researchers whose labors have led to the precise description the Higgs mechanism, the discovery of dozens of new particles, analysis of rare processes and matter-antimatter asymmetry and exploration of nature at the shortest distances and most extreme conditions. https://t.co/OSDzo6jMHF @CERN

brkthroughprize's tweet photo. CERN's experiments are global efforts. The 2025 Breakthrough Prize in Fundamental Physics honors over 13,000 researchers whose labors have led to the precise description the Higgs mechanism, the discovery of dozens of new particles, analysis of rare processes and matter-antimatter asymmetry and exploration of nature at the shortest distances and most extreme conditions. https://t.co/OSDzo6jMHF @CERN

331

110

72K

atrisovic retweeted

MIT Open Learning

@mitopenlearning

about 1 year ago

From student in Serbia to researcher at @MIT! Get to know @atrisovic: https://t.co/RgkOjyrxyt

415

atrisovic retweeted

MIT OpenCourseWare

@MITOCW

about 1 year ago

"Being persistent and investing in yourself is the best thing a young person can do." @atrisovic's educational journey has been nothing short of transformative! She began learning Python using MIT OpenCourseWare in 2012 from her home in Serbia, eventually leading her to her current position at the FutureTech Lab at @MIT_CSAIL. Learn more about Ana Trišović's story ➡️ https://t.co/Ke1SeRXx0X (Photo courtesy of Ana Trišović.) @mitopenlearning

MITOCW's tweet photo. "Being persistent and investing in yourself is the best thing a young person can do."

@atrisovic's educational journey has been nothing short of transformative! She began learning Python using MIT OpenCourseWare in 2012 from her home in Serbia, eventually leading her to her current position at the FutureTech Lab at @MIT_CSAIL.

Learn more about Ana Trišović's story ➡️ https://t.co/Ke1SeRXx0X

(Photo courtesy of Ana Trišović.)
@mitopenlearning

atrisovic retweeted

Massachusetts Institute of Technology (MIT)

@MIT

about 1 year ago

Computer scientist Ana Trišović was a student in Serbia when she discovered a free online course from MIT. “I instantly fell in love with Python the moment I took that course. I have such a soft spot for OpenCourseWare — it shaped my career,” she says. https://t.co/fmtUGZ4QsM

242

12K

Ana Trisovic @atrisovic

about 1 year ago

Very excited to share my story about MIT OpenCourseWare - right here at MIT! Thank you to Lauren Thacker for a great interview and thoughtful article. https://t.co/upXPFKtWeV

187

atrisovic retweeted

Jeff Dean

@JeffDean

about 1 year ago

Want to check out the source for the "AlexNet" paper? Google has made the code from Alex Krizhevsky, @ilyasut, and @geoffreyhinton's seminal "ImageNet Classification with Deep Convolutional Neural Networks" paper public, in partnership with the Computer History Museum. As I said in the press release, "Google is delighted to contribute the source code for the groundbreaking AlexNet work to the Computer History Museum". https://t.co/62Ilp7jaeT

JeffDean's tweet photo. Want to check out the source for the "AlexNet" paper? Google has made the code from Alex Krizhevsky, @ilyasut, and @geoffreyhinton's seminal "ImageNet Classification with Deep Convolutional
Neural Networks" paper public, in partnership with the Computer History Museum.

As I said in the press release, "Google is delighted to contribute the source code for the groundbreaking AlexNet work to the Computer History Museum".

https://t.co/62Ilp7jaeT

145

323

108K

atrisovic retweeted

Insajder TV

@Insajder_net

over 1 year ago

Skup podrške studentima u Bostonu (VIDEO)

201

36K

atrisovic retweeted

Dr. Peter Slattery @PeterSlattery1

almost 2 years ago

🧵 📢 What are the risks from Artificial Intelligence? We present the first-ever AI Risk Repository: a comprehensive living database of 700+ risks extracted, with quotes and page numbers, from 43(!) existing frameworks. Read and explore here: https://t.co/RJXlNdFPBv To categorize the identified risks, we adapt two existing frameworks into taxonomies. Our Causal Taxonomy categorizes risks based on three factors: the Entity involved, the Intent behind the risk, and the Timing of its occurrence. Our Domain Taxonomy categorizes AI risks into 7 broad domains, and 23 more specific subdomains. For example, 'Misinformation' is one of the domains, while 'False or misleading information' is one of its subdomains. 💡 Four insights from our analysis: 1️⃣ 51% of the risks extracted were attributed to AI systems, while 34% were attributed to humans. Slightly more risks were presented as being unintentional (37%) than intentional (35%). Six times more risks were presented as occurring after (65%) than before deployment (10%). 2️⃣ Existing risk frameworks vary widely in their scope. On average, each framework addresses only 34% of the risk subdomains we identified. The most comprehensive framework covers 70% of these subdomains. However, nearly a quarter of the frameworks cover less than 20% of the subdomains. 3️⃣ Several subdomains, such as *Unfair discrimination and misrepresentation* (mentioned in 63% of documents); *Compromise of privacy* (61%); and *Cyberattacks, weapon development or use, and mass harm* (54%) are frequently discussed. 4️⃣ Others such as *AI welfare and rights* (2%), *Competitive dynamics* (12%), and *Pollution of information ecosystem and loss of consensus reality* (12%) were rarely discussed. 🔗 How can you engage? Website: https://t.co/RlLcF8cRLw Preprint: https://t.co/0xuWG5ilPh Database: https://t.co/rXT8qpn3p6 Feedback: https://t.co/SaJbNBcwkJ 🙏 Please help us spread the word by sharing with anyone relevant! Thanks to everyone involved: @aksaeri, @StephenLCasper @mnoetel @ProfNeilT @RistoUuk @soroushjp @jmsdao

PeterSlattery1's tweet photo. 🧵 📢 What are the risks from Artificial Intelligence?

We present the first-ever AI Risk Repository: a comprehensive living database of 700+ risks extracted, with quotes and page numbers, from 43(!) existing frameworks.
Read and explore here: https://t.co/RJXlNdFPBv

To categorize the identified risks, we adapt two existing frameworks into taxonomies.

Our Causal Taxonomy categorizes risks based on three factors: the Entity involved, the Intent behind the risk, and the Timing of its occurrence.

Our Domain Taxonomy categorizes AI risks into 7 broad domains, and 23 more specific subdomains. For example, 'Misinformation' is one of the domains, while 'False or misleading information' is one of its subdomains.

💡 Four insights from our analysis:

1️⃣ 51% of the risks extracted were attributed to AI systems, while 34% were attributed to humans. Slightly more risks were presented as being unintentional (37%) than intentional (35%). Six times more risks were presented as occurring after (65%) than before deployment (10%).

2️⃣ Existing risk frameworks vary widely in their scope. On average, each framework addresses only 34% of the risk subdomains we identified. The most comprehensive framework covers 70% of these subdomains. However, nearly a quarter of the frameworks cover less than 20% of the subdomains.

3️⃣ Several subdomains, such as *Unfair discrimination and misrepresentation* (mentioned in 63% of documents); *Compromise of privacy* (61%); and *Cyberattacks, weapon development or use, and mass harm* (54%) are frequently discussed.

4️⃣ Others such as *AI welfare and rights* (2%), *Competitive dynamics* (12%), and *Pollution of information ecosystem and loss of consensus reality* (12%) were rarely discussed.

🔗 How can you engage?

Website: https://t.co/RlLcF8cRLw
Preprint: https://t.co/0xuWG5ilPh
Database: https://t.co/rXT8qpn3p6
Feedback: https://t.co/SaJbNBcwkJ

🙏 Please help us spread the word by sharing with anyone relevant!

Thanks to everyone involved: @aksaeri, @StephenLCasper @mnoetel @ProfNeilT @RistoUuk @soroushjp @jmsdao

16K

atrisovic retweeted

Andrej Karpathy

@karpathy

almost 2 years ago

📽️ New 4 hour (lol) video lecture on YouTube: "Let’s reproduce GPT-2 (124M)" https://t.co/QTUdu8b0qh The video ended up so long because it is... comprehensive: we start with empty file and end up with a GPT-2 (124M) model: - first we build the GPT-2 network - then we optimize it to train very fast - then we set up the training run optimization and hyperparameters by referencing GPT-2 and GPT-3 papers - then we bring up model evaluation, and - then cross our fingers and go to sleep. In the morning we look through the results and enjoy amusing model generations. Our "overnight" run even gets very close to the GPT-3 (124M) model. This video builds on the Zero To Hero series and at times references previous videos. You could also see this video as building my nanoGPT repo, which by the end is about 90% similar. Github. The associated GitHub repo contains the full commit history so you can step through all of the code changes in the video, step by step. https://t.co/BOzkxQ8at2 Chapters. On a high level Section 1 is building up the network, a lot of this might be review. Section 2 is making the training fast. Section 3 is setting up the run. Section 4 is the results. In more detail: 00:00:00 intro: Let’s reproduce GPT-2 (124M) 00:03:39 exploring the GPT-2 (124M) OpenAI checkpoint 00:13:47 SECTION 1: implementing the GPT-2 nn.Module 00:28:08 loading the huggingface/GPT-2 parameters 00:31:00 implementing the forward pass to get logits 00:33:31 sampling init, prefix tokens, tokenization 00:37:02 sampling loop 00:41:47 sample, auto-detect the device 00:45:50 let’s train: data batches (B,T) → logits (B,T,C) 00:52:53 cross entropy loss 00:56:42 optimization loop: overfit a single batch 01:02:00 data loader lite 01:06:14 parameter sharing wte and lm_head 01:13:47 model initialization: std 0.02, residual init 01:22:18 SECTION 2: Let’s make it fast. GPUs, mixed precision, 1000ms 01:28:14 Tensor Cores, timing the code, TF32 precision, 333ms 01:39:38 float16, gradient scalers, bfloat16, 300ms 01:48:15 torch.compile, Python overhead, kernel fusion, 130ms 02:00:18 flash attention, 96ms 02:06:54 nice/ugly numbers. vocab size 50257 → 50304, 93ms 02:14:55 SECTION 3: hyperpamaters, AdamW, gradient clipping 02:21:06 learning rate scheduler: warmup + cosine decay 02:26:21 batch size schedule, weight decay, FusedAdamW, 90ms 02:34:09 gradient accumulation 02:46:52 distributed data parallel (DDP) 03:10:21 datasets used in GPT-2, GPT-3, FineWeb (EDU) 03:23:10 validation data split, validation loss, sampling revive 03:28:23 evaluation: HellaSwag, starting the run 03:43:05 SECTION 4: results in the morning! GPT-2, GPT-3 repro 03:56:21 shoutout to llm.c, equivalent but faster code in raw C/CUDA 03:59:39 summary, phew, build-nanogpt github repo

karpathy's tweet photo. 📽️ New 4 hour (lol) video lecture on YouTube:
"Let’s reproduce GPT-2 (124M)"
https://t.co/QTUdu8b0qh

The video ended up so long because it is... comprehensive: we start with empty file and end up with a GPT-2 (124M) model:
- first we build the GPT-2 network
- then we optimize it to train very fast
- then we set up the training run optimization and hyperparameters by referencing GPT-2 and GPT-3 papers
- then we bring up model evaluation, and
- then cross our fingers and go to sleep.
In the morning we look through the results and enjoy amusing model generations. Our "overnight" run even gets very close to the GPT-3 (124M) model. This video builds on the Zero To Hero series and at times references previous videos. You could also see this video as building my nanoGPT repo, which by the end is about 90% similar.

Github. The associated GitHub repo contains the full commit history so you can step through all of the code changes in the video, step by step.
https://t.co/BOzkxQ8at2

Chapters.
On a high level Section 1 is building up the network, a lot of this might be review. Section 2 is making the training fast. Section 3 is setting up the run. Section 4 is the results. In more detail:
00:00:00 intro: Let’s reproduce GPT-2 (124M)
00:03:39 exploring the GPT-2 (124M) OpenAI checkpoint
00:13:47 SECTION 1: implementing the GPT-2 nn.Module
00:28:08 loading the huggingface/GPT-2 parameters
00:31:00 implementing the forward pass to get logits
00:33:31 sampling init, prefix tokens, tokenization
00:37:02 sampling loop
00:41:47 sample, auto-detect the device
00:45:50 let’s train: data batches (B,T) → logits (B,T,C)
00:52:53 cross entropy loss
00:56:42 optimization loop: overfit a single batch
01:02:00 data loader lite
01:06:14 parameter sharing wte and lm_head
01:13:47 model initialization: std 0.02, residual init
01:22:18 SECTION 2: Let’s make it fast. GPUs, mixed precision, 1000ms
01:28:14 Tensor Cores, timing the code, TF32 precision, 333ms
01:39:38 float16, gradient scalers, bfloat16, 300ms
01:48:15 torch.compile, Python overhead, kernel fusion, 130ms
02:00:18 flash attention, 96ms
02:06:54 nice/ugly numbers. vocab size 50257 → 50304, 93ms
02:14:55 SECTION 3: hyperpamaters, AdamW, gradient clipping
02:21:06 learning rate scheduler: warmup + cosine decay
02:26:21 batch size schedule, weight decay, FusedAdamW, 90ms
02:34:09 gradient accumulation
02:46:52 distributed data parallel (DDP)
03:10:21 datasets used in GPT-2, GPT-3, FineWeb (EDU)
03:23:10 validation data split, validation loss, sampling revive
03:28:23 evaluation: HellaSwag, starting the run
03:43:05 SECTION 4: results in the morning! GPT-2, GPT-3 repro
03:56:21 shoutout to llm.c, equivalent but faster code in raw C/CUDA
03:59:39 summary, phew, build-nanogpt github repo

412

15K

10K

atrisovic retweeted

Yann LeCun

@ylecun

about 2 years ago

To qualify as Science a piece of research must be correct and reproducible. To be correct and reproducible, it must be described in sufficient details in a publication. To be 'published' (to receive a seal of approval) the publication must be checked for correctness by reviewers. To be reproduced, the publication must be widely available to the community and sufficiently interesting. If you do research and don't publish, it's not Science. Without peer review and reproducibility, chances are your methodology was flawed and you fooled yourself into thinking you did something great. No one will ever hear about your work. No one will pick it up and build on top of it. No one will build new technology and products with it. Your work will have been in vain. You'll die bitter and forgotten. If you never published your research but somehow developed it into a product, you might die rich. But you'll still be a bit bitter and largely forgotten.

515

940

atrisovic retweeted

JOSE @JOSE_TheOJ

about 2 years ago

Just published in JOSE: 'A practical guide to climate econometrics: Navigating key decision points in weather and climate data analysis' https://t.co/0h39zDzCGq

958

atrisovic retweeted

MIT CSAIL

@MIT_CSAIL

about 2 years ago

Today is a big day in open-source history: 31 years ago @CERN released the source code for the World Wide Web for anyone to use. v/@BTGroup

MIT_CSAIL's tweet photo. Today is a big day in open-source history: 31 years ago @CERN released the source code for the World Wide Web for anyone to use.

v/@BTGroup https://t.co/tCrgytnVBL

819

223

116

56K

atrisovic retweeted

CERN

@CERN

about 2 years ago

#OnThisDay in 1993, CERN put the World Wide Web in the public domain, later made available with an open licence, allowing the web to flourish. Here we see Sir Tim Berners-Lee's proposal for the World Wide Web, revised back in May 1990. The World Wide Web will be one of the topics of the next #CERN70 public event, “CERN – An extraordinary human endeavour”, where we will also talk about inclusiveness, collaboration, #OpenScience, #AI, machine learning and more. 📍CERN Science Gateway 📆 19 May, 17:00 – 19:00 CEST Stay tuned to find out more about the event and our special guests. 😉

CERN's tweet photo. #OnThisDay in 1993, CERN put the World Wide Web in the public domain, later made available with an open licence, allowing the web to flourish.

Here we see Sir Tim Berners-Lee's proposal for the World Wide Web, revised back in May 1990.

The World Wide Web will be one of the topics of the next #CERN70 public event, “CERN – An extraordinary human endeavour”, where we will also talk about inclusiveness, collaboration, #OpenScience, #AI, machine learning and more.

📍CERN Science Gateway
📆 19 May, 17:00 – 19:00 CEST

Stay tuned to find out more about the event and our special guests. 😉

758

289

219K

atrisovic retweeted

Mauricio Tec @ ICLR 2026 🇧🇷 @mauriciogtec

about 2 years ago

I am so excited to be involved in this workshop! The call for contributions is out!

407

Ana Trisovic

@atrisovic

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users