Neil Mallinar @nmallinar - Twitter Profile

Pinned Tweet

almost 2 years ago

Grokking modular arithmetic is widely studied for the seemingly unique emergent abilities of neural networks. Instead, we find that iteratively solving a kernel machine and estimating the Average Gradient Outer Product (AGOP) recovers this phenomenon identically:

nmallinar's tweet photo. Grokking modular arithmetic is widely studied for the seemingly unique emergent abilities of neural networks.

Instead, we find that iteratively solving a kernel machine and estimating the Average Gradient Outer Product (AGOP) recovers this phenomenon identically: https://t.co/Jib6GjN3TI

Stat.ML Papers @StatMLPapers

almost 2 years ago

Emergence in non-neural models: grokking modular arithmetic via average gradient outer product https://t.co/bavEySi9UW

0

44

10

23

15K

2

85

15

51

18K

Neil Mallinar @nmallinar

2 months ago

@learning_mech @ashleydzhang @iclr_conf on-the-fly poster with a KidPix emulator??

0

1

0

34

nmallinar retweeted

Jamie Simon @learning_mech

2 months ago

@iclr_conf shoutout to @eaboix and @nmallinar for actually doing the work that made this paper happen, and Misha Belkin for advising. paper here: https://t.co/FKsDjCRv84

0

9

3

8

2K

nmallinar retweeted

Jamie Simon @learning_mech

2 months ago

coauthors + I couldn't get a poster printed in Rio in time, so here's me ad-libbing at @iclr_conf

11

313

14

33

26K

Who to follow

Sanja Fidler

@FidlerSanja

Associate Professor @UofT, Vice President of AI Research @nvidia, founding member of @VectorInst. Computer vision, deep learning, 3D. Opinions are my own.

Tengyu Ma

@tengyuma

Assistant prof. @ Stanford; Chief AI Scientist @ MongoDB; Former Co-founder/CEO of Voyage AI Working on ML, DL, RL, LLMs, and their theory.

Kanjun 🐙

@kanjun

helping humans fight Moloch. CEO @imbue_ai. support founders @outsetcap.

Neil Mallinar @nmallinar

2 months ago

@learning_mech @iclr_conf 🙏🏾🙏🏾🙏🏾 thank you Jamie & many apologies!! Looks like a fun time though!

0

1

0

545

nmallinar retweeted

Jamie Simon @learning_mech

2 months ago

1/ Deep learning is going to have a scientific theory. We can see the pieces starting to come together, and it's looking a lot like physics! We're releasing a paper pulling together these emerging threads and giving them a name: learning mechanics. 🔨 https://t.co/92nSIHameW 🔧

learning_mech's tweet photo. 1/ Deep learning is going to have a scientific theory. We can see the pieces starting to come together, and it's looking a lot like physics!

We're releasing a paper pulling together these emerging threads and giving them a name: learning mechanics.

🔨 https://t.co/92nSIHameW 🔧 https://t.co/3cshMD33bl

54

2K

291

2K

306K

Neil Mallinar @nmallinar

12 months ago

Shout-out to my amazing coauthors @dbeagleholeCS @BusyZhu @PartheP Adit and Misha!! Check out all of their other papers they are brilliant researchers

0

2

0

147

Neil Mallinar @nmallinar

12 months ago

Super excited to share that we have an Oral presentation for this paper next week at ICML! It will be on Tuesday at 10am (Oral 1E) in West Ballroom D, I'll be presenting 4th at 10:45am :) Our poster will be on Wednesday at 11am and I encourage you to stop by and chat!

nmallinar's tweet photo. Super excited to share that we have an Oral presentation for this paper next week at ICML!

It will be on Tuesday at 10am (Oral 1E) in West Ballroom D, I'll be presenting 4th at 10:45am :)

Our poster will be on Wednesday at 11am and I encourage you to stop by and chat! https://t.co/4DYA3PM8xj

1

18

3

2

1K

Neil Mallinar @nmallinar

about 1 year ago

@emanouks 🙏🏾❤️

0

30

Neil Mallinar @nmallinar

about 1 year ago

Happy to share that we got a spotlight at ICML for this work, see y'all there!!

Neil Mallinar @nmallinar

almost 2 years ago

Grokking modular arithmetic is widely studied for the seemingly unique emergent abilities of neural networks. Instead, we find that iteratively solving a kernel machine and estimating the Average Gradient Outer Product (AGOP) recovers this phenomenon identically:

2

85

15

51

18K

0

14

0

1

467

Neil Mallinar @nmallinar

over 1 year ago

@matthistory Maybe they are going to do a heist together, or sing karaoke! I cannot wait to find out

0

1

0

59

Neil Mallinar @nmallinar

over 1 year ago

@matthistory I'd like to order one duck riding on top of a horse please

1

2

0

79

Neil Mallinar @nmallinar

over 1 year ago

@emanouks @tacobellhoarder Banana

0

2

0

33

nmallinar retweeted

amirhesam abedsoltan @Amirhesam_A

over 1 year ago

Two generalization regimes in ICL: (1) context-scaling, where performance improves with more in-context examples, and (2) task-scaling, where performance improves with more pre-training tasks. While MLPs show task-scaling but not context-scaling, https://t.co/fWBMHhrqs7

1

3

2

0

343

Neil Mallinar @nmallinar

almost 2 years ago

Consider my beautiful day uninterrupted 🥲 Alas the research work calls me back

0

3

0

215

Neil Mallinar @nmallinar

almost 2 years ago

@thdbui @pfau Anyway I enjoyed your paper and would love to get a chance to discuss these topics further sometime and hear more about your observations!

0

2

0

107

Neil Mallinar @nmallinar

almost 2 years ago

@thdbui @pfau Another difference we see compared to grokking in low-rank settings like k-parity is that the circulant features we learn for modular arithmetic (MA) are full rank! It wasn't obvious to us that you could do MA with kernels as the MA experiments we see all use neural nets still

1

0

109

Neil Mallinar @nmallinar

almost 2 years ago

@avishvj In a hole in the ground there lived a kernel...

0

83

Neil Mallinar @nmallinar

almost 2 years ago

Grokking modular arithmetic is widely studied for the seemingly unique emergent abilities of neural networks. Instead, we find that iteratively solving a kernel machine and estimating the Average Gradient Outer Product (AGOP) recovers this phenomenon identically:

Stat.ML Papers @StatMLPapers

almost 2 years ago

Emergence in non-neural models: grokking modular arithmetic via average gradient outer product https://t.co/bavEySi9UW

0

44

10

23

15K

2

85

15

51

18K

nmallinar retweeted

Daniel Beaglehole @dbeagleholeCS

almost 2 years ago

Iterating kernel ridgeless regression with AGOP computation groks modular arithmetic… and this grokking is remarkably similar to the phenomenon in neural networks. I found these results very surprising!

1

31

3

6

3K

Neil Mallinar @nmallinar

almost 2 years ago

Please check out our paper here: https://t.co/A4Vz2d1X2S This was an amazing collaboration with Daniel Beaglehole (@dbeagleholeCS), Libin Zhu (@BusyZhu), Adit Radhakrishnan, Parthe Pandit (@PartheP), and Misha Belkin.

0

5

0

352

Neil Mallinar @nmallinar

almost 2 years ago

In our setting, grokking appears to occur solely due to feature learning. We decouple from neural architectures and gradient-descent based optimization by using kernels equipped with feature learning through AGOP and find many of the same phenomena as observed in neural networks.

1

3

0

235

Neil Mallinar

@nmallinar

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users