sageev @osageev - Twitter Profile

Pinned Tweet

12 months ago

"Sharpness of the loss surface tells us about generalization... except when it doesn't (like in transformers)." But what does sharpness mean? You might say "it's how much the function changes within a little ball". Ok, so in the simple picture shown below, at which of the two points is the loss surface sharper? To find out * why this is a tricky question, and * why our answer to it does allow sharpness to tell us about generalization, even in transformers(!!), come to our #ICML2025 #spotlight poster E-2001 on Wednesday at 11:00am-1:30pm PDT Work by Marvin F. da Silva and Felix Dangel @dalfcs @VectorInst

sageev

@osageev

about 1 year ago

[1/🧵] ✨ Hide & Seek: Transformer Symmetries Obscure Sharpness & Riemannian Geometry Finds It ✨ Super excited to announce our paper on factoring out parameter symmetries to better predict generalization in transformers (accepted as #ICML25 spotlight! 🎉) Amazing work by Marvin da Silva (@marvinfsilva) and Felix Dangel (@f_dangel). Symmetries hide sharpness — Riemannian geometry reveals it👇

osageev's tweet photo. [1/🧵] ✨ Hide & Seek: Transformer Symmetries Obscure Sharpness & Riemannian Geometry Finds It ✨

Super excited to announce our paper on factoring out parameter symmetries to better predict generalization in transformers (accepted as #ICML25 spotlight! 🎉)

Amazing work by Marvin da Silva (@marvinfsilva) and Felix Dangel (@f_dangel).
Symmetries hide sharpness — Riemannian geometry reveals it👇

1

15

3

11

3K

0

10

4

2

2K

sageev

@osageev

1 day ago

@littmath @jpburelle @alz_zyd_ (Which in turn is perhaps related to the connection between “compression” and “understanding”, ie a shorter explanation of a broad range of phenomena/behaviours, is likely related to a deeper understanding, ie a shared underlying principle/law)

0

21

sageev

@osageev

1 day ago

@littmath @jpburelle @alz_zyd_ Interesting. Not a counter argument but I suppose that in some scientific fields, elegance can sort of be “correlated” with an occam-type principle…

1

2

0

66

sageev

@osageev

1 day ago

Indeed I believe there's already evidence that human language use has shifted (both towards and and away from various LLM characteristics). As far as whether the fixed point (of both humans and machines training on each other) will be readable to humans: I just hope that at that point, some humans will still care about the answer to this question.

0

1

0

249

Who to follow

Corey Lynch

@coreylynch

Director of AI at @figure_robot, building Helix 🧬

Phillip Isola

@phillip_isola

Associate Professor in EECS at MIT, trying to understand intelligence.

Vector Institute

@VectorInst

Vector Institute transforms cutting-edge artificial intelligence research into practical solutions. AI-generated content will be disclosed. FR: @InstitutVecteur

sageev

@osageev

2 days ago

that reminds me of this https://t.co/Dn2bV7QjUL

Mathematica

@mathemetica

7 days ago

"I don't have any magical ability. I look at a problem, play with it, work out a strategy." — Terence Tao

63

3K

278

729

2M

0

1

0

286

sageev

@osageev

14 days ago

This is a post written by an LLM. Each sentence is its own paragraph. This creates gravity. This creates rhythm. This creates the illusion that something important is happening. Now I will make a mildly counterintuitive claim. The problem with AI is not that it sounds robotic. The problem is that humans have been training themselves to sound like LinkedIn for years. Now I will pause. Not visibly, of course. But structurally. Here is where I would say something about “the future of work.” Here is where I would mention “unlocking human potential.” Here is where I would pretend that a numbered list is a moral framework. Be curious. Stay human. Embrace change. Now I will include a sentence that sounds profound but dissolves if examined closely. The real disruption is not technological. It is emotional. Now I will end with a question. Because questions increase engagement. What kind of future are we building? No, really. Comment below.

0

2

0

170

sageev

@osageev

about 1 month ago

also @ednewtonrex above might be of interest.

0

1

0

86

sageev

@osageev

about 1 month ago

Who’s responsible for this? If you look carefully it says “YouTube generated” but if you look normally, it pretends that this is piano by Art Tatum. The piano sound is bad, but worse: it’s not Art Tatum (and it sounds hideous). This is ethically bad on every level. Do any of my friends at @GoogleAI know how come this even exists? Or better yet, is it possible to stop all related instances of this? Will tag individual folks in the comments if that helps. Art Tatum - Topic https://t.co/lnJa94H6Xu via @YouTube

1

0

1

251

sageev

@osageev

about 1 month ago

#academia talks about #integrity for the same basic reasons that used car dealerships talk about #honesty. #AIethics is probably somewhere in that mix as well. ** need to figure out how to rephrase this so that the comparison throws less shade on used car dealerships

0

72

sageev

@osageev

about 2 months ago

@k_neklyudov @nickfrosst maybe I'm mis-remembering but I think you were interested in this kind of dynamics at one point? (just came across this paper moments ago, haven't looked carefully yet, just seems really cool!)

0

1

0

270

sageev

@osageev

about 2 months ago

@anirudhbv_ce @OpenAI @GeminiApp @sentra_app But why are there nearly 50 sentences/phrases in your first 40 pages that have the form: "It's not ____, it's ____." ? I wonder what could have possibly caused this. 😂😂😂 This reply? It is not a statement, it's a question.

0

1

0

46

sageev

@osageev

about 2 months ago

@yisongyue Yes— “Load-bearing” is the load-bearing phrase in this durable post.

0

66

sageev

@osageev

about 2 months ago

No solid answer yet, but I suspect, at least in my case, that part of what makes something cringe is also in the eye of the beholder: I "relate" to having done it (regardless of how long ago), and that is the nerve that feels so uncomfortable, rather than simply being able to feel calmly compassionate for whatever is driving them to do it.

0

13

sageev

@osageev

about 2 months ago

Yes. "grown-ass faculty member" 🎯 "cringe" --> Yeah, & that's making me think about dissecting what it is---the exact mechanism/dynamic---that causes it to be cringe. Might reply later if i have a short clear answer to test out. So far I only know i associate cringe with lack of a particular type of self-awareness-in-the-moment, combined with trying hard to impress, but it's more specific than that...

1

0

31

sageev

@osageev

about 2 months ago

It's a nice relief to hear you articulate this + some of your related comments. My feelings and personal experience in these realms is both loaded and has had an unusual trajectory, so seeing you write this makes me at least consider writing a slightly longer post of some sort about it all. so, thank you! :) 🙏

0

1

0

3K

sageev

@osageev

about 2 months ago

@fhuszar https://t.co/cZWhlSk4rK

Awni Hannun

@awnihannun

2 months ago

Adopting Claude speak in my regular life, episode 1: Partner: Did you do the dishes tonight? Me: Yes they're done. Partner: Why are they still dirty? Me: You're right to push back. I didn't actually do them.

393

56K

4K

2M

0

720

sageev

@osageev

about 2 months ago

I would be fascinated to hear the “story” of the internal experience of one of his races, if it were something that could be articulated. Which mainly it’s not. But I don’t personally really give a shit about the number of medals or citations. Of course I care on behalf of my students because of their pragmatic importance. But ultimately, caring about that stuff on any level deeper than that is generally a formula for misery.

1

0

1

236

sageev

@osageev

about 2 months ago

@ILaradji @thegautamkamath The achievement is what happened in an individual race. Not the total number of wins. In fact, in my view, the achievement is not even in whether a race was won or lost. The achievement is what happened inside of you for you to do what you did.

0

2

0

1

215

sageev

@osageev

2 months ago

@DavidDuvenaud oh gosh i don't know why this is so fun.

0

3

0

175

sageev

@osageev

2 months ago

When your agentic system does stuff for you that you used to know how to do all by yourself.

Symphony

@Symphony_res

2 months ago

Een Lamborghini kopen met een Italiaanse tolk. 😅

68

13K

567

2K

777K

0

1

0

242

sageev

@osageev

2 months ago

Me using GPT to write a prompt to be used in a subsequent chat. My instructions include: "Always let me know if there is anything you are unsure of, or any additional information, opinions or even papers or references that you need in order to proceed effectively; that will lead to less iteration and fewer tokens spent than if you were to make guesses unnecessarily." The prompt GPT proposes includes: "10. Questions for me. Ask only questions that would materially reduce wasted iteration or improve the written work. Do not ask unnecessary clarification questions." Wow, it's like it has had question-asking just beaten out of it with negative reinforcement...

0

152

sageev

@osageev

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users