Gregor Joeris

@iCube

Addressing challenges in Content Services, Content Analytics, Content Federation, ECM, BPM, and Case Management from a CTO perspective. Invented Doxis4 CSB.

Germany, Bonn

Joined May 2008

137 Following

82 Followers

230 Posts

Pinned Tweet

Gregor Joeris @iCube

about 11 years ago

Billion of stars and some black wholes - that's what the internet is becoming more and more

Gregor Joeris @iCube

12 days ago

@durov Use Threema when you need privacy. Nothing else comparable. It's also the only app you pay with money. All others you pay with data

Gregor Joeris @iCube

21 days ago

Agree 100%. I would recommend it to everyone. Otherwise you have substituted yourself already by AI

Gergely Orosz

@GergelyOrosz

21 days ago

I find myself doing a lot better work, being more satisfied, and also learn a lot more+faster when I do *the hard work* and don’t outsource it to AI. As in, I’ll use AI as a *tool* with substasks, additional research: but I don’t turn off my brain or kick back, assuming it can do the work for me. Every time I “hand over the” hard work part to AI and mentally turn off, I either regret it or find myself eventually needing to go back and spend more time on it. I also see slop work coming out from people who assume the AI does better work than they would.

107

216

64K

Gregor Joeris @iCube

6 months ago

@GergelyOrosz Eventual consistency is now a design philosophy at X

iCube retweeted

Pavel Durov

@durov

11 months ago

📚 If you’re a student choosing what to focus on, pick MATH. It will teach you to relentlessly rely on your own brain, think logically, break down problems, and solve them step by step in the right order. That’s the core skill you’ll need to build companies and manage projects.

975

35K

iCube retweeted

Gergely Orosz

@GergelyOrosz

11 months ago

I generally like Anthropic: but the more they paint a dystopian future where AI “manages” people (“AI middle-managers”) the more I am starting to think they are losing their marbles. LLMs is a tool humans should use. The tail should not wag the dog; Anthropic should know better

105

167

388

242K

Gregor Joeris @iCube

12 months ago

@elonmusk @grok such simple self reflection will not work considering Goedel - what do you think?

iCube retweeted

Andrej Karpathy

@karpathy

over 1 year ago

The (true) story of development and inspiration behind the "attention" operator, the one in "Attention is All you Need" that introduced the Transformer. From personal email correspondence with the author @DBahdanau ~2 years ago, published here and now (with permission) following some fake news about how it was developed that circulated here over the last few days. Attention is a brilliant (data-dependent) weighted average operation. It is a form of global pooling, a reduction, communication. It is a way to aggregate relevant information from multiple nodes (tokens, image patches, or etc.). It is expressive, powerful, has plenty of parallelism, and is efficiently optimizable. Even the Multilayer Perceptron (MLP) can actually be almost re-written as Attention over data-indepedent weights (1st layer weights are the queries, 2nd layer weights are the values, the keys are just input, and softmax becomes elementwise, deleting the normalization). TLDR Attention is awesome and a *major* unlock in neural network architecture design. It's always been a little surprising to me that the paper "Attention is All You Need" gets ~100X more err ... attention... than the paper that actually introduced Attention ~3 years earlier, by Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio: "Neural Machine Translation by Jointly Learning to Align and Translate". As the name suggests, the core contribution of the Attention is All You Need paper that introduced the Transformer neural net is deleting everything *except* Attention, and basically just stacking it in a ResNet with MLPs (which can also be seen as ~attention per the above). But I do think the Transformer paper stands on its own because it adds many additional amazing ideas bundled up all together at once - positional encodings, scaled attention, multi-headed attention, the isotropic simple design, etc. And the Transformer has imo stuck around basically in its 2017 form to this day ~7 years later, with relatively few and minor modifications, maybe with the exception better positional encoding schemes (RoPE and friends). Anyway, pasting the full email below, which also hints at why this operation is called "attention" in the first place - it comes from attending to words of a source sentence while emitting the words of the translation in a sequential manner, and was introduced as a term late in the process by Yoshua Bengio in place of RNNSearch (thank god? :D). It's also interesting that the design was inspired by a human cognitive process/strategy, of attending back and forth over some data sequentially. Lastly the story is quite interesting from the perspective of nature of progress, with similar ideas and formulations "in the air", with a particular mentions to the work of Alex Graves (NMT) and Jason Weston (Memory Networks) around that time. Thank you for the story @DBahdanau !

karpathy's tweet photo. The (true) story of development and inspiration behind the "attention" operator, the one in "Attention is All you Need" that introduced the Transformer. From personal email correspondence with the author @DBahdanau ~2 years ago, published here and now (with permission) following some fake news about how it was developed that circulated here over the last few days.

Attention is a brilliant (data-dependent) weighted average operation. It is a form of global pooling, a reduction, communication. It is a way to aggregate relevant information from multiple nodes (tokens, image patches, or etc.). It is expressive, powerful, has plenty of parallelism, and is efficiently optimizable. Even the Multilayer Perceptron (MLP) can actually be almost re-written as Attention over data-indepedent weights (1st layer weights are the queries, 2nd layer weights are the values, the keys are just input, and softmax becomes elementwise, deleting the normalization). TLDR Attention is awesome and a *major* unlock in neural network architecture design.

It's always been a little surprising to me that the paper "Attention is All You Need" gets ~100X more err ... attention... than the paper that actually introduced Attention ~3 years earlier, by Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio: "Neural Machine Translation by Jointly Learning to Align and Translate". As the name suggests, the core contribution of the Attention is All You Need paper that introduced the Transformer neural net is deleting everything *except* Attention, and basically just stacking it in a ResNet with MLPs (which can also be seen as ~attention per the above). But I do think the Transformer paper stands on its own because it adds many additional amazing ideas bundled up all together at once - positional encodings, scaled attention, multi-headed attention, the isotropic simple design, etc. And the Transformer has imo stuck around basically in its 2017 form to this day ~7 years later, with relatively few and minor modifications, maybe with the exception better positional encoding schemes (RoPE and friends).

Anyway, pasting the full email below, which also hints at why this operation is called "attention" in the first place - it comes from attending to words of a source sentence while emitting the words of the translation in a sequential manner, and was introduced as a term late in the process by Yoshua Bengio in place of RNNSearch (thank god? :D). It's also interesting that the design was inspired by a human cognitive process/strategy, of attending back and forth over some data sequentially. Lastly the story is quite interesting from the perspective of nature of progress, with similar ideas and formulations "in the air", with a particular mentions to the work of Alex Graves (NMT) and Jason Weston (Memory Networks) around that time.

Thank you for the story @DBahdanau !

133

985

862K

Gregor Joeris @iCube

over 1 year ago

Great achievement @elonmusk - your AI does not hallucinate: https://t.co/o2JwTrFLr6

Gregor Joeris @iCube

over 1 year ago

https://t.co/izYb7ospZ0

Gregor Joeris @iCube

over 1 year ago

@DB_Bahn Ich bin immer noch unterwegs. War ja nicht nötig, in Bonn Beuel anzuhalten. Der Bahnsteig ist lang genug für ICE. Warum interessiert sich die Bahn eigentlich nicht für die Verspätung ihrer Kunden, sondern nur für die der Züge. Ist großer Unterschied

Gregor Joeris @iCube

over 1 year ago

@DB_Bahn @DB_Info @DB_Presse Kann ja passieren, dass Züge ausfallen. Aber wenn dann der Ersatzzug ICE 2922 nach Ausfall diverser Züge auf der Schnellstrecke nach Köln nicht in Siegburg/Bonn hält, dann ist das schlicht mutwillige und bewusste Nichterbringung der Leistung.

166

Gregor Joeris @iCube

over 1 year ago

@DB_Bahn Zugpersonal nach Halt in Bonn Beuel gefragt. Stadt mit mehr als 300.000 Einwohner. Von Köln nach Bonn und nach Siegburg/Bonn geht nichts mehr, wenn mich die Bahn in Deutz raus lässt. Antwort: machen wir nicht

Gregor Joeris @iCube

over 1 year ago

@DB_Bahn Ok, wird über den Rhein umgeleitet. Sagt das doch irgendwo vorher. Wie wäre es dann mit Halt in Bonn Beuel?

Gregor Joeris @iCube

over 1 year ago

@DB_Bahn Der Zug ICE 620 fährt jetzt wirklich. Die Strecke ist also frei. Jetzt fehlt nur noch der Halt in Siegburg/Bonn. Das ist Euer Job. Die Kunden ans Ziel bringen, nicht dran vorbei fahren. Ist ja nicht Wolfsburg

Gregor Joeris @iCube

over 1 year ago

@DB_Bahn Und sorry, Kommunikation dazu per Twitter ist dämlich, aber Personal vor Ort hat keinen Einfluss auf Fahrdienstleitung.

Gregor Joeris @iCube

over 1 year ago

@DB_Bahn Das ist nicht der Grund, warum der Zug nicht in Siegburg halten kann. Fährt ja über Strecke nach Köln Deutz. Ist jetzt wieder als ICE 629 angekündigt. Hier stehen 100de Gestrandete und ihr plant, einfach durchzufahren. Das ist mutwillig

iCube retweeted

SER Group @sergroup

over 2 years ago

🎉Registration is open for the #SERSUMMIT 2024 on June 11 and 12 in Berlin!!! 🎉 Don't miss this key event for digital transformation leaders and sign up here: https://t.co/lBnT23u29d #ECM #Event #AI #Leader

173

Gregor Joeris @iCube

over 2 years ago

@ilyasut Predict the next tweet

Gregor Joeris @iCube

almost 3 years ago

@then_there_was @OpenAI GPTs reverse impersonation at work

Gregor Joeris @iCube

almost 3 years ago

„Rettung von Freiluftkonzerten und gegen den Missbrauch von Anträgen durch Einzelpersonen” - Jetzt unterschreiben! https://t.co/1sXAyjbJPn via @ChangeGER

Gregor Joeris

@iCube

Last Seen Users on Sotwe

Trends for you

Most Popular Users