Addressing challenges in Content Services, Content Analytics, Content Federation, ECM, BPM, and Case Management from a CTO perspective. Invented Doxis4 CSB.
I find myself doing a lot better work, being more satisfied, and also learn a lot more+faster when I do *the hard work* and don’t outsource it to AI.
As in, I’ll use AI as a *tool* with substasks, additional research: but I don’t turn off my brain or kick back, assuming it can do the work for me.
Every time I “hand over the” hard work part to AI and mentally turn off, I either regret it or find myself eventually needing to go back and spend more time on it.
I also see slop work coming out from people who assume the AI does better work than they would.
📚 If you’re a student choosing what to focus on, pick MATH. It will teach you to relentlessly rely on your own brain, think logically, break down problems, and solve them step by step in the right order. That’s the core skill you’ll need to build companies and manage projects.
I generally like Anthropic: but the more they paint a dystopian future where AI “manages” people (“AI middle-managers”) the more I am starting to think they are losing their marbles.
LLMs is a tool humans should use. The tail should not wag the dog; Anthropic should know better
The (true) story of development and inspiration behind the "attention" operator, the one in "Attention is All you Need" that introduced the Transformer. From personal email correspondence with the author @DBahdanau ~2 years ago, published here and now (with permission) following some fake news about how it was developed that circulated here over the last few days.
Attention is a brilliant (data-dependent) weighted average operation. It is a form of global pooling, a reduction, communication. It is a way to aggregate relevant information from multiple nodes (tokens, image patches, or etc.). It is expressive, powerful, has plenty of parallelism, and is efficiently optimizable. Even the Multilayer Perceptron (MLP) can actually be almost re-written as Attention over data-indepedent weights (1st layer weights are the queries, 2nd layer weights are the values, the keys are just input, and softmax becomes elementwise, deleting the normalization). TLDR Attention is awesome and a *major* unlock in neural network architecture design.
It's always been a little surprising to me that the paper "Attention is All You Need" gets ~100X more err ... attention... than the paper that actually introduced Attention ~3 years earlier, by Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio: "Neural Machine Translation by Jointly Learning to Align and Translate". As the name suggests, the core contribution of the Attention is All You Need paper that introduced the Transformer neural net is deleting everything *except* Attention, and basically just stacking it in a ResNet with MLPs (which can also be seen as ~attention per the above). But I do think the Transformer paper stands on its own because it adds many additional amazing ideas bundled up all together at once - positional encodings, scaled attention, multi-headed attention, the isotropic simple design, etc. And the Transformer has imo stuck around basically in its 2017 form to this day ~7 years later, with relatively few and minor modifications, maybe with the exception better positional encoding schemes (RoPE and friends).
Anyway, pasting the full email below, which also hints at why this operation is called "attention" in the first place - it comes from attending to words of a source sentence while emitting the words of the translation in a sequential manner, and was introduced as a term late in the process by Yoshua Bengio in place of RNNSearch (thank god? :D). It's also interesting that the design was inspired by a human cognitive process/strategy, of attending back and forth over some data sequentially. Lastly the story is quite interesting from the perspective of nature of progress, with similar ideas and formulations "in the air", with a particular mentions to the work of Alex Graves (NMT) and Jason Weston (Memory Networks) around that time.
Thank you for the story @DBahdanau !
@DB_Bahn Ich bin immer noch unterwegs. War ja nicht nötig, in Bonn Beuel anzuhalten. Der Bahnsteig ist lang genug für ICE. Warum interessiert sich die Bahn eigentlich nicht für die Verspätung ihrer Kunden, sondern nur für die der Züge. Ist großer Unterschied
@DB_Bahn@DB_Info@DB_Presse Kann ja passieren, dass Züge ausfallen. Aber wenn dann der Ersatzzug ICE 2922 nach Ausfall diverser Züge auf der Schnellstrecke nach Köln nicht in Siegburg/Bonn hält, dann ist das schlicht mutwillige und bewusste Nichterbringung der Leistung.
@DB_Bahn Zugpersonal nach Halt in Bonn Beuel gefragt. Stadt mit mehr als 300.000 Einwohner. Von Köln nach Bonn und nach Siegburg/Bonn geht nichts mehr, wenn mich die Bahn in Deutz raus lässt.
Antwort: machen wir nicht
@DB_Bahn Der Zug ICE 620 fährt jetzt wirklich. Die Strecke ist also frei. Jetzt fehlt nur noch der Halt in Siegburg/Bonn. Das ist Euer Job. Die Kunden ans Ziel bringen, nicht dran vorbei fahren. Ist ja nicht Wolfsburg
@DB_Bahn Das ist nicht der Grund, warum der Zug nicht in Siegburg halten kann. Fährt ja über Strecke nach Köln Deutz. Ist jetzt wieder als ICE 629 angekündigt. Hier stehen 100de Gestrandete und ihr plant, einfach durchzufahren. Das ist mutwillig
🎉Registration is open for the #SERSUMMIT 2024 on June 11 and 12 in Berlin!!! 🎉
Don't miss this key event for digital transformation leaders and sign up here: https://t.co/lBnT23u29d
#ECM#Event#AI#Leader
„Rettung von Freiluftkonzerten und gegen den Missbrauch von Anträgen durch Einzelpersonen” - Jetzt unterschreiben! https://t.co/1sXAyjbJPn via @ChangeGER