Dario is wrong.
He knows absolutely nothing about the effects of technological revolutions on the labor market.
Don't listen to him, Sam, Yoshua, Geoff, or me on this topic.
Listen to economists who have spent their career studying this, like @Ph_Aghion , @erikbryn , @DAcemogluMIT , @amcafee , @davidautor
Google DeepMind researcher argues that LLMs can never be conscious, not in 10 years or 100 years.
"Expecting an algorithmic description to instantiate the quality it maps is like expecting the mathematical formula of gravity to physically exert weight."
“Anthropic's Claude Mythos isn't a sentient super-hacker, it's a sales pitch — claims of 'thousands' of severe zero-days rely on just 198 manual reviews”
Get used to the “Effective Altruists” fear and self-hating multi-level marketing system, it is just starting.
After seeing that Claude Mythos marketing turned out to be, as expected, a scam, I wanted to make a master list of tricks being used to market LLMs.
The master list includes statements directly from leadership in the companies or from the "organic marketing" of people on social media, along with an explanation on how the scam works. This is my first attempt, so likely incomplete.
The LLM Marketing Scams Master List v1:
"Two more weeks" - the models will be good enough someday soon to do what we claim.
"They're already good enough" - the models are already good enough to replace workers, but it hasn't happened yet because of x y z reasons.
"We just built God in the backroom, and no, you can't see it" - the models they built in private are actually capable of doing the things we have been waiting for, but they can't let us see them yet for x y z reasons.
"Actually they already have replaced jobs" - the layoffs that tech companies have been doing, citing AI as the reason, have already been replaced with current LLM tech, ignoring market conditions and past data on layoffs during such conditions.
"You just don't know how to use then as well as me" - the models are good enough, but esoteric prompt engineering is required to get these results, and no, I won't teach you.
"I built an app making big money with LLMs" - they claim they already have made startup companies, almost always SaaS companies, that are making them tons of money, but when you ask to see them, they won't show you.
"You aren't using the right model" - claims that you must be using the wrong model and need to use Open Claude 420b-parameter Gemini Plus Pro 6.9 with 4RealThisTime HomerSimpson agent mode enabled. Note that this will be used to attack every study on the effectiveness of LLMs, since studies take time to complete and publish, with new models releasing more frequently than it's possible to complete and publish a study
"You're falling behind" - claims that you need to use the bots now, even though they aren't good enough to fully automate any jobs, because otherwise, when the bots are good enough, you will lose your natural English skills required to prompt effectively.
"All these companies are using LLMs, so do you think you know better than they do?" - pointing to claims of large companies deeply invested in LLMs being a success saying that LLMs are being used effectively, with no viewable results in the speed and/or quality of their company's output.
"The benchmark score went up" - claiming improvements on the benchmarking tests given to their latest model, despite the training being specifically tuned to improve on these tests, and then conflating better benchmark scores with actually being more able to automate jobs or drastically improve worker productivity.
"It can now count the letters in Strawberry/can now do things it famously couldn't do previously" - saying that it can now count the letters in Strawberry or instruct you on how to use a cup without a bottom, etc. is often done to suggest increased reasoning for the LLM, but often involves just hard coding an answer into the service.
"It has escaped our control" - saying that they cannot control the LLM, implying it is conscious or living to some degree when really it just said words that it wasn't supposed to or an agent used an app that wasn't intended by the user's prompt when next-token predicting
"It's feeling sad/scared/happy/angry, suggesting it is conscious" - they ask the LLM how it is feeling, and it next-token predicts a response that includes an emotion felt by humans, since training data is from human conversations online.
"Costs are going down/the LLM service is profitable" - ignores training costs and capex for hardware, usually just referring to inference being profitable, which isn't even true in many cases. Training and capex is 95%+ of the total costs to serve the models.
Did I miss any?
@freddier Ver posts de AI en X es como meterse en una cortina de humo. Cada semana hay un AGI diferente, un gurú diferente. Darle el tiempo a estos modelos a ver si realmente son lo que se dice de ellos
@jony_loscal@S0N_IA Gracias por decirlo. Aparte " el que entrenó los modelos que usamos todos los días"??? Tener la visión es una cosa pero atribuirle cada modelo y el trabajo de cada ingeniero de investigación? Hombre que el hablar del hype no se note tanto
Terence Tao put it plainly: there is no evidence that LLMs exhibit genuine creativity.
Yes, they have solved some Erdős problems. But these are low-hanging fruit, questions that attracted little attention and that yield once the right existing techniques are applied. That is not creativity. That is search plus recombination.
Yes, LLM outputs can look impressive. But look at who is impressed: typically non-experts. Experts know very well that LLM performance gets terrible when you approach the frontier of human knowledge.
And this is not a temporary gap. It reflects a structural limitation.
We do not fully understand human creativity. But we do know a key property:
Conceptual leaps: the ability to generate new representations, not just recombine existing ones.
LLMs do not do this. They interpolate in representation space. They operate within existing conceptual frameworks; they do not create new ones.
This is why we haven’t “yet seen them take the next step”.
@galislab Es una burbuja que va a explotar. Yo tengo doctorado en el área y al menos sigo haciendo lo de mi tesis, interpretabilidad. Es mejor estar en las rama donde hay posibilidad de crecimiento fuera de la burbuja, aparte de tener aplicaciones más interesantes y necesarias
@sebkrier Interpretability still requires a lot of work, regarding standardization, more complex models and a bigger, general consensus. On top of a focus on more than just groundtruth, but actual predictions: for accountability and real world applications.
@sebkrier I'm actually one of the authors of a paper cited therein. I am more on the boat of looking at interpretability through two lenses, human interpretability (understanding at what humans see from explanations) and quantified, machine-specific interpretability, hence metrics. (cont).
@sebkrier We found out that this optimal masking approach, often considers a lot of the background, be it context or bias. Thus, despite interpretable recognition metrics being optimal, localization failed completely. (Cont)
@sebkrier Despite of it, research ought to be made in distinguishing those two paradigms. One interesting case is that of Saliency Maps. During my thesis we propsed a CAM variant that was optimized to maximize confidence in prediction, when used to mask input images. (Cont)
@sebkrier It is often asked for a submission to evaluate properties on both localization and recognition. However, localization is often more aligned with human aligned interpretability, than recognition is. Of course there can be biases and learning of the background (water birds, cont)