Announcing ARC-AGI-3
The only unsaturated agentic intelligence benchmark in the world
Humans score 100%, AI <1%
This human-AI gap demonstrates we do not yet have AGI
Most benchmarks test what models already know, ARC-AGI-3 tests how they learn
I'm excited to share my latest Medium article where I walk you through Visualizing decision boundaries at every stage of a deep neural network to see exactly how each neuron classify the data.
👉 Read the full article on Medium: https://t.co/qf2kf2Hcdh
For those struggling to understand how Artificial Neural Networks build non-linear decision boundaries to classify data, I’ve written an article to help you develop a clearer intuition on the topic.
👉 https://t.co/bopsKCqFnd
Huge breakthrough
New AI unveils strange chip designs, while discovering new functionalities, it's also slashing the time and cost of designing new wireless chips
"In a study published in Nature Communications researchers at Princeton Engineering and the Indian Institute of Technology describe their methodology, in which an AI creates complicated electromagnetic structures and associated circuits in microchips based on the design parameters. What used to take weeks of highly skilled work can now be accomplished in hours."
"Moreover, the AI behind the new system has produced strange new designs featuring unusual patterns of circuitry. Kaushik Sengupta, the lead researcher, said the designs were unintuitive and unlikely to be developed by a human mind. But they frequently offer marked improvements over even the best standard chips."
"We are coming up with structures that are complex and look randomly shaped, and when connected with circuits, they create previously unachievable performance. Humans cannot really understand them, but they can work better," said Sengupta, a professor of electrical and computer engineering and co-director of NextG.
High-rise construction robots are keeping workers safer
Raise Robotics is leveraging Universal Robots' UR20 arms to automate challenging tasks like installing glass façade fasteners on skyscrapers.
Early adopter Harmon, the largest glazing company in the US, reports significant benefits, including faster ROI, enhanced worker safety, and improved installation precision.
With the robots delivering a 3X boost in labor efficiency, their cost is recouped within a single 13-floor project.
A step forward in construction technology.
This is how AI games will be made…
No world map, real time rendering of what you need to see.
It may also be how our simulation is rendered… the world may not exist except when you observe it.
I'm German.
16 years ago, the EU and US economies were neck and neck.
Today, the US economy is 50% larger than the entire EU combined.
Here's the devastating truth behind Europe's ongoing economic suicide 🧵:
1/n Why Think Step by Step?
The human capacity for reasoning is a remarkable phenomenon. We can arrive at conclusions that would be impossible through direct observation simply by working through a series of intermediate steps in our minds. This ability is mirrored in recent advances with large language models, where "chain-of-thought" prompting – encouraging models to generate intermediate steps before answering – has led to significant performance gains on complex tasks. But a fundamental question remains: why does reasoning work at all? If reasoning doesn't introduce any new information from the world, what makes it so effective?
The paper "Why think step by step? Reasoning emerges from the locality of experience" tackles this question head-on. It proposes a compelling hypothesis: the effectiveness of reasoning stems from the local structure of experience and training data. In both human experience and typical text data, related concepts tend to cluster together. We experience the world from a first-person perspective, encountering related aspects of our environment in close temporal and spatial proximity. Similarly, language models are trained on documents that typically focus on a few interconnected topics. This local structure allows for strong associations between nearby concepts, but direct connections between distant concepts are sparse.
The authors argue that reasoning allows us to bridge these gaps by chaining together a series of local inferences. Imagine trying to determine the climate of France's capital. A language model might have learned that France's capital is Paris and that Paris has an oceanic climate, but it might not have directly encountered the phrase "France's capital has an oceanic climate." By generating the intermediate step "France's capital is Paris," the model can leverage the local associations it has learned to arrive at the correct answer.
This hypothesis is not just intuitive; it's supported by both theoretical and empirical evidence. The authors prove mathematically that in a simplified chain-structured probabilistic model, reasoning through intermediate variables reduces bias compared to direct prediction. They then conduct experiments with transformer language models trained on synthetic data generated from Bayesian networks. By manipulating the structure of the training data, they demonstrate that a "reasoning gap" – where reasoning improves performance – emerges only when the data exhibits local structure. Furthermore, they show that models trained with local data and using free generation (generating their own reasoning steps) achieve comparable performance to models trained on fully observed data, but with significantly less data.
The implications of this work are far-reaching. It provides a concrete mechanism to explain why reasoning is effective, shedding light on a fundamental aspect of human cognition and offering insights into the workings of large language models. The findings suggest that the power of reasoning lies not in accessing new information, but in effectively leveraging the local structure of existing knowledge to make connections that would otherwise remain hidden. This understanding opens up exciting avenues for future research, including exploring the role of local structure in more complex models and real-world data, and developing new techniques to enhance reasoning abilities in artificial intelligence systems.
"Y'a pas d'argent à l'Education Nationale ! On doit faire des économies !"
Regardez les prix catalogues des sociétés par lesquelles je suis OBLIGÉ de passer
80€ la lampe de chevet QUATRE-VINGT EUROS
48,6€ l'ampoule
QUARANTE-HUIT PUTAINS D'EUROS POUR UNE AMPOULE
@Spotify cannot properly check for duplicates before adding a song to a playlist. The exact same songs from a different album are considered different songs by Spotify. It's insane!
The effort to protect innovation and open source continues. I believe we’re all better off if anyone can carry out basic AI research and share their innovations. Right now, I’m deeply concerned about California's proposed law SB-1047. It’s a long, complex bill with many parts that require safety assessments, shutdown capability for models, and so on.
There are many things wrong with this bill, but I’d like to focus here on just one: It defines an unreasonable “hazardous capability” designation that may make builders of large AI models liable if someone uses their models to do something that exceeds the bill’s definition of harm (such as causing $500 million in damage). That is practically impossible for any AI builder to ensure. If the bill is passed in its present form, it will stifle AI model builders, especially open source developers.
Some AI applications, for example in healthcare, are risky. But as I wrote previously, regulators should regulate applications rather than technology.
- Technology refers to tools that can be applied in many ways to solve various problems.
- Applications are specific implementations of technologies designed to meet particular customer needs.
For example, an electric motor is a technology. When we put it in a blender, an electric vehicle, dialysis machine, or guided bomb, it becomes an application. Imagine if we passed laws saying, if anyone uses a motor in a harmful way, the motor manufacturer is liable. Motor makers would either shut down or make motors so tiny as to be useless for most applications. If we pass such a law, sure, we might stop people from building guided bombs, but we’d also lose blenders, electric vehicles, and dialysis machines. In contrast, if we look at specific applications, like blenders, we can more rationally assess risks and figure out how to make sure they’re safe, and even ban classes of applications, like certain types of munitions.
Safety is a property of the application, not a property of the technology (or model), as @random_walker and @sayashk have pointed out. Whether a blender is a safe one can’t be determined by examining the electric motor. A similar argument holds for AI.
SB-1047 doesn’t account for this distinction. It ignores the reality that the number of beneficial uses of AI models is, like electric motors, vastly greater than the number of harmful ones. But, just as no one knows how to build a motor that can’t be used to cause harm, no one has figured out how to make sure an AI model can’t be adapted to harmful uses. In the case of open source models, there’s no known defense to fine-tuning to remove RLHF alignment. And jailbreaking work has shown that even closed-source, proprietary models that have been properly aligned can be attacked in ways that make them give harmful responses. Indeed, the sharp-witted @elder_plinius regularly tweets about jailbreaks for closed models. Kudos also to Anthropic’s @cem__anil and collaborators for publishing their work on many-shot jailbreaking, an attack that can get leading large language models to give inappropriate responses and is hard to defend against.
California has been home to a lot of innovation in AI. I’m worried that this anti-competitive, anti-innovation proposal has gotten so much traction in the legislature. Worse, other jurisdictions often follow California, and it would be awful if they were to do so in this instance.
SB-1047 passed in a key vote in the State Senate in May, but it still has additional steps before it becomes law. I hope you will speak out against it if you get a chance to do so.
[Original text (with links): https://t.co/MOQqFF6cID ]
Remarque intéressante de @jasonfurman hier lors de la Markus Academy @MarkusEconomist : le bon ratio PIB sur dette aux Etats-Unis serait 0,5% et non 99%. Explication: le PIB est un flux et la dette est un stock. La bonne comparaison doit porter sur deux stocks. Minithread 1/3