DeepMind Canada Engineering Lead.
Interested in philosophy, cocktails, ramen, machine learning, manga and computer science, not necessarily in this order.
The PhD thesis of my _first_ PhD student, Doina Precup, is at-long-last available in digital form.
Title: Temporal Abstraction in Reinforcement Learning
Url: https://t.co/d1JJNtv407
Abstract:
Decision making usually involves choosing among different courses of action over a broad range of time scales. For instance, a person planning a trip to a distant location makes high-level decisions regarding what means of transportation to use, but also chooses low-level actions, such as the movements for getting into a car. The problem of picking an appropriate time scale for reasoning and learning has been explored in artificial intelligence, control theory and robotics. In this dissertation we develop a framework that allows novel solutions to this problem, in the context of Markov Decision Processes (MDPs) and reinforcement learning.
In this dissertation, we present a general framework for prediction, control and learning at multiple temporal scales. In this framework, temporally extended actions are represented by a way of behaving (a policy) together with a termination condition. An action represented in this way is called an _option_. Options can be easily incorporated in MDPs, allowing an agent to use existing controllers, heuristics for picking actions, or learned courses of action.
The effects of behaving according to an option can be predicted using multi-time models, learned by interacting with the environment. In this dissertation we develop multi-time models, and we illustrate the way in which they can be used to produce plans of behavior very quickly, using classical dynamic programming or reinforcement learning techniques.
The most interesting feature of our framework is that it allows an agent to work simultaneously with high-level and low-level temporal representations. The interplay of these levels can be exploited in order to learn and plan more efficiently and more accurately. We develop new algorithms that take advantage of this structure to improve the quality of plans, and to learn in parallel about the effects of many different options.
Where now: Doina is a professor of computer science at McGill University and head of the Montreal office of Google DeepMind
@iam_elias1 Imagine an economy composed of Alice and Bob.
Post AI layoff, 0 employment.
They wake and both want to have cereals and milk.
Option 1: it is readily available. Who needs a job!!!
Option 2: not available. Alice will find a way to cereals and Bob milk.
Full employment.
The industry is not "obsessed" with making LLM bigger it is obsessed with delivering value to the users so that they choose their product.
If you doubt this please try using favorite LLM from 2 generations ago.
📁 Yann LeCun says that scaling models will not get us to human intelligence.
He explains that the industry remains obsessed with making LLMs bigger, but that this path is fundamentally broken. It does not matter how many parameters we add or how many clusters we build, because the models only imitate language patterns.
Human intelligence does not emerge from size, it emerges from understanding the world.
@ViralOrTrying@fchollet Kolmogorov complexity isn't an absolute number but depends on the utm used.
The difference between k_utm1 and k_utm2 is bounded to a constant but it could be large.
@fchollet If it has merely cached the data then what explains generalization of small datasets with large models?
This indicates that large models + régularisera are doing some sort of soft compression happening in large models.
https://t.co/fJMOkkG69y
Best comment about LLMs from Scott AAronson:
"The more I play with things like the O1 model the more grateful I am I have tenure."
@fantfant5@qaisailan العلم و الفلسفة نشاطان مختلفان جذريا.
الفلسفة تتحرى عن معنى و طبيعة الأشياء و العلم يقدم نماذج قابلة للتجربة و الاستخدام عمليا.
الفلسفة تقدم راحة النفس و الاحساس بمعنى للحياة و العلم يقدم الكهرباء و الطائرات و الجوال ...
@fantfant5@qaisailan ليست ذات النظرية اينشتاين قدم معادلات تسمح بتحديد موضع الكواكب بدقة تفوق ما كانت الأجهزة في وقته قابلة للقياس. ابن تيمية كرر اراء الإغريق و المعتزلة في خلق الزمان و نسبيته.
هذا ليس اجحافا لأن الشيء نفسه يقال في أفلاطون و اوغسطين و ابن رشد و هم من أهم المفكرين في التاريخ.