@asoare159@neurosp1ke But I think we actually agree on the technical side, and it is a matter of semantics? Does the data collection strategy merit a separate "adjective" or not ? I think it does but I get why others would say that this adds too much weight to it, as there is no RL-algo difference
@asoare159@neurosp1ke So when you say offline, you say: I have another means of gathering all my data. With online you say: I need to use the policy for (additional) data collection? And then off-policy algorithms allow you to combine that data with any other data, on-policy algo's don't
@asoare159@neurosp1ke On-policy implies online afaik. But off policy does not necessarily imply offline right? You can use off-policy in online and offline settings I think, and both can make sense?
@asoare159@neurosp1ke Not an expert in offline RL, but my intuition is that w/ offline RL there is less exploration involved? In online settings the algo needs to collect its own data to cover the relevant parts of the state space. The dataset in offline RL supposedly already covers the state space?
Because AI is an engineering discipline and not a scientific field, it's never possible to fully separate the properties of a given approach from those of its specific implementations. The artifact is the method.
TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: https://t.co/AV2cmfeX40
One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the technology, and to share a lot of details for how we're achieving it.
https://t.co/EVFLJAY6Zu
@DJiafei@JitendraMalikCV Agreed! Why don't we have it? Seems like most ingredients are available?
And still, whenever you want to do something in sim, you are limited to the same ~50 tasks with little to no diversity.
@phillip_isola I really, really like this book. Only read a couple of chapters so far, but it seems to combine intuition with rigour in a way I haven't found before. Thanks for open-sourcing it!
@thlarsen@GaryMarcus@MechanizeWork@thlarsen When I look at publications by economists on expected macro-economic impact of AI, they seem very (!) skeptical. Eg, Acemoglu predicts a yearly GDP growth of 2% for the next 10 years.
What are your thoughts on this?
https://t.co/oHp3Xf1jHb
@helper2424@VilleKuosmanen@mimicrobotics This is where 'common sense' comes in, which helps in dealing with failures and can hopefully be obtained with other (less expensive) data sources such as sim, third person demonstrations, non-action data,...
@helper2424@VilleKuosmanen@mimicrobotics I think many researchers are actively adding 'envisioned failures ' explicitly to the train set during data collection.. seems to work to some extent, but I have the same feeling that covering all potential failures is rather hard..