@CsabaSzepesvari@AvivTamar1@shiemannor .... and you need to optimize over all four classes of policies, rather than assuming that all solutions start with Bellman's equation. Even Sutton and Barto's RL book has samples from each of the four classes.
@CsabaSzepesvari@AvivTamar1@shiemannor In any software implementation, transitions are captured through transition *functions* not transition matrices. You need to model the exogenous information process (show me a random variable in Sutton and Barto's RL book)....>
@marcgbellemare@AvivTamar1@shiemannor ... it does not model transition functions, the exogenous information process, and it does not recognize that you optimize over policies. See my comparison of modeling frameworks in section 2.1, and then go to chapter 9 in https://t.co/bqEUgWN5Xp
@marcgbellemare@AvivTamar1@shiemannor The RL community uses the modeling framework of Markov decision processes, developed in the 1950s and widely used by theoreticians. However, it does not translate to code, and ignores some fundamentally important modeling elements....
@marcgbellemare@AvivTamar1@shiemannor Yes… vs how the RL community models an RL problem using the classical framework of markov decision processes. I contrast over a dozen modeling frameworks in section 2.1.
@CsabaSzepesvari@AvivTamar1@shiemannor In section 9.9, I illustrate energy storage problems with increasingly complex state variables. Show me how to describe these state variables by just describing the state space. State spaces are OK for theory, but they don't describe the problem.
@CsabaSzepesvari@AvivTamar1@shiemannor Modeling means a mathematical representation of a problem. In my view, the best models can be translated directly to code. Regarding state variables - pls see section 9.9 of my book (you can download chapter 9 from https://t.co/bqEUgWMy7R)
@CsabaSzepesvari@AvivTamar1@shiemannor The theory does not require accurate modeling. You cannot model a real problem with "state spaces", "action spaces" and, most of all, a "one-step transition matrix". You need state *variables* and you need to know what a state variable is. See https://t.co/q3Bq5H0WSV
@AvivTamar1@shiemannor Great article, but it misses the real problem with RL. Chapter 9 of https://t.co/bqEUgWMy7R describes how to model any sequential decision problem. Then see the four classes of policies in chapter 11. Both chapters can be downloaded from the webpage.
@heyitsmehugo@Sergei_Imaging@rasbt I post regularly on LinkedIn… for more on “sequential decision analytics” see my resources page at https://t.co/XOtbNegF3V
@Sergei_Imaging@rasbt See the companion book Sequential Decision Analytics and Modeling. It is accompanied by a python library at https://t.co/ibzhmlBmYW
@heyitsmehugo@Sergei_Imaging@rasbt My latest book (Sequential Decision Analytics and Modeling) has a python library on GitHub. Right away people were asking if I had a version in Julia. I started my career in Fortran, and I lived through C, C++, Java and then python. Mathematical modeling is much more stable.
I have been reviewing the top textbooks on supply chains… supply chain management, supply chain theory, supply chain engineering, and production/operations management… not one presents a mathematical model of an inventory problem involving uncertainty…
@OR_4_SA Yes it is, but I am only interested in what is being taught in books. Please send me a book title and page number where I can find any of: 1) a formal statement of any stochastic inventory problem, 2) a model and solution to a problem with nonstationary demands, and …