Introducing a new #ReinforcementLearning algorithm for teaching agents how to solve tasks by providing only examples of success, avoiding the need for hand-crafted reward functions while outperforming prior imitation learning approaches. Read more →https://t.co/gK3nJV8JIf
Open-domain long-form question answering can provide a testbed to measure the factuality of generative text models in #NLP. Check out a new system that leverages both sparse attention and retrieval-based models to achieve state-of-the-art results. https://t.co/yUlAD18dkt