Home > Author > Brian Christian >

" What the field needed, he argued, was what he called inverse reinforcement learning. Rather than asking, as regular reinforcement learning does, “Given a reward signal, what behavior will optimize it?,” inverse reinforcement learning (or “IRL”) asks the reverse: “Given the observed behaviour, what reward signal, if any, is being optimized?”15 This is, of course, in more informal terms, one of the foundational questions of human life. What exactly do they think they’re doing? We spend a good fraction of our life’s brainpower answering questions like this. We watch the behavior of others around us—friend and foe, superior and subordinate, collaborator and competitor—and try to read through their visible actions to their invisible intentions and goals. It is in some ways the cornerstone of human cognition. It also turns out to be one of the seminal and critical projects in twenty-first-century AI. "

Brian Christian , The Alignment Problem: Machine Learning and Human Values


Image for Quotes

Brian Christian quote : What the field needed, he argued, was what he called inverse reinforcement learning. Rather than asking, as regular reinforcement learning does, “Given a reward signal, what behavior will optimize it?,” inverse reinforcement learning (or “IRL”) asks the reverse: “Given the observed behaviour, what reward signal, if any, is being optimized?”15 This is, of course, in more informal terms, one of the foundational questions of human life. What exactly do they think they’re doing? We spend a good fraction of our life’s brainpower answering questions like this. We watch the behavior of others around us—friend and foe, superior and subordinate, collaborator and competitor—and try to read through their visible actions to their invisible intentions and goals. It is in some ways the cornerstone of human cognition. It also turns out to be one of the seminal and critical projects in twenty-first-century AI.