Google discusses new reinforcement learning model in new “off-policy classification” paper
A team of AI researchers at Google has recently published a paper titled “Off-Policy Evaluation via Off-Policy Classification” on its blog. The paper talks about “off-policy classification” or OPC -- as the researchers call it -- which assesses the performance of AI-driven agents by treating evaluation as a classification problem.
The team says that their approach, which involves a variant of reinforcement learning that uses rewards to drive software policies toward goals,...