Google discusses new reinforcement learning model in new “off-policy classification” paper


A team of AI researchers at Google has recently published a paper titled “Off-Policy Evaluation via Off-Policy Classification” on its blog. The paper talks about “off-policy classification” or OPC — as the researchers call it — which assesses the performance of AI-driven agents by treating evaluation as a classification problem.

The team says that their approach, which involves a variant of reinforcement learning that uses rewards to drive software policies toward goals, works with image inputs and scales to tasks, including vision-based robotic grasping.

Alex Irpan, software engineer at Google, said: “Fully off-policy reinforcement learning is a variant in which an agent learns entirely from older data, which is appealing because it enables model iteration without requiring a physical robot. With fully off-policy RL, one can train several models on the same fixed dataset collected by previous agents, then select the best one.”

In the blog, Google writes, OPC depends on two assumptions. The first is the final task has deterministic dynamics, which does not involve randomness in how states change, and the second is that the agent either succeeds or fails at the end of every trial. The paper proves that the performance of an agent is measured by how frequently its chosen action is an effective action, depending on how well the Q-function correctly classifies actions as effective versus catastrophic.

At its 2019 I/O Keynote last month, Google had announced that it has managed to condense 100GB of AI to just 0.5GB for a drastically sped-up Assistant. According to Scott Huffman, vice president of engineering at Google, the so-called “next generation” Assistant is so fast that it operates in real-time.

Interested in hearing industry leaders discuss subjects like this and their use cases? Attend the co-located AI & Big Data Expo events with upcoming shows in Silicon Valley, London, and Amsterdam to learn more. Co-located with the IoT Tech ExpoBlockchain Expo, and Cyber Security & Cloud Expo.

Click to comment

You must be logged in to post a comment Login

Leave a Reply

To Top

We are using cookies on our website

We use cookies to personalise content and ads, to provide social media features, and to analyse our traffic. Please confirm if you accept our tracking cookies. You are free to decline the tracking so you can continue to visit our website without any data sent to third-party services. All personal data can be deleted by visiting the Contact Us > Privacy Tools area of the website.