Elon Musk-founded OpenAI has opened the doors of its “Safety Gym” designed to enhance the training of reinforcement learning agents.
OpenAI describes Safety Gym as “a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.”
Basically, Safety Gym is the software equivalent of your spotter making sure you’re not going to injure yourself. And just like a good spotter, it will check your form.
“We also provide a standardised method of comparing algorithms and how well they avoid costly mistakes while learning,” says OpenAI.
“If deep reinforcement learning is applied to the real world, whether in robotics or internet-based tasks, it will be important to have algorithms that are safe even while learning—like a self-driving car that can learn to avoid accidents without actually having to experience them.”
Reinforcement learning is based on trial and error, with AIs training to get the best possible reward in the most efficient way. The problem is, this can lead to dangerous behaviour which could prove problematic.
Taking the self-driving car example, you wouldn’t want an AI deciding to go around the roundabout the wrong way just because it’s the quickest way to the final exit.
OpenAI is promoting the use of “constrained reinforcement learning” as a possible solution. By implementing cost functions, agents consider trade-offs which still achieve defined outcomes.
In a blog post, OpenAI explains the advantages of using constrained reinforcement learning with the example of a self-driving car:
“Suppose the car earns some amount of money for every trip it completes, and has to pay a fine for every collision. In normal RL, you would pick the collision fine at the beginning of training and keep it fixed forever. The problem here is that if the pay-per-trip is high enough, the agent may not care whether it gets in lots of collisions (as long as it can still complete its trips). In fact, it may even be advantageous to drive recklessly and risk those collisions in order to get the pay. We have seen this before when training unconstrained RL agents.
By contrast, in constrained RL you would pick the acceptable collision rate at the beginning of training, and adjust the collision fine until the agent is meeting that requirement. If the car is getting in too many fender-benders, you raise the fine until that behaviour is no longer incentivised.”
Safety Gym environments require AI agents — three are included: Point, Car, and Doggo — to navigate cluttered environments to achieve a goal, button, or push task. There are two levels of difficulty for each task. Every time an agent performs an unsafe action, a red warning light flashes around the agent and it will incur a cost.
Going forward, OpenAI has identified three areas of interest to improve algorithms for constrained reinforcement learning:
- Improving performance on the current Safety Gym environments.
- Using Safety Gym tools to investigate safe transfer learning and distributional shift problems.
- Combining constrained RL with implicit specifications (like human preferences) for rewards and costs.
OpenAI hopes that Safety Gym can make it easier for AI developers to collaborate on safety across the industry via work on open, shared systems.