reinforcement learning – AI News https://news.deepgeniusai.com Artificial Intelligence News Wed, 25 Mar 2020 05:24:58 +0000 en-GB hourly 1 https://deepgeniusai.com/news.deepgeniusai.com/wp-content/uploads/sites/9/2020/09/ai-icon-60x60.png reinforcement learning – AI News https://news.deepgeniusai.com 32 32 Do you even AI, bro? OpenAI Safety Gym enhances reinforcement learning https://news.deepgeniusai.com/2019/11/22/ai-openai-reinforcement-learning-safety-gym/ https://news.deepgeniusai.com/2019/11/22/ai-openai-reinforcement-learning-safety-gym/#respond Fri, 22 Nov 2019 12:04:53 +0000 https://d3c9z94rlb3c1a.cloudfront.net/?p=6222 Elon Musk-founded OpenAI has opened the doors of its “Safety Gym” designed to enhance the training of reinforcement learning agents. OpenAI describes Safety Gym as “a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.” Basically, Safety Gym is the software equivalent of your spotter making... Read more »

The post Do you even AI, bro? OpenAI Safety Gym enhances reinforcement learning appeared first on AI News.

]]>
Elon Musk-founded OpenAI has opened the doors of its “Safety Gym” designed to enhance the training of reinforcement learning agents.

OpenAI describes Safety Gym as “a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.”

Basically, Safety Gym is the software equivalent of your spotter making sure you’re not going to injure yourself. And just like a good spotter, it will check your form.

“We also provide a standardised method of comparing algorithms and how well they avoid costly mistakes while learning,” says OpenAI.

“If deep reinforcement learning is applied to the real world, whether in robotics or internet-based tasks, it will be important to have algorithms that are safe even while learning—like a self-driving car that can learn to avoid accidents without actually having to experience them.”

Reinforcement learning is based on trial and error, with AIs training to get the best possible reward in the most efficient way. The problem is, this can lead to dangerous behaviour which could prove problematic.

Taking the self-driving car example, you wouldn’t want an AI deciding to go around the roundabout the wrong way just because it’s the quickest way to the final exit.

OpenAI is promoting the use of “constrained reinforcement learning” as a possible solution. By implementing cost functions, agents consider trade-offs which still achieve defined outcomes.

In a blog post, OpenAI explains the advantages of using constrained reinforcement learning with the example of a self-driving car:

“Suppose the car earns some amount of money for every trip it completes, and has to pay a fine for every collision. In normal RL, you would pick the collision fine at the beginning of training and keep it fixed forever. The problem here is that if the pay-per-trip is high enough, the agent may not care whether it gets in lots of collisions (as long as it can still complete its trips). In fact, it may even be advantageous to drive recklessly and risk those collisions in order to get the pay. We have seen this before when training unconstrained RL agents.

By contrast, in constrained RL you would pick the acceptable collision rate at the beginning of training, and adjust the collision fine until the agent is meeting that requirement. If the car is getting in too many fender-benders, you raise the fine until that behaviour is no longer incentivised.”

Safety Gym environments require AI agents — three are included: Point, Car, and Doggo — to navigate cluttered environments to achieve a goal, button, or push task. There are two levels of difficulty for each task. Every time an agent performs an unsafe action, a red warning light flashes around the agent and it will incur a cost.

Going forward, OpenAI has identified three areas of interest to improve algorithms for constrained reinforcement learning:

  1. Improving performance on the current Safety Gym environments.
  2. Using Safety Gym tools to investigate safe transfer learning and distributional shift problems.
  3. Combining constrained RL with implicit specifications (like human preferences) for rewards and costs.

OpenAI hopes that Safety Gym can make it easier for AI developers to collaborate on safety across the industry via work on open, shared systems.

The post Do you even AI, bro? OpenAI Safety Gym enhances reinforcement learning appeared first on AI News.

]]>
https://news.deepgeniusai.com/2019/11/22/ai-openai-reinforcement-learning-safety-gym/feed/ 0
AI enables ‘hybrid drones’ with the attributes of both planes and helicopters https://news.deepgeniusai.com/2019/07/15/ai-hybrid-drones-planes-helicopters/ https://news.deepgeniusai.com/2019/07/15/ai-hybrid-drones-planes-helicopters/#respond Mon, 15 Jul 2019 15:41:36 +0000 https://d3c9z94rlb3c1a.cloudfront.net/?p=5832 Researchers have developed an AI system enabling ‘hybrid drones’ which combine the attributes of both planes and helicopters. The propeller-forward designs of most drones are inefficient and reduce flight time. Researchers from MIT, Dartmouth, and the University of Washington have proposed a new hybrid design which aims to combine the perks of both helicopters and... Read more »

The post AI enables ‘hybrid drones’ with the attributes of both planes and helicopters appeared first on AI News.

]]>
Researchers have developed an AI system enabling ‘hybrid drones’ which combine the attributes of both planes and helicopters.

The propeller-forward designs of most drones are inefficient and reduce flight time. Researchers from MIT, Dartmouth, and the University of Washington have proposed a new hybrid design which aims to combine the perks of both helicopters and fixed-wing planes.

In order to support the new design, a new AI system was developed to switch between hovering and gliding with a single flight controller.

Speaking to VentureBeat, MIT CSAIL graduate student and project lead Jie Xu said:

 “Our method allows non-experts to design a model, wait a few hours to compute its controller, and walk away with a customised, ready-to-fly drone.

The hope is that a platform like this could make more these more versatile ‘hybrid drones’ much more accessible to everyone.”

Existing fixed-wing drones require engineers to build different systems for hovering (like a helicopter) and flying horizontally (like a plane). Controllers are also needed to switch between.

Today’s control systems are designed around simulations, causing a discrepancy when used in actual hardware in real-world scenarios.

Using reinforcement learning, the researchers trained a model which can detect potential differences between the simulation and reality. The controller is then able to use this model to transition from hovering to flying, and back again, just by updating the drone’s target velocity.

OnShape, a popular CAD platform, is used to allow users to select potential drone parts from a data set. The proposed design’s performance can then be tested in a simulator.

“We expect that this proposed solution will find application in many other domains,” wrote the researchers in the paper. It’s easy to imagine the research one day being scaled up to people-carrying ‘air taxis’ and more.

The researchers will present their paper later this month at the Siggraph conference in Los Angeles.

deepgeniusai.com/">AI & Big Data Expo events with upcoming shows in Silicon Valley, London, and Amsterdam to learn more. Co-located with the IoT Tech Expo, , & .

The post AI enables ‘hybrid drones’ with the attributes of both planes and helicopters appeared first on AI News.

]]>
https://news.deepgeniusai.com/2019/07/15/ai-hybrid-drones-planes-helicopters/feed/ 0
Uber’s AI beats troublesome games with new type of reinforcement learning https://news.deepgeniusai.com/2018/11/27/uber-ai-games-reinforcement-learning/ https://news.deepgeniusai.com/2018/11/27/uber-ai-games-reinforcement-learning/#comments Tue, 27 Nov 2018 14:35:47 +0000 https://d3c9z94rlb3c1a.cloudfront.net/?p=4242 Video games have become a proving ground for AIs and Uber has shown how its new type of reinforcement learning has succeeded where others have failed. Some of mankind’s most complex games, like Go, have failed to challenge AIs from the likes of DeepMind. Reinforcement learning trains algorithms by running scenarios repeatedly with a ‘reward’... Read more »

The post Uber’s AI beats troublesome games with new type of reinforcement learning appeared first on AI News.

]]>
Video games have become a proving ground for AIs and Uber has shown how its new type of reinforcement learning has succeeded where others have failed.

Some of mankind’s most complex games, like Go, have failed to challenge AIs from the likes of DeepMind. Reinforcement learning trains algorithms by running scenarios repeatedly with a ‘reward’ given for successes, often a score increase.

Two classic games from the 80s – Montezuma’s Revenge and Pitfall! – have thus far been immune to a traditional reinforcement learning approach. This is because they have little in the way of notable rewards until later in the games.

Applying traditional reinforcement learning typically results in a failure to progress out the first room in Montezuma’s Revenge, while in Pitfall! it fails completely.

One way researchers have attempted to provide the necessary rewards to incentivise the AI is by adding them in for exploration, what’s called ‘intrinsic motivation’. However, this approach has shortcomings.

“We hypothesize that a major weakness of current intrinsic motivation algorithms is detachment,” wrote Uber’s researchers. “Wherein the algorithms forget about promising areas they have visited, meaning they do not return to them to see if they lead to new states.”

Uber’s AI research team in San Francisco developed a new type of reinforcement learning to overcome the challenge.

The researchers call their approach ‘Go-Explore’ whereby the AI will return to a previous task or area to assess whether it yields a better result. Supplementing with human knowledge to guide it towards notable areas sped up its progress dramatically.

If nothing else, the research provides some comfort us feeble humans are not yet fully redundant and the best results will be attained by working hand-in-binary with our virtual overlords.

 AI & >.

The post Uber’s AI beats troublesome games with new type of reinforcement learning appeared first on AI News.

]]>
https://news.deepgeniusai.com/2018/11/27/uber-ai-games-reinforcement-learning/feed/ 1
Google improves AI model training by open-sourcing framework https://news.deepgeniusai.com/2018/08/28/google-ai-model-open-source-framework/ https://news.deepgeniusai.com/2018/08/28/google-ai-model-open-source-framework/#respond Tue, 28 Aug 2018 10:32:23 +0000 https://d3c9z94rlb3c1a.cloudfront.net/?p=3662 Google is helping researchers seeking to train AI models by open-sourcing a reinforcement learning framework used for its own projects. Reinforcement learning has been used for some of the most impressive AI demonstrations thus far, including those which beat human professional gamers at Alpha Go and Dota 2. Google subsidiary DeepMind uses it for its... Read more »

The post Google improves AI model training by open-sourcing framework appeared first on AI News.

]]>
Google is helping researchers seeking to train AI models by open-sourcing a reinforcement learning framework used for its own projects.

Reinforcement learning has been used for some of the most impressive AI demonstrations thus far, including those which beat human professional gamers at Alpha Go and Dota 2. Google subsidiary DeepMind uses it for its Deep Q-Network (DQN).

Building a reinforcement learning framework takes both time and significant resources. For AI to reach its full potential, it needs to become more accessible.

Starting today, Google is making an open source reinforcement framework based on TensorFlow – its machine learning library – available on GitHub.

Pablo Samuel Castro and Marc G. Bellemare, Google Brain researchers, wrote in a blog post:

“Inspired by one of the main components in reward-motivated behavior in the brain and reflecting the strong historical connection between neuroscience and reinforcement learning research, this platform aims to enable the kind of speculative research that can drive radical discoveries.

This release also includes a set of collabs that clarify how to use our framework.”

Google’s framework was designed with three focuses: flexibility, stability, and reproducibility.

The company is providing 15 code examples for the Arcade Learning Environment — a platform which uses video games to evaluate the performance of AI technology — along with four distinct machine learning models: C51, the aforementioned DQN, Implicit Quantile Network, and the Rainbow agent.

Reinforcement learning is among the most effective methods of training. If you’re training a dog, offering treats as a reward for the desired behaviour is a key example of positive reinforcement in practice.

Training a machine is a similar concept, only the rewards are delivered or withheld as ones and zeros instead of tasty goods or a paycheck.

“Our hope is that our framework’s flexibility and ease-of-use will empower researchers to try out new ideas, both incremental and radical,” wrote Bellemare and Castro. “We are already actively using it for our research and finding it is giving us the flexibility to iterate quickly over many ideas.”

“We’re excited to see what the larger community can make of it.”

What are your thoughts on Google’s open-sourcing of its reinforcement learning framework?

 

The post Google improves AI model training by open-sourcing framework appeared first on AI News.

]]>
https://news.deepgeniusai.com/2018/08/28/google-ai-model-open-source-framework/feed/ 0
AI is helping to make treatment for cancer more bearable https://news.deepgeniusai.com/2018/08/13/ai-helping-make-treatment-cancer/ https://news.deepgeniusai.com/2018/08/13/ai-helping-make-treatment-cancer/#respond Mon, 13 Aug 2018 14:44:13 +0000 https://d3c9z94rlb3c1a.cloudfront.net/?p=3625 Researchers from MIT are using artificial intelligence to make treatment for cancer less debilitating but just as effective for patients. The AI learns from historical patient data to determine what the lowest doses and frequencies of medication delivered the desired results to shrink tumours. In some cases, the monthly administration of doses was reduced to... Read more »

The post AI is helping to make treatment for cancer more bearable appeared first on AI News.

]]>
Researchers from MIT are using artificial intelligence to make treatment for cancer less debilitating but just as effective for patients.

The AI learns from historical patient data to determine what the lowest doses and frequencies of medication delivered the desired results to shrink tumours.

In some cases, the monthly administration of doses was reduced to just twice per year while achieving the same goal. Based on a trial of fifty patients, treatments were reduced to between a quarter and half of the prior doses.

Pratik Shah, Principal Investigator at MIT Media Lab, says:

“We kept the goal, where we have to help patients by reducing tumour sizes but, at the same time, we want to make sure the quality of life — the dosing toxicity — doesn’t lead to overwhelming sickness and harmful side effects.”

Some of the side effects of cancer medication can do more harm than good to a patient’s quality of life. By implementing the AI’s treatment strategy, the least toxic doses can be used.

The current model focuses on glioblastoma treatment.

Glioblastoma is the most aggressive form of brain cancer, although it can also be found in the spinal cord. It’s more commonly found in older adults but can impact any age.

Sufferers are often given a life expectancy of up to five years. Doctors often administer the maximum safe dosages to shrink tumours as much as possible, but with side effects that can impact a patient’s quality of life over that period.

In a press release, MIT said:

“The researchers’ model, at each action, has the flexibility to find a dose that doesn’t necessarily solely maximize tumour reduction, but that strikes a perfect balance between maximum tumour reduction and low toxicity.”

“This technique has various medical and clinical trial applications, where actions for treating patients must be regulated to prevent harmful side effects.”

Reinforced learning was used for the model whereby the AI seeks ‘rewards’ and wants to avoid ‘penalties’ so it optimises all of its actions.

The model started by determining whether to administer or withhold a dose. If administered, whether a full dose or just a portion is necessary.

A second clinical model is pinged each time an action is taken in order to predict the effect on the tumour.

In order to prevent just giving frequent maximum dosages each time – the researchers’ AI received a penalty whenever it handed out full doses, or a medication too often.

Without the penalty in place, the results were very similar to a treatment regime created by humans. With the penalties, the frequency and potency of the doses were significantly reduced.

The full research paper can be found here (PDF)

What are your thoughts on using AI to improve cancer patients’ quality of life?

 

The post AI is helping to make treatment for cancer more bearable appeared first on AI News.

]]>
https://news.deepgeniusai.com/2018/08/13/ai-helping-make-treatment-cancer/feed/ 0