Go Here to Read this Fast! Apple is taking over this popular password manager plugin
Originally appeared here:
Apple is taking over this popular password manager plugin
Go Here to Read this Fast! Apple is taking over this popular password manager plugin
Originally appeared here:
Apple is taking over this popular password manager plugin
Anyone who has tried teaching a dog new tricks knows the basics of reinforcement learning. We can modify the dog’s behavior by repeatedly offering rewards for obedience and punishments for misbehavior. In reinforcement learning (RL), the dog would be an agent, exploring its environment and receiving rewards or penalties based on the available actions. This very simple concept has been formalized mathematically and extended to advance the fields of self-driving and self-driving/autonomous labs.
As a New Yorker, who finds herself riddled with anxiety driving, the benefits of having a stoic robot chauffeur are obvious. The benefits of an autonomous lab only became apparent when I considered the immense power of the new wave of generative AI biology tools. We can generate a huge volume of high-quality hypotheses and are now bottlenecked by experimental validation.
If we can utilize reinforcement learning (RL) to teach a car to drive itself, can we also use it to churn through experimental validations of AI-generated ideas? This article will continue our series, Understanding AI Applications in Bio for ML Engineers, by learning how reinforcement learning is applied in self-driving cars and autonomous labs (for example, AlphaFlow).
The most general way to think about RL is that it’s a learning method by doing. The agent interacts with its environment, learns what actions produce the highest rewards, and avoids penalties through trial and error. If learning through trial and error going 65mph in a 2-ton metal box sounds a bit terrifying, and like something that a regulator would not approve of, you’d be correct. Most RL driving has been done in simulation environments, and current self-driving technology still focuses on supervised learning techniques. But Alex Kendall proved that a car could teach itself to drive with a couple of cheap cameras, a massive neural network, and twenty minutes. So how did he do it?
More mainstream self driving approaches use specialized modules for each of subproblem: vehicle management, perception, mapping, decision making, etc. But Kendalls’s team used a deep reinforcement learning approach, which is an end-to-end learning approach. This means, instead of breaking the problem into many subproblems and training algorithms for each one, one algorithm makes all the decisions based on the input (input-> output). This is proposed as an improvement on supervised approaches because knitting together many different algorithms results in complex interdependencies.
Reinforcement learning is a class of algorithms intended to solve Markov Decision Problem (MDP), or decision-making problem where the outcomes are partially random and partially controllable. Kendalls’s team’s goal was to define driving as an MDP, specifically with the simplified goal of lane-following. Here is a breakdown of how how reinforcement learning components are mapped to the self-driving problem:
These pieces come together through an iterative learning process. The agent uses its policy to take actions in the environment, observes the resulting state and reward, and updates both the policy (via the actor) and the value function (via the critic). Here’s how it works step-by-step:
6. Replay Buffer: Experiences (state, action, reward, next state) are stored in a replay buffer. During training, the agent samples from this buffer to update its networks, ensuring efficient use of data and stability in training.
7. Iteration: The process repeats over and over. The agent refines its policy and value function through trial and error, gradually improving its driving ability.
8. Evaluation: The agent’s policy is tested without exploration noise to evaluate its performance. In Kendall’s work, this meant assessing the car’s ability to stay in the lane and maximize the distance traveled autonomously.
Getting in a car and driving with randomly initialized weights seems a bit daunting! Luckily, what Kendall’s team realized hyper-parameters can be tuned in 3D simulations before being transferred to the real world. They built a simulation engine in Unreal Engine 4 and then ran a generative model for country roads, varied weather conditions and road textures to create training simulations. This vital tuned reinforcement learning parameters like learning rates, number of gradient steps. It also confirmed that a continuous action space was preferable to a discrete one and that DDPG was an appropriate algorithm for the problem.
One of the most interesting aspects of this was how generalized it is versus the mainstream approach. The algorithms and sensors employed are much less specialized than those required by the approaches from companies like Cruise and Waymo. It doesn’t require advancing mapping data or LIDAR data which could make it scalable to new roads and unmapped rural areas.
On the other hand, some downsides of this approach are:
That being said, Kendall’s team’s achievement is an encouraging step towards autonomous driving. Their goal of lane following was intentionally simplified and illustrates the ease at with RL could be incorperated to help solve the self driving problem. Now lets turn to how it can be applied in labs.
The creators of AlphaFlow argue that much like Kendall’s assessment of driving, that development of lab procotols are a Markov Decision Problem. While Kendall constrained the problem to lane-following, the AlphaFlow team constrained their SDL problem to the optimization of multi-step chemical processes for shell-growth of core-shell semiconductor nanoparticles. Semiconductor nanoparticles have a wide range of applications in solar energy, biomedical devices, fuel cells, environmental remediation, batteries, etc. Methods for discovering types of these materials are typically time-consuming, labor-intensive, and resource-intensive and subject to the curse of dimensionality, the exponential increase in a parameter space size as the dimensionality of a problem increases.
Their RL based approach, AlphaFlow, successfully identified and optimized a novel multi-step reaction route, with up to 40 parameters, that outperformed conventional sequences. This demonstrates how closed-loop RL based approaches can accelerate fundamental knowledge.
Colloidal atomic layer deposition (cALD) is a technique used to create core-shell nanoparticles. The material is grown in a layer-by-layer manner on colloidal particles or quantum dots. The process involves alternating reactant addition steps, where a single atomic or molecular layer is deposited in each step, followed by washing to remove excess reagents. The outcomes of steps can vary due to hidden states or intermediate conditions. This variability reinforces the belief that this as a Markov Decision Problem.
Additionally, the layer-by-layer manner aspect of the technique makes it well suited to an RL approach where we need clear definitions of the state, available actions, and rewards. Furthermore, the reactions are designed to naturally stop after forming a single, complete atomic or molecular layer. This means the experiment is highly controllable and suitable for tools like micro-droplet flow reactors.
Here is how the components of reinforcement learning are mapped to the self driving lab problem:
Similar to the usage of the Unreal Engine by Kendall’s team, the AlphaFlow team used a digital twin structure to help pre-train hyper-parameters before conducting physical experiments. This allowed the model to learn through simulated computational experiments and explore in a more cost efficient manner.
Their approach successfully explored and optimized a 40-dimensional parameter space showcasing how RL can be used to solve complex, multi-step reactions. This advancement could be critical for increasing the throughput experimental validation and helping us unlock advances in a range of fields.
In this post, we explored how reinforcement learning can be applied for self driving and automating lab work. While there are challenges, applications in both domains show how RL can be useful for automation. The idea of furthering fundamental knowledge through RL is of particular interest to the author. I look forward to learning more about emerging applications of reinforcement learning in self driving labs.
Cheers and thank you for reading this edition of Understanding AI Applications in Bio for ML Engineers
Reinforcement Learning: Self-Driving Cars to Self-Driving Labs was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Reinforcement Learning: Self-Driving Cars to Self-Driving Labs
Go Here to Read this Fast! Reinforcement Learning: Self-Driving Cars to Self-Driving Labs
T’is apparently the season to look back on the past, as Tim Cook dodges questions about his retirement and has a good go at re-framing Apple as being at the forefront of AI. But alongside that and the many 2024 reviews like Apple Music Replay launching now, there is always time to delve into the latest rumors about future Apple products.
Originally appeared here:
Tim Cook, Super Intelligence, and Wallace and Gromit on the AppleInsider Podcast
Previously on Indonesia vs Apple… Apple used to get around the country’s requirement that 40% of the components in phones must be made locally. That’s not a practical requirement for any country, and the government seemed to know it, because it allowed Apple to qualify to sell iPhones there by launching developer centers instead.
According to Reuters, though, the country’s now six-week-old ban on sales of the iPhone 16, may be be resolved by fulfilling that 40% component requirement. Rosan Roeslan, the Indonesian investment minister who made the claim about Apple building a new plant, also said the 40% requirement would be increased.
Originally appeared here:
Now Indonesia says Apple will build it a $1 billion plant to end the iPhone 16 ban
Originally appeared here:
Gift a Dyson V7 cordless vacuum at its lowest price ever with this Walmart holiday deal
Go Here to Read this Fast! Asus’ latest monitor is a treat for both esports and AAA games
Originally appeared here:
Asus’ latest monitor is a treat for both esports and AAA games
Originally appeared here:
Dodge’s Charger EV muscles up to save the planet from ‘self-driving sleep pods’
Go Here to Read this Fast! NYT Crossword: answers for Friday, December 6
Originally appeared here:
NYT Crossword: answers for Friday, December 6