Blog

  • AlphaFold 2 Through the Context of BERT

    AlphaFold 2 Through the Context of BERT

    Meghan Heintz

    Understanding AI applications in bio for machine learning engineers

    Photo by Google DeepMind on Unsplash

    AlphaFold 2 and BERT were both developed in the cradle of Google’s deeply lined pockets in 2018 (albeit by different departments: DeepMind and Google AI). They represented huge leaps forward in state-of-the-art models for natural language processing (NLP) and biology respectively. For BERT, this meant topping the leaderboard on benchmarks like GLUE (General Language Understanding Evaluation) and SQuAD (Stanford Question Answering Dataset). For AlphaFold 2 (hereafter just referred to as AlphaFold), it meant achieving near-experimental accuracy in predicting 3D protein structures. In both cases, these advancements were largely attributed to the use of transformer architecture and the self-attention mechanism.

    I expect most machine learning engineers have a cursory understanding of how BERT or Bidirectional encoder representations from transformers work with language but only a vague metaphorical understanding of how the same architecture is applied to the field of biology. The purpose of this article is to explain the concepts behind the development and success of AlphaFold through the lens of how they compare and contrast to BERT.

    Forewarning: I am a machine learning engineer and not a biologist, just a curious person.

    BERT Primer

    Before diving into protein folding, let’s refresh our understanding of BERT. At a high level, BERT is trained by masked token prediction and next-sentence prediction.

    Example masked token prediction where “natural” was the masked token in the target sentence. (All images, unless otherwise noted, are by the author)

    BERT falls into the sequence model family. Sequence models are a class of machine learning models designed to handle and make sense of sequential data where the order of the elements matters. Members of the family include Recurrent Neural Nets (RNNs), LSTMs (Long Short Term Memory), and Transformers. As a Transformer model (like its more famous relative, GPT), a key unlock for BERT was how training could be parallelized. RNNs and LSTMs process sequences sequentially, which slows down training and limits the applicable hardware. Transformer models utilize the self-attention mechanism which processes the entire sequence in parallel and allows training to leverage modern GPUs and TPUs, which are optimized for parallel computing.

    Processing the entire sequence at once not only decreased training time but also improved embeddings by modeling the contextual relationships between words. This allows the model to better understand dependencies, regardless of their position in the sequence. A classic example illustrates this concept: “I went fishing by the river bank” and “I need to deposit money in the bank.” To readers, bank clearly represents two distinct concepts, but previous models struggled to differentiate them. The self-attention mechanism in transformers enables the model to capture these nuanced differences. For a deeper dive into this topic, I recommend watching this Illustrated Guide to Transformers Neural Network: A step by step explanation.

    Example sentences where previous NLP models would have failed to differentiate the two meanings of bank and river bank.

    One reason RNNs and LSTMs struggle is because they are unidirectional i.e. they process a sentence from left to right. So if the sentence was rewritten “At the bank, I need to deposit money”, money would no longer clarify the meaning of bank. The self-attention mechanism eliminates this fragility by allowing each word in the sentence to “attend” to every other word, both before and after it making it “bidirectional”.

    AlphaFold and BERT Comparison

    Now that we’ve reviewed the basics of BERT, let’s compare it to AlphaFold. Like BERT, AlphaFold is a sequence model. However, instead of processing words in sentences, AlphaFold’s inputs are amino acid sequences and multiple sequence alignments (MSAs), and its output/prediction is the 3D structure of the protein.

    Let’s review what these inputs and outputs are before learning more about how they are modeled.

    First input: Amino Acid Sequences

    Amino acid sequences are embedded into high-dimensional vectors, similar to how text is embedded in language models like BERT.

    Reminder from your high school biology class: the specific sequence of amino acids that make up a protein is determined by mRNA. mRNA is transcribed from the instructions in DNA. As the amino acids are linked together, they interact with one another through various chemical bonds and forces, causing the protein to fold into a unique three-dimensional structure. This folded structure is crucial for the protein’s function, as its shape determines how it interacts with other molecules and performs its biological roles. Because the 3D structure is so important for determining the protein’s function, the “protein folding” problem has been an important research problem for the last half-century.

    Bio 101 reminder on the relationship between DNA, mRNA, and Amino Acid Sequences

    Before AlphaFold, the only reliable way to determine how an amino acid sequence would fold was through experimental validation through techniques like X-ray crystallography, NMR spectroscopy (nuclear magnetic resonance), and Cryo-electron microscopy (cryo-EM). Though accurate, these methods are time-consuming, labor-intensive, and expensive.

    So what is an MSA (multiple sequence alignment) and why is it another input into the model?

    Second input: Multiple sequence alignments, represented as matrices in the model.

    Amino acid sequences contain the necessary instructions to build a protein but also include some less important or more variable regions. Comparing this to language, I think of these less important regions as the “stop words” of protein folding instructions. To determine which regions of the sequence are the analogous stop words, MSAs are constructed using homologous (evolutionarily related) sequences of proteins with similar functions in the form of a matrix where the target sequence is the first row.

    Similar regions of the sequences are thought to be “evolutionarily conserved” (parts of the sequence that stay the same). Highly conserved regions across species are structurally or functionally important (like active sites in enzymes). My imperfect metaphor here is to think about lining up sentences from Romance languages to identify shared important words. However, this metaphor doesn’t fully explain why MSAs are so important for predicting the 3D structure. Conserved regions are so critical because they allow us to detect co-evolution between amino acids. If two residues tend to mutate in a coordinated way across different sequences, it often means they are physically close in the 3D structure and interact with each other to maintain protein stability. This kind of evolutionary relationship is difficult to infer from a single amino acid sequence but becomes clear when analyzing an MSA.

    An imperfect metaphor for MSAs: Like comparing similar words in Romance languages (e.g., “branches”: ramas, branches, rami, ramos, ramuri, branques), MSAs align sequences to reveal evolutionary connections, tracing shared origins through small variations.

    Here is another place where the comparison of natural language processing and protein folding diverges; MSAs must be constructed and researchers often manually curate them for optimal results. Biologists use tools like BLAST (Basic Local Alignment Search Tool) to search their target sequences to find “homologs” or similar sequences. If you’re studying humans, this could mean finding sequences from other mammals, vertebrates, or more distant organisms. Then the sequences are manually selected considering things like comparable lengths and similar functions. Including too many sequences with divergent functions degrades the quality of the MSA. This is a HUGE difference from how training data is collected for natural language models. Natural language models are trained on huge swaths of data that are hovered up from anywhere and everywhere. Biology models, by contrast, need highly skilled and contentious dataset composers.

    What is being predicted/output?

    In BERT, the prediction or target is the masked token or next sentence. For AlphaFold, the target is the 3D structure of the protein, represented as the 3D coordinates of protein atoms, which defines the spatial arrangement of amino acids in a folded protein. Each set of 3D coordinates is collected experimentally, reviewed, and stored in the Protein Data Bank. Recently solved structures serve as a validation set for evaluation.

    The output of AlphaFold is typically the 3D structure of a protein, which consists of the x, y, z coordinates of the atoms that make up the protein’s amino acids.

    How are the inputs and outputs tied together?

    Both the target sequence and MSA are processed independently through a series of transformer blocks, utilizing the self-attention mechanism to generate embeddings. The MSA embedding captures evolutionary relationships, while the target sequence embedding documents local context. These contextual embeddings are then fed into downstream layers to predict pairwise interactions between amino acids, ultimately inferring the protein’s 3D structure.

    Within each sequence, the pairwise residue (the relationship or interaction between two amino acids within a protein sequence) representation predicts spatial distances and orientations between acids, which are critical for modeling how distant parts of the protein come into proximity when folded. The self-attention mechanism allows the model to account for both local and long-range dependencies within the sequence and MSA. This is important because when a sequence is folded, residues that are far from each other in a sequence may end up close to each other spatially.

    The loss function for AlphaFold is considerably more complex than the BERT loss function. BERT faces no spatial or geometric constraints and its loss function is much simpler because it only needs to predict missing words or sentence relationships. In contrast, AlphaFold’s loss function involves multiple aspects of protein structure (distance distributions, torsion angles, 3D coordinates, etc.), and the model optimizes for both ****geometric and spatial predictions. This component heavy loss function ensures that AlphaFold accurately captures the physical properties and interactions that define the protein’s final structure.

    While there is essentially no meaningful post-processing required for BERT predictions, predicted 3D coordinates are reviewed for energy minimization and geometric refinement based on the physical principles of proteins. These steps ensure that predicted structures are physically viable and biologically functional.

    Conclusion

    • AlphaFold and BERT both benefit from the transformer architecture and the self-attention mechanism. These improvements improve contextual embeddings and faster training time with GPUs and TPUs.
    • AlphaFold has a much more complex data preparation process than BERT. Curating MSAs from experimentally derived data is harder than vacuuming up a large corpus of text!
    • AlphaFold’s loss function must account for spatial or geometric constraints and it’s much more complex than BERT’s.
    • AlphaFold predictions require post-processing to confirm that the prediction is physically viable whereas BERT predictions do not require post-processing.

    Thank you for reading this far! I’m a big believer in cross-functional learning and I believe as machine learning engineers we can learn more by challenging ourselves to learn outside our immediate domains. I hope to continue this series on Understanding AI Applications in Bio for Machine Learning Engineers throughout my maternity leave. ❤


    AlphaFold 2 Through the Context of BERT was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

    Originally appeared here:
    AlphaFold 2 Through the Context of BERT

    Go Here to Read this Fast! AlphaFold 2 Through the Context of BERT

  • Fourth watchOS 11.1, tvOS 18.1, and visionOS 2.1 have arrived

    Fourth watchOS 11.1, tvOS 18.1, and visionOS 2.1 have arrived

    Continuing its beta testing process for this series, Apple has issued fourth developer builds of watchOS 11.1, tvOS 18.1, and visionOS 2.1.

    A smartwatch on a wrist displaying weather-related data, including temperature and humidity graphs.
    watchOS 11 introduces better cycle tracking, more customizable Activity Rings, and more personalization.

    This period of beta releases is a bit unusual, as Apple is providing builds in two distinct groups. This is caused by the beta testing of iOS 18.1, iPadOS 18.1, and macOS Sequoia 15.1, to hammer out issues with Apple Intelligence.

    Since those three have been in testing far earlier, the beta is split into two groups, with Apple Intelligence-infused builds separate from others that don’t have it.

    Continue Reading on AppleInsider | Discuss on our Forums

    Go Here to Read this Fast! Fourth watchOS 11.1, tvOS 18.1, and visionOS 2.1 have arrived

    Originally appeared here:
    Fourth watchOS 11.1, tvOS 18.1, and visionOS 2.1 have arrived

  • Apple’s six developer betas land for iOS 18.1, iPadOS 18.1, macOS Sequoia 15.1

    Apple’s six developer betas land for iOS 18.1, iPadOS 18.1, macOS Sequoia 15.1

    Apple’s testing of Apple Intelligence continues, with the sixth developer betas of iOS 18.1, iPadOS 18.1, and macOS Sequoia 15.1 now available.

    Laptop, tablet, and phone screens displaying various text articles, notifications, and graphics positioned in front of a plain background
    Examples of Apple Intelligence at work.

    The sixth developer betas of iOS 18.1, iPadOS 18.1, and macOS Sequoia 15.1, arrive after the third builds of visionOS 2.1, tvOS 18.1, and watchOS 11.1, which arrived on October 1. The respective fifth and second builds landed on September 23, while the fourth and first respective builds were introduced on September 17.

    The difference in build counts is due to Apple beta testing tvOS 18, watchOS 11, and visionOS 2 at the same time as the Apple Intelligence-infused iOS 18.1, iPadOS 18.1, and macOS Sequoia 15.1.

    Continue Reading on AppleInsider | Discuss on our Forums

    Go Here to Read this Fast!

    Apple’s six developer betas land for iOS 18.1, iPadOS 18.1, macOS Sequoia 15.1

    Originally appeared here:

    Apple’s six developer betas land for iOS 18.1, iPadOS 18.1, macOS Sequoia 15.1

  • Zombie-horror ‘Resident Evil 2’ heads to Mac on Dec 31

    Zombie-horror ‘Resident Evil 2’ heads to Mac on Dec 31

    Capcom’s “Resident Evil 2” is on the way, with the zombie-based game lurching into the Mac App Store on December 31.

    Person in dark clothing grappling with a decaying zombie, showing an intense and suspenseful moment.
    Resident Evil 2 [Mac App Store]

    The Resident Evil franchise has been slowly moving onto Apple’s ecosystem, with Resident Evil 4 and Resident Evil 7: Biohazard having already arrived on Mac and Apple’s other platforms in 2024. There is one more fright-fest on the way, in the form of Resident Evil 2.

    Briefly mentioned as on the way over the summer, Resident Evil has since surfaced in the Mac App Store. It is listed as available for preorder, with an expected release date of December 31, 2024.

    Continue Reading on AppleInsider | Discuss on our Forums

    Go Here to Read this Fast!

    Zombie-horror ‘Resident Evil 2’ heads to Mac on Dec 31

    Originally appeared here:

    Zombie-horror ‘Resident Evil 2’ heads to Mac on Dec 31

  • Apple iPads are back down to as low as $199 for Prime Big Deals Days

    Apple iPads are back down to as low as $199 for Prime Big Deals Days

    Amazon’s iPad deals for Prime Day offer discounts of up to $200 off and prices as low as $199.

    Price tag of $199 overlaid on three iPad 9th Generation angles; two show screens, one shows the Apple logo on the back.
    Prime Day iPad deals are in effect now. [Apple/AppleInsider]

    Prime Day officially starts at midnight Pacific Time on Oct. 8, but Amazon is getting a head start by discounting iPads today. The entire range is on sale, ranging from the budget-friendly $199 9th Gen to $200 off select iPad Pros. And fresh Apple Pencil deals are in effect as well to pair with a new or existing iPad.

    Below is a curated list of the top offers:

    Continue Reading on AppleInsider

    Go Here to Read this Fast! Apple iPads are back down to as low as $199 for Prime Big Deals Days

    Originally appeared here:
    Apple iPads are back down to as low as $199 for Prime Big Deals Days

  • Congo government plans to crack down on buyers like Apple for conflict minerals

    The Democratic Republic of Congo (DRC) is considering legal action against tech companies such as Apple to reduce the amount of conflict minerals sourced from its eastern provinces.

    An acid lake within a mined landscape, dotted with vibrant green trees along its edges.
    Mine with acid lake | Credit: dimitrisvetsikas1969 on Pixabay

    The Eastern Congo is the world’s biggest supplier of tantalum, a conductive metal used in devices like the iPhone. Because of this, more than 100 militias have sought to control the tantalum trade.

    In 2024, rebel group M23 took control of the largest tantalum mine, Rubaya. Congo, the US, and United Nations experts say Rwanda has sent thousands of troops to Congo to back the M23 — though Rwanda denies the allegations.

    Continue Reading on AppleInsider | Discuss on our Forums

    Go Here to Read this Fast! Congo government plans to crack down on buyers like Apple for conflict minerals

    Originally appeared here:
    Congo government plans to crack down on buyers like Apple for conflict minerals

  • SmartThings adds Matter 1.3, eufy S3 Cam Pro launches, & more on HomeKit Insider

    SmartThings adds Matter 1.3, eufy S3 Cam Pro launches, & more on HomeKit Insider

    On this episode of the HomeKit Insider Podcast, eufy launches a new Apple Home camera system, Sonos makes a pledge, and more gear is released.

    HomeKit Insider
    HomeKit Insider Podcast

    As we start to recap the news, we kick it off with the latest news from Sonos. After last adding alarms back to the app in their most recent update, they now have a new commitment to quality.

    Their CEO laid out a multi-point promise that they will abide by to ensure the highest level of quality for their products going forward. This includes a new quality ombudsperson, regular reports, and internal checks.

    Continue Reading on AppleInsider | Discuss on our Forums

    Go Here to Read this Fast!

    SmartThings adds Matter 1.3, eufy S3 Cam Pro launches, & more on HomeKit Insider

    Originally appeared here:

    SmartThings adds Matter 1.3, eufy S3 Cam Pro launches, & more on HomeKit Insider

  • Best Prime Day Fire TV deals to shop in October 2024

    October Prime Day is just one day away, so it’s a great time to buy a new TV for a discount, especially if you’re interested in Amazon’s Fire TV brand.

    Go Here to Read this Fast! Best Prime Day Fire TV deals to shop in October 2024

    Originally appeared here:
    Best Prime Day Fire TV deals to shop in October 2024

  • The 30+ best computer monitor deals for October Prime Day

    We’re seeing discounts on the top computer monitors on the eve of Amazon’s October Prime Day sale. Below is a detailed list of all of the best monitor days we found.

    Go Here to Read this Fast! The 30+ best computer monitor deals for October Prime Day

    Originally appeared here:
    The 30+ best computer monitor deals for October Prime Day

  • Best Prime Day laptop deals to shop in October 2024

    Amazon’s October Prime Day starts tomorrow, but we’ve got our eyes on early deals live now, including sales on Apple MacBooks and laptops from Asus, Lenovo, Microsoft, and more.

    Go Here to Read this Fast! Best Prime Day laptop deals to shop in October 2024

    Originally appeared here:
    Best Prime Day laptop deals to shop in October 2024