probably my most environmentalist view is that species are an enormously valuable public good. I think AlphaFold may be the…
probably my most environmentalist view is that species are an enormously valuable public good. I think AlphaFold may be the first real sign that I was right about that. You probably don’t know what I mean because AlphaFold is widely misunderstood. People think it’s a solution to the “protein folding problem” as formulated decades ago, figuring out the protein fold based on the sequence. Actually, it figures out the protein fold based on the sequence and hundreds or thousands of related sequences. It’s using the history of life on Earth like a natural experiment that determines which modifications of the protein maintain the functional fold. However I think almost of these sequences are from microbes so it actually seems orthogonal to environmentalist concerns about extinction. But maybe it’s a proof of principle.
You can never tell exactly where information in ML comes from, but the protein structures that AlphaFold2 was trained on come from the PDB which is mostly human proteins. AF2 generally uses 30-100 related proteins in the MSA, so some might be bacteria, but typically for a protein from a mammal I don’t expect that many to show up because (1) there’s more closely related organisms and (2) bacteria as prokaryotes don’t have analogues to Eukaryotic proteins.