logoalt Hacker News

AlphaGenome: AI for better understanding the genome

472 pointsby i_love_limesyesterday at 2:16 PM151 commentsview on HN

Comments

LarsDu88today at 12:20 AM

You know the corporate screws are coming down hard, when the model (which can be run off a single A100) doesn't get a code release or a weight release, but instead sits behind an API, and the authors say fuck it and copy-paste the entirety of the model code in pseudocode on page 31 of the white paper.

Please Google/Demis/Sergei, just release the darn weights. This thing ain't gonna be curing cancer sitting behind an API and it's not gonna generate that much GCloud revenue when the model is this tiny.

show 1 reply
i_am_not_groottoday at 1:40 PM

Soooo... Jurassic Park ?

RivieraKidyesterday at 8:56 PM

I wish there's some breakthrough in cell simulation that would allow us to create simulations that are similarly useful to molecular dynamics but feasible on modern supercomputers. Not being able to see what's happening inside cells seems like the main blocker to biological research.

show 5 replies
another_twisttoday at 8:58 AM

So very similar approach to Conformer - convolution head for downsampling and transformer for time dependencies. Hmm, surprising that this idea works across application domains.

jebarkeryesterday at 7:14 PM

I don't think DM is the only lab doing high-impact AI applications research, but they really seem to punch above their weight in it. Why is that or is it just that they have better technical marketing for their work?

show 5 replies
xiphotoday at 3:20 AM

"To ensure consistent data interpretation and enable robust aggregation across experiments, metadata were standardized using established ontologies."

Can't emphasize enough about how DNA requires human data curation to make things work, even from day one alignments models were driven based on biological observations. Glad to see UBERON, which represents a massive amount of human insight and data curation of what is for all intents and purposes a semantic-web product (OWL based RDF at the heart) playing a significant role.

eleveriventoday at 7:03 AM

Curious how it'll perform when people start fine-tuning on smaller, specialized datasets

seydoryesterday at 6:12 PM

this is such an interesting problem. Imagine expanding the input size to 3.2Gbp, the size of human genome. I wonder if previously unimaginable interactions would occur. Also interesting how everything revolves around U-nets and transformers these days.

show 2 replies
mountainriveryesterday at 6:28 PM

With the huge jump in RNA prediction seems like it could be a boon for the wave of mRNA labs

show 1 reply
leohtoday at 2:35 AM

Let’s figure out introns pls

dekhnyesterday at 4:56 PM

When I went to work at Google in 2008 I immediately advocated for spending significant resources on the biological sciences (this was well before DM started working on biology). I reasoned that Google had the data mangling and ML capabilities required to demonstrate world-leading results (and hopefully guide the way so other biologists could reproduce their techniques). We made some progress- we used exacycle to demonstrate some exciting results in protein folding and design, and later launched Cloud Genomics to store and process large datasets for analytics.

I parted ways with Google a while ago (sundar is a really uninspiring leader), and was never able to transfer into DeepMind, but I have to say that they are executing on my goals far better than I ever could have. It's nice to see ideas that I had germinating for decades finally playing out, and I hope these advances lead to great discoveries in biology.

It will take some time for the community to absorb this most recent work. I skimmed the paper and it's a monster, there's just so much going on.

show 5 replies
nextosyesterday at 5:00 PM

I found it disappointing that they ignored one of the biggest problems in the field, i.e. distinguishing between causal and non-causal variants among highly correlated DNA loci. In genetics jargon, this is called fine mapping. Perhaps, this is something for the next version, but it is really important to design effective drugs that target key regulatory regions.

One interesting example of such a problem and why it is important to solve it was recently published in Nature and has led to interesting drug candidates for modulating macrophage function in autoimmunity: https://www.nature.com/articles/s41586-024-07501-1

show 1 reply
cwmooreyesterday at 11:37 PM

Just add startofficial intel.

sahil_sharma0today at 8:13 AM

[dead]

kat529770today at 5:29 AM

[dead]

lcfcjs6yesterday at 9:41 PM

[dead]

Scaevolusyesterday at 5:03 PM

Naturally, the (AI-generated?) hero image doesn't properly render the major and minor grooves. :-)

show 5 replies
twothreeoneyesterday at 7:39 PM

Maybe "Release" requires a bit more context, as it clearly means different things to different people:

> AlphaGenome will be available for non-commercial use via an online API at http://deepmind.google.com/science/alphagenome

So, essentially the paper is a sales pitch for a new Google service.

mattigamestoday at 12:08 AM

I bet the internal pitch is that genome will help deliver better advertisement, like if you are at risk of colon cancer they sell you "colon supplements", its likely they will be able to infer a bit about your personality just with your genome, "these genes are correlated with liking dark humor, use them to promote our new movie"

hyfgfhtoday at 1:12 AM

Cant await for people to use it for CRISPR an it hallucinate some weird mutation