It turns out that a human genome — the complete set of genetic material encoded as DNA sequences — is 100 gigabytes.
That’s the amount of storage space the average human genome would occupy when the decoded and raw DNA data is moved onto the cloud. Google, through its product Google Genomics, is offering hospitals and universities the ability to store the genomes they have on file. The hope is to start a network helping researchers around the world to compare genetics and multiply the rate at which discoveries are made.
“We saw biologists moving from studying one genome at a time to studying millions,” David Glazer, Google Genomics’ software engineer, told MIT Technology Review. “The opportunity is how to apply breakthroughs in data technology to help with this transition.”
The cost of the storing the complete raw data of the genome is set at $25 per year, per genome. However, after being cleaned up, the genome data can be pared down to under one gigabyte and stored for only 25 cents per year. Further computations on the genome data would cost extra.
It is unknown how many genomes are currently stored, but the project is already off to a healthy start. A collaboration with the Institute for Systems Biology, funded through a $6.5 million grant from the National Cancer Institute, will see the Cancer Genome Atlas uploaded to Genomics platform, making “data related to the molecular basis for cancer” available to anybody around the world.