Today, we generate data faster than we can increase storage
capacity. The volume of digital data worldwide is projected to exceed 16
zettabytes sometime next year, the paper’s authors wrote, citing a forecast by
IDC Research.
A big portion of the world’s data sits in archival storage, where the densest medium currently is tape, offering maximum density of about 10 GB per cubic millimeter. One research project has demonstrated an optical disk technology that’s 10 times denser than tape.
But there’s another approach that promises storage density of 1 Exabyte per cubic millimeter, or eight orders of magnitude higher than tape. That approach is encoding data the same way nature encodes instructions for building every living thing on Earth: DNA.
In addition to density, DNA storage addresses another big limitation of archival storage: longevity. Tape can hold data for 10 to 30 years before data integrity starts to corrode, and spinning disks are rated for three to five years. DNA’s observed half-life is more than 500 years in harsh environments, according to the paper.
The idea to store data in the form of synthetic DNA has been around for a long time, but the huge improvements in cost and efficiency of synthesizing and sequencing genes in recent years have made its feasibility a lot more probable. Its state of the art went from a 23-character message in 1999 to a 739 kB message in 2013.
Architecture for a DNA storage system that includes a DNA synthesizer, a storage container, and a DNA sequencer. The synthesizer encodes data to be stored, the container holds pools of DNA that map to a volume, and the sequencer reads DNA sequences and converts them to digital data.
It addresses the error problem with redundancy, an approach that has been proposed before but without regard to the impact of redundancy on storage density. The new encoding scheme introduced in the paper offers “controllable redundancy,” where you can specify a different level of reliability and density for each type of data.
The problem of random access is solved by using the same technique molecular biologists use to isolate specific regions of a DNA sequence in research. Polymerase Chain Reaction is a technique used to “amplify” a piece of DNA by repeated cycles of heating and cooling. The DNA storage researchers use PCR to amplify only the desired data, which they say accelerates reads and enables specific data to be accessed without sequencing the entire DNA pool.
While DNA storage is not practical today, the rate of progress in DNA sequencing and synthesis in the biotech industry and the “impending limit of silicon technology” make it something computer architects should seriously consider today, the researchers conclude. They envision hybrid silicon and biochemical archival storage systems as the ultimate cold storage of the future.
A big portion of the world’s data sits in archival storage, where the densest medium currently is tape, offering maximum density of about 10 GB per cubic millimeter. One research project has demonstrated an optical disk technology that’s 10 times denser than tape.
But there’s another approach that promises storage density of 1 Exabyte per cubic millimeter, or eight orders of magnitude higher than tape. That approach is encoding data the same way nature encodes instructions for building every living thing on Earth: DNA.
In addition to density, DNA storage addresses another big limitation of archival storage: longevity. Tape can hold data for 10 to 30 years before data integrity starts to corrode, and spinning disks are rated for three to five years. DNA’s observed half-life is more than 500 years in harsh environments, according to the paper.
The idea to store data in the form of synthetic DNA has been around for a long time, but the huge improvements in cost and efficiency of synthesizing and sequencing genes in recent years have made its feasibility a lot more probable. Its state of the art went from a 23-character message in 1999 to a 739 kB message in 2013.
Architecture for a DNA storage system that includes a DNA synthesizer, a storage container, and a DNA sequencer. The synthesizer encodes data to be stored, the container holds pools of DNA that map to a volume, and the sequencer reads DNA sequences and converts them to digital data.
It addresses the error problem with redundancy, an approach that has been proposed before but without regard to the impact of redundancy on storage density. The new encoding scheme introduced in the paper offers “controllable redundancy,” where you can specify a different level of reliability and density for each type of data.
The problem of random access is solved by using the same technique molecular biologists use to isolate specific regions of a DNA sequence in research. Polymerase Chain Reaction is a technique used to “amplify” a piece of DNA by repeated cycles of heating and cooling. The DNA storage researchers use PCR to amplify only the desired data, which they say accelerates reads and enables specific data to be accessed without sequencing the entire DNA pool.
While DNA storage is not practical today, the rate of progress in DNA sequencing and synthesis in the biotech industry and the “impending limit of silicon technology” make it something computer architects should seriously consider today, the researchers conclude. They envision hybrid silicon and biochemical archival storage systems as the ultimate cold storage of the future.