Single Molecule Used to Store US Declaration of Independence

A team of researchers was able to store a copy of the United States Declaration of Independence in a single molecule to improve synthetic biology and data-storage technology.

In the ongoing quest for more efficient data storage technology, a team of scientists led by John Chaput - a pharmaceutical sciences professor at the University of California Irvine also appointed at the university's chemistry and molecular biology & biochemistry. Using a synthesized variation of DNA strands, they have made a breakthrough in the emerging field of semipermanent data storage.

The technique employed is described in the report titled "Redesigning the Genetic Polymers of Life," published in the journal Accounts of Chemical Research.

DNA Strands From A Double Helix Model — LONDON - APRIL 23: Deoxyribonucleic acid (DNA) strands from a double helix model on display at the Science Museum April 23, 2003 in London. Photo by Paul Gilham/Getty Images

Data Storage and Synthetic Genetic Polymers

A press release from UCI notes that the global amount of data is about 44 zettabytes or roughly 44 billion terabytes of information. Current storage technologies require a 15-million-square-foot data center to house one billion gigabytes (1 million terabytes) of information. To house all the world data, imaging having 44,000 of these huge warehouses covers a land area almost the same as entire West Virginia.

ALSO READ : Laser-Aided Optical Data Storage at the Nanoscale Conducted

The solution found by the UCI team lies in synthetic genetic polymers, an artificial DNA carefully replicated and evolved by scientists. Also called "xeno-nucleic acids" or XNA, researchers describe the material as synthetic genetic polymers whose natural sugar content - whose equivalents in DNA and RNA strands - are replaced with a different kind of "sugar moiety."

"Unnatural genetic polymers offer a nice paradigm for developing novel soft materials that are capable of low-energy, high-density information storage without the liabilities of DNA," Chaput said in the UCI release.

Genetic data encoding is a relatively new technology, with effective recording and recovery of data from DNA being less than a decade old. However, increased interest has led to an increase in breakthroughs over the last two years. Although there have been significant improvements in terms of cost and time efficiency, there remains a lot to be solved before making it practically feasible.

Overcoming DNA Limitations and Encoding the Declaration of Independence

In terms of genetic data encoding, one of the restrictions is the inherent fragility of deoxyribonucleic acid - it easily degrades from various factors such as naturally-occurring enzymes and biological compounds to environmental factors like sunlight. It prompted Chaput and his team to turn to threose nucleic acid (TNA), which is harder and more resistant to degradation.

In their report, the UCI team used the four-letter nucleotide in DNAs instead of the binary system used in earlier genetic data encoding techs and traditional computers. The four components - adenine (A), thymine (T), cytosine (C), and guanine (G) - were assigned a particular binary number, which the researchers used as an equivalent language. In retrieving the encoded genetic data, they only need to apply a special enzyme that connects two sequences, converting the genomic sequence to the original binary form.

The UCI team led by Chaput has tested this mechanism by storing electronic document files such as the Declaration of Independence and an image file with the UCI seal to a TNA solution. They were also to recover both files from the same method.