Developments in DNA data storage show serious potential



[ad_1]

In the last decade, academic and corporate researchers have turned to deoxyribonucleic acid as a possible means of storing data. DNA data storage offers much higher density and durability than any of today's storage media – tape drives, flash drives, or optical drives. DNA has also existed for billions of years and therefore should not become obsolete in the near future.

It's not surprising that scientists are focusing so much on storing DNA-based data. The world is producing more data than ever and these numbers will only grow. According to the IDC report "Age of 2025 Data: Critical Data Evolution for Life", the world will produce 163 zettabytes of data each year by 2025. To store this amount of data, you would need about 16 modern hard drives of 12 TB. disks. Even if it were financially feasible, the discs would need a lot of space and energy, while having a relatively short life span. DNA can potentially solve many of these problems.

This is not to say that DNA-based storage does not have its own set of challenges; it's expensive, slow and prone to errors. Nevertheless, researchers have made steady progress in meeting these challenges, with some notable recent successes.

Functioning of DNA for storage

DNA is a self-replicating material that naturally forms in a biological cell. The DNA code information on the features and functions of a cell and provides the genetic instructions necessary to shape the host organism of the cell.

DNA contains four molecular structures called nucleotides – adenine, cytosine, guanine and thymine – which are linked together in base pairs, with two different nucleotides per pair. Together, the base pairs form a linear strand, or the oligonucleotide, each pair of bases representing a bar on the scale of the oligonucleotide, thus giving rise to the chain double helix well known scientific journals and company logos. ] DNA data storage uses nucleotides to represent the binary values ​​and zeros that form the basis of the current digital data. The recording of data in the DNA is a basic process in two steps:

  1. A translation software converts the binary data of a file into sequences of nucleotide pairs of nucleotides correlated to the models bits.
  2. A synthesizer creates strands of DNA based on nucleotide sequences. A synthesizer is a scientific instrument that uses synthetic bioengineering technologies to create artificial DNA molecules, a process known as synthesize .

The recovery of encoded data in synthetic DNA is also a two-step process:

  1. A sequencer decodes DNA nucleotides within oligonucleotides in a specific order and returns their genetic code , a process called sequencing . Like a synthesizer, the sequencer is a scientific instrument, but in this case, it is used to automate sequencing operations.
  2. A translation program converts the results returned by the sequencer into a binary format based on the bit patterns originally used. convert the data.

The synthesis and sequencing of DNA has become standard practice in today's bio-industries. As a result, most of the technologies needed to store DNA data already exist.

The Promise of DNA

Researchers are turning to DNA storage as it potentially offers many advantages over the current storage medium. One of the main advantages is its density, which is several orders of magnitude greater than any current storage medium. One gram of DNA can contain millions of gigabytes of data.

DNA is also extremely durable. According to some estimates, if DNA is kept cool and dry without being exposed to light or radiation, it could last for thousands of years and never become obsolete. Moreover, given the central role of DNA in cell development, scientists will no doubt continue to study it and look for better ways to synthesize and sequence it, without the need for data storage. the same fate as the obsolete floppy disk.

DNA also has potential over current media, but also because synthesis and sequencing technologies will continue to gain efficiencies while reducing costs, as researchers deepen the internal workings of the company. DNA.

Despite the difficulties of storing DNA-based data, this technology is promising enough that scientists can continue to search for practical solutions.

Despite this cost-saving potential, one of the biggest challenges today in significantly adopting DNA-based data storage is the high cost of synthesizing and sequencing DNA. 39; DNA. Storing a few hundred megabytes of data in this way can easily cost thousands of dollars.

Recording data in the DNA is an extremely slow process because it is trying to convert all these bit patterns into nucleotides. In addition, RAM was difficult to obtain with the storage of DNA data, requiring the sequencing of DNA in large blocks and the slowing of the reading process. In addition, the synthesis and sequencing processes themselves may be subject to errors at the molecular level, which may result in data loss or corruption.

For DNA to work

Despite the difficulties of DNA-based storage, the technology is promising enough for scientists to continue to search for practical answers. For example, Catalog Technologies researchers have found a way to make DNA storage more economical for long-term archiving by uncoupling synthesis and sequencing processes. Rather than mapping individual bits to nucleotide base pairs, they synthesize large amounts of relatively few DNA types that serve as building blocks for data coding.

Researchers at the University of Padova in Italy are also looking for ways to improve the DNA data. storage for archival purposes using bacterial nanotrees and individual plasmids, characteristic of bacterial cells carrying genetic information. Bacteria can be used to reliably access specific data from different storage locations, using a technology known as the Molecular Positioning System that allows bacteria to detect chemical transmissions and mobilize to a specific location.

The University of Illinois at Urbana-Champaign is working on a solution to obtain an error-free RAM for storing DNA-based data. Their approach is based on the selective amplification of specific data to accelerate readings without having to sequence the entire DNA pool. To implement this method, they add two unique sequences ( primers ) to each oligonucleotide, one at each end, using a simple key-value architecture to identify the primers.

Microsoft and the University of Washington also worked together on a similar technique to obtain error-free RAM. Researchers from these organizations have recently demonstrated the ability to recover specific files from more than 400 MB of data. Microsoft plans to introduce a prototype commercial data storage system for DNA by 2020.

Many other organizations, such as the Defense Advanced Research Projects Agency, are interested in also seriously to data storage. At the same time, synthesis and sequencing processes are improving steadily and prices are falling. Given the huge amounts of data badysts expect, any hope of storing them lies in much more advanced technologies than the current media. The DNA certainly has the ability to meet this need, so its practical application can be fully realized.

[ad_2]
Source link