Figure 1 | Enzyme nicking based DNA storage.Tabatabaei et al. proposed the use of enzymatic nicking to write
and read binary encoded data. a. Predetermined nick sites on
the DNA register are either left un-nicked (0) or nicked (1) by thePf Ago enzyme, transcribing the inputted binary code. Multiple
enzymes can bind in parallel at the same time, enabling rapid
transcription of the input data into the register. Each Pf Ago
enzyme has a specific recognition sequence which ensures the correct
bond is nicked, greatly reducing writing errors. b. The binary
information from the input file is first encoded into predetermined
nicking sites, where the sites representing a ‘1’ are nicked. The native
DNA is then extracted, and the resulting single-stranded DNA products of
different lengths are sequenced. The sequence data is then analysed and
mapped against the native DNA reference, allowing the position of the
nicks to be identified via the size of the fragments. This is then
translated back into binary code, allowing for recovery of the file.
(figure created using Biorender®)
The tag is easily removed upon hybridization of the DNA, making the
readout process non-destructive5.
The computational process of enzymatic nicking can be further enhanced
by the simultaneous use of multiple registers and sense-antisense
recording. The order of the multiple registers is dictated in the
genome, allowing for greater storage space in a retrievable format.
Sense-antisense recording utilises parallel nicking, this further
enhances the storage capabilities of this technique. It involves usingPf Ago to nick both DNA strands, converting the binary format to a
ternary format, enabling more information to be encoded in the same
space. In other words, for each nicking site, no nick signifies 0, a
nick on the sense strand represents 1, and a nick on the anti-sense
strand represents 2. This technique enables the storage of ≈ 4 Exabytes
of data per gram, which is somewhat shy of the 200 Ebytes/g capability
of synthesis-based DNA storage methods, but nonetheless blows
conventional data storage capacities out the water.
The computational potential of this work is what sets it aside from
previous DNA-based storage methods. Previously, modifying stored data in
synthesised DNA involved sequencing the data encoded, altering it on a
separate (traditional) computer, then writing it into a new DNA
molecule. Whereas enzymatic nicking utilises strand displacement (made
possible by toeholds), making in-memory parallel computations possible
without the need to synthesise new DNA.
The major problem for this technique lies in its cost. It may be cheaper
than DNA-synthesis strategies but is still a long way away from
cost-efficient scaling which can rival its mechanical counterparts.
Despite this, further optimizations in DNA technologies will see a sharp
decrease in cost, much like the cost of sequencing the human genome.
However, with that in mind, these future advances are not limited to DNA
nicking. DNA-synthesis strategies will also improve4(arguably at a faster rate than the DNA nicking approach), possibly
reaching an error rate to rival that of nicking, making one of its key
arguments redundant. Future approaches may also combine the two methods;
writing data in synthetic DNA, then utilising the flexibility of nicking
to encode metadata which is easily modified through ligation/toeholds.
The future looks promising for DNA-storage, but only time will tell if
it becomes another expensive novelty, or the solution to the world’s big
data crisis.