Figure 1 | Enzyme nicking based DNA storage.Tabatabaei et al. proposed the use of enzymatic nicking to write and read binary encoded data. a. Predetermined nick sites on the DNA register are either left un-nicked (0) or nicked (1) by thePf Ago enzyme, transcribing the inputted binary code. Multiple enzymes can bind in parallel at the same time, enabling rapid transcription of the input data into the register. Each Pf Ago enzyme has a specific recognition sequence which ensures the correct bond is nicked, greatly reducing writing errors. b. The binary information from the input file is first encoded into predetermined nicking sites, where the sites representing a ‘1’ are nicked. The native DNA is then extracted, and the resulting single-stranded DNA products of different lengths are sequenced. The sequence data is then analysed and mapped against the native DNA reference, allowing the position of the nicks to be identified via the size of the fragments. This is then translated back into binary code, allowing for recovery of the file. (figure created using Biorender®)
The tag is easily removed upon hybridization of the DNA, making the readout process non-destructive5.
The computational process of enzymatic nicking can be further enhanced by the simultaneous use of multiple registers and sense-antisense recording. The order of the multiple registers is dictated in the genome, allowing for greater storage space in a retrievable format. Sense-antisense recording utilises parallel nicking, this further enhances the storage capabilities of this technique. It involves usingPf Ago to nick both DNA strands, converting the binary format to a ternary format, enabling more information to be encoded in the same space. In other words, for each nicking site, no nick signifies 0, a nick on the sense strand represents 1, and a nick on the anti-sense strand represents 2. This technique enables the storage of ≈ 4 Exabytes of data per gram, which is somewhat shy of the 200 Ebytes/g capability of synthesis-based DNA storage methods, but nonetheless blows conventional data storage capacities out the water.
The computational potential of this work is what sets it aside from previous DNA-based storage methods. Previously, modifying stored data in synthesised DNA involved sequencing the data encoded, altering it on a separate (traditional) computer, then writing it into a new DNA molecule. Whereas enzymatic nicking utilises strand displacement (made possible by toeholds), making in-memory parallel computations possible without the need to synthesise new DNA.
The major problem for this technique lies in its cost. It may be cheaper than DNA-synthesis strategies but is still a long way away from cost-efficient scaling which can rival its mechanical counterparts. Despite this, further optimizations in DNA technologies will see a sharp decrease in cost, much like the cost of sequencing the human genome. However, with that in mind, these future advances are not limited to DNA nicking. DNA-synthesis strategies will also improve4(arguably at a faster rate than the DNA nicking approach), possibly reaching an error rate to rival that of nicking, making one of its key arguments redundant. Future approaches may also combine the two methods; writing data in synthetic DNA, then utilising the flexibility of nicking to encode metadata which is easily modified through ligation/toeholds. The future looks promising for DNA-storage, but only time will tell if it becomes another expensive novelty, or the solution to the world’s big data crisis.