Introduction
Human deoxyribonucleic acid (DNA) has been widely used in human
individual identification (Ambers et al., 2018; Lygo et al., 1994; Meng
et al., 2019), paternity identification (Bertoglio et al., 2020; Habibi
et al., 2019) and other applications in forensics. However, human DNA is
not always available. Under this situation, we have to resort to
environmental DNA in the crime scene to narrow the search scope for
criminal suspects and find out the truth.
Environmental materials such as soil, dust, water, etc., are very likely
to be taken away unintentionally by suspects on his or her skin, shoes,
clothes, hair or even in the nail seams. Among them, soil, usually
contaminated by plant fragments or pollen grains, is the material the
police can get in most criminal cases. Plant DNA is quite suitable for
the forensic source tracking because of its ubiquity, stability and
proper variability.
Plant DNA has a high potential providing definitive evidence during
criminal investigations. With the advent of DNA metabarcoding, it has
recently been used to find out body dumping site (Yang et al., 2015),
residence of unknown human body (Liu et al., 2019), drowning site (Fang
et al., 2019), and confirmation of suspected drowning (Kakizaki et al.,
2018). Unfortunately, such applications are still very rare due to three
main challenges. The first one is the difficulties in species
identification of plant DNA in the environmental materials. Past
projects (e.g., BARCODE 500K (https://ibol.org), BIOSCAN (Hobern &
Hebert, 2019), ISHAM-ITS (Irinyi et al., 2016)) have enriched the pool
of DNA barcodes, though the reference library for DNA barcoding is
rather not comprehensive. Only less than 5.0% species of flowering
plants have their matK or rbcL sequences deposited in
GenBank (Liu et al. 2021).
The second challenge is that the Sanger sequencing method is not
applicable to environmental DNA because the amplicons are a mixture of
many species. Fortunately, next generation sequencing (NGS) platforms
meet the requirement of environmental DNA metabarcoding and a very easy
data processing method is now available
(https://github.com/YanleiLiu1989/Cotu-master).
The last challenge is lack of an “ideal” DNA barcode for DNA
metabarcoding (Ferri et al., 2015). DNA barcode is a short DNA sequence
for species recognition and discrimination. DNA barcoding is a commonly
used biotechnology in biology, environmental science, forensics, etc
(Ferri et al., 2015; Hebert et al., 2003). It is a powerful molecular
diagnostic method for specimen identification. Finding the best DNA
barcodes (Dong et al., 2014; Dong et al., 2015; Kress & Erickson, 2007;
Li et al., 2011) or developing new technical improvements (Yu et al.,
2011; Xu et al., 2015) was one of the main themes for plant DNA
barcoding during the past decade. Unfortunately, there is not a single
ideal DNA barcode suitable for all plant species identification, and
plant group-specific DNA barcodes seem more realistic. For example,rbcL is much less variable than ycf1 in flowering plants,
but acceptable as a DNA barcode for lower plants (Dong et al., 2015; Liu
et al., 2020a).
The lower plants (algae) instead of higher plants (mosses, ferns and
seed plants) play a very important role in investigation of wet
environment-related criminal cases and rbcL has been proposed as
a DNA barcode of diatoms (Liu et al., 2020a). The variability ofrbcL is much higher in lower plants than in higher plants andrbcL is one of the few choices of DNA barcodes for lower plants
for its relatively higher species coverage of existing sequences and
universal PCR primers (Ferri et al., 2015).
In this paper, we demonstrate how to use mud collected from a criminal
suspect’s pants to determine the real criminal in a murder case happened
in China based on DNA metabarcoding of diatom using chloroplastrbcL gene fragments. The diatom communities in the mud provided
solid evidence of the suspect’s appearance in the murder scene.