Figure 3: Individual mosquito ITS2 amplicon lengths from the
three development steps (A) and definitive size intervals
defined for eight Anopheles species (B) . Each dot
represents a mosquito with its fifth hind tarsus color information
recorded at time of capture (white or black dots, sometimes erroneous)
or missing (grey dots). Species were determined by sequencing during
development steps, allowing to link Anopheles species and ITS2
size.
When determination is not possible with our method, samples would still
need to be sequenced, but the remaining species (An. aquasalis ,An. ininii and An. oswaldoi ) are only found sporadically.Anopheles braziliensis , An. darlingi , An.
nuneztovari and An. triannulatus represented 90 % of the
samples (Figure 4A) , including 44 % of An. darlingiafter all three development steps and 68 % in total with routine
samples. In comparison, we collected only 1, 2, 3, 4 and 6 samples ofAn. peryassui , An. ininii , An. oswaldoi , An.
aquasalis and An. medialis respectively.
Our method allowed us to identify correctly more than 80 % of samples
during development phases (steps 2 and 3), and even over 99 % when used
in routine (Figure 4B) . During steps 2 and 3 of development,
1.2 and 2.7 % of samples were assigned to the wrong species by the
method, and the errors were detected after validation by sequencing. The
misidentifications were due to wrong fifth hind tarsus color data (2/163Anopheles samples) at step 2 and to requirement for adjustment ofAn. aquasalis interval (2/73) at step 3. At these steps, we were
not able to determine the species for 12 and 16 % of samples
respectively, due to three factors: missing fifth hind tarsus color
information; wrong fifth hind tarsus color information that led to
contradictory results; interval overlaps. In routine, the remaining
uncertainties were very low (0.54 %; 2/372) due to rare missing fifth
hind tarsus color and to interval overlaps.
We used sequencing as a classical way of species identification, at step
1 to establish the link between species and ITS2 size, and at steps 2
and 3 to verify and optimize our method. Sequencing turned out to be a
source of errors too with 3.0 and 3.7 % of misidentification at steps 1
and 2 respectively (Figure 4C) . In addition, 7.8, 9.2 and 1.4
% samples led to different species identification when sequenced from
forward and reverse primers at steps 1, 2 and 3 respectively. This may
be due to cross contamination occurring during processing, shipping
and/or sequencing.