Computational Performance
Finally, we compared the average performance of four tools in terms of speed, memory efficiency, disk space usage, gene recovery, and accuracy (Figure 3-A). GeneMiner demonstrated a significant speed advantage in Test I, outperforming Easy353 and HybPiper, while aTRAM lagged due to its reliance on BLAST tasks. GeneMiner also stood out for its efficient memory usage, which users can easily manage by selecting different subsets of reference sequences. The principal reason for memory consumption when using GeneMiner is construction of the reference hash table. The memory usage of this program depends on the variety of genes and the similarity among the same genes from different species. When there is high similarity between reference sequences of the same gene, variations in the number of reference genes will not significantly elevate memory usage. In Test I, Poaceae and Brassicaceae comprised 224,049 sequences from 347 species and 53,379 sequences from 349 species, respectively. Despite a higher quantity of reference genes in Poaceae compared to Brassicaceae, the peak memory usage for Poaceae was lower. This is mainly attributed to the lower variability of the 353 genes within the Poaceae family compared to Brassicaceae. In practice, if the reference sequences only come from a closely related genus, GeneMiner only requires approximately 0.33 GB of memory usage per sample. Regarding disk space usage, GeneMiner was the second-best performer in Test I, utilizing an average of 0.30 GB, slightly higher than Easy353 (0.225 GB). In conclusion, GeneMiner’s speed, memory, and disk usage are highly conducive for small servers and even personal computers, facilitating the ease of gene retrieval tasks. GeneMiner’s user-friendly computational attributes make it a desirable tool for various users working with gene acquisition tasks.