Domain annotation
Each interacting pathogen protein, the corresponding host protein and host first interactor were examined for structural domain or local sequence motif similarity. Domain annotation was carried out using the NCBI Batch Conserved domain (CD) Search (49). CD-Search is a sensitive method that constructs the models of structurally conserved domain families based on multiple sequence alignments which are converted into position-specific scoring matrices (PSSM). It scans protein query sequences against these matrices with Reverse Position Specific-BLAST, a variant of the Psi-BLAST algorithm. Structure being more conserved than sequence, CD Search identifies remote similarity between pathogen and host proteins. The domain information was collected in the form of a unique PSSM ID and Domain short name for every unique domain family of all the proteins.