Domain annotation
Each interacting pathogen protein, the corresponding host protein and
host first interactor were examined for structural domain or local
sequence motif similarity. Domain annotation was carried out using the
NCBI Batch Conserved domain (CD) Search (49). CD-Search is a sensitive
method that constructs the models of structurally conserved domain
families based on multiple sequence alignments which are converted into
position-specific scoring matrices (PSSM). It scans protein query
sequences against these matrices with Reverse Position Specific-BLAST, a
variant of the Psi-BLAST algorithm. Structure being more conserved than
sequence, CD Search identifies remote similarity between pathogen and
host proteins. The domain information was collected in the form of a
unique PSSM ID and Domain short name for every unique domain family of
all the proteins.