Identifying Uncharacterized Protein LOC100699110 - X1
The most highly regulated protein analyzed was an uncharacterized protein given the NCBI accession number XP_019220227.1. Searching the amino acid sequence using the NBCI pBLAST found the closest annotated proteins to be fucolectin, pentraxin fusion-like protein, and tenascin (Supplementary fig. 3). Examining the amino acid sequence revealed that protein XP_019220227.1 consists mainly of eleven repeats of 143 amino acids with 94.4% identity between repeats. Searching the repeat sequence on pBLAST returned matches with high identity to fucolectins from different teleost fish with E-values lower than 2E-70 and identity greater than 75%. Matching fucolectins were much shorter, between 147 and 321 amino acids long in comparison with the 1605 amino acid length of the uncharacterized protein.