loading page

OLD:Evaluating eUniRep and other protein feature representations for in silico directed evolution
  • Ivan Jayapurna,
  • Andrew Favor
Ivan Jayapurna

Corresponding Author:[email protected]

Author Profile
Andrew Favor
University of California, Berkeley
Author Profile

Abstract

This study analyzes and adds to the Low-N protein engineering with data-efficient deep learning work done by Biswas et al \cite{Biswas2020}. We provide a complete, open-source, end-to-end re-implementation of the in silico protein engineering pipeline with improved computational efficiency,  more detailed documentation, cleaner API and additional features to lower the barrier to entry for use of this pipeline as an engineering tool. We additionally perform a more thorough evaluation of the success and necessity of each step in the pipeline for in silico directed evolution, by re-implementing select portions of the study of TEM-1 β-lactamase, as well as applying the full in silico pipeline to 2 novel protein engineering tasks - increasing the melting temperature of plastic degrading enzyme IsPETase and improving the thermostability of viral capsid bacteriophage coat protein MS2. By comparing the performance of various UniRep-based feature representations we provide proof that linear kernels can be equivalent to additive fitness landscapes and outperform more complex models on small or simple mutation prediction tasks. This is assumed in many previous works but never explicitly shown. We believe it helps to elucidate the main strength of the eUniRep representation: its ability to overcome epistatic effects in proposing extensively mutated candidate sequences with optimized functionality.