Motivation: Poor proteinsolubility hinders the production of many therapeutic and industrially useful proteins.Experimental efforts to increase solubility are plagued by low success ratesand often reduce biological activity. Computational prediction of proteinexpressibility and solubility in Escherichiacoli using only sequence information could reduce the cost of experimentalstudies by enabling prioritisation of highly soluble proteins. Results: A newtool for sequence-based prediction of soluble protein expression in Escherichia coli, SoluProt, was createdusing the gradient boosting machine technique with the TargetTrack database asa training set. When evaluated against a balanced independent test set derivedfrom the NESG database, SoluProt’s accuracy of 58.4% and AUC of 0.60 exceeded those of a suite of alternative solubility prediction tools.There is also evidence that it could significantly increase the success rate ofexperimental protein studies. SoluProt is freely available as a standalone programand a user-friendly webserver at https://loschmidt.chemi.muni.cz/soluprot/. Availabilityand Implementation: https://loschmidt.chemi.muni.cz/soluprot/ Contact:jiri@chemi.muni.cz Supplementary Information: Supplementary data areavailable at Bioinformatics online
SoluProt:大肠杆菌中可溶性蛋白表达的预测,ChemRxiv<大肠杆菌自身表达2000种以上的蛋白质>
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容, 请发送邮件至lizi9903@foxmail.com举报,一经查实,本站将立刻删除。