In an accompanying paper by Minervini et al., we deal with the scientific problem of studying the sequence to structure relationships in “never born proteins” (NBPs), i.e. protein sequences which have never been observed in nature. The study of the structural and functional properties of "never born proteins" requires the generation of a large library of protein sequences characterized by the absence of any significant similarity with all the known protein sequences. In this paper we describe the implementation of a simple command-line software utility used to generate random amino acid sequences and to filter them against the NCBI non redundant protein database, using as a threshold the value of the Evalue parameter returned by the well known sequence comparison software Blast. This utility, named RandomBlast, has been written using C programming language for Windows operating systems. The structural implications of NBPs random amino acid composition are discussed as compared to natural proteins of comparable length.
Evangelista, G., Minervini, G., Luisi, P.L., Polticelli, F. (2007). RANDOMBLAST A TOOL TO GENERATE RANDOM “NEVER BORN PROTEIN” SEQUENCES. BIO-ALGORITHMS AND MED-SYSTEMS, 3, 27-31.
RANDOMBLAST A TOOL TO GENERATE RANDOM “NEVER BORN PROTEIN” SEQUENCES
MINERVINI, GIOVANNI;LUISI, PIER LUIGI;POLTICELLI, Fabio
2007-01-01
Abstract
In an accompanying paper by Minervini et al., we deal with the scientific problem of studying the sequence to structure relationships in “never born proteins” (NBPs), i.e. protein sequences which have never been observed in nature. The study of the structural and functional properties of "never born proteins" requires the generation of a large library of protein sequences characterized by the absence of any significant similarity with all the known protein sequences. In this paper we describe the implementation of a simple command-line software utility used to generate random amino acid sequences and to filter them against the NCBI non redundant protein database, using as a threshold the value of the Evalue parameter returned by the well known sequence comparison software Blast. This utility, named RandomBlast, has been written using C programming language for Windows operating systems. The structural implications of NBPs random amino acid composition are discussed as compared to natural proteins of comparable length.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.