Home
Agents
Assignees
Inventors
Examiners
Contact
Links

Streptococcus pneumoniae polynucleotides and sequences


No:

6420135 -

Application no:

08961527 -

Filed date:

1997-10-30 -

Issue date:

2002-07-16



Abstract:


The present invention provides polynucleotide sequences of the genome of Streptococcus pneumoniae, polypeptide sequences encoded by the polynucleotide sequences, corresponding polynucleotides and polypeptides, vectors and hosts comprising the polynucleotides, and assays and other uses thereof. The present invention further provides polynucleotide and polypeptide sequence information stored on computer readable media, and computer-based systems and methods which facilitate its use.

US Classes:



Inventors:



Agents:


Assignees:


Claims:


What is claimed is:

1. An isolated polynucleotide fragment comprising the nucleic acid sequence of an ORF selected from the group consisting of: (a) ORF ID NO:9 of Contig ID NO:5, represented by nucleotides 12592-13197 of SEQ ID NO:5; (b) ORF ID NO:1 of Contig ID NO:46, represented by nucleotides 2-1267 of SEQ ID NO:46; (c) ORF ID NO:5 of Contig ID NO:58, represented by nucleotides 6565-7356 of SEQ ID NO:58; (d) ORF ID NO:3 of Contig ID NO:78, represented by nucleotides 1108-3636 of SEQ ID NO:78; (e) ORF ID NO:3 of Contig ID NO:94, represented by nucleotides 951-2741 of SEQ ID NO:94; (f) ORF ID NO:4 of Contig ID NO:94, represented by nucleotides 3006-5444 of SEQ ID NO:94; (g) ORF ID NO:3 of Contig ID NO:32, represented by nucleotides 1885-1076 of SEQ ID NO:32; (h) ORF ID NO:4 of Contig ID NO:92, represented by nucleotides 1753-3276 of SEQ ID NO:92; (i) ORF ID NO:2 of Contig ID NO:89, represented by nucleotides 1007-1627 of SEQ ID NO:89; and (j) ORF ID NO:1 of Contig ID NO:287, represented by nucleotides 2-871 of SEQ ID NO:287.

2. The isolated polynucleotide of claim 1, wherein said polynucleotide comprises a heterologous polynucleotide sequence.

3. The isolated polynucleotide of claim 2, wherein said heterologous polynucleotide sequence encodes a heterologous polypeptide.

4. A method for making a recombinant vector comprising inserting the isolated polynucleotide of claim 1 into a vector.

5. A nucleic acid sequence complimentary to the polynucleotide of claim 1.

6. A recombinant vector comprising the isolated polynucleotide of claim 1.

7. The recombinant vector of claim 6, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

8. A recombinant host cell comprising the isolated polynucleotide of claim 1.

9. The recombinant host cell of claim 8, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

10. An isolated polynucleotide fragment comprising a nucleic acid sequence which hybridizes under hybridization conditions, comprising hybridization in 5×SSC and 50% formamide at 50-65° C. and washing in a wash buffer consisting of 0.5×SSC at 65° C., to the complementary strand of an ORF selected from the group consisting of: (a) ORF ID NO:9 of Contig ID NO:5, represented by nucleotides 12592-13197 of SEQ ID NO:5; (b) ORF ID NO:1 of Contig ID NO:46, represented by nucleotides 2-1267 of SEQ ID NO:46; (c) ORF ID NO:5 of Contig ID NO:58, represented by nucleotides 6565-7356 of SEQ ID NO:58; (d) ORF ID NO:3 of Contig ID NO:78, represented by nucleotides 1108-3636 of SEQ ID NO:78; (e) ORF ID NO:4 of Contig ID NO:94, represented by nucleotides 3006-5444 of SEQ ID NO:94; (f) ORF ID NO:3 of Contig ID NO:32, represented by nucleotides 1885-3876 of SEQ ID NO:32; (g) ORF ID NO:4 of Contig ID NO:92, represented by nucleotides 1753-3276 of SEQ ID NO:92; (h) ORF ID NO:2 of Contig ID NO:89, represented by nucleotides 1007-1627 of SEQ ID NO:89; and (i) ORF ID NO:1 of Contig ID NO:287, represented by nucleotides 2-871 of SEQ ID NO:287.

11. An isolated polynucleotide complementary to the polynucleotide of claim 10.

12. An isolated polynucleotide comprising at least 50 contiguous nucleotides of an ORF selected from the group consisting of: (a) ORF ID NO:9 of Contig ID NO:5, represented by nucleotides 12592-13197 of SEQ ID NO:5; (b) ORF ID NO:1 of Contig ID NO:46, represented by nucleotides 2-1267 of SEQ ID NO:46; (c) ORF ID NO:5 of Contig ID NO:58, represented by nucleotides 6565-7356 of SEQ ID NO:58; (d) ORF ID NO:3 of Contig ID NO:78, represented by nucleotides 1108-3636 of SEQ ID NO:78; (e) ORF ID NO:3 of Contig ID NO:94, represented by nucleotides 951-2741 of SEQ ID NO:94; (f) ORF ID NO:4 of Contig ID NO:94, represented by nucleotides 3006-5444 of SEQ ID NO:94; (g) ORF ID NO:3 of Contig ID NO:32, represented by nucleotides 1885-3876 of SEQ ID NO:32; (h) ORF ID NO:4 of Contig ID NO:92, represented by nucleotides 1753-3276 of SEQ ID NO:92; (i) ORF ID NO:2 of Contig ID NO:89, represented by nucleotides 1007-1627 of SEQ ID NO:89; and (j) ORF ID NO:1 of Contig ID NO:287, represented by nucleotides 2-871 of SEQ ID NO:287.

13. An isolated polynucleotide complementary to the polynucleotide of claim 12.

14. An isolated polynucleotide comprising at least 100 contiguous nucleotides of an ORF selected from the group consisting of: (a) ORF ID NO:9 of Contig ID NO:5, represented by nucleotides 12592-13197 of SEQ ID NO:5; (b) ORF ID NO:1 of Contig ID NO:46, represented by nucleotides 2-1267 of SEQ ID NO:46; (c) ORF ID NO:5 of Contig ID NO:58, represented by nucleotides 6565-7356 of SEQ ID NO:58; (d) ORF ID NO:3 of Contig ID NO:78, represented by nucleotides 1108-3636 of SEQ ID NO:78; (e) ORF ID NO:3 of Contig ID NO:94, represented by nucleotides 951-2741 of SEQ ID NO:94; (f) ORF ID NO:4 of Contig ID NO:94, represented by nucleotides 3006-5444 of SEQ ID NO:94; (g) ORF ID NO:3 of Contig ID NO:32, represented by nucleotides 1885-3876 of SEQ ID NO:32; (h) ORF ID NO:4 of Contig ID NO:92, represented by nucleotides 1753-3276 of SEQ ID NO:92; (i) ORF ID NO:2 of Contig ID NO:89, represented by nucleotides 1007-1627 of SEQ ID NO:89; and (j) ORF ID NO:1 of Contig ID NO:287, represented by nucleotides 2-871 of SEQ ID NO:287.

15. An isolated polynucleotide complementary to the polynucleotide of claim 14.

16. The isolated polynucleotide of claim 1, wherein the selected ORF is (a).

17. The isolated polynucleotide of claim 1, wherein the selected ORF is (b).

18. The isolated polynucleotide of claim 1, wherein the selected ORF is (c).

19. The isolated polynucleotide of claim 1, wherein the selected ORF is (d).

20. The isolated polynucleotide of claim 1, wherein the selected ORF is (e).

21. The isolated polynucleotide of claim 1, wherein the selected ORF is (f).

22. The isolated polynucleotide of claim 1, wherein the selected ORF is (g).

23. The isolated polynucleotide of claim 1, wherein the selected ORF is (h).

24. The isolated polynucleotide of claim 1, wherein the selected ORF is (i).

25. The isolated polynucleotide of claim 1, wherein the selected ORF is (j).

26. The isolated polynucleotide of claim 10, wherein the selected ORF is (a).

27. The isolated polynucleotide of claim 10, wherein the selected ORF is (b).

28. The isolated polynucleotide of claim 10, wherein the selected ORF is (c).

29. The isolated polynucleotide of claim 10, wherein the selected ORF is (d).

30. The isolated polynucleotide of claim 10, wherein the selected ORF is (e).

31. The isolated polynucleotide of claim 10, wherein the selected ORF is (f).

32. The isolated polynucleotide of claim 10, wherein the selected ORF is (g).

33. The isolated polynucleotide of claim 10, wherein the selected ORF is (h).

34. The isolated polynucleotide of claim 10, wherein the selected ORF is (i).

35. The isolated polynucleotide of claim 12, wherein the selected ORF is (a).

36. The isolated polynucleotide of claim 12, wherein the selected ORF is (b).

37. The isolated polynucleotide of claim 12, wherein the selected ORF is (c).

38. The isolated polynucleotide of claim 12, wherein the selected ORF is (d).

39. The isolated polynucleotide of claim 12, wherein the selected ORF is (e).

40. The isolated polynucleotide of claim 12, wherein the selected ORF is (f).

41. The isolated polynucleotide of claim 12, wherein the selected ORF is (g).

42. The isolated polynucleotide of claim 12, wherein the selected ORF is (h).

43. The isolated polynucleotide of claim 12, wherein the selected ORF is (i).

44. The isolated polynucleotide of claim 12, wherein the selected ORF is (j).

45. The isolated polynucleotide of claim 14, wherein the selected ORF is (a).

46. The isolated polynucleotide of claim 14, wherein the selected ORF is (b).

47. The isolated polynucleotide of claim 14, wherein the selected ORF is (c).

48. The isolated polynucleotide of claim 14, wherein the selected ORF is (d).

49. The isolated polynucleotide of claim 14, wherein the selected ORF is (e).

50. The isolated polynucleotide of claim 14, wherein the selected ORF is (f).

51. The isolated polynucleotide of claim 14, wherein the selected ORF is (g).

52. The isolated polynucleotide of claim 14, wherein the selected ORF is (h).

53. The isolated polynucleotide of claim 14, wherein the selected ORF is (i).

54. The isolated polynucleotide of claim 14, wherein the selected ORF is (j).

55. The isolated polynucleotide of claim 10, wherein said polynucleotide comprises a heterologous polynucleotide sequence.

56. The isolated polynucleotide of claim 55, wherein said heterologous polynucleotide sequence encodes a heterologous polypeptide.

57. A method for making a recombinant vector comprising inserting the isolated polynucleotide of claim 10 into a vector.

58. A recombinant vector comprising the isolated polynucleotide of claim 10.

59. The recombinant vector of claim 58, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

60. A recombinant host cell comprising the isolated polynucleotide of claim 10.

61. The recombinant host cell of claim 60, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

62. The isolated polynucleotide of claim 12, wherein said polynucleotide comprises a heterologous polynucleotide sequence.

63. The isolated polynucleotide of claim 62, wherein said heterologous polynucleotide sequence encodes a heterologous polypeptide.

64. A method for making a recombinant vector comprising inserting the isolated polynucleotide of claim 12 into a vector.

65. A recombinant vector comprising the isolated polynucleotide of claim 12.

66. The recombinant vector of claim 65, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

67. A recombinant host cell comprising the isolated polynucleotide of claim 12.

68. The recombinant host cell of claim 67, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

69. The isolated polynucleotide of claim 14, wherein said polynucleotide comprises a heterologous polynucleotide sequence.

70. The isolated polynucleotide of claim 69, wherein said heterologous polynucleotide sequence encodes a heterologous polypeptide.

71. A method for making a recombinant vector comprising inserting the isolated polynucleotide of claim 14 into a vector.

72. A recombinant vector comprising the isolated polynucleotide of claim 14.

73. The recombinant vector of claim 72, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

74. A recombinant host cell comprising the isolated polynucleotide of claim 14.

75. The recombinant host cell of claim 74, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

76. An isolated polynucleotide fragment comprising a nucleic acid sequence encoding an amino acid sequence encoded by an ORF selected from the group consisting of: (a) ORF ID NO:3 of Contig ID NO:78, represented by nucleotides 1108-3636 of SEQ ID NO:78; (b) ORF ID NO:3 of Contig ID NO:94, represented by nucleotides 951-2741 of SEQ ID NO:94; and (c) ORF ID NO:1 of Contig ID NO:287, represented by nucleotides 2-871 of SEQ ID NO:287.

77. The isolated polynucleotide of claim 76, wherein the selected ORF is (a).

78. The isolated polynucleotide of claim 76, wherein the selected ORF is (b).

79. The isolated polynucleotide of claim 76, wherein the selected ORF is (c).

80. The isolated polynucleotide of claim 76, wherein said polynucleotide comprises a heterologous polynucleotide sequence.

81. The isolated polynucleotide of claim 80, wherein said heterologous polynucleotide sequence encodes a heterologous polypeptide.

82. A method for making a recombinant vector comprising inserting the isolated polynucleotide of claim 76 into a vector.

83. A nucleic acid sequence complimentary to the polynucleotide of claim 76.

84. A recombinant vector comprising the isolated polynucleotide of claim 76.

85. The recombinant vector of claim 84, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

86. A recombinant host cell comprising the isolated polynucleotide of claim 76.

87. The recombinant host cell of claim 86, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

88. A method for producing a polypeptide, comprising: (a) culturing a host cell under conditions suitable to produce a polypeptide encoded by the polynucleotide of claim 76; and (b) recovering the polypeptide from the cell culture.


Text:


FIELD OF THE INVENTION

The present invention relates to the field of molecular biology. In particular, it relates to, among other things, nucleotide sequences of Streptococcus pneumoniae, contigs, ORFs, fragments, probes, primers and related polynucleotides thereof, peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, such as in fermentation, polypeptide production, assays and pharmaceutical development, among others.

BACKGROUND OF THE INVENTION

Streptococcus pneumoniae has been one of the most extensively studied microorganisms since its first isolation in 1881. It was the object of many investigations that led to important scientific discoveries. In 1928, Griffith observed that when heat-killed encapsulated pneumococci and live strains constitutively lacking any capsule were concomitantly injected into mice, the nonencapsulated could be converted into encapsulated pneumococci with the same capsular type as the heat-killed strain. Years later, the nature of this “transforming principle,” or carrier of genetic information, was shown to be DNA. (Avery, O. T., et al., J. Exp. Med., 79:137-157 (1944)).

In spite of the vast number of publications on S. pneumoniae many questions about its virulence are still unanswered, and this pathogen remains a major causative agent of serious human disease, especially community-acquired pneumonia. (Johnston, R. B., et al., Rev. Infect. Dis. 13(Suppl. 6):S509-517 (1991)). In addition, in developing countries, the pneumococcus is responsible for the death of a large number of children under the age of 5 years from pneumococcal pneumonia. The incidence of pneumococcal disease is highest in infants under 2 years of age and in people over 60 years of age. Pneumococci are the second most frequent cause (after Haemophilus influenzae type b) of bacterial meningitis and otitis media in children. With the recent introduction of conjugate vaccines for H. influenzae type b, pneumococcal meningitis is likely to become increasingly prominent. S. pneumoniae is the most important etiologic agent of community-acquired pneumonia in adults and is the second most common cause of bacterial meningitis behind Neisseria meningitidis.

The antibiotic generally prescribed to treat S. pneumoniae is benzylpenicillin, although resistance to this and to other antibiotics is found occasionally. Pneumococcal resistance to penicillin results from mutations in its penicillin-binding proteins. In uncomplicated pneumococcal pneumonia caused by a sensitive strain, treatment with penicillin is usually successful unless started too late. Erythromycin or clindamycin can be used to treat pneumonia in patients hypersensitive to penicillin, but resistant strains to these drugs exist. Broad spectrum antibiotics (e.g., the tetracyclines) may also be effective, although tetracycline-resistant strains are not rare. In spite of the availability of antibiotics, the mortality of pneumococcal bacteremia in the last four decades has remained stable between 25 and 29%. (Gillespie, S. H., et al., J. Med. Microbiol. 28:237-248 (1989).

S. pneumoniae is carried in the upper respiratory tract by many healthy individuals. It has been suggested that attachment of pneumococci is mediated by a disaccharide receptor on fibronectin, present on human pharyngeal epithelial cells. (Anderson, B. J., et al., J. Immunol. 142:2464-2468 (1989). The mechanisms by which pneumococci translocate from the nasopharynx to the lung, thereby causing pneumonia, or migrate to the blood, giving rise to bacteremia or septicemia, are poorly understood. (Johnston, R. B., et al., Rev. Infect. Dis. 13(Suppl. 6):S509-517 (1991).

Various proteins have been suggested to be involved in the pathogenicity of S. pneumoniae, however, only a few of them have actually been confirmed as virulence factors. Pneumococci produce an IgA1 protease that might interfere with host defense at mucosal surfaces. (Kornfield, S. J., et al., Rev. Inf. Dis. 3:521-534 (1981). S. pneumoniae also produces neuraminidase, an enzyme that may facilitate attachment to epithelial cells by cleaving sialic acid from the host glycolipids and gangliosides. Partially purified neuraminidase was observed to induce meningitis-like symptoms in mice; however, the reliability of this finding has been questioned because the neuraminidase preparations used were probably contaminated with cell wall products. Other pneumococcal proteins besides neuraminidase are involved in the adhesion of pneumococci to epithelial and endothelial cells. These pneumococcal proteins have as yet not been identified. Recently, Cundell et al., reported that peptide permeases can modulate pneumococcal adherence to epithelial and endothelial cells. It was, however, unclear whether these permeases function directly as adhesions or whether they enhance adherence by modulating the expression of pneumococcal adhesions. (DeVelasco, E. A., et al., Micro. Rev. 59:591-603 (1995). A better understanding of the virulence factors determining its pathogenicity will need to be developed to cope with the devastating effects of pneumococcal disease in humans.

Ironically, despite the prominent role of S. pneumoniae in the discovery of DNA, little is known about the molecular genetics of the organism. The S. pneumoniae genome consists of one circular, covalently closed, double-stranded DNA and a collection of so-called variable accessory elements, such as prophages, plasmids, transposons and the like. Most physical characteristics and almost all of the genes of S. pneumoniae are unknown. Among the few that have been identified, most have not been physically mapped or characterized in detail. Only a few genes of this organism have been sequenced. (See, for instance current versions of GENBANK and other nucleic acid databases, and references that relate to the genome of S. pneumoniae such as those set out elsewhere herein.)

It is clear that the etiology of diseases mediated or exacerbated by S. pneumoniae, infection involves the programmed expression of S. pneumoniae genes, and that characterizing the genes and their patterns of expression would add dramatically to our understanding of the organism and its host interactions. Knowledge of S. pneumoniae genes and genomic organization would improve our understanding of disease etiology and lead to improved and new ways of preventing, ameliorating, arresting and reversing diseases. Moreover, characterized genes and genomic fragments of S. pneumoniae would provide reagents for, among other things, detecting, characterizing and controlling S. pneumoniae infections. There is a need to characterize the genome of S. pneumoniae and for polynucleotides of this organism.

SUMMARY OF THE INVENTION

The present invention is based on the sequencing of fragments of the Streptococcus pneumoniae genome. The primary nucleotide sequences which were generated are provided in SEQ ID NOS:1-391.

The present invention provides the nucleotide sequence of several hundred contigs of the Streptococcus pneumoniae genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embodiment, the present invention is provided as contiguous strings of primary sequence information corresponding to the nucleotide sequences depicted in SEQ ID NOS:1-391.

The present invention further provides nucleotide sequences which are at least 95% identical to the nucleotide sequences of SEQ ID NOS:1-391.

The nucleotide sequence of SEQ ID NOS:1-391, a representative fragment thereof, or a nucleotide sequence which is at least 95% identical to the nucleotide sequence of SEQ ID NOS; 1-391 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the sequences of the present invention are recorded on computer readable media. Such media includes, but is not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media.

The present invention further provides systems, particularly computer-based systems which contain the sequence information herein described stored in a data storage means. Such systems are designed to identify commercially important fragments of the Streptococcus pneumoniae genome.

Another embodiment of the present invention is directed to fragments of the Streptococcus pneumoniae genome having particular structural or functional attributes. Such fragments of the Streptococcus pneumoniae genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter referred to as open reading frames or ORFs, fragments which modulate the expression of an operably linked ORF, hereinafter referred to as expression modulating fragments or EMFs, and fragments which can be used to diagnose the presence of Streptococcus pneumoniae in a sample, hereinafter referred to as diagnostic fragments or DFs.

Each of the ORFs in fragments of the Streptococcus pneumoniae genome disclosed in Tables 1-3, and the EMFs found 5′ to the ORFs, can be used in numerous ways as polynucleotide reagents. For instance, the sequences can be used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity.

The present invention further includes recombinant constructs comprising one or more fragments of the Streptococcus pneumoniae genome of the present invention. The recombinant constructs of the present invention comprise vectors, such as a plasmid or viral vector, into which a fragment of the Streptococcus pneumoniae has been inserted.

The present invention further provides host cells containing any of the isolated fragments of the Streptococcus pneumoniae genome of the present invention. The host cells can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell.

The present invention is further directed to isolated polypeptides and proteins encoded by ORFs of the present invention. A variety of methods, well known to those of skill in the art, routinely may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and proteins of the present invention from cells which have been altered to express them.

The invention further provides methods of obtaining homologs of the fragments of the Streptococcus pneumoniae genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specifically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs.

The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention. Such antibodies include both monoclonal and polyclonal antibodies.

The invention further provides hybridomas which produce the above-described antibodies. A hybridoma is an immortalized cell line which is capable of secreting a specific monoclonal antibody.

The present invention further provides methods of identifying test samples derived from cells which express one of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one or more of the antibodies of the present invention, or one or more of the DFs of the present invention, under conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom.

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry out the above-described assays.

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the antibodies, or one of the DFs of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of bound antibodies or hybridized DFs.

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present invention. Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates, pharmaceutical agents and the like. Such methods comprise steps of: (a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention; and (b) determining whether the agent binds to said protein.

The present genomic sequences of Streptococcus pneumoniae will be of great value to all laboratories working with this organism and for a variety of commercial purposes. Many fragments of the Streptococcus pneumoniae genome will be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value to Streptococcus pneumoniae researchers and for immediate commercial value for the production of proteins or to control gene expression.

The methodology and technology for elucidating extensive genomic sequences of bacterial and other genomes has and will greatly enhance the ability to analyze and understand chromosomal organization. In particular, sequenced contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and function, including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative genomic and molecular phylogeny.

DESCRIPTION OF THE FIGURES

FIG. 1 is a block diagram of a computer system (102) that can be used to implement computer-based systems of present invention.

FIG. 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble, edit and annotate the contigs of the Streptococcus pneumoniae genome of the present invention. Both Macintosh and Unix platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et al., Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 585, IEEE Computer Society Press, Washington D.C. (1993). Factura (AB) is a Macintosh program designed for automatic vector sequence removal and end-trimming of sequence files. The program Loadis runs on a Macintosh platform and parses the feature data extracted from the sequence files by Factura to the Unix based Streptococcus pneumoniae relational database. Assembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and their associated features using Extrseq, a Unix utility for retrieving sequences from an SQL database. The resulting sequence file is processed by seq_filter to trim portions of the sequences with more than 2% ambiguous nucleotides. The sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic Research (TIGR) for rapid and accurate assembly of thousands of sequence fragments. The collection of contigs generated by the assembly step is loaded into the database with the lassie program. Identification of open reading frames (ORFs) is accomplished by processing contigs with zorf or GenMark. The ORFs are searched against S. pneumoniae sequences from GenBank and against all protein sequences using the BLASTN and BLASTP programs, described in Altschul et al., J. Mol. Biol. 215: 403-410 (1990)). Results of the ORF determination and similarity searching steps were loaded into the database. As described below, some results of the determination and the searches are set out in Tables 1-3.

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

The present invention is based on the sequencing of fragments of the Streptococcus pneumoniae genome and analysis of the sequences. The primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID NOS:1-391. (As used herein, the “primary sequence” refers to the nucleotide sequence represented by the IUPAC nomenclature system.)

In addition to the aforementioned Streptococcus pneumoniae polynucleotide and polynucleotide sequences, the present invention provides the nucleotide sequences of SEQ ID NOS:1-391, or representative fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan.

As used herein, a “representative fragment of the nucleotide sequence depicted in SEQ ID NOS:1-391” refers to any portion of the SEQ ID NOS:1-391 which is not presently represented within a publicly available database. Preferred representative fragments of the present invention are Streptococcus pneumoniae open reading frames (ORFs), expression modulating fragment (EMFs) and fragments which can be used to diagnose the presence of Streptococcus pneumoniae in sample (DFs). A non-limiting identification of preferred representative fragments is provided in Tables 1-3. As discussed in detail below, the information provided in SEQ ID NOS:1-391 and in Tables 1-3 together with routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence all “representative fragments” of interest, including open reading frames encoding a large variety of Streptococcus pneumoniae proteins.

While the presently disclosed sequences of SEQ ID NOS:1-391 are highly accurate, sequencing techniques are not perfect and, in relatively rare instances, further investigation of a fragment or sequence of the invention may reveal a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS:1-391. However, once the present invention is made available (i.e., once the information in SEQ ID NOS:1-391 and Tables 1-3 has been made available), resolving a rare sequencing error in SEQ ID NOS:1-391 will be well within the skill of the art. The present disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to be obtained readily by straightforward application of routine techniques. Further sequencing of such polynucleotide may proceed in like manner using manual and automated sequencing methods which are employed ubiquitous in the art. Nucleotide sequence editing software is publicly available. For example, Applied Biosystem's (AB) AutoAssembler can be used as an aid during visual inspection of nucleotide sequences. By employing such routine techniques potential errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing effort, also of a routine nature, to the region containing the potential error.

Even if all of the very rare sequencing errors in SEQ ID NOS:1-391 were corrected, the resulting nucleotide sequences would still be at least 95% identical, nearly all would be at least 99% identical, and the great majority would be at least 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-391.

As discussed elsewhere herein, polynucleotides of the present invention readily may be obtained by routine application of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obtaining libraries and for sequencing are provided below, for instance. A wide variety of Streptococcus pneumoniae strains that can be used to prepare S. pneumoniae genomic DNA for cloning and for obtaining polynucleotides of the present invention are available to the public from recognized depository institutions, such as the American Type Culture Collection (ATCC). While the present invention is enabled by the sequences and other information herein disclosed, the S. pneumoniae strain that provided the DNA of the present Sequence Listing, Strain 7/87 14.8.91, has been deposited in the ATCC, as a convenience to those of skill in the art. As a further convenience, a library of S. pneumoniae genomic DNA, derived from the same strain, also has been deposited in the ATCC. The S. pneumoniae strain was deposited on Oct. 10, 1996, and was given Deposit No. 55840, and the cDNA library was deposited on Oct. 11, 1996 and was given Deposit No. 97755. The genomic fragments in the library are 15 to 20 kb fragments generated by partial Sau3A1 digestion and they are inserted into the BamHI site in the well-known lambda-derived vector lambda DASH II (Stratagene, La Jolla, Calif.). The provision of the deposits is not a waiver of any rights of the inventors or their assignees in the present subject matter.

The nucleotide sequences of the genomes from different strains of Streptococcus pneumoniae differ somewhat. However, the nucleotide sequences of the genomes of all Streptococcus pneumoniae strains will be at least 95% identical, in corresponding part, to the nucleotide sequences provided in SEQ ID NOS:1-391. Nearly all will be at least 99% identical and the great majority will be 99.9% identical.

Thus, the present invention further provides nucleotide sequences which are at least 95%, preferably 99% and most preferably 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-391, in a form which can be readily used, analyzed and interpreted by the skilled artisan.

Methods for determining whether a nucleotide sequence is at least 95%, at least 99% or at least 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-391 are routine and readily available to the skilled artisan. For example, the well known fasta algorithm described in Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85: 2444 (1988) can be used to generate the percent identity of nucleotide sequences. The BLASTN program also can be used to generate an identity score of polynucleotides compared to one another.

COMPUTER RELATED EMBODIMENTS

The nucleotide sequences provided in SEQ ID NOS:1-391, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a polynucleotide sequence of SEQ ID NOS:1-391 may be “provided” in a variety of mediums to facilitate use thereof. As used herein, provided refers to a manufacture, other than an isolated nucleic acid molecule, wich contains a nucleotide sequence of the present invention; i.e., a nucleotide sequence provided in SEQ ID NOS:1-391, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a polynucleotide of SEQ ID NOS:1-391. Such a manufacture provides a large portion of the Streptococcus pneumoniae genome and parts thereof (e.g., a Streptococcus pneumoniae open reading frame (ORF)) in a form which allows a skilled artisan to examine the manufacture using means not directly applicable to examining the Streptococcus pneumoniae genome or a subset thereof as it exists in nature or in purified form.

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer readable media. As used herein, “computer readable media” refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable mediums can be used to create a manufacture comprising computer readable medium having recorded thereon a nucleotide sequence of the present invention. Likewise, it will be clear to those of skill how additional computer readable media that may be developed also can be used to create analogous manufactures having recorded thereon a nucleotide sequence of the present invention.

As used herein, “recorded” refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently know methods for recording information on computer readable medium to generate manufactures comprising the nucleotide sequence information of the present invention. A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and MicroSoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of data-processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium. Thus, by providing in computer readable form the nucleotide sequences of SEQ ID NOS:1-391, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a sequence of SEQ ID NOS:1-391 the present invention enables the skilled artisan routinely to access the provided sequence information for a wide variety of purposes.

The examples which follow demonstrate how software which implements the BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system was used to identify open reading frames (ORFs) within the Streptococcus pneumoniae genome which contain homology to ORFs or proteins from both Streptococcus pneumoniae and from other organisms. Among the ORFs discussed herein are protein encoding fragments of the Streptococcus pneumoniae genome useful in producing commercially important proteins, such as enzymes used in fermentation reactions and in the production of commercially useful metabolites.

The present invention further provides systems, particularly computer-based systems, which contain the sequence information described herein. Such systems are designed to identify, among other things, commercially important fragments of the Streptococcus pneumoniae genome.

As used herein, “a computer-based system” refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available computer-based systems are suitable for use in the present invention.

As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means for supporting and implementing a search means.

As used herein, “data storage means” refers to memory which can store nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence information of the present invention.

As used herein, “search means” refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means. Search means are used to identify fragments or regions of the present genomic sequences which match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting homology searches can be adapted for use in the present computer-based systems.

As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. The most preferred sequence length of a target sequence is from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that searches for commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

As used herein, “a target structural motif,” or “target motif,” refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences).

A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the Streptococcus pneumoniae genomic sequences possessing varying degrees of homology to the target sequence or target motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the target sequence or target motif and identifies the degree of homology contained in the identified fragment.

A variety of comparing means can be used to compare a target sequence or target motif with the data storage means to identify sequence fragments of the Streptococcus pneumoniae genome. In the present examples, implementing software which implement the BLAST and BLAZE algorithms, described in Altschul et al., J. Mol. Biol. 215: 403-410 (1990), is used to identify open reading frames within the Streptococcus pneumoniae genome. A skilled artisan can readily recognize that any one of the publicly available homology search programs can be used as the search means for the computer-based systems of the present invention. Of course, suitable proprietary systems that may be known to those of skill also may be employed in this regard.

FIG. 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present invention. The computer system 102 includes a processor 106 connected to a bus 104. Also connected to the bus 104 are a main memory 108 (preferably implemented as random access memory, RAM) and a variety of secondary storage devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data recorded therein may be inserted into the removable medium storage device 114. The computer system 102 includes appropriate software for reading the control logic and/or the data from the removable medium storage device 114, once it is inserted into the removable medium storage device 114.

A nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108, any of the secondary storage devices 110, and/or a removable storage medium 116. During execution, software for accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside in main memory 108, in accordance with the requirements and operating parameters of the operating system, the hardware system and the software program or programs.

BIOCHEMICAL EMBODIMENTS

Other embodiments of the present invention are directed to isolated fragments of the Streptococcus pneumoniae genome. The fragments of the Streptococcus pneumoniae genome of the present invention include, but are not limited to fragments which encode peptides and polypeptides, hereinafter open reading frames (ORFs), fragments which modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs) and fragments which can be used to diagnose the presence of Streptococcus pneumoniae in a sample, hereinafter diagnostic fragments (DFs).

As used herein, an “isolated nucleic acid molecule” or an “isolated fragment of the Streptococcus pneumoniae genome” refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification means to reduce, from the composition, the number of compounds which are normally associated with the composition. Particularly, the term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS:1-391, to representative fragments thereof as described above, to polynucleotides at least 95%, preferably at least 99% and especially preferably at least 99.9% identical in sequence thereto, also as set out above.

A variety of purification means can be used to generate the isolated fragments of the present invention. These include, but are not limited to methods which separate constituents of a solution based on charge, solubility, or size.

In one embodiment, Streptococcus pneumoniae DNA can be enzymatically sheared to produce fragments of 15-20 kb in length. These fragments can then be used to generate a Streptococcus pneumoniae library by inserting them into lambda clones as described in the Examples below. Primers flanking, for example, an ORF, such as those enumerated in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS:1-391. Well known and routine techniques of PCR cloning then can be used to isolate the ORF from the lambda DNA library or Streptococcus pneumoniae genomic DNA. Thus, given the availability of SEQ ID NOS:1-391, the information in Tables 1, 2 and 3, and the information that may be obtained readily by analysis of the sequences of SEQ ID NOS:1-391 using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-containing or other nucleic acid fragment of the present invention.

The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and double stranded DNA, and single stranded RNA.

As used herein, an “open reading frame,” ORF, means a series of triplets coding for amino acids without any termination codons and is a sequence translatable into protein.

Tables 1, 2, and 3 list ORFs in the Streptococcus pneumoniae genomic contigs of the present invention that were identified as putative coding regions by the GeneMark software using organism-specific second-order Markov probability transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical methods, such as those discussed herein, to generate more inclusive, more restrictive, or more selective lists.

Table 1 sets out ORFs in the Streptococcus pneumoniae contigs of the present invention that over a continuous region of at least 50 bases are 95% or more identical (by BLAST analysis) to a nucleotide sequence available through GenBank in October, 1997.

Table 2 sets out ORFs in the Streptococcus pneumoniae contigs of the present invention that are not in Table 1 and match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through GenBank in October, 1997.

Table 3 sets out ORFs in the Streptococcus pneumoniae contigs of the present invention that do not match significantly, by BLASTP analysis, a polypeptide sequence available through GenBank in October, 1997.

In each table, the first and second columns identify the ORF by, respectively, contig number and ORF number within the contig; the third column indicates the first nucleotide of the ORF (actually the first nucleotide of the stop codon immediately preceeding the ORF), counting from the 5′ end of the contig strand; and the fourth column, “stop (nt)” indicates the last nucleotide of the stop codon defining the 3′ end of the ORF.

In Tables 1 and 2, column five, lists the Reference for the closest matching sequence available through GenBank. These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be familiar with their denominators. Descriptions of the nomenclature are available from the National Center for Biotechnology Information. Column six in Tables 1 and 2 provides the gene name of the matching sequence; column seven provides the BLAST identity score and column eight the BLAST similarity score from the comparison of the ORF and the homologous gene; and column nine indicates the length in nucleotides of the highest scoring segment pair identified by the BLAST identity analysis.

Each ORF described in the tables is defined by “start (nt)” (5′)and “stop (nt)” (3′)nucleotide position numbers. These position numbers refer to the boundaries of each ORF and provide orientation with respect to whether the forward or reverse strand is the coding strand and which reading frame the coding sequence is contained. The “start” position is the first nucleotide of the triplet encoding a stop codon just 5′ to the ORF and the “stop” position is the last nucleotide of the triplet encoding the next in-frame stop codon (i.e., the stop codon at the 3′ end of the ORF). Those of ordinary skill in the art appreciate that preferred fragments within each ORF described in the table include fragments of each ORF which include the entire sequence from the delineated “start” and “stop” positions excepting the first and last three nucleotides since these encode stop codons. Thus, polynucleotides set out as ORFs in the tables but lacking the three (3) 5′ nucleotides and the three (3) 3′ nucleotides are encompassed by the present invention. Those of skill also appreciate that particularly preferred are fragments within each ORF that are polynucleotide fragments comprising polypeptide coding sequence. As defined herein, “coding sequence” includes the fragment within an ORF beginning at the first in-frame ATG (triplet encoding methionine) and ending with the last nucleotide prior to the triplet encoding the 3′ stop codon. Preferred are fragments comprising the entire coding sequence and fragments comprising the entire coding sequence, excepting the coding sequence for the N-terninal methionine. Those of skill appreciate that the N-terminal methionine is often removed during post-translational processing and that polynucleotides lacking the ATG can be used to facilitate production of N-termainal fusion proteins which may be benefical in the production or use of genetically engineered proteins. Of course, due to the degeneracy of the genetic code many polynucleotides can encode a given polypeptide. Thus, the invention further includes polynucleotides comprising a nucleotide sequence encoding a polypeptide sequence itself encoded by the coding sequence within an ORF described in Tables 1-3 herein. Further, polynucleotides at least 95%, preferably at least 99% and especially preferably at least 99.9% identical in sequence to the foregoing polynucleotides, are contemplated by the present invention.

Polypeptides encoded by polynucleotides described above and elsewhere herein are also provided by the present invention as are polypeptide comprising a an amino acid sequence at least about 95%, preferably at least 97% and even more preferably 99% identical to the amino acid sequence of a polypeptide encoded by an ORF shown in Tables 1-3. These polypeptides may or may not comprise an N-terminal methionine.

The concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art. For example, two polypeptides 10 amino acids in length which differ at three amino acid positions (e.g., at positions 1, 3 and 5) are said to have a percent identity of 70%. However, the same two polypeptides would be deemed to have a percent similarity of 80% if, for example at position 5, the amino acids moieties, although not identical, were “similar” (i.e., possessed similar biochemical characteristics). Many programs for analysis of nucleotide or amino acid sequence similarity, such as fasta and BLAST specifically list percent identity of a matching region as an output parameter. Thus, for instance, Tables 1 and 2 herein enumerate the percent identity of the highest scoring segment pair in each ORF and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided below and are described in the pertinent literature highlighted by the citations provided below.

It will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful. Thus, a skilled artisan can readily identify ORFs in contigs of the Streptococcus pneumoniae genome other than those listed in Tables 1-3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those ascertainable using the computer-based systems of the present invention.

As used herein, an “expression modulating fragment,” EMF, means a series of nucleotide molecules which modulates the expression of an operably linked ORF or EMF.

As used herein, a sequence is said to “modulate the expression of an operably linked sequence” when the expression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to, promoters, and promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression or an operably linked ORF in response to a specific regulatory factor or physiological event.

EMF sequences can be identified within the contigs of the Streptococcus pneumoniae genome by their proximity to the ORFs provided in Tables 1-3. An intergenic segment, or a fragment of the intergenic segment, from about 10 to 200 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate the expression of an operably linked ORF in a fashion similar to that found with the naturally linked ORF sequence. As used herein, an “intergenic segment” refers to fragments of the Streptococcus pneumoniae genome which are between two ORF(s) herein described. EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems of the present invention. Further, the two methods can be combined and used together.

The presence and activity of an EMF can be confirmed using an EMF trap vector. An EMF trap vector contains a cloning site linked to a marker sequence. A marker sequence encodes an identifiable phenotype, such as antibiotic resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap vector is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the expression of an operably linked marker sequence. A more detailed discussion of various marker sequences is provided below. A sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction sites upstream from the marker sequence in the EMF trap vector. The vector is then transformed into an appropriate host using known procedures and the phenotype of the transformed host in examined under appropriate conditions. As described above, an EMF will modulate the expression of an operably linked marker sequence.

As used herein, a “diagnostic fragment,” DF, means a series of nucleotide molecules which selectively hybridize to Streptococcus pneumoniae sequences. DFs can be readily identified by identifying unique sequences within contigs of the Streptococcus pneumoniae genome, such as by using well-known computer analysis software, and by generating and testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which determines amplification or hybridization selectivity.

The sequences falling within the scope of the present invention are not limited to the specific sequences herein described, but also include allelic and species variations thereof. Allelic and species variations can be routinely determined by comparing the sequences provided in SEQ ID NOS:1-391, a representative fragment thereof, or a nucleotide sequence at least 95%, preferrably at least 99% and most at least preferably 99.9% identical to SEQ ID NOS:1-391, with a sequence from another isolate of the same species. Furthermore, to accommodate codon variability, the invention includes nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated. Any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, such as an ORF, in both directions (i.e., sequence both strands). Alternatively, error screening can be performed by sequencing corresponding polynucleotides of Streptococcus pneumoniae origin isolated by using part or all of the fragments in question as a probe or primer.

Preferred DFs of the present invention comprise at least about 17, preferrably at least about 20, and more preferrably at least about 50 contiguous nucleotides within an ORF set out in Tables 1-3. Most highly preferred DFs specifically hybridize to a polynucleotide containing the sequence of the ORF from which they are derived. Specific hybridization occurs even under stringent conditions defined elsewhere herein.

Each of the ORFs of the Streptococcus pneumoniae genome disclosed in Tables 1, 2 and 3, and the EMFs found 5′ to the ORFs, can be used as polynucleotide reagents in numerous ways. For example, the sequences can be used as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, particularly Streptococcus pneumoniae. Especially preferred in this regard are ORFs such as those of Table 3, which do not match previously characterized sequences from other organisms and thus are most likely to be highly selective for Streptococcus pneumoniae. Also particularly preferred are ORFs that can be used to distinguish between strains of Streptococcus pneumoniae, particularly those that distinguish medically important strain, such as drug-resistant strains.

In addition, the fragments of the present invention, as broadly described, can be used to control gene expression through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynucleotide sequence to DNA or RNA. Triple helix-formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Information from the sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides. Polynucleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et al., Science 251:1360 (1991). Antisence techniques in general are discussed in, for instance, Okano, J. Neurochem. 56:560 (1991) and Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, Fla. (1988)).

The present invention further provides recombinant constructs comprising one or more fragments of the Streptococcus pneumoniae genomic fragments and contigs of the present invention. Certain preferred recombinant constructs of the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Streptococcus pneumoniae genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a promoter, operably linked to the ORF. For vectors comprising the EMFs of the present invention, the vector may further comprise a marker sequence or heterologous ORF operably linked to the EMF.

Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially available for generating the recombinant constructs of the present invention. The following vectors are provided by way of example. Useful bacterial vectors include phagescript, PsiX174, pBluescript SK, pBS KS, pNH8a, pNH16a, pNH18a, pNH46a (available from Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (available from Pharmacia). Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXT1, pSG (available from Stratagene) pSVK3, pBPV, pMSG, pSVL (available from Pharmacia).

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other vectors with selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include lacI, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art.

The present invention further provides host cells containing any one of the isolated fragments of the Streptococcus pneumoniae genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the host cell using known methods. The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell.

A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such as calcium phosphate transfection, DEAE, dextran mediated transfection and electroporation, which are described in, for instance, Davis, L. et al., BASIC METHODS IN MOLECULAR BIOLOGY (1986).

A host cell containing one of the fragments of the Streptococcus pneumoniae genomic fragments and contigs of the present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF. The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present invention or by degenerate variants of the nucleic acid fragments of the present invention. By “degenerate variant” is intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence.

Preferred nucleic acid fragments of the present invention are the ORFs and subfragments thereof depicted in Tables 2 and 3 which encode proteins.

A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially available peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides. Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies against the native polypeptide, as discussed further below.

In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the polypeptide or protein. One skilled in the art can readily employ well-known methods for isolating polypeptides and proteins to isolate and purify polypeptides or proteins of the present invention produced naturally by a bacterial strain, or by other methods. Methods for isolation and purification that can be employed in this regard include, but are not limited to, immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, and immuno-affinity chromatography.

The polypeptides and proteins of the present invention also can be purified from cells which have been altered to express the desired polypeptide or protein. As used herein, a cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally does not produce or which the cell normally produces at a lower level. Those skilled in the art can readily adapt procedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides or proteins of the present invention.

Any host/vector system can be used to express one or more of the ORFs of the present invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. The most preferred cells are those which do not normally express the particular polypeptide or protein or which expresses the polypeptide or protein at low natural level.

“Recombinant,” as used herein, means that a polypeptide or protein is derived from recombinant (e.g., microbial or mammalian) expression systems. “Microbial” refers to recombinant polypeptides or proteins made in bacterial or fungal (e.g., yeast) expression systems. As a product, “recombinant microbial” defines a polypeptide or protein essentially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells.

“Nucleotide sequence” refers to a heteropolymer of deoxyribonucleotides. Generally, DNA segments encoding the polypeptides and proteins provided by this invention are assembled from fragments of the Streptococcus pneumoniae genome and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a microbial or viral operon.

“Recombinant expression vehicle or vector” refers to a plasmid or phage or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. The expression vehicle can comprise a transcriptional unit comprising an assembly of (1) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, promoters and, where necessary, an enhancer and a polyadenylation signal; (2) a structural or coding sequence which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the beginning of the desired coding region and terminate translation at its end. Structural units intended for use in yeast or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated protein by a host cell. Alternatively, where recombinant protein is expressed without a leader or transport sequence, it may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the expressed recombinant protein to provide a final product.

“Recombinant expression system” means host cells which have stably integrated a recombinant transcriptional unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally. The cells can be prokaryotic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or proteins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed.

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DNA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described in Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989), the disclosure of which is hereby incorporated by reference in its entirety.

Generally, recombinant expression vectors will include origins of replication and selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence. Such promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), alpha-factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable of directing secretion of translated protein into the periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired characteristics, e.g., stabilization or simplified purification of expressed recombinant product.

Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable reading phase with a functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance of the vector and, when desirable, provide amplification within the host.

Suitable prokaryotic hosts for transformation include strains of E. coli, B. subtilis, Salmonella typhimurium and various species within the genera Pseudomonas and Streptomyces. Others may, also be employed as a matter of choice.

As a representative but non-limiting example, useful expression vectors for bacterial use can comprise a selectable marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, Wis., USA). These pBR322 “backbone” sections are combined with an appropriate promoter and the structural sequence to be expressed.

Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the selected promoter, where it is inducible, is derepressed or induced by appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product. Thereafter cells are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally by physical or chemical means, and the resulting crude extract is retained for further purification.

Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23:175 (1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and BHK cell lines.

Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5′ flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements.

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Microbial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as necessary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can be employed for final purification steps.

The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are substantially equivalent to those herein described. As used herein, substantially equivalent can refer both to nucleic acid and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity between reference and subject sequences. For purposes of the present invention, sequences having equivalent biological activity, and equivalent expression characteristics are considered substantially equivalent. For purposes of determining equivalence, truncation of the mature sequence should be disregarded.

The invention further provides methods of obtaining homologs from other strains of Streptococcus pneumoniae, of the fragments of the Streptococcus pneumoniae genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. As used herein, a sequence or protein of Streptococcus pneumoniae is defined as a homolog of a fragment of the Streptococcus pneumoniae fragments or contigs or a protein encoded by one of the ORFs of the present invention, if it shares significant homology to one of the fragments of the Streptococcus pneumoniae genome of the present invention or a protein encoded by one of the ORFs of the present invention. Specifically, by using the sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs.

As used herein, two nucleic acid molecules or proteins are said to “share significant homology” if the two contain regions which possess greater than 85% sequence (amino acid or nucleic acid) homology. Preferred homologs in this regard are those with more than 90% homology. Especially preferred are those with 93% or more homology. Among especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred among these are those with 97% and even more particularly preferred among those are homologs with 99% or more homology. The most preferred homologs among these are those with 99.9% homology or more. It will be understood that, among measures of homology, identity is particularly preferred in this regard.

Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS:1-391 or from a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ ID NOS:1-391 can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies containing cloned DNA encoding a homolog. Methods suitable to this aspect of the present invention are well known and have been described in great detail in many publications such as, for example, Innis et al., PCR Protocols, Academic Press, San Diego, Calif. (1990)).

When using primers derived from SEQ ID NOS:1-391 or from a nucleotide sequence having an aforementioned identity to a sequence of SEQ ID NOS:1-391, one skilled in the art will recognize that by employing high stringency conditions (e.g., annealing at 50-60° C. in 6×SSC and 50% formamide, and washing at 50-65° C. in 0.5×SSC) only sequences which are greater than 75% homologous to the primer will be amplified. By employing lower stringency conditions (e.g., hybridizing at 35-37° C. in 5×SSC and 40-45% formamide, and washing at 42° C. in 0.5×SSC), sequences which are greater than 40-50% homologous to the primer will also be amplified.

When using DNA probes derived from SEQ ID NOS:1-391, or from a nucleotide sequence having an aforementioned identity to a sequence of SEQ ID NOS:1-391, for colony/plaque hybridization, one skilled in the art will recognize that by employing high stringency conditions (e.g., hybridizing at 50-65° C. in 5×SSC and 50% formamide, and washing at 50-65° C. in 0.5×SSC), sequences having regions which are greater than 90% homologous to the probe can be obtained, and that by employing lower stringency conditions (e.g., hybridizing at 35-37° C. in 5×SSC and 40-45% formamide, and washing at 42° C. in 0.5×SSC), sequences having regions which are greater than 35-45% homologous to the probe will be obtained.

Any organism can be used as the source for homologs of the present invention so long as the organism naturally expresses such a protein or contains genes encoding the same. The most preferred organism for isolating homologs are bacteria which are closely related to Streptococcus pneumoniae.

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION

Each ORF provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide. As a result, one skilled in the art can use the polypeptides of the present invention for commercial, therapeutic and industrial purposes consistent with the type of putative identification of the polypeptide. Such identifications permit one skilled in the art to use the Streptococcus pneumoniae ORFs in a manner similar to the known type of sequences for which the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite. A variety of reviews illustrative of this aspect of the invention are available, including the following reviews on the industrial use of enzymes, for example, BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK, 2nd Ed., MacMillan Publications, Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper et al., Eds., Elsevier Science Publishers, Amsterdam, The Netherlands (1985). A variety of exemplary uses that illustrate this and similar aspects of the present invention are discussed below.

1. Biosynthetic Enzymes

Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and macromolecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes enzymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, enzymes involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in amino acid synthesis, enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, can be used for industrial biosynthesis.

The various metabolic pathways present in Streptococcus pneumoniae can be identified based on absolute nutritional requirements as well as by examining the various enzymes identified in Table 1-3 and SEQ ID NOS:1-391.

Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non-macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase.

Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification and depectinization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to give unicellular fruits. A detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts et al., Symbiosis 21:79 (1986) and Voragen et al. in Biocatalysts In Agricultural Biotechnology, Whitaker et al., Eds., American Chemical Society Symposium Series 389:93 (1989).

The metabolism of sugars is an important aspect of the primary metabolism of Streptococcus pneumoniae. Enzymes involved in the degradation of sugars, such as, particularly, glucose, galactose, fructose and xylose, can be used in industrial fermentation. Some of the important sugar transforming enzymes, from a commercial viewpoint, include sugar isomerases such as glucose isomerase. Other metabolic enzymes have found commercial use such as glucose oxidases which produces ketogulonic acid (KGA). KGA is an intermediate in the commercial production of ascorbic acid using the Reichstein's procedure, as described in Krueger et al., Biotechnology 6(A , Rhine et al., Eds., Verlag Press, Weinheim, Germany (1984).

Glucose oxidase (GOD) is commercially available and has been used in purified form as well as in an immobilized form for the deoxygenation of beer. See, for instance, Hartmeir et al., Biotechnology Letters 1:21 (1979). The most important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are used in the detergent, textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described, for example, in Bigelis et al., beginning on page 357 in GENE MANIPULATIONS AND FUNGI; Benett et al., Eds., Academic Press, New York (1985). In addition to industrial applications, GOD has found applications in medicine for quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and cellulose hydrosylates. This application is described in Owusu et al., Biochem. et Biophysica. Acta. 872:83 (1986), for instance.

The main sweetener used in the world today is sugar which comes from sugar beets and sugar cane. In the field of industrial enzymes, the glucose isomerase process shows the largest expansion in the market today. Initially, soluble enzymes were used and later immobilized enzymes were developed (Krueger et al., Biotechnology, The Textbook of Industrial Microbiology, Sinauer Associated Incorporated, Sunderland, Mass. (1990)). Today, the use of glucose-produced high fructose syrups is by far the largest industrial business using immobilized enzymes. A review of the industrial use of these enzymes is provided by Jorgensen, Starch 40:307 (1988).

Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus represent one of the largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See Faultman et al., Acid Proteases Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and Godfrey et al., Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner et al., Report Industrial Enzymes by 1990, Hel Hepner & Associates, London (1986)).

Another class of commercially usable proteins of the present invention are the microbial lipases, described by, for instance, Macrae et al., Philosophical Transactions of the Chiral Society of London 310:227 (1985) and Poserke, Journal of the American Oil Chemist Society 61:1758 (1984). A major use of lipases is in the fat and oil industry for the production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides. Application of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the washing procedures.

The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral intermediates. Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et al., Recent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Fla. (1990)). The following reactions catalyzed by enzymes are of interest to organic chemists: hydrolysis of carboxylic acid esters, phosphate esters, amides and nitrites, esterification reactions, trans-esterification reactions, synthesis of amides, reduction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, and carbon bond forming reactions such as the aldol reaction.

When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a microorganism as opposed to an isolated enzyme. Pros and cons of using a whole cell system on the one hand or an isolated partially purified enzyme on the other hand, has been described in detail by Bud et al., Chemistry in Britain (1987), p. 127.

Amino transferases, enzymes involved in the biosynthesis and metabolism of amino acids, are useful in the catalytic production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase enzymes catalyze the stereo-selective synthesis of only L-amino acids and generally possess uniformly high catalytic rates, A description of the use of amino transferases for amino acid production is provided by Roselle-David, Methods of Enzymology 136:479 (1987).

Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in nucleic acid synthesis, repair, and recombination.

2. Generation of Antibodies

As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety of procedures and methods known in the art which are currently applied to other proteins. The proteins of the present invention can further be used to generate an antibody which selectively binds the protein. Such antibodies can be either monoclonal or polyclonal antibodies, as well fragments of these antibodies, and humanized forms.

The invention further provides antibodies which selectively bind to one of the proteins of the present invention and hybridomas which produce these antibodies. A hybridoma is an immortalized cell line which is capable of secreting a specific monoclonal antibody.

In general, techniques for preparing polyclonal and monoclonal antibodies as well as hybridomas capable of producing the desired antibody are well known in the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory Techniques In Biochemistry And Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1984); St. Groth et al., J. Immunol. Methods 35. 1-21 (1980), Kohler and Milstein, Nature 256:495-497 (1975)), the trioma technique, the human B-cell hybridoma technique (Kozbor et al., Immunology Today 4:72 (1983), pgs. 77-96 of Cole et al., in Monoclonal Antibodies And Cancer Therapy, Alan R. Liss, Inc. (1985)). Any animal (mouse, rabbit, etc.) which is known to produce antibodies can be immunized with the pseudogene polypeptide. Methods for immunization are well known in the art. Such methods include subcutaneous or interperitoneal injection of the polypeptide. One skilled in the art will recognize that the amount of the protein encoded by the ORF of the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the peptide and the site of injection.

The protein which is used as an immunogen may be modified or administered in an adjuvant in order to increase the proteins antigenicity. Methods of increasing the antigenicity of a protein are well known in the art and include, but are not limited to coupling the antigen with a heterologous protein (such as globulin or galactosidase) or through the inclusion of an adjuvant during immunization.

For monoclonal antibodies, spleen cells from the immunized animals are removed, fused with myeloma cells, such as SP2/0-Ag14 myeloma cells, and allowed to become monoclonal antibody producing hybridoma cells.

Any one of a number of methods well known in the art can be used to identify the hybridoma cell which produces an antibody with the desired characteristics. These include screening the hybridomas with an ELISA assay, western blot analysis, or radioimmunoassay (Lutz et al., Exp. Cell Res. 175.109-124 (1988)),

Hybridomas secreting the desired antibodies are cloned and the class and subclass is determined using procedures known in the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1984)).

Techniques described for the production of single chain antibodies (U.S. Pat. No. 4,946,778) can be adapted to produce single chain antibodies to proteins of the present invention.

For polyclonal antibodies, antibody containing antisera is isolated from the immunized animal and is screened for the presence of antibodies with the desired specificity using one of the above-described procedures.

The present invention further provides the above-described antibodies in detectably labelled form. Antibodies can be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels (such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine, etc.), paramagnetic atoms, etc. Procedures for accomplishing such labeling are well-known in the art, for example see Sternberger et al., J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A. et al., Meth. Enzym. 62:308 (1979); Engval, E. et al., Immunol. 109:129 (1972); Goding, J. W., J. Immunol.

Meth. 13:215 (1976)).

The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells or tissues in which a fragment of the Streptococcus pneumoniae genome is expressed.

The present invention further provides the above-described antibodies immobilized on a solid support. Examples of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose, acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports are well known in the art (Weir, D. M. et al., “Handbook of Experimental Immunology” 4th Ed., Blackwell Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. et al., Meth. Enzym. 34 Academic Press, N.Y. (1974)). The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for immunoaffinity purification of the proteins of the present invention.

3. Diagnostic Assays and Kits

The present invention further provides methods to identify the expression of one of the ORFs of the present invention, or homolog thereof, in a test sample, using one of the DFs or antibodies of the present invention.

In detail, such methods comprise incubating a test sample with one or more of the antibodies or one or more of the DFs of the present invention and assaying for binding of the DFs or antibodies to components within the test sample.

Conditions for incubating a DF or antibody with a test sample vary, Incubation conditions depend on the format employed in the assay, the detection methods employed, and the type and nature of the DF or antibody used in the assay. One skilled in the art will recognize that any one of the commonly available hybridization, amplification or immunological assay formats can readily be adapted to employ the DFs or antibodies of the present invention. Examples of such assays can be found in Chard, T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G. R. et al., Techniques in Immunocytochemistry, Academic Press, Orlando, Fla. Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1985).

The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or urine. The test sample used in the above-described method will vary based on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be adapted in order to obtain a sample which is compatible with the system utilized.

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry out the assays of the present invention.

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the DFs or antibodies of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of a bound DF or antibody.

In detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one compartment to another. Such containers will include a container which will accept the test sample, a container which contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody or DF.

Types of detection reagents include labelled nucleic acid probes, labelled secondary antibodies, or in the alternative, if the primary antibody is labelled, the enzymatic, or antibody binding reagents which are capable of reacting with the labelled antibody. One skilled in the art will readily recognize that the disclosed DFs and antibodies of the present invention can be readily incorporated into one of the established kit formats which are well known in the art.

4. Screening Assay for Binding Agents

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the fragments and the Streptococcus pneumoniae fragment and contigs herein described.

In general, such methods comprise steps of:

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated fragment of the Streptococcus pneumoniae genome; and

(b) determining whether the agent binds to said protein or said fragment,

The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed using protein modeling techniques.

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention.

Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be “rationally selected or designed” when the agent is chosen based on the configuration of the particular protein. For example, one skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, for example see Hurby et al., “Application of Synthetic Peptides: Antisense Peptides,” in Synthetic Peptides, A User's Guide, W. H. Freeman, NY (1992), pp. 289-307, and Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like.

In addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or multiple ORFs which rely on the same EMF for expression control.

One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity.

Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary to a region of the gene involved in transcription (triple helix—see Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et al., Science 251:1360 (1991)) or to the mRNA itself (antisense—Okano, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, Fla. (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in model systems. Information contained in the sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides, and other DNA binding agents.

5. Pharmaceutical Compositions and Vaccines

The present invention further provides pharmaceutical agents which can be used to modulate the growth or pathogenicity of Streptococcus pneumoniae, or another related organism, in vivo or in vitro. As used herein, a “pharmaceutical agent” is defined as a composition of matter which can be formulated using known techniques to provide a pharmaceutical compositions. As used herein, the “pharmaceutical agents of the present invention” refers the pharmaceutical agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are identified using the herein described assays.

As used herein, a pharmaceutical agent is said to “modulate the growth pathogenicity of Streptococcus pneumoniae or a related organism, in vivo or in vitro,” when the agent reduces the rate of growth, rate of division, or viability of the organism in question. The pharmaceutical agents of the present invention can modulate the growth or pathogenicity of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth by binding to an important protein thus blocking the biological activity of the protein, while other agents may bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs of the present invention and serve as a vaccine. The development and use of a vaccine based on outer membrane components are well known in the art.

As used herein, a “related organism” is a broad term which refers to any organism whose growth can be modulated by one of the pharmaceutical agents of the present invention, In general, such an organism will contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As such, related organisms do not need to be bacterial but may be fungal or viral pathogens.

The pharmaceutical agents and compositions of the present invention may be administered in a convenient manner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal routes. The pharmaceutical compositions are administered in an amount which is effective for treating and/or prophylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day. In most cases, the dosage is from about 0.1 mg/kg to about 10 g/kg body weight daily, taking into account the routes of administration, symptoms, etc.

The agents of the present invention can be used in native form or can be modified to form a chemical derivative. As used herein, a molecule is said to be a “chemical derivative” of another molecule when it contains additional chemical moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed in, among other sources, REMINGTON'S PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein.

For example, such moieties may change an immunological character of the functional derivative, such as affinity for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a competitive type immunoassay. Modifications of such protein properties as redox or thermal stability, biological half-life, hydrophobicity, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers also may be effected in this way and can be assayed by methods well known to the skilled artisan.

The therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient by any suitable means (e.g., inhalation, intravenously, intramuscularly, subcutaneously, enterally, or parenterally). It is preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single or multiple injections.

In providing a patient with one of the agents of the present invention, the dosage of the administered agent will vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical history, etc. In general, it is desirable to provide the recipient with a dosage of agent which is in the range of from about 1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosage may be administered. The therapeutically effective dose can be lowered by using combinations of the agents of the present invention or another agent.

As used herein, two or more compounds or agents are said to be administered “in combination” with each other when either (1) the physiological effects of each compound, or (2) the serum concentrations of each compound can be measured at the same time. The composition of the present invention can be administered concurrently with, prior to, or following the administration of the other agent.

The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to decrease the rate of growth (as defined above) of the target organism.

The administration of the agent(s) of the invention may be for either a “prophylactic” or “therapeutic” purpose. When provided prophylactically, the agent(s) are provided in advance of any symptoms indicative of the organisms growth. The prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of any subsequent infection. When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an indication of infection. The therapeutic administration of the compound(s) serves to attenuate the pathological symptoms of the infection and to increase the rate of recovery.

The agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharmaceutically acceptable form and in a therapeutically effective concentration, A composition is said to be “pharmacologically acceptable” if its administration can be tolerated by a recipient patient. Such an agent is said to be administered in a “therapeutically effective amount” if the amount administered is physiologically significant. An agent is physiologically significant if its presence results in a detectable change in the physiology of a recipient patient.

The agents of the present invention can be formulated according to known methods to prepare pharmaceutically useful compositions, whereby these materials, or their functional derivatives, are combined in a mixture with a pharmaceutically acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, e.g., human serum albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 16th Ed., Osol, A., Ed., Mack Publishing, Easton Pa. (1980). In order to form a pharmaceutically acceptable composition suitable for effective administration, such compositions will contain an effective amount of one or more of the agents of the present invention, together with a suitable amount of carrier vehicle.

Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention. The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macromolecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcellulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired time course of release. Another possible method to control the duration of action by controlled release preparations is to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino acids, hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of incorporating these agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by coacervation techniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatine-microcapsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such techniques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (1980).

The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention. Associated with such container(s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use or sale for human administration.

In addition, the agents of the present invention may be employed in conjunction with other therapeutic compounds.

6. Shot-Gun Approach to Megabase DNA Sequencing

The present invention further demonstrates that a large sequence can be sequenced using a random shotgun approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols.

Certain aspects of the present invention are described in greater detail in the examples that follow. The examples are provided by way of illustration. Other aspects and embodiments of the present invention are contemplated by the inventors, as will be clear to those of skill in the art from reading the present disclosure.

ILLUSTRATIVE EXAMPLES

Libraries and Sequencing

1. Shotgun Sequencing Probability Analysis

The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman (Landerman and Waterman, Genomics 2:231 (1988)) application of the equation for the Poisson distribution. According to this treatment, the probability, P, that any given base in a sequence of size L, in nucleotides, is not sequenced after a certain amount, n, in nucleotides, of random sequence has been determined can be calculated by the equation P=e−m, where m is L/n, the fold coverage. For instance, for a genome of 2.8 Mb, m=1 when 2.8 Mb of sequence has been randomly generated (1×coverage). At that point, P=e−1=0.37. The probability that any given base has not been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, therefore, is equivalent to the fraction of the whole sequence that has yet to be determined, Thus, at one-fold coverage, approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced. When 14 Mb of sequence has been generated, coverage is 5× for a 2.8 Mb and the unsequenced fraction drops to 0.0067 or 0.67%. 5×coverage of a 2.8 Mb sequence can be attained by sequencing approximately 17,000 random clones from both insert ends with an average sequence read length of 410 bp.

Similarly, the total gap length, G, is determined by the equation G=Le−m, and the average gap size, g, follows the equation, g=L/n. Thus, 5×coverage leaves about 240 gaps averaging about 82 bp in size in a sequence of a polynucleotide 2.8 Mb long.

The treatment above is essentially that of Lander and Waterman, Genomics 2: 231 (1988).

2. Random Library Construction

In order to approximate the random model described above during actual sequencing, a nearly ideal library of cloned genomic fragments is required. The following library construction procedure was developed to achieve this end.

Streptococcus pneumoniae DNA is prepared by phenol extraction. A mixture containing 200 μg DNA in 1.0 ml of 300 mM sodium acetate, 10 mM Tris-HCl, 1 mM Na-EDTA, 50% glycerol is processed through a nebulizer (IPI Medical Products) with a stream of nitrogen adjusted to 35 Kpa for 2 minutes. The sonicated DNA is ethanol precipitated and redissolved in 500 μl TE buffer.

To create blunt-ends, a 100 μl aliquot of the resuspended DNA is digested with 5 units of BAL31 nuclease (New England BioLabs) for 10 min at 30° C. in 200 μl BAL31 buffer. The digested DNA is phenol-extracted, ethanol-precipitated, redissolved in 100 μl TE buffer, and then size-fractionated by electrophoresis through a 1.0% low melting temperature agarose gel. The section containing DNA fragments 1.6-2.0 kb in size is excised from the gel, and the LGT agarose is melted and the resulting solution is extracted with phenol to separate the agarose from the

DNA. DNA is ethanol precipitated and redissolved in 20 μl of TE buffer for ligation to vector.

A two-step ligation procedure is used to produce a plasmid library with 97% inserts, of which >99% were single inserts. The first ligation mixture (50 ul) contains 2 μg of DNA fragments, 2 μg pUC18 DNA (Pharmacia) cut with SmaI and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and is incubated at 14° C. for 4 hr. The ligation mixture then is phenol extracted and ethanol precipitated, and the precipitated DNA is dissolved in 20 μl TE buffer and electrophoresed on a 1.0% low melting agarose gel. Discrete bands in a ladder are visualized by ethidium bromide-staining and UV illumination and identified by size as insert (I), vector (v), v+I, v+2i, v+3i, etc. The portion of the gel containing v+I DNA is excised and the v+I DNA is recovered and resuspended into 20 μl TE. The v+I DNA then is blunt-ended by T4 polymerase treatment for 5 min. at 37° C. in a reaction mixture (50 ul) containing the v+I linears, 500 μM each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), under recommended buffer conditions. After phenol extraction and ethanol precipitation the repaired v+I linears are dissolved in 20 μl TE, The final ligation to produce circles is carried out in a 50 μl reaction containing 5 μl of v+I linears and 5 units of T4 ligase at 14° C. overnight. After 10 min. at 70° C. the following day, the reaction mixture is stored at −20° C.

This two-stage procedure results in a molecularly random collection of single-insert plasmid recombinants with minimal contamination from double-insert chimeras (<1%) or free vector (<3%).

Since deviation from randomness can arise from propagation the DNA in the host, E. coli host cells deficient in all recombination and restriction functions (A. Greener, Strategies 3 (1):5 (1990)) are used to prevent rearrangements, deletions, and loss of clones by restriction. Furthermore, transformed cells are plated directly on antibiotic diffusion plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells.

Plating is carried out as follows. A 100 μl aliquot of Epicurian Coli SURE II Supercompetent Cells (Stratagene 200152) is thawed on ice and transferred to a chilled Falcon 2059 tube on ice. A 1.7 μl aliquot of 1.42 M beta-mercaptoethanol is added to the aliquot of cells to a final concentration of 25 mM. Cells are incubated on ice for 10 min. A 1 μl aliquot of the final ligation is added to the cells and incubated on ice for 30 min. The cells are heat pulsed for 30 sec. at 42° C. and placed back on ice for 2 min. The outgrowth period in liquid culture is eliminated from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transformation mixture is plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 20 g tryptone, 5 g yeast extract, 0.5 g NaCl, 1.5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented with 0.4 ml of 50 mg/ml ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml X-Gal (2%), 1 ml MgCl (1 M), and 1 ml MgSO/100 ml SOB agar. The 15 ml top layer is poured just prior to plating. Our titer is approximately 100 colonies/10 μl aliquot of transformation.

All Colonies are picked for template preparation regardless of size. Thus, only clones lost due to “poison” DNA or deleterious gene products are deleted from the library, resulting in a slight increase in gap number over that expected.

3. Random DNA Sequencing

High quality double stranded DNA plasmid templates are prepared using a “boiling bead” method developed in collaboration with Advanced Genetic Technology Corp. (Gaithersburg, Md.) (Adams et al., Science 252:1651 (1991); Adams et al., Nature 355:632 (1992)). Plasmid preparation is performed in a 96-well format for all stages of DNA preparation from bacterial growth through final DNA purification. Template concentration is determined using Hoechst Dye and a Millipore Cytofluor. DNA concentrations are not adjusted, but low-yielding templates are identified where possible and not sequenced.

Templates are also prepared from two Streptococcus pneumoniae lambda genomic libraries. An amplified library is constructed in the vector Lambda GEM-12 (Promega) and an unamplified library is constructed in Lambda DASH II (Stratagene). In particular, for the unamplified lambda library, Streptococcus pneumoniae DNA (>100 kb) is partially digested in a reaction mixture (200 ul) containing 50 μg DNA, 1×Sau3AI buffer, 20 units Sau3AI for 6 min. at 23° C. The digested DNA was phenol-extracted and electrophoresed on a 0.5% low melting agarose gel at 2V/cm for 7 hours. Fragments from 15 to 25 kb are excised and recovered in a final volume of 6 ul. One μl of fragments is used with 1 μl of DASHII vector (Stratagene) in the recommended ligation reaction. One μl of the ligation mixture is used per packaging reaction following the recommended protocol with the Gigapack II XL Packaging Extract (Stratagene, #227711). Phage are plated directly without amplification from the packaging mixture (after dilution with 500 μl of recommended SM buffer and chloroform treatment). Yield is about 2.5×103 pfu/ul. The amplified library is prepared essentially as above except the lambda GEM-12 vector is used. After packaging, about 3.5×104 pfu are plated on the restrictive NM539 host. The lysate is harvested in 2 ml of SM buffer and stored frozen in 7% dimethylsulfoxide. The phage titer is approximately 1×109 pfu/ml.

Liquid lysates (100 μl) are prepared from randomly selected plaques (from the unamplified library) and template is prepared by long-range PCR using T7 and T3 vector-specific primers.

Sequencing reactions are carried out on plasmid and/or PCR templates using the AB Catalyst LabStation with Applied Biosystems PRISM Ready Reaction Dye Primer Cycle Sequencing Kits for the M13 forward (M13-21) and the M13 reverse (M13RP1) primers (Adams et al., Nature 368:474 (1994)). Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits. T7 and SP6 primers are used to sequence the ends of the inserts from the Lambda GEM-12 library and T7 and T3 primers are used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are performed by eight individuals using an average of fourteen AB 373 DNA Sequencers per day. All sequencing reactions are analyzed using the Stretch modification of the AB 373, primarily using a 34 cm well-to-read distance. The overall sequencing success rate very approximately is about 85% for M13-21 and M13RP1 sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for M13-21 sequences, 445 bp for M13RP1 sequences, and 375 bp for dye-terminator reactions.

Richards et al., Chapter 28 in AUTOMATED DNA SEQUENCING AND ANALYSIS, M. D. Adams, C. Fields, J. C. Venter, Eds., Academic Press, London, (1994) described the value of using sequence from both ends of sequencing templates to facilitate ordering of contigs in shotgun assembly projects of lambda and cosmid clones. We balance the desirability of both-end sequencing (including the reduced cost of lower total number of templates) against shorter read-lengths for sequencing reactions performed with the M13RP1 (reverse) primer compared to the M13-21 (forward) primer. Approximately one-half of the templates are sequenced from both ends. Random reverse sequencing reactions are done based on successful forward sequencing reactions. Some M13RP1 sequences are obtained in a semi-directed fashion: M13-21: sequences pointing outward at the ends of contigs are chosen for M13RP1 sequencing in an effort to specifically order contigs.

4. Protocol for Automated Cycle Sequencing

The sequencing is carried out using ABI Catalyst robots and AB 373 Automated DNA Sequencers. The Catalyst robot is a publicly available sophisticated pipetting and temperature control robot which has been developed specifically for DNA sequencing reactions. The Catalyst combines pre-aliquoted templates and reaction mixes consisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescently-labelled sequencing primers, and reaction buffer. Reaction mixes and templates are combined in the wells of an aluminum 96-well thermocycling plate. Thirty consecutive cycles of linear amplification (i.e., one primer synthesis) steps are performed including denaturation, annealing of primer and template, and extension; i. e., DNA synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for an oil overlay.

Two sequencing protocols are used: one for dye-labelled primers and a second for dye-labelled dideoxy chain terminators. The shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four terminator nucleotide. Each dye-primer is labelled with a different fluorescent dye, permitting the four individual reactions to be combined into one lane of the 373 DNA Sequencer for electrophoresis, detection, and base-calling. ABI currently supplies pre-mixed reaction mixes in bulk packages containing all the necessary non-template reagents for sequencing. Sequencing can be done with both plasmid and PCR-generated templates with both dye-primers and dye-terminators with approximately equal fidelity, although plasmid templates generally give longer usable sequences.

Thirty-two reactions are loaded per AB373 Sequencer each day, for a total of 960 samples. Electrophoresis is run overnight following the manufacturer's protocols, and the data is collected for twelve hours. Following electrophoresis and fluorescence detection, the ABI 373 performs automatic lane tracking and base-calling. The lane-tracking is confirmed visually. Each sequence electropherogram (or fluorescence lane trace) is inspected visually and assessed for quality. Trailing sequences of low quality are removed and the sequence itself is loaded via software to a Sybase database (archived daily to 8 mm tape). Leading vector polylinker sequence is removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 are around 400 bp and depend mostly on the quality of the template used for the sequencing reaction. ABI 373 Sequencers converted to Stretch Liners provide a longer electrophoresis path prior to fluorescence detection and increase the average number of usable bases to 500-600 bp.

Informatics

1. Data Management

A number of information management systems for a large-scale sequencing lab have been developed. (For review see, for instance, Kerlavage et al., Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, IEEE Computer Society Press, Washington D. C., 585 (1993)) The system used to collect and assemble the sequence data was developed using the Sybase relational database management system and was designed to automate data flow wherever possible and to reduce user error. The database stores and correlates all information collected during the entire operation from template preparation to final analysis of the genome. Because the raw output of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based on a Unix platform, it was necessary to design and implement a variety of multi-user, client-server applications which allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort.

2. Assembly

An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence fragments was employed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments of the genome. In order to obtain the speed necessary to assemble more than 104 fragments, the algorithm builds a hash table of 12 bp oligonucleotide subsequences to generate a list of potential sequence fragment overlaps. The number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add the best matching fragment based on oligonucleotide content. The contig and candidate fragment are aligned using a modified version of the Smith-Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S., Methods in Enzymology 164:765 (1988)). The contig is extended by the fragment only if strict criteria for the quality of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repetitive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of alignments and excluded from the current contig. TIGR Assembler is designed to take advantage of clone size information coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from two ends of the same template point toward one another in the contig and are located within a certain range of base pairs (definable for each clone based on the known clone size range for a given library).

The process resulted in 391 contigs as represented by SEQ ID NOs:1-391.

3. Identifying Genes

The predicted coding regions of the Streptococcus pneumoniae genome were initially defined with the program GeneMark, which finds ORFs using a probabilistic classification technique. The predicted coding region sequences were used in searches against a database of all nucleotide sequences from GenBank (October, 1997), using the BLASTN search method to identify overlaps of 50 or more nucleotides with at least a 95% identity. Those ORFs with nucleotide sequence matches are shown in Table 1. The ORFs without such matches were translated to protein sequences and compared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept databases. ORFs that matched a database protein with BLASTP probability less than or equal to 0.01 are shown in Table 2. The table also lists assigned functions based on the closest match in the databases. ORFs that did not match protein or nucleotide sequences in the databases at these levels are shown in Table 3.

Illustrative Applications

1. Production of an Antibody to a Streptococcus pneumoniae Protein

Substantially pure protein or Polypeptide is isolated from the transfected or transformed cells using any one of the methods known in the art. The protein can also be produced in a recombinant prokaryotic expression system, such as E. coli, or can be chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms/ml Monoclonal or polyclonal antibody to the protein can then be prepared as follows.

2. Monoclonal Antibody Production by Hybridoma Fusion

Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from murine hybridomas according to the classical method of Kohler, G. and Milstein, C., Nature 256:495 (1975) or modifications of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. The spleen cells are fused by means of Polyethylene glycol with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall, E., Meth. Enzymol. 70:419 (1980), and modified methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Davis, L. et al., Basic Methods in Molecular Biology, Elsevier, N.Y. Section 21-2 (1989).

3. Polyclonal Antibody Production by Immunization

Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by immunizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. For example, small molecules tend to be less immunogenic than others and may require the use of carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, J. et al., J. Clin. Endocrinol. Metab. 33:988-991 (1971).

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony, O. et al., Chap. 19 in: Handbook of Experimental Immunology, Wier, D., ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher, D., Chap. 42 in: Manual of Clinical Immunology, second edition, Rose and Friedman, eds., Amer. Soc. For Microbiology, Washington, D. C. (1980)

Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which determine concentrations of antigen-bearing substances in biological samples; they are also used semi-quantitatively or qualitatively to identify the presence of antigen in a biological sample. In addition, antibodies are useful in various animal models of pneumococcal disease as a means of evaluating the protein used to make the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunotherapeutic or immunoprophylactic reagent.

4. Preparation of PCR Primers and Amplification of DNA

Various fragments of the Streptococcus pneumoniae genome, such as those of Tables 1-3 and SEQ ID NOS:1-391 can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses. The PCR primers are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequence, it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approximately the same. The PCR primers and amplified DNA of this Example find use in the Examples that follow.

5. Gene expression from DNA Sequences Corresponding to ORFs

A fragment of the Streptococcus pneumoniae genome provided in Tables 1-3 is introduced into an expression vector using conventional technology. Techniques to transfer cloned sequences into expression vectors that direct protein translation in mammalian, yeast, insect or bacterial expression systems are well known in the art. Commercially available vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla, Calif.), Promega (Madison, Wis.), and Invitrogen (San Diego, Calif.). If desired, to enhance expression and facilitate proper protein folding, the codon context and codon pairing of the sequence may be optimized for the particular expression organism, as explained by Hatfield et al., U.S. Pat. No. 5,082,767, incorporated herein by this reference.

The following is provided as one exemplary method to generate polypeptide(s) from cloned ORFs of the Streptococcus pneumoniae genome fragment. Bacterial ORFs generally lack a poly A addition signal. The addition signal sequence can be added to the construct by, for example, splicing out the poly A addition sequence from pSG5 (Stratagene) using BglI and SalI restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXT1 (Stratagene) for use in eukaryotic expression systems. pXT1 contains the LTRs and a portion of the gag gene of Moloney Murine Leukemia Virus. The positions of the LTRs in the construct allow efficient stable transfection. The vector includes the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene. The Streptococcus pneumoniae DNA is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Streptococcus pneumoniae DNA and containing restriction endonuclease sequences for PstI incorporated into the 5′ primer and BglII at the 5′ end of the corresponding Streptococcus pneumoniae DNA 3′ primer, taking care to ensure that the Streptococcus pneumoniae DNA is positioned such that its followed with the poly A addition sequence. The purified fragment obtained from the resulting PCR reaction is digested with PstI, blunt ended with an exonuclease, digested with BglII, purified and ligated to pXT1, now containing a poly A addition sequence and digested BglII.

The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc., Grand Island, N.Y.) under conditions outlined in the product specification. Positive transfectants are selected after growing the transfected cells in 600 ug/ml G418 (Sigma, St. Louis, Mo.). The protein is preferably released into the supernatant. However if the protein has membrane binding domains, the protein may additionally be retained within the cell or expression may be restricted to the cell surface. Since it may be necessary to purify and locate the transfected product, synthetic 15-mer peptides synthesized from the predicted Streptococcus pneumoniae DNA sequence are injected into mice to generate antibody to the polypeptide encoded by the Streptococcus pneumoniae DNA.

Alternatively and if antibody production is not possible, the Streptococcus pneumoniae DNA sequence is additionally incorporated into eukaryotic expression vectors and expressed as, for example, a globin fusion. Antibody to the globin moiety then is used to purify the chimeric protein. Corresponding protease cleavage sites are engineered between the globin moiety and the polypeptide encoded by the Streptococcus pneumoniae DNA so that the latter may be freed from the formed by simple protease digestion. One useful expression vector for generating globin chimerics is pSG5 (Stratagene). This vector encodes a rabbit globin. Intron II of the rabbit globin gene facilitates splicing of the expressed transcript, and the polyadenylation signal incorporated into the construct increases the level of expression. These techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods texts such as Davis et al., cited elsewhere herein, and many of the methods are available from the technical assistance representatives from Stratagene, Life Technologies, Inc., or Promega. Polypeptides of the invention also may be produced using in vitro translation systems such as in vitro Express™ Translation Kit (Stratagene).

While the present invention has been described in some detail for purposes of clarity and understanding, one skilled in the art will appreciate that various changes in form and detail can be made without departing from the true scope of the invention.

All patents, patent applications and publications referred to above are hereby incorporated by reference.

Stop< td/>< td/>< td/>< td/>Streptococcus pneumoniae malA gene, complete cds; malR gene, complete cdsStreptococcus pneumoniae malA gene, complete cds; malR gene, complete cdsStreptococcus pneumoniae peptide methionine sulfoxide reductase (msrA) andStreptococcus pneumoniae peptide methionine sulfoxide reductase (msrA) andS. pneumoniae autolysin (lytA) gene, complete cds< td>S. pneumoniae autolysin (lytA) gene, complete cds< td>Streptococcus pneumoniae ORF, complete cdsStreptococcus pneumoniae ORF, complete cdsStreptococcus pneumoniae ORF, complete cdsStreptococcus pneumoniae ORF, complete cds< td>Streptococcus pneumoniae ORF, complete cds< td>Streptococcus pneumoniae Exp7 gene, partial cds< tr>S. pneumoniae mismatch repair protein (hexA) gene, complete cdsS. pneumoniae mismatch repair protein (hexA) gene, complete cdsStreptococcus pneumoniae hyaluronidase gene, complete cdsStreptococcus pneumoniae hyaluronidase gene, complete cdsS. pneumoniae DpnI gene region encoding dpnC and dpnD, complete cds< td>S. pneumoniae DpnII gene region encoding dpnM, dpnA, dpnB, complete cds< td>S. pneumoniae exodeoxyribonuclease (exoA) gene, complete cdsS. pneumoniae exodeoxyribonuclease (exoA) gene, complete cds< td>Streptococcus pneumoniae Exp8 gene, partial cds< td/>< td>Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase< td/>< td/>< tr>Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase< td/>< td/>< td/>< td>99< td>S. pneumoniae mismatch repair protein (hexA) gene, comptete cds< td/>< td/>< td/>< td/>< tr>Streptococcus pneumoniae uvs402 protein gene, complete cds< td>Streptococcus pneumoniae uvs402 protein gene, complete cds< td/>< td/>163S. pneumoniae malX and malM genes encoding membrane protein and< td>Streptococcus pneumoniae Exp4 gene, partial cdsStreptococcus pneumoniae formate acetyltransferase (exp72) gene, partial< td/>< td/>< td>Streptococcus pneumoniae pneumococcal surface protien A PspA (pspA) gene,< td>S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnoseS. pneumoniae recP gene, complete cds< td>Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase< td/>< td/>< td>257< td/>< tr>< td/>
TABLE 1
S. pneumoniae - Coding regions containing known sequences
ContigORFStartmatchpercentHSP ntORF nt
IDID(nt)(nt)< /td>acessionmatch gene nameidentlengthlength
114371003gb|U41735|Streptococcus pneumoniae peptide methionine sulfoxide reductase (msrA) and92200567
homoserine kinase homolog (thrB) genes, complete cds
2561695720gb|U04047|Streptococcus pneumoniae SSZ dextran glucosidase gene and insertion96450450
sequence IS1202 transposase gene, complete cds
2665926167emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose98426426
biosynthesis genes and aliA gene
31197709147 emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose94624624
biosynthesis genes and aliA gene
31210489967 1emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose91819819
biosynthesis genes and aliA gene
31311546120 19gb|U43526|Streptococcus pneumoniae neuraminidase B (nanB) gene, complete cds, and99474474
neuraminidase (nanA) gene, partial cds
314120171337 5gb|U43526|Streptococcus pneumoniae neuraminidase B (nanB) gene, complete cds, and9913591359
< td/>neuraminidase (nanA) gene, partial cds
315134211433 8gb|U43526|Streptococcus pneumoniae neuraminidase B (nanB) gene, complete cds, and99918918
neuraminidase (nanA) gene, partial cds
316143291517 1gb|U43526|Streptococcus pneumoniae neuraminidase B (nanB) gene, complete cds, and99843843
neuraminidase (nanA) gene, partial cds
317151321728 2gb|U43526|Streptococcus pneumoniae neutaminidase B (nanB) gene, complete cds, and9921512151
< td/>neuraminidase (nanA) gene, partial cds
318172671839 7gb|U43526|Streptococcus pneumoniae neuraminidase B (nanB) gene, complete cds, and9910691131
< td/>neuraminidase (nanA) gene, partial cds
41461188emb|Y11463|SPDNStreptococcus pneumoniae dnaG, rpoD, cpoA genes and ORF3 and ORF59911431143
4 211982529emb|Y11463|SPDN Streptococcus pneumoniae dnaG, rpoD, cpoA genes and ORF3 and ORF5998761332
5< /td>71129711473gb|U41735|Streptococcus pneumoniae peptide methionine sulfoxide reductase (msrA) and82175177
homoserine kinase homolog (thrB) genes, complete cds
6771257364emb|Z77726|SPISS. pneumoniae DNA for insertion sequence IS1318 (1372 bp)93238240
6873227570emb|Z77725|SPISS. pneumoniae DNA for insertion sequence IS1381 (966 bp)95160249
6975337985emb|Z77725|SPISS. pneumoniae DNA for insertion sequence IS1381 (966 bp)99453453
6232019719733emb|Z83335|SPZ8 S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose96465465
biosynthesis genes and aliA gene
71083057682 emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose95624624
biosynthesis genes and aliA gene
71190248206 emb|Z83335|SPZ8S. pneumonia dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose95819819
biosynthesis genes and aliA gene
10139304807 8gb|L29323|Streptococcus pneumoniae methyl transferase (mtr) gene cluster, complete935131227
cds
112 548919emb|Z79691|SOORS. pneumoniae yorf[A,B,C,D,E], ftsL, pbpX and regR genes99316372
11 38921980emb|Z79691|SOOR< /td>S. pneumoniae yorf[A,B,C,D,E], ftsL, pbpX and regR genes9910891089
11530403477emb|Z79691|SO ORS. pneumoniae yorf[A,B,C,D,E], ftsL, pbpX and regR genes99259438
11 634803247emb|Z79691|SOOR S. pneumoniae yorf[A,B,C,D,E], ftsL, pbpX and regR genes99234234
11 736014557emb|Z79691|SOOR S. pneumoniae yorf[A,B,C,D,E], ftsL, pbpX and regR genes98957957
11 845064886emb|Z79691|SOOR S. pneumoniae yorf[A,B,C,D,E], ftsL, pbpX and regR genes99381381
11 948847142emb|X16367|SPPB Streptococcus pneumoniae pbpX gene for penicillin binding protein 2X9922592259
11< /td>1071328124emb|X16367|SPPB Streptococcus pneumoniae pbpX gene for penicillin binding protein 2X9870993
131531126gb|M31296|S. pneumoniae recP gene, complete cds994371074
14< /td>318372148emb|Z83335|SPZ8< /td>S. pneumoniae desB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose8796312
bisynthesis genes and aliA gene
14425182108 gb|M36180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase98411411
(purC) genes, complete cds
15989428511< /td>gb|U09239|Streptococcus pneumoniae type 19F capsular polysaccharide biosynthesis89340432
operon, (cps19 fABCDEFGHIJKLMNO) genes, complete cds, and aliA gene,
partial cds
17739103458< /td>emb|Z77726|SPISS. pneumoniae DNA for insertion sequence IS1318 (1372 bp)98453453
17843043873emb|Z77727|SPISS. pneumoniae DNA for insertion sequence IS1318 (823 bp)96382432
19141529emb|X94909|SPIG S. pneumoniae iga gene75368489
19< /td>2554757gb|L07752|Streptococcus pneumoniae attachment site (attB), DNA sequence99167204
1939461827gb|L07752|Streptococcus pneumoniae attachment site (attB), DNA sequence94100882
201937182gb|U33315|Streptoccus pneumoniae orfL gene, partial cds, competence stimulating99756756
peptide precursor (comC), histidine protein kinase (comD) and response
regulator (comE) genes, complete cds, tRNA-Arg and tRNA-Gln genes
2022271931 gb|U33315|Streptococcus pneumoniae orfL gene, partial cds, competence stimulating9813411341
peptide precursor (comC), histidine protein kinase (comD) and response
regulator (comE) genes, complete cds, tRNA-Arg and tRNA-Gln genes
2033175268 4gb|U76218|Streptococcus pneumoniae competence stimulating peptide precursor ComC99492492
(comC), histidine kinase homolog ComD (comD), and response regulator
homolog ComE (comE) genes, complete cds
20433224527< /td>gb|AF000658|Streptococcus pneumoniae R801 tRNA-Arg gene, partial sequence, and putative9912061206
serine protease (sphtra), SPSpoJ (spspoJ), initiator protein (spdnaa) and
beta subunit of DNA polymerase III (spdnan) genes, complete cds
20545735343< /td>gb|AF000658|Streptococcus pneumoniae R801 tRNA-Arg gene, partial sequence, and putative99771771
serine protease (sphtra), SPSpoJ (spspoJ), initiator protein (Spdnaa) and
beta subunit of DNA polymerase III (spdnan) genes, complete cds
20655326917< /td>gb|AF000658|Streptococcus pneumoniae R801 tRNA-Arg gene, partial sequence, and putative9913861386
serine protease (sphtra), SPSpoJ (spspoJ), initiator protein (spdnaa) and
beta subunit of DNA polymerase III (spdnan) genes, complete cds
20769958212< /td>gb|AF000658|Streptococcus pneumoniae R801 tRNA-Arg gene, partial sequence, and putative9912181218
serine protease (sphtra), SPSpoJ (spspoJ), initiator protein (spdnaa) and
beta subunit of DNA polymerase III (spdnan) genes, complete cds
20882148471< /td>gb|AF000658|Streptococcus pneumoniae R801 tRNA-Arg gene, partial sequence, and putative98258258
serine protease (sphtra), SPSpoJ (spspoJ), initiator protein (spdnaa) and
beta subunit of DNA polymerase III (spdnan) genes, complete cds
20985349670< /td>gb|AF000658|Streptococcus pneumoniae R801 tRNA-Arg gene, partial sequence, and putative991341137
serine protease (sphtra), SPSpoJ (spspoJ), initiator protein (spdnaa) and
beta subunit of DNA polymerase III (spdnan) genes, complete cds
221411887122 67emb|Z77726|SPISS. pneumoniae DNA for insertion sequence IS1318 (1372 bp)99226381
22151270812256emb|Z77727|SPI SS. pneumoniae DNA for insertion sequence IS1318 (823 bp)97353453
22161316512662emb|Z77726|SPI SS. pneumoniae DNA for insertion sequence IS1318 (1372 bp)98504504
22231839818910emb|Z86112|SPZ 8S. pneumoniae genes encoding galacturonosyl transferase and transposase and95463513
insertion sequence IS1515
222418829 19299emb|Z86112|SPZ8S. pneumoniae genes encoding galacturonosyl transferase and transposase and99443471
insertion sequence IS1515
235562442 03emb|X52474|SPPLS. pneumoniae ply gene for pneumolysin9914221422
23660635629gb|M177 17|S. pneumoniae pneumolysin gene, complete cds98197435
26155002emb|X94909|SPIG S. pneumoniae iga gene8734875499
2 6258235584gb|U47687|Streptococcus pneumoniae immunoglobulin A1 protease (iga) gene, complete99151240
cds
263< /td>68785685gb|U47687|Streptococcus pneumoniae immunoglobulin A1 protease (iga) gene, complete100501194
cds
268 1449814854emb|Z83335|SPX8S. pnuemoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose99338357
biosynthesis genes and aliA gene
26914763149 24emb|Z83335|SPZ8S. pnuemoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose10094162
biosynthesis genes and aliA gene
26101492215 173gb|U04047|Streptococcus pneumoniae SSZ dextran glucosidase gene and insertion97242252
sequence IS1202 transposase gene, complete cds
28180505emb|Z83335|SPZ8S. pnuemoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose99426426
biosynthesis genes and aliA gene
282503952gb|U04047|Streptococcus pneumoniae SSZ dextran glucosidase gene and insertion97450450
sequence IS1202 transposase gene, complete cds
2837801298gb|U04047|Streptococcus pneumoniae SSZ dextran glucosidase gene and insertion96181519
sequence IS1202 transposase gene, complete cds
3412071523gb|L08611|Streptococcus pneumoniae maltose/maltodextrin uptake (malX) and two9913171317
< td/>maltodextrin permease (malC and MalD) genes, complete cds
34214772367< /td>gb|L08611|Streptococcus pneumoniae maltose/maltodextrin uptake (malX) and two96795891
maltodextrin permease (malC and MalD) genes, complete cds
34325933420< /td>gb|L21856|Streptococcus pneumoniae malA gene, complete cds; malR gene, complete cds96446828
34427902647gb|L21856|98137144
34534184416gb|L21856|96999999
34977647507gb|U41735|93201258
homoserine kinase homolog (thrB) genes, complete cds
341610562102 57emb|X63602|SPBOS. pneumoniae mmsA-Box92238306
35411761439em b|Z83335|SPX8S. pnuemoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose87248264
biosynthesis genes and aliA gene
35514581961 gb|U09239|Streptococcus pneumoniae type 19F capsular polysaccharide biosynthesis98264504
operon, (cps19 fABCDEFGHIJKLMNO) genes, complete cds, and aliA gene,
partial cds
351716172154 77emb|X85787|SPCPS. pneumoniae dexB, cps14A, cps14B, cps14C, cps14D, cps14E, cps14F, cps14G,97696696
cps14H, cps14I, cps14J, cps14K, cps14L, tasA genes
3518169611 6170emb|Z83335|SPX8S. pnuemoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose86792792
biosynthesis genes and aliA gene
35191762016 871gb|U09239|Streptococcus pneumoniae type 19F capsular polysaccharide biosynthesis83750750
operon, (cps19fABCDEFGHIJKLMNO) genes, complete cds, and aliA gene,
partial cds
352019061176 04emb|X85787|SPCPS. pneumoniae dexB, cps14A, cpsl4B, cps14C, cps14D, cps14E, cps14F, cps14G,9414581458
cps14H, cps14I, cps14J, cps14K, cps14L, tasA genes
3619189601 8352gb|U40786|Streptococcus pneumoniae surface antigen A variant precursor (psaA) and 1899609609
kDa protein genes, complete cds, and ORF1 gene, partial cds
362019934189 66gb|U53509|Streptococcus pneumoniae surface adhesin A precursor (psaA) gene, complete99969969
cds
371< /td>2743179emb|Z67739|SPPAS. pneumoniae parC, parE and transposase genes and unknown orf9925652565
37 229852824emb|Z67739|SPPA S. pneumoniae parC, parE and transposase genes and unknown orf100162162
37< /td>350343070emb|Z67739|SPPA< /td>S. pneumoniae parC, parE and transposase genes and unknown orf9919651965
37 451345790emb|Z67739|SPPA S. pneumoniae parC, parE and transposase genee and unknown orf99657657
37561715833emb|Z67739|SPPAS. pneumoniae parC, parE and transpoaase genes and unknown orf96339339
38191296913268gb|M28679|S. pneumoniae promoter region DNA10064300
39212562137gb|U41735|99882882
homoserine kinase homolog (thrB) genes, complete cds
39324053370< /td>gb|U41735|Streptococcus pneumoniae peptide methionine sulfoxide reductase (msrA) and99966966
homoserine kinase homolog (thrB) genes, complete cds
40952537208< /td>gb|M29686|S. pneumoniae mismatch repair (hexB) gene, complete cds9919561956
41 131037emb|Z17307|SPRES. pneumoniae recA gene encoding RecA9910271035
4 1213282713emb|Z34303|SPC IStreptococcus pneumoniae cin operon encoding the cinA, recA, dinF, lytA9913863386
genes, and downstream sequences
41330834045gb|M13812|S. pneumoniae autolysin (lytA) gene, complete cds99963963
41432723096gb|M13812|100177177
41< /td>536033860gb|M13812|100258258
41< /td>647555162gb|L36660|98408408
41752705716gb|L36660|98447447
41861126918gb|L36660|98431807
41969167119gb|L36660|100204204
41< /td>1070827660gb|L36660| Streptococcus pneumoniae ORF, complete cds97552579
411176807979gb|L36660|9881300
411291698717emb|Z77727|SPISS. pneumoniae DNA for insertion sequence IS1318 (823 bp)97353453
411395339132emb|Z77725|SPIS< /td>S. pneumoniae DNA for insertion sequence IS1381 (966 bp)95160402
411496699475emb|Z82001|SPZ8< /td>S. pneumoniae pcpA gene and open reading frames100189195
44571907555emb|Z82001|SP Z8S. pneumoniae pcpA gene and open reading frames99366366
4 4680597607emb|Z77726|SPI SS. pneumoniae DNA for insertion sequence IS1318 (1372 bp)97453453
44784238022emb|Z77725|SPISS. pneumoniae DNA for insertion sequence IS1381 (966 bp)95160402
44885598365emb|Z82001|SPZ8S. pneumoniae pcpA gene and open reading frames100189195
48964804687gb|L39074|Streptococcus pneumoniae pyruvate oxidase (spxB) gene, complete cds9917941794
49 22312603gb|L20561|1002162373
53 624072156gb|U04047| Streptococcus pneumoniae SSZ dextran glucomidase gene and insertion97242252
sequence IS1202 transposase gene, complete cds
53725662405< /td>emb|Z83335|SPZ8S. pneuaoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose10094162
biosynthesis genes and aliA gene
53828312475 emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose99338357
biosynthesis genes and aliA gene
54131240911 105emb|Z83335|SPZ8S. pneuaoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose675911305
biosynthesis genes and aliA gene
55222048819 949emb|Z84379|HSZ8S. pneumoniae dfr gene (isolate 92)99540540
6111118649900emb|Z16082|PNAL Streptococcus pneumoniae aliB gene9819651965
6 313239gb|M18729|S. pneumoniae mismatch repair protein (hexA) gene, complete cds100237237
63< /td>22332611gb|M18729|9923302379
63 325572823gb|M18729| S. pneumoniae mismatch repair protein (hexA) gene, complete cds99266267
63429584664gb|M18729|95691707
67637703399gb|L20670|96372372
67771614171gb|L20670|9929382991
70 11702gb|M14340| S. pneumoniae DpnI gene region encoding dpnC and dpnD, complete cds100693702
70< /td>26781160gb|M14340|100483483
70< /td>324901210gb|M14339|984621281
70< /td>742304424gb|J04234|99147195
70851974316gb|J04234|99881882
701381089874gb|L20562|932341767
71< /td>222796428341emb|X63602|SP BOS. pneumoniae mmsA-Box93233378
72546073552em b|Z26850|SPATS. pneumoniae (M222) genes for ATPase a subunit, ATPase b subunit and ATPase971021056
c subunit
73147113 3emb|X63602|SPBOS. pneumoniae mmsA-Box91193339
7333658977gb| J04479|S. pneumoniae DNA polymerase I (polA) gene, complete cds9926822682
73 848645379gb|M36180| Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase98318516
(purC) genes, complete cds
77326221999< /td>emb|Z83335|SPZ8S. penumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose95624624
biosynthesis genes and aliA gene
77433412523 emb|Z83335|SPZ8S. penumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose91819819
biosynthesis genes and aliA gene
7813413emb|X77249|SPR6S. pneumoniae (R6) ciaR/ciaH genes99339339
78 21095325emb|X77249|SPR6< /td>S. pneumoniae (R6) ciaR/ciaH genes99771771
82 101143610816gb|U90721|Streptococcus pneumoniae signal peptidase I (spi) gene, complete cds97621621
82111240211434gb|U93576|Streptococcus pneumoniae ribonuclease HII (rnhB) gene, complete cds98953969
82121238112704gb|U93576|Streptococcus pneumoniae ribonuclease HII (rnhB) gene, complete cds10051324
83832123550emb|Z77727|SPISS. pneumoniae DNA for insertion sequence IS1318 (823 bp)97290339
831046626851gb|M36180|9921902190
(purC) genes, complete cds
831168498213 gb|M36180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase9913651365
(purC) genes, complete cds
831282369090 gb|M36180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase99855855
(purC) genes, complete cds
831392831301 7gb|L15190|Streptococcus pneumoniae SAICAR synthetase (purC) gene, complete cds1001073735
83 232214723313gb|L36923|Streptococcus pneumoniae beta-N-acetylhexosaminidase (strH) gene, complete982181167
cds
832 42326823450gb|L36923|Streptococcus pneumoniae beta-N-acetylhexosaminidase (strH) gene, complete98172183
cds
8325 2752723505gb|L36923|Streptococcus pneumoniae beta-N-acetylhexosaminidase (strH) gene, complete9938264023
cds
83 262847227771gb|L36923|Streptococcus pneumoniae beta-N-acetylhexosaminidase (strH) gene, complete99416702
cds
844< /td>45546173emb|Z83335|SPZBS. pneumoniae dexB, cap1]A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose986971620
biosynthesis genes and aliA gene
87659515316 emb|Z77725|SPISS. pneumoniae DNA for insertion sequence IS1381 (966 bp)96439636
88529573511gb|M36180|94555555
(purC) genes, complete cds
88634664269< /td>gb|M36180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase94804804
(purC) genes, complete cds
891398781009 3gb|M36180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase97211216
(purC) genes, complete cds
891410062104 12emb|Z83335|SPZ8S. pneumoniae dexB, cap1(A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose97335351
biosynthesis genes and aliA gene
93105303494 1emb|X63602|SPBOS. pneumoniae mmsA-Box89237363
97417081520gb |U41735|Streptococcus pneumoniae peptide methionine sulfoxide reductase (msrA) and91140189
homoserine kinase homolog (thrB) genes, complete cds
99189700emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose93592612
biosynthesis genes and aliA gene
9921773775< /td>emb|X17337|SPAMStreptococcus pneumoniae ami locus conferring aminopterin resistance99998999
327941712emb|X17337 |SPAMStreptococcus pneumoniae ami locus conferring aminopterin resistance9910831083
99437322788emb|X173 37|SPAMStreptococcus pneumoniae ami locus conferring aminopterin resistance100945945
99552493714emb|X1733 7|SPAMStreptococcus pneumoniae ami locus conferring aminopterin resistance10015361536
99672625277emb|X17 337|SPAMStreptococcus pneumoniae ami locus conferring aminopterin resistance9919861986
10112161538emb|X542 25|SPENS. pneumoniae epuA and endA genes for 7 kDa protein and membrane991461323
endonuclease
101 214921719emb|X54225|SPEN S. pneumoniae epuA and endA genes for 7 kDa protein and membrane99228228
endonuclease
101< /td>316941855emb|X54225|SPEN< /td>S. pneumoniae epuA and endA genes for 7 kDa protein and membrane100162162
endonuclease
101 417012582emb|X54225|SPEN S. pneumponiae epuA and endA genes for 7 kDa protein and membrane100882882
endonuclease
103 755565041emb|Z95914|SPZ9 Streptococcus pneumoniae sodA gene100396516
10 4213471556emb|Z77727|SPI SS. pneumoniae DNA for insertion sequence IS1318 (823 bp)83206210
105< /td>553815028emb|Z67739|SPPA< /td>S. pneumoniae parC, parE and transposase genes and unknown orf98353354
105< /td>660895379emb|Z67739|SPPA< /td>S. pneumoniae parC, parE and transposase genes and unknown orf9884711
107427851880emb|X16022|SPPES. pneumoniae penA gene9872906
107< /td>529134988emb|X16022|SPPE< /td>S. pneumoniae penA gene9916922076
1 07649815595emb|X13136|SP PEStreptococcus pneumoniae penA gene for penicillin binding protein 2B91107615
lacking N-term, (penicillin resistant strain)
10899068 8718emb|Z67739|SPPAS. pneumoniae parC, parE and transposase genes and unknown orf95342351
108< /td>121130810922emb|Z67739|SP PAS. pneumoniae parC, parE and transposase genes and unknown orf99199387
109< /td>327682241emb|Z77725|SPIS< /td>S. pneumoniae DNA for insertion sequence IS1381 (966 bp)9661528
109426882855emb|Z77726|SPISS. pneumoniae DNA for insertion sequence IS3318 (1372 bp)96148168
109< /td>528623269emb|Z77727|SPIS< /td>S. pneumoniae DNA for insertion sequence IS1318 (823 bp)97353408
109< /td>653203584gb|M18729|1003711737
11 314313gb|M36180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase95429429
(purC) genes, complete cds
113109788853 2emb|X99400|SPDAS. pneumoniae dacA gene and ORF9912571257
11 311987010985emb|X99400|S PDAS. pneumoniae dacA gene and ORF9911161116
11 4325302030gb|M36180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase95481501
(purC) genes, complete cds
115111130310 932gb|U04047|Streptococcus pneumoniae SSZ dextran glucosidase gene and insertion97372372
sequence IS1202 transposase gene, complete cds
11718973302< /td>emb|X72967|SPNAS. pneumoniae nanA gene9924022408
1 17232773831emb|X72967|SP NAS. pneumoniae nanA gene99237555
117 343273899gb|M36180| Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase98429429
(purC) genes, complete cds
121213691941 gb|U72720|Streptococcus pneumoniae heat shock protein 70 (dnaK) gene, complete cds99202573
and DnaJ (dnaJ) gene, partial cds
121324124253 gb|U72720|Streptococcus pneumoniae heat shock protein 70 (dnaK) gene, complete cds9918421842
< td/>and DnaJ (dnaJ) gene, partial cds
122850665587 gb|U04047|Streptococcus pneumoniae SSZ dextran glucosidase gene and insertion64451522
sequence IS1202 transposase gene, complete cds
12511811189< /td>gb|H36180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase92991623
(purC) genes, complete cds
128151249611 204emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose917051293
biosynthesis genes and aliA gene
13411492emb|Y10818|SPY1S. pneumoniae spsA gene99203492
134 25562652gb|AF019904|Streptococcus pneumoniae choline binding protein A (cbpA) gene, partial cds866852097
134 31160837emb|Y10818|SPY1< /td>S. pneumoniae spsA gene86324324
134 439522882gb|AF019904|Streptococcus pneumoniae choline binding protein A (cbpA) gene, partial cds982151071
134 879929848gb|U12567| Streptococcus pneumoniae P13 glycerol-3-phosphate dehydrogenase (glpD)992851857
gene, partial cds, and glycerol uptake facilitator (glpF) and ORF3 genes,
complete cds
134998461062 2gb|U12567|Streptococcus pneunoniae P13 glycerol-3-phosphate dehydrogenase (glpD)99570777
gene, partial cds, and glycerol uptake facilitator (glpF) and ORF3 genes,
complete cds
134101080511 122gb|U12567|Streptococcus pneumoniae P13 glycerol-3-phosphate dehydrogenase (glpD)100318318
gene, partial cds, and glycerol uptake facilitator (glpF) and ORF3 genes,
complete cds
137137970844 3gb|U09239|Streptococcus pneumoniae type 19F capsular polysaccharide biosynthesis90420474
operon, (cps19 fABCDEFGHIJKLMNO) genes, complete cds, and aliA gene,
partial cds
137148590877 5emb|Z83335|SPZ8S. pneumoniae dexB, cap1 [A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose94174186
biosynthesis genes and aliA gene
13715877389 67emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose98195195
biosynthesis genes and aliA gene
13716922396 87emb|Z77726|SPISS. pneumoniae DNA for insertion sequence IS1318 (1372 bp)96446465
137< /td>17964110051emb|Z77727|SPI SS. pneumoniae DNA for insertion sequence IS1318 (823 bp)96293411
139< /td>101299812702emb|X63602|SP BOS. pneumoniae mmsA-Box90234297
141878058938e mb|Z49988|SPMMStreptococcus pneumoniae mmsA gene993381134
14 19893610972emb|Z49988|SP MMStreptococcus pneumoniae mmsA gene9920372037
1 41101147212467emb|Z49988 |SPMMStreptococcus pneumoniae mmsA gene10076996
142 2257814gb|M80215|98174558
142< /td>3787957gb|M80215|Streptococcus pneumoniae uvs402 protein gene, complete cds100142171
142 49803022gb|M80215|9519972043
14 2530203595gb|M80215|Streptococcus pneumoniae uvs402 protein gene, complete cds100153576
145 11219emb|Z35135|SPALS. pneumoniae aliA gene for amiA-like gene A97185219
14521711994gb|L20556| Streptococcus pneumoniae plpA gene, partial cds9918111824
14 5322877599emb|Z47210|SPD ES. pneumoniae dexB, cap3A, cap3B and cap3C genes and orfs9910525313
1 45499347766gb|M90527|Streptococcus pneumoniae penicillin binding protein (ponA) gene, complete9921692169
cds
1455104889922gb|M90527|Streptococcus pneumoniae penicillin binding protein (ponA) gene, complete99512567
cds
1461 1594emb|Z82002|SPZ8 S. pneumoniae pcpB and pcpC genes98156156
14 6234490emb|Z82002|SPZ8S. pneumoniae pcpB and pcpC genes98255255
14 6161179510794emb|Z82002| SPZ8S. pneumoniae pcpB and pcpC genes852761002
1 47111067810202emb|Z21702 |SPUNS. pneumoniae ung gene and mutX genes encoding uracil-DNA glycosylase and 8-98477477
oxodGTP nucleoside triphosphatase
1471211338 10676emb|Z21702|SPUNS. pneumoniae ung gene and mutX genes encoding uracil-DNA glycosylase and 8-99663663
oxodGTP nucleoside triphosphatase
148129009< /td>8815gb|U41735|Streptococcu s pneumoniae peptide methionine sulfoxide reductase (msrA) and90180195
homoserine kinase homolog (thrB) genes, complete cds
156411541402 emb|X63602|SPBOS. pneumoniae mmsA-Box94185249
1591390488521 gb|M36180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase98526528
(purC) genes, complete cds
16011147emb|Z26851|SPATS. pneumoniae (R6) genes for ATPase a subunit, ATPase b subunit and ATPase c100142147
subunit
1602179898emb|Z26851|SPAT S. pneumoniae (R6) genes for ATPase a subunit, ATPase b subunit and ATPase c99720720
subunit
16039061406emb|Z26850|SPAT S. pneumoniae (M222) genes for ATPase a subunit, ATPase b subunit and ATPase95501501
c subunit
16041373 1942emb|Z26850|SPATS. pneumoniae (M222) genes for ATPase a subunit, ATPase b subunit and ATPase87306570
c subunit
16111984 emb|X77249|SPR6S. pneumoniae (R6) ciaR/ciaH genes99984984
16 1769107497emb|X83917|SPG YS. pneumoniae orflgyrB and gyrB gene encoding DNA gyrase B subunit99437588
161874439386emb|X83917|S PGYS. pneumoniae orflgyrB and gyrB gene encoding DNA gyrase B subunit9819121944
122155gb|L20559|Streptococcus pneumoniae Exp5 gene, partial cds983272154
165 1321618gb|J01796|9915871587
< td/>amylomaltase, complete cds, and malP gene encoding phosphorylase
165216083902gb|J01796|S. pneumoniae malX and malM genes encoding membrane protein and1002802295
< td/>amylomaltase, complete cds, and malP gene encoding phosphorylase
16613784emb|Y11463|SPDNStreptococcus pneumoniae dnaG, rpoD, cpoA genes and ORF3 and ORF5100375375
16 621507320emb|Y11463|SPDN Streptocgccus pneumoniae dnaG, rpoD, cpoA genes and ORF3 and ORF59911881188
1 66332401432emb|Y11463|SP DNStreptococcus pneumoniae dnaG, rpoD, cpoA genes and ORF3 and ORF5995631809
16 711077328emb|Z71552|SPAD Streptococcus pneumoniae adcCBA operon94155750
1 6721844999emb|Z71552|SPA DStreptococcus pneumoniae adcCBA operon98405846
1 67327141842emb|Z71552|SP ADStreptococcus pneumoniae adcCBA operon97604873
1 67433992641emb|Z71552|SP ADStreptococcus pneumoniae adcCBA operon99703759
1 68112259gb|L20558|992822259
170 1073387685emb|Z77726|SPI SS. pneumoniae DNA for insertion sequence IS1318 (1372 bp)95315348
172< /td>624624981gb|47625|973652520
cds
1751 37320gb|M36180|S treptococcus pneumoniae transposase (comA and comB) and SAICAR synthetase89353354
(purC) genes, complete cds
175418433621 emb|Z47210|SPDES. pneumoniae dexB, cap3A, cap3B and cap3C genes and orfs95891779
176 539842980emb|Z67739|SPPA S. pneumoniae parC, parE and transposase genes and unknown orf1005731005
17 813425emb|Z67739|SPPAS. pneumoniae parC, parE and transposase genes and unknown orf95423423
179< /td>142670emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose99338357
biosynthesis genes and aliA gene
18033084185 5emb|X95718|SPGYS. pneumoniae gyrA gene993811230
18 617144emb|Z79691|SOORS. pneumoniae yorf[A,B,C,D,E], ftsL, pbpX and regR genes9859711
186 22254608emb|Z79691|SOOR< /td>S. pneumoniae yorf[A,B,C,D,E], ftsL, pbpX and regR genes983151647
1 863707880emb|Z79691|SOOR S. pneumoniae yorf[A,B,C,D,E], ftsL, pbpX and regR genes98174174
18 912259gb|U72720|Streptococcus pneumoniae heat shock protein 70 (dnaK) gene, complete cds99258258
and DnaJ (dnaJ) gene, partial cds
1892600385gb|U72720|Streptococcus pneumoniae heat shock protein 70 (dnaK) gene, complete cds98204216
and DnaJ (dnaJ) gene, partial cds
18931018851< /td>gb|U72720|Streptococcus pneumoniae heat shock protein 70 (dnaK) gene, complete cds99168168
and DnaJ (dnaJ) gene, partial cds
189410122154 gb|U72720|Streptococcus pneumoniae heat shock protein 70 (dnaK) gene, complete cds9910621143
< td/>and DnaJ (dnaJ) gene, partial cds
191978297524 emb|X63602|SPBOS. pneumoniae mmsA-Box95234306
19411729gb|M3 6180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase91728729
(purC) genes, complete cds
19921117881< /td>emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose96211237
biosynthesis genes and aliA gene
19941499176 2emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose89248264
biosynthesis genes and aliA gene
19951781228 4emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose98504504
biosynthesis genes and aliA gene
20311977337 gb|L20563|Streptococcus pneumoniae Exp9 gene, partial cds993421641
204 111453gb|L36131|Streptococcus pneumoniae exp10 gene, complete cds, recA gene, 5′ end9911431143
20 81592296gb|U89711|904712238
complete cds
213324552123 emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose96332333
biosynthesis genes and aliA gene
216136812emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose99338357
biosynthesis genes and aliA gene
21632650232 7gb|M28678|S. pneumoniae promoter sequence DNA9886324
22214174emb|Z83335|SPZ894414414
biosynthesis genes and aliA gene
22735266423 8emb|AJ000336|SPStreptococcus pneumoniae 1dh gene9910291029
2 3911804gb|M31296|95484804
247< /td>316251807gb|M36180|94178183
(purC) genes, complete cds
24939211364< /td>emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose94443444
biosynthesis genes and aliA gene
25313623gb|M36180]Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase99360360
(purC) genes, complete cds
253512382050 emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose95420813
biosynthesis genes and aliA gene
25362069257 2emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose97504504
biosynthesis genes and aliA gene
25413800emb|Z82002|SPZ8S. pneumoniae pcpB and pcpC genes97531798
25 527981841emb|Z82002|SPZ8 S. pneumoniae pcpB and pcpC genes976721044
2 55324931969emb|Z67739|SP PAS. pneumoniae parC, parE and transposase genes and unknown orf92435525
257< /td>2985770emb|X17337|SPAMStreptococcus pneumoniae ami locus conferring aminopterin resistance96117216
31245907gb|M36180| Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase97339339
(purC) genes, complete cds
26724951208< /td>gb|U16156|Streptococcus pneumoniae dihydropteroate synthase (sulA), dihydrofolate9584714
synthetase (sulB), guanosine triphosphate cyclohydrolase (sulC), aldolase-
pyrophos phokinase (sulD) genes, complete cds
267312912277 gb|U16156|Streptococcus pneumoniae dihydropteroate synthase (sulA), dihydrofolate97755987
synthetase (sulB), guanosine triphosphate cyclohydrolase (sulC), aldolase-
pyrophos phokinase (sulD) genes, complete cds
267422613601 gb|U16156|Streptococcus pneumoniae dihydropteroate synthase (sulA), dihydrofolate9813411341
synthetase (sulB), guanosine triphosphate cyclohydrolase (sulC), aldolase-
pyrophos phokinase (sulD) genes, complete cds
267535614136 gb|U16156|Streptococcus pneumoniae dihydropteroate synthase (sulA), dihydrofolate99576576
synthetase (sulB), guanosine triphosphate cyclohydrolase (sulC), aldolase-
pyrophos phokinase (sulD) genes, complete cds
267641644949 gb|U16156|Streptococcus pneumoniae dihydropteroate synthase (sulA), dihydrofolate99748786
synthetase (sulB), guanosine triphosphate cyclohydrolase (sulC), aldolase-
pyrophos phokinase (sulD) genes, complete cds
267755445140 gb|U16156|Streptococcus pneumoniae dihydropteroate synthase (sulA), dihydrofolate100186405
synthetase (sulB), guanosine triphosphate cyclohydrolase (sulC), aldolase-
pyrophos phokinase (sulD) genes, complete cds
268417931990 emb|X63602|SPBOS. pneumoniae mmsA-Box89194198
2711562104gb| 429686|S. pneumoniae mismatch repair (hexB) gene, complete cds93160459
291< /td>175524gb|U04047| Streptococcus pneumoniae SSZ dextran glucosidase gene and insertion96450450
sequence IS1202 transposase gene, complete cds
29121001525< /td>emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose87205477
biosynthesis genes and aliA gene
2913807559< /td>emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose90170249
biosynthesis genes and aliA gene
29141374109 9gb|M36180|Streptococcus pneumoniae transposase, (comA and comB) and SAICAR synthetase85264276
(purC) genes, complete cds
293131673emb|Z67740|SPGYS. pneumoniae gyrB gene and unknown orf985531671
296 11434151emb|Z47210|SPDE< /td>S. pneumoniae dexB, cap3A, cap3B and cap3C genes and orfs994301284
31 71157510emb|Z67739|SPPA< /td>S. pneumoniae parC, parE and transposase genes and unknown orf89353354
325< /td>21237485emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose91299753
biosynthesis genes and aliA gene
32611462emb|Z82001|SPZ8S. pneumoniae pcpA gene and open reading frames100233462
327160364emb|Z83335|SPZ8 S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K) genes, dTDP-rhamnose9489540
biosynthesis genes and aliA gene
3341153545< /td>gb|U41735|Streptococcus pneumoniae peptide methionine sulfoxide reductase (msrA) and8791393
homoserine kinase homolog (thrB) genes, complete cds
336130893emb|Z26850|SPATS. pneumoniae (M222) genes for ATPase a subunit, ATPase b subunit and ATPase97102216
c subunit
36011519 emb|Z67739|SPPAS. pneumoniae parC, parE and transposase genes and unknown orf95435519
360< /td>415981960emb|Z83335|SPZ8< /td>S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose94353363
biosynthesis genes and aliA gene
36216732emb|Z83335|SPZ8S. pneumoniae dexB, cap1[A,B,C,D,E,F,G,H,I,J,K] genes, dTDP-rhamnose9563672
biosynthesis genes and aliA gene
36221168728 gb|U04047|Streptococcus pneumoniae SSZ dextran glucosidase gene and insertion96441441
sequence IS1202 transposase gene, complete cds
3841347111emb|X85787|SPCPS. pneumoniae dexB, cps14A, cps14B, cps14C, cpsl4D, cps14E, cps14F, cps14G,9454237
cps14H, cps14I, cps14J, cps14K, cps14L, tasA genes
Stop< tr>< /tr>< td>189< td>gi|987050< /tr>< td>sp|P37214|ERA_S< td>gi|46606< td>gi|208225< td>gi|149396< td>gi|703442gi|287871< /tr>< /tr>< td>gi|2323342< td>pir|A45434|A45419447 96862623< td>gnl|PID|e134943< td>912< td>771< /tr>< td>gi|1196921gnl|PID|e228283< td>gi|1661193< td>gi|2160707< td>gi|2149909< td>gi|1044989< td>876< /tr>gnl|PID|d102006< tr>< tr>< /tr>< td>gi|104498014144 < td>gi|311707< tr>< td>gi|1500567< td>1350   3< td>pir|S06097|S060< td>pir|S08564|R3BSgi|146402198< /tr>gi|1685111< td>gi|1216490< td>gnl|PID|e305362< td>828< td>gi|1183884< td>pir|A02815|R5BS< td>1104 19457 < /tr>< td>gnl|PID|d101091< td>gi|506700< tr>< td>gi|897795< td>gi|143328< tr>< td>gi|580914< td>gi|142463< tr>< td>gi|149402gi|1174237gi|580902< td>6322< td>gnl|PID|d100306< td>gi|1916729< td>gi|520738< td>gi|1708640< tr>< td>gi|148921726< td>gi|2293302< td>gi|511015< td>gnl|PID|e334776< td>774< td>3869< td>12881 < td/>< td>gi|944944< td>gi|1674310< td>gi|143795< td>gnl|PID|d100959< td>gi|551880< td>gi|431272< tr>< tr>< td>gi|149520< td>gi|2293211< tr>< tr>< td>gi|1146219< td>gnl|PID|e323529< td>711< td>1071 < td>38710082 < td>gi|1119198192< td>gnl|PID|e257631< td>gi|1573373< td/>< td>gi|666983gnl|PID|e265580gi|857631< tr>< tr>< tr>< tr>< td/>< td>ribosome releasing factor [Synechocystis sp.]< tr>< td>gi|829615< td>gi|472327< tr>< tr>< tr>< tr>< tr>< td>gi|580900< td>gi|1353874< td/>< td>gi|39478< td>gi|662792gi|11612193812< tr>1954< td>762< td>gnl|PID|e275074< td>gi|1573037< td>7888< td>gi|2293312< tr>< /tr>< /tr>gi|1353874< td>gi|580887< tr>< /tr>< td>3247gnl|PID|e257629< td>gnl|PID|d101314< td>gnl|PID|d100581< td>966< /tr>gi|148945251< td>gi|1790131< tr>< td>786< td>gnl|PID|d101732< td>pir|B54545|B545spore coat protein [Bacillus subtilis]< tr>< td>1026 1728 < /tr>< td>717< tr>< td>gnl|PID|d10133012740 < td>gnl|PID|e264711< td>gi|1142714< /tr>< tr>< td>gi|1161933< /tr>< td>gi|153841< tr>< td>gi|1842438< tr>1083 < tr>< td>219< td/>1101 < td>gi|2098719< td> 89< /tr>gi|609332< td>gi|726288TPP-dependent acetoin dehydrogenase alpha-subunit [Clostridium< td>2270< tr>< td>gi|517210< td>gnl|PID|e325010< td>gi|14804294519< td>861< tr>< td/>gi|1575577< tr>< td>657< td>gi|49315< td>gnl|PID|e265529< td>gi|1762328< td>gnl|PID|d102036< td>gi|15738264949< td>pir|JC1151|JC11< td>gi|1322245< td>gnl|PID|d100964< td>gi|581307< /tr>< td>gi|1573393gnl|PID|e217602< /tr>gi|1377843< td>gi|722339< td>gi|2098719< td>gi|1052803< td>gi|7171< tr>< td>gnl|PID|e324217< /tr>< td>gi|722339gi|1877424gi|2252843< td>168< td>9538< td>gi|2314379< tr>< /tr>YqeG [Bacillus subtilis]< td>gi|1573024< td>pir|S41509|S415< td>657< td>gnl|PID|e254644< td>200< tr>< tr>< tr>< tr>< tr>putative cel operon regulator [Bacillus subtilis]< td>gi|1209527< td>gi|1787043< tr>< td>363< tr>< td>gnl|PID|d100417< tr>< td>pir|JC1151|JC11< tr>< td>gi|149569< /tr>16951 < td>gnl|PID|d100587< td>7372< tr>< tr>< td>gnl|PID|e334782gnl|PID|d101876< td>Pgm [Treponema pallidum]< td>gnl|PID|e313022< tr>< td>gnl|PID|e308090< tr>< td>gi|580841< td>180< td>246< /tr>40< td>(AB001488) FUNCTION UNKNOWN, SIMILAR PRODUCT IN< td>7213< td>8836< td/>360< td>gi|666062< tr>< /tr>gi|39478291gi|153794< td>648< td>gi|473901gi|167475< td>gnl|PID|e246727< tr>< td>gi|466474< td>pir|JC1151|JC11< td>939< td>gi|1553692079< td>gi|559882< td>gi|1591393< td>45618627 < tr>< td>1008 < td>28701 < td>gi|995560< /tr>< td>gi|600431< /tr>< tr>< td>gi|147402< tr>< td>492< td>gi|1592090< /tr>13191 < /tr>< td>gnl|PID|d100247gi|2182397< /tr>pir|S5790451 S579< td>3< tr>< td>gnl|PID|e316518< td>gi|1088261< tr>< td>gi|896042< tr>< td>gi|535052< td>gi|1196504< td>gnl|PID|d101741< td>gi|396844< td>gi|396397< tr>< td>gi|160671488212103 < tr>< /tr>< td>recombinase [Moraxella bovis]< /tr>< tr>< td/>< td>gnl|PID|e313074S antigen precursor [Plasmodium falciparum]< /tr>gnl|PID|e316029< tr>< td>gi|563366< td>gi|149581< /tr>< /tr>< td>orf2 [Lactobacillus helveticus]< td>gi|2317740< td>1896 < tr>< td>gi|471234gi|153733< td>gnl|PID|e245024< td>gi|497647< tr>gi|2314455< td>gnl|PID|d101316< td>probable copper-transporting atpase [Escherichia coli]< tr>< td>gi|1633572< td>11550 < tr>< tr>< tr>< td>gnl|PID|d100974< td/>< td>7179< td>1125 < td>15770 < /tr>gi|2275264< td>gi|1066016gi|895747< td>gi|1216475gi|152271< td>gi|1786983< td/>< td>gi|393396< td>gnl|PID|e285322< /tr>< td>gi|39573< td/>< td>549< td>gnl|PID|e290649< tr>< /tr>< td/>< td>P20 (AA 1-178) [Bacillus licheniformis]< td>pir|A37024|A370< td>gnl|PID|d100964< td>615< td/>< tr>gi|1469286< td>232< td>gi|1673744< /tr>< /tr>< tr>18256 < tr>< td>340< /tr>
TABLE 2
S. pneumoniae - Putative coding regions of novel proteins similar to known proteins
ContigORFStartmatch%%length
IDID(nt)(nt)acessionmatch gene namesimident(nt)
22821760 1942pir|F60663|F606translation elongation factor Tu - Streptococcus oralis100 100 183
3191  2 205gi|984927neomycin phosphotransferase [Cloning vector pBSL99]100 100 204
2601  21138pir|F60663|F606translatio n elongation factor Tu - Streptococcus oralis99981137 
 252 4861394 gi|1574495hypothetical [Haemophilus influenzae]9896909
 942 6851002gi|310627phosphoenolpyruvate:sugar phosphotransferase system HPr9893318
[Streptococcus mutans]
3121  190  2gi|347999ATP-dependent protease proteolytic subunit [Streptococcus9895
sal ivarius]
3291  1 807gi|924848inosine monophosphate dehydrogenase [Streptococcus pyogenes]9894807
3362 290 589lacZ gene product [unidentified cloning vector]9898300
1 81959487366gi|153855phospho-beta-D-galactosidase (EC 3.2.1.85) [Lactococcus lactis97941419 
cremoris]
31221044  361gi|347998uracil phosphoribosyltransferase [Streptococcus salivarius]9788684
 32865757486GTP-BINDING PROTEIN ERA HOMOLOG.9691912
 943 9512741gi|15361 5phosphoenolpyruvate:sugar phosphotransferase system enzyme I96921791 
[Streptococcus salivarius]
1271  1 168gicremoris]581299initia tion factor IF-1 [Lactococcus lactis]9689168
12814 10438 11154  gicremoris]1276873DeoD [Streptococcus thermophilus]9693717
181413621598lacD polypeptide (AA 1-326) [Staphylococcus aureus]9680237
2181  1 834gi|1743856intrageneric coaggregation-relevant adhesin [Streptococcus gordonii]9693834
3192 115 441heat-shock protein 82/neomcyn phosphotransferase fusion protein9696327
(hsp82-neo) [unidentified cloning vector]
 5412 862210967 gnl|PID|d100972Pyruvate formate-lyase [Streptococcus mutans]95892346 
1812 6061289lacD [Lactococcus lactis]9589684
 46334103045g i|1850606YlxM [Streptococcus mutans]9486366
 8910 79727337thymidine kinase [Streptococcus gordonii]9486636
148964317354g i|995767UDP-glucose pyrophosphorylase [Streptococcus pyogenes]9485924
160744305848g i|153573H+ ATPase [Enterococcus faecalis]94871419 
2345983513gi|153763 plasmin receptor [Streptococcus pyogenes]93861086 
 12878776204gi|1103865formyl-tetrahydrofolate synthetase [Streptococcus mutans]93841674 
 6511 47345120gi|40150L14 protein (AA 1-122) [Bacillus subtilis]9387387
 681 531297gi|47341antitumor protein [Streptococcus pyogenes]93871245 
 801  3 299gnl|PID|d101166ribosoma l protein S7 [Bacillus subtilis]9384297
1273 6951093gi|142462ribosomal protein S11 [Bacillus subtilis]9386399
160519243462g i|1773264ATPase, alpha subunit [Streptococcus mutans]93851539 
211537573047gi|535273aminopeptidase C [Streptococcus thermophilus]9382711
2621 16 564gi|149394lacB [Lactococcus lactis]9390549
3661 197  3gi|295259tryptophan synthase beta subunit [Synechocystis sp.]9391195
 2 5313921976gi|1574496hypothetical [Haemophilus influenzae]9280585
 3621 20781 19 927 gi|310632hydrophobic membrane protein [Streptococcus gordonii]9286855
181312651534g i|149396lacD [Lactococcus lactis]9283270
181736624060gi| 149410enzyme III [Lactococcus lactis]9283399
 32456313937g nl|PID|e294090fibronectin-binding protein-like protein A [Streptococcus gordonii]91851695 
 46230541462gi|1850607signal recognition particle Ffh [Streptococcus mutans]91841593 
 6510 44424726pir|S17865|S178ribosomal protein S17 - Bacillus stearothermophilus9180285
 772 2601900groEL gene product [Lactococcus lactis]91821641 
 841  22056gi|871784Clp-like ATP-dependent protease binding subunit [Bos taurus]91792055 
 99810750 9272gi|153740sucrose phosphorylase [Streptococcus mutans]91841479 
 99911947 11072⠀‚gi|153739membrane protein [Streptococcus mutans]9178876
127520652469pir |SO7223|R5BSribosomal protein L17 - Bacillus stearothermophilus91784 05
132695399390< /td>gi|143065hubst [Bacillus stearothermophilus]9189 150
137847656153 gnl|PID|d100347Na+ - ATPase beta subunit [Enterococcus hirae]91791389 
151711119 9734 gi|1815634glutamine synthetase type 1 [Streptococcus agalactiae]91821386 < /td>
20121798 278gi|1108998dextran glucosidase DexS [Streptococcus suis]91791521 
2222 6731839gi|153741ATP-binding protein [Streptococcus mutans]91851167 
293541134400gi|1196921unknown protein [Insertion sequence IS861]9171288
†‚32761666570pir|A36933|A 369diacylglycerol kinase homolog - Streptococcus mutans9077405
 332 841527g i|1196921unknown protein [Insertion sequence IS861]9070315
†‚4827 20908 19757  gnl|PID|e274705lactate oxidase [Streptococcus iniae]90801152 
 5521 19777 185 15 gnl|PID|e221213ClpX protein [Bacillus subtilis]90751263 
 562 717 977 gi|1710133flagellar filament cap [Borrelia burgdorferi]9050261
 651  1 606gi|1165303L3 [Bacillus subtilis]9075606
1141  2 988gi|153562aspartate beta-semialdehyde dehydrogenase (EC 1.2.1.11)9080987
[Streptococcus mutans]
1201134 5 827gi|407880ORF1 [Streptococcus equismilis]9075519
15912 76908298gi|143012GMP synthetase [Bacillus subtilis]9084609
166440763282g i|1661179high affinity branched chain amino acid transport protein9078795
[Streptococcus mutans]
1831 281395gi|308858ATP:pyruvate 2-O-phosphotransferase [Lactococcus lactis]90761368 
191328911662gi|149521tryptophan synthase beta subunit [Lactococcus lactis]90781230 
19821551 436(AF014460) CcpA [Streptococcus mutans]90761116 
3051 37 783gi|1573551asparagine synthethase A (asnA) [Haemophilus influenzae]9080747
8322853343gi|149434 putative [Lactococcus lactis]89781059 
 46875777362ribosomal protein L19 - Bacillus stearothermophilus8976216
 499836310342  gi|153792recP peptide [Streptococcus pneumoniae]89831980 < /td>
 5114 18410 gi|308857ATP:D-fructose 6-phosphate 1-phosphotransferase [Lactococcus89811038 
l actis]
 5711 10669 gnl|PID|d100932H2O- forming NADH Oxidase [Streptococcus mutans]8977984
 65524182786g i|1165307S19 [Bacillus subtilis]8981369
 65838064225sp|P14577|RL16—50S RIBOSOMAL PROTEIN L16.8982420
 6 518 82198719gi|143417< /td>ribosomal protein S5 [Bacillus stearothermophilus]8976 501
 739633753 15gi|532204prs [Listeria monocytogens]89701023†‚
 76333601465 gnl|PID|e200671lepA gene product [Bacillus subtilis]89761896 
 9910 12818  11919 gi|153738membrane protein [Streptococcus mutans]8973900
120235521300gi| 407881stringent response-like protein [Streptococcus equisimilis]89792253 
122545122791gnl|PID|e280490unknown [Streptococcus pneumoniae]89811722 < /td>
1761 669  4gi|473945-oxoprolyl-peptidase [Streptococcus pyogenes]8978666
177630503934g i|912423putative [Lactococcus lactis]8971885
181840335751gi| 149411enzyme III [Lactococcus lactis]89801719 
211431492793gi|535273aminopeptidase C [Streptococcus thermophilus]8970408
3611 431 838gi|1196922unknown protein [Insertion sequence IS861]8970408
†‚3417 11839 10535  sp|P30053[SYH_SHISTIDYL-TRNA SYNTHETASE (EC 6.1.1.21)88781305 
(HISTIDINE--TRNA LIGASE) (HISRS)
 3831646gi|2058544putative ABC transporter subunit ComYA [Streptococcus gordonii]8878978
 541  3 227gnl|PID|d101320YggU [Bacillus subtilis]8866225
 572 6111468putative reductase 1 [Saccharomyces cerevisiae]8875858
 6513 54976069pir|A29102|R5BSribosomal protein L5 - Bacillus stearothermophilus88755 73
 6520 90309500gi|2078381ribosomal protein L15 [Staphylococcus aureus]8883471
 78336361108g nl|PID|d100781lysyl-aminopeptidase [Lactococcus lactis]88802529 
10612 12965 1205 4 gi|2407215(AF017421) putative heat shock protein HtpX [Streptococcus8872
gor donii]
1072 2 19962gnl|PID|e339862putative acylneuraminate lyase [Clostridium tertium]8875744
111814073 10420 gi|402363RNA polymerase beta-subunit [Bacillus subtilis]88743654 
126913096 12062⠀‚gnl|PID|e311468unknown [Bacillus subtilis]88741035 
14017 19143 18 874 gi|1573659H. influenzae predicted coding region HI0659 [Haemophilus8861270
influ enzae]
1441 3 94555gnl|PID|e274705lactate oxidase [Streptococcus iniae]8875162
148427233493gi|1 59672phosphate transport system ATP-binding protein [Methanococcus8868
jan naschii]
160858 536278gi|1773267ATPase, epsilon subunit [Streptococcus mutans]8865426
177417702885gi| 149426putative [Lactococcus lactis]88721116 
211641493613gi|535273aminopeptidase C [Streptococcus thermophilus]8874528
2314 580 957gi|40186homologous to E. coli ribosomal protein L27 [Bacillus subtilis]8878378
260523872998g i|1196922unknown protein [Insertion sequence IS861]8869612
29 1620173375gnl|PID|d10057 1adenylosuccinate synthetase [Bacillus subtilis]88751359 
3194 658 317gi|603578serine/threonine kinase [Phytophthora capsici]8888342
 40543534514 gi|153672lactose repressor [Streptococcus mutans]8756162
 4910 10660 10929⠀‚gi|1196921unknown protein [Insertion sequence IS861]8772270
†‚65731403808gi|1165309S3 [Bacillus subtilis]8773669
 6515 66237039gi|1044978ribosomal protein S8 [Bacillus subtilis]8773417
 75854116625gi|1877422galactokinase [Streptococcus mutans]87781215 
 802 7032805gnl|PID|d101166elongation factor G [Bacillus subtilis]87762103 
 821 541 248 gi|1196921unknown protein [Insertion sequence IS861]8769294
14 023 25033 23897 gn l|PID|e254999phenylalany-tRNA synthetase beta subunit [Bacillus subtilis]87741137 
21414 10441 85 16gi|2281305glucose inhibited division protein homolog GidA [Lactococcus lactis87751926 
cremoris]
22022742  874gnl|PID|e324358product highly similar to elongation factor EF-G [Bacillus subtilis]87731869 
260420962389unknown protein [Insertion sequence IS861]8772294
32 31 27 650gi|89779530S ribosomal protein [Pediococcus acidilactici]8773624
3571 154 570gi|1044978ribosomal protein S8 [Bacillus subtilis]8773417
 4911 10927 1144 5 gi|1196922unknown protein [Insertion sequence IS861]8663519
†‚5912 74619224gi|95105 1relaxase [Streptococcus pneumoniae]86681764 < /td>
 65415532401pir|A02759|R5BSribosomal protein L2 - Bacillus stearothermophilus8677849
 6523 10957 11610  gi|44074adenylate kinase [Lactococcus lactis]8676654
 82443744856g i|153745mannitol-specific enzyme III [Streptococcus mutans]8672483
102442704986gnl |PID|e264705OMP decarboxylase [Lactococcus lactis]8676717
106678246880gnl |PID|e137598aspartate transcarbamylase [Lactobacillus leichmannii]8668945
1071  1 273gnl|PID|e339862putative acylneuraminate lyase [Clostridium tertium]8671273
111710432 6710DNA-dependent RNA polymerase [Streptococcus pyogenes]86803723 
131957044892polipoprotein diacylglycerol transferase [Streptococcus mutans]8671813
134764307980gi| 2388637glycerol kinase [Enterococcus faecalis]86731551 
14611 74736583gi|1591731melvalonate kinase [Methanococcus jannaschii]8672891
1532 5952010dipeptidase [Lactococcus lactis]86781416 
1541  21435gi|18572466-phosphoglucon ate dehydrogenase [Lactococcus lactis]86741434 
161550256284gi|47529Unknown [Streptococcus salivarius]86661260 < /td>
1841  21483gi|642667NADP-dependent glyceraldehyde-3-phosphate dehydrogenase86731482 
[Streptococcus mutans]
2108365 96571gi|153661translational initiation factor IF2 [Enterococcus faecium]86762913 
2501  2 187gi|1573551asparagine synthetase A (asnA) [Haemophilus influenzae]8668186
 36426443909cell division protein [Enterococcus faecalis]85731266 
 38424753587gi|2058545putative ABC transporter subunit ComYB [Streptococcus gordonii]85721113 
 38535773915gi|2058546ComYC [Streptococcus gordonii]8580339
 57527973789gnl|PID|d101316YqfJ [Bacillus subtilis]8572993
 82549156054gi|153746mannitol-phosphate dehydrogenase [Streptococcus mutans]85681140 
 8315 14690 15 793 gi|143371phosphoribosyl aminoimidazole synthetase (PUR-M) [Bacillus856911 04 
subt ilis]
 872141 72388gi|1184967ScrR [Streptococcus mutans]8569972
108326663154gi| 153566ORF (19K protein) [Enterococcus faecalis]8567489
1272 312 692ribosomal protein S13 [Bacillus subtilis]8572381
128315342409g i|1685110tetrahydrofolate dehydrogenase/cyclohydrolase [Streptococcus8571
the rmophilus]
1377 29624767gnl|PID|d100347Na+ -ATPase alpha subunit [Enterococcus hirae]85741806 
17022622 709(AB001488) FUNCTION UNKNOWN, SIMILAR PRODUCT IN E.85701914 
COLI, H. INFLUENZAE AND NEISSERIA MENINGITIDIS.
[Bacillus subtilis]
18753 7604386gi|727436putative 20-kDa protein [Lactococcus lactis]8565627
2332 7281873g i|1163116ORF-5 [Streptococcus pneumoniae]85671146 < /td>
2343 9621255gi|2293155(AF008220) YtiA [Bacillus subtilis]8561294
2401 3091931gi|143597CTP synthetase [Bacillus subtilis]85701623 
61 1991521gi|508979GTP-binding protein [Bacillus subtilis]84721323 
 10443753443gnl|PID|e339862putative acylneuraminate lyase [Clostridium tertium]8470933
 141 632093gi|520753DNA topoisomerase I [Bacillus subtilis]84692031 
 19417932593gi|2352484(AF005098) RNAseH II (Lactococcus lactis)8468801
 2017 17720 19687⠀‚gnl|PID|d100584cell division protein [Bacillus subtilis]84711968 
 2228 21723  20884 gi|299163alanine dehydrogenase [Bacillus subtilis]8468840
 3010 77306792gnl|PID|d100296fructokinase [Streptococcus mutans]8475939
 33956505300g i|147194phnA protein [Escherichia coli]8471351
 3622 21551 20772  gi|310631ATP binding protein [Streptococcus gordonii]8472780
 48428372505gi|8826096-phospho-beta-glucosidase [Escherichia coli]8469333
 581 411516gi|450849amylase [Streptococcus bovis]84731476 
 5910 67157116gi|951053ORF10, putative [Streptococcus pneumoniae]8474402
 621 21 644gi|806487ORF211, putative [Lactococcus lactis]8466624
 6517 77798207ribosomal protein L18 [Bacillus subtilis]8473429
 6521 950710397  gi|44073SecY protein [Lactococcus lactis]8468891
106454742262gnl |PID|e199387carbamoyl-phosphate synthase [Lactobacillus plantarium]84733213 < /td>
1591 47  4gi|806487ORF211; putative [Lactococcus lactis]8463144
163446905910gi| 2293164(AF008220) SAM synthase [Bacillus subtilis]84691221 
1921 461308gi|495046tripeptidase [Lactococcus lactis]84731263 
3481 671  6gi|1787753(AE000245) f346; 70 pct identical to 336 amino acids of8471666
ADH1_ZYMMO SW; F20368 but has 10 additional N-ter residues
[ Escherichia coli]
3415721375gi|143766 (thrSv) (EC 6.1.1.3) [Bacillus subtilis]83652004 
9638933417gnl|PID|d10057 6single strand DNA binding protein [Bacillus subtilis]8368477
 1715 74268457gi|520738comA protein [Streptococcus pneumoniae]83661032 < /td>
 2012 13860 gnl|PID|d100583unknown [Bacillus subtilis]8361285
 23433582606gi|1788294(AE000290) o238; This 238 aa orf is 40 pct identical (5 gaps) to 2318374753
residues of an approx. 248 as protein YEBC_ECOLI SW; P24237
[Es cherichia coli]
 286330 43005gi|1573659H. influenzae predicted coding region HI0659 [Haemophilus influenzae]8357300
 35751083867hypothetical nucleotide binding protein [Acholeplasma laidlawii]83631242 
 5519 17932 17528 gi|537085ORF_f141 [Escherichia coli]8359405
 5520 18539 17919  gi|496558orfx [Bacillus subtilis]8369621
 65627953142gi|1165308L22 [Bacillus subtilis]8364348
 68668776683gi|1213494immunoglobulin A1 protease [Streptococcus pneumoniae]8354195
 8715 15112 14 771gnl|PID|e323522putative rpoZ protein [Bacillus subtilis]8354342
 9612 89639631gi|473945-oxoprolyl-peptidase [Streptococcus pyogenes]8373669
 981  3 263go|1183885glutamine-bin ding subunit [Bacillus subtilis]8355261
120471705233g i|310630zinc metalloprotease [Streptococcus gordonii]83721938 
127729984347M. jannaschii predicted coding region MJ1665 [Methanococcus8372
jannaschii]
1371 440gi|472918v-type Na-ATPase [Enterococcus hirae]8360438
160634664356gi|1 773265ATPase, gamma subunit [Streptococcus mutans]8367891
214422782964gi| 663279transposase [Streptococcus pneumoniae]8372687
226323672020gi|142154thioredoxin [Synechococcus PCC6301]8358348
3031  31049gi|40046phosphoglucose isomerase A (AA 1-449) [Bacillus836710 47 
stea rothermophilus]
303211551931gi|289282glutamyl-tR NA synthetase [Bacillus subtilis]8367777
617 15370 14318 gi |633147ribose-phosphate pyrophosphokinase [Bacillus caldolyticus]82641053†‚
71 299 96gi|143648ribosomal protein L28 [Bacillus subtilis]8269204
9314791090gi|385178 unknown [Bacillus subtilis]8246390
9742133899gnl|PID|d10057 6ribosomal protein S6 [Bacillus subtilis]8260315
 12646883942gnl|PID|d100571unknown [Bacillus subtilis]8268747
 2217 13422 1483 7 gi|520754putative [Bacillus subtilis]82691416 
 2218 14897  15658 gnl|PID|d101929uridine monophosphate kinase [Synechocystis sp.]8262762
 3 316 11471 10641 gn l|PID|d101190ORF4 [Streptococcus mutans]8268831
 35974006255g i|1881543UDP-N-acetylglucosamine-2-epimerase [Streptococcus pneumoniae]82681146 < /td>
 4010 800375 33gi|1173519riboflavin synthase beta subunit [Actinobacillus pleuropneumoniae]826847 1
 4832 23159  23437 gi|1930092outer membrane protein [Campylobacter jejuni]8261279
 5214 13833 14765⠀‚gi|142521deoxyribodipyrimidine photolyase [Bacillus subtilis]8261933
 60447371849gnl|PID|d102221(AB001610) urvA [Deinococcus radiodurans]82662889 
 62421311457< /td>gi|2246749(AF009622) thioredoxin reductase (Listeria monocytogenes]8263675
 7111 16586 17518 gnl|PID|e322063ss-1,4-galactosylt ransferase [Streptococcus pneumoniae]8260933
 7313 92227837gnl|PID|d100586unknown [Bacillus subtilis]82651386 
 741  13771gnl|PID|d101199alkaline amylopullulanase [Bacillus sp.]82683771 
 83936963983gnl|PID|e3 05362unnamed protein product [Streptococcus thermophilus]8252288
 8613 10776  9394gi|6835835-enolpyruvylshikimate-3-phos phate synthase [Lactococcus lactis]82671383 
 8912 82959752gi|40025homologous to E. coli 50K [Bacillus subtilis]82661458 
115910347 8812gnl|PID|d102090(AV003927) phospho-beta-galactosidase [Lactobacillus gasseri]82741536 
1181  11332gnl|PID|d100579seryl-tRNA synthetase [Bacillus subtilis]82711332 
151146576246type I site-specific deoxyribonuclease (EC 3.1.21.3) CfrA chain S -82661590 
Citrobacter freundii
173641 833503gi|2313836(AE000584) conserved hypothetical protein [Helicobacter pylori]8268681
17712 54817442gnl|PID|d101999(AV001341) NcrB [Escherichia coli]82581962 
1932 178 576ribosomal protein S9 - Bacillus stearothermophilus8270399
2452 258 845EcoA type I restriction-modification enzyme S subunit [Escherichia8268588
coli< /i>]
9534003146gnl|PID|d10057 6ribosomal protein S18 [Bacillus subtilis]8166255
 16774848413gi|1100074tryptophanyl-tRNA synthetase [Clostridium longisporum]8170930
 2011 10308 1 3820 gnl|PID|d100583transcription-repair coupling factor [Bacillus subtilis]81633513 
 38212321606gi|2058543putative DNA binding protein [Streptococcus gordonii]8163375
 45230611751gi|460259enolase [Bacillus subtilis]81671311 
 461  21267gi|431231uracil permease [Bacillus caldolyticus]81611266†‚
 48324531440 gnl|PID|d100453Mannosephosphate Isomerase [Streptococcus mutans]81701014 
 5421106 336gi|154752transport protein [Agrobacterium tumefaciens]8164771
 6522 10306 1 0821 gi|44073SecY protein [Lactococcus lactis]8166516
 89438742603g i|556886Sering hydroxymethyltransferase [Bacillus subtilis]81691272 
 9916 19126  18929 gi|2313526(AE000557) H. pylori predicted coding region HP0411 [Helicobacter8175
pylo ri]
106783737822gnl|PID|e199384pyrR [Lactobacillus plantarum]8161552
108650546877 gi|1469939group B oligopeptidase PepB [Streptococcus agalactiae]81661824 < /td>
11315 15899  18283 pir|S09411|S094spoIIIE protein - Bacillus subtilis81652385 
128533593634orf1091 [Streptococcus thermophilus]8169276
1511 8303211gi|304896EcoE type I restriction-modification enzyme R subunit [Escherichia81592382 
c oli]
15911 67 227837gi|2239288GMP synthetase [Bacillus subtilis]81691116 
1701 739 458gnl|PID|d102006(AB001488) FUNCTION UNKNOWN [Bacillus subtilis]8155282
19121759 893gi|149522tryptophan synthase alpha subunit [Lactococcus lactis]8165867
214322901994gi| 157587reverse transcriptase endonuclease [Drosophila virilis]8143297
217444154008gi |466473cellobiose phosphotransferase enzyme II′ [Bacillus815940 8
stearoth ermophilus]
2622 569 868gi|153675tagatose 6-P kinase [Streptococcus mutans]8168300
2991 663  4gnl|PIDp51 e301154StySKI methylase [Salmonella enterica]8160660
3662 376 83gi|149521tryptophan synthase beta subunit [Lactococcus lactis]8165294
 1210 87669242DNA/pantothenate metabolism flavoprotein [Streptococcus mutans]8064477
 1711 60505748unnamed protein product [Streptococcus thermophilus]8067303
 1716 84559066 gi|703126leucocin A translocator [Leuconostoc gelidum]8059612
 18324401613 gi|1591672phosphate transport system ATP-binding protein [Methanococcus8058
jan naschii]
 273 42481579gi|452309valyl-tRNA synthetase [Bacillus subtilis]80692670 
 28736713288gi|1573660H. influenzae predicted coding region HI0660 [Haemophilus8063384
influ enzae]
 322†‚9021933gnl|PID|e264499dihydrooro tate dehydrogenase B [Lactococcus lactis]80661032 
 391  1266gnl|PID|e2340478hom [Lactococcus lactis]80631266 
 52543633593ATP-binding subunit [Bacillus subtilis]8057771
 54545504744gi|2198820(AF004225) Cux/CDP(1B1); Cus/CDP homeoprotein [Mus musculus]8060195
 5911 71097486gi|951052ORF9, putative [Streptococcus pneumoniae]8068378
 65312301550ribosomal protein L23 - Bacillus stearothermophilus80693 21
 6512 51745503pir|A02819|R5BSribosomal protein L24 - Bacillus stearothermophilus80703 30
 6699884106 87 gi|2313836(AE000584) conserved hypothetical protein [Helicobacter pylori]8066804
 822 6482438gi|622991mannitol transport protein [Bacillus stearothermophilus]8065 1791 
 851 950  630gi|528995polyketide synthase [Bacillus subtilis]8046321
 89868705779gi|853776peptide chain release factor 1 [Bacillus subtilis]80631092 
 9312 87187438 gnl|PID|d101959hypothetical protein [Synechocystis sp.]80601281 
106568545751gnl|PID|e199 386glutaminase of carbomoyl-phosphate synthase [Lactobacillus8065
plantarum]
109221601450gi|40056phoP gene product [Bacillus subtilis]8059711
124942463953g nl|PID|d10225430S ribosomal protein S16 [Bacillus subtilis]8065294
128851486428g i|2281308phosphopentomutase [Lactococcus lactis cremoris]80661281 
33719 12665 11 376 gi|159109NADP-dependent glutamate dehydrogenase [Glardia intestinalis]80681290†‚
14019 19699 gi|517210putative transposase [Streptococcus pyogenes]8070243
15822474 984gi|1877423galactose-1-P-uridyl transferase [Streptococcus mutans]80651491 
17110 74747728gi|397800cyclophilin C-associated protein [Mus musculus]8060255
1811  2 619gi|149395lacC [Lactococcus lactis]8066618
3131 27 539gi|143467ribosomal protein S4 [Bacillus subtilis]8080513
32921652 858gi|533080RecF protein [Streptococcus pyogenes]8063795
3711  2 958gi|442360ClpC adenosine triphosphates [Bacillus subtilis]8058957
8743125580gi|149435 putative [Lactococcus lactis]79641269 
 2311175 135gi|1542975AbcB [Thermoanaerobacterium thermosulfurigenes]7961 1041 
 3314 92448201gnl|PID|e253891UDP-glocuse 4-epimerase [Bacillus subtilis]79621044 
 36312422633gnl|PID|e324218ftsA [Enterococcus hirae]79581392 
 3813 71558378gi|405134acetate kinase [Bacillus subtilis]79581224 
 55790118229gi|1146234dihydroipicolinate reductase [Bacillus subtilis]7956783
 6519 86618915gi|2078380ribosomal protein L30 [Staphylococcus aureus]7968255
 69436782128g nl|PID|e311452unknown [Bacillus subtilis]79641551 
 69978817279gi|677850hypothetical protein [Staphylococcus aureus]7959603
 7210 84919783hypothetical protein [Synechocystis sp.]79621293 
 80329067300gi|143342< /td>polymerase III [Bacillus subtilis]79654395 
 8214 13326  15689 gnl|PID|e255093hypothetical protein [Bacillus subtilis]79652364 
 8613 12233  11118 gi|683582prephenate dehydrogenase [Lactococcus lactis]79581116 
 923 9401734gi|537286triosephosphate isomerase [Lactococcus lactis]7965795
 98640234742g nl|PID|d100262LivG protein [Salmonella typhimurium]7963720
 9912 16315 1 4150 gi|153736a-galactosidase [Streptococcus mutans]79642166 
107756846406gi|460080D-alanine:D-alanine ligase-related protein [Enterococcus faecalis]7958723
113968588303g i|466882ppsl; B1496_C2_189 [Mycobacterium leprae]79641446 
15110 13424 1221 3 gi|4506863-phosphoglycerate kinase [Thermotoga maritima]79601212 
162211583017CapD [Staphylococcus aureus]79671860 
177528763052gi|912423putative [Lactococcus lactis]7961177
177841984563gi| 149429putative [Lactococcus lactis]7961366
187327282907gnl |PID|d102002(AB001488) FUNCTION UNKNOWN [Bacillus subtilis]7953180
189735894350g nl|PID|e183449putative ATP-binding protein of ABC-type [Bacillus subtilis]7961762
191542493449g i|149519indoleglycerol phosphate synthase [Lactococcus lactis]7966801
211318052737gi| 147404mannose permease subunit II-M-Man [Escherichia coli]7957933
212338633621gnl|P ID|e209004glutaredoxin-like protein [Lactococcus lactis]7958243
2151 987 715gi|1183242(AF008220) arginine succinate synthase [Bacillus subtilis]7964273
3232 530 78130S ribosomal protein [Pedicoccus acidilactici]7967252
3801 694  2gi|1184680polynucleotide phosphorylase [Bacillus subtilis]7964693
3842 655 239phoP protein (put.); putative [Bacillus subtilis]7959417
6328204091gi|853767 UDP-N-acetylglucosamine 1-carboxyvinyltransferase [Bacillus786212 72 
subt ilis]
81 501786gi|149432putative [Lactococcus lactis]78631737 
91 351 124gi|897793< /td>y98 gene product [Pediococcus acidilactici]7859228
 15873648314gnl|PID|d100585cystein synthetase [Bacillus subtilis]7863951
 2010 978310310  gnl|PID|d100583stage V sporulation [Bacillus subtilis]7858573
 2016 17165 1771 3 gi|49105hypoxanthine phosphoribosyltransferase [Lactococcus lactis]7859549
 2222 17388 18416⠀‚gnl|PID|d101315Ygfe [Bacillus subtilis]78601029 
 2227 20971  20612 gi|299163alanine dehydrogenase [Bacillus subtilis]7859360
 34874077105gi|41015aspartate-tRNA ligase [Escherichia coli]7855303
 35862575196gi| 1657644Cap8E [Staphylococcus aureus]78601062 
 4011 92878001gi|1173518GTP cyclohydrase II 3,4-dihydroxy-2-butanone-4-phosphate78581287 
syn thase [Actinobacillus pleuropneumoniae]
 483 1 22422 23183 gi|2314330(AE000623) glutamine ABC transporter, ATP-binding protein (glnQ)7858762
< td/>[Helicobacter pylori]
 5222 1011430gi|1183887integral membrane protein [Bacillus subtilis]7854672
 5514 13605 1271 2 gnl|PID|d102026(AB002150) YbbP [Bacillus subtilis]7858894
 5517 16637 1561 2 gnl|PID|e313027hypothetical protein [Bacillus subtilis]78511026 
 7114 19756  19598 gi|179764calcium channel alpha-1D subunit [Homo sapiens]7857159
 7411 15031 14018  gi|1573279Holliday junction DNA helicase (rubB) [Haemophilus influenzae]78571014 < /td>
 75966237972gi|1877423galactose-1-P-uridyl transferase [Streptococcus mutans]78621350 
 8112 12125 13 906 gi|1573607L-fucose isomerase (fucI) [Haemophilus influenzae]78661782 < /td>
 82324234417gi|153744ORF X; putative [Streptococcus mutans]78641995 
 8318 16926 18 500 gi|143373phosphoribosyl aminoimidazole carboxy formyl78631575 
formyltransferase/inosine monophosphate cyclohydrolase (PUR-H(J))
[< i>Bacillus subtilis]
 8320 20212 20775 gi|143364pho sphoribosyl aminoimidazole carboxylase I (PUR-E) [Bacillus786456 4
subtilis ]
 922 165< /td> 878gnl|PID|d101190ORF2 [Streptococcus mutans]7862714
 98858636909g i|2331287(AF013188) release factor 2 [Bacillus subtilis]78631047 
113310712741dnaZX [Bacillus subtilis]78641671 
127411332071RNA polymerase alpha-core-subunit [Bacillus subtilis]7859939
13212782 497gi|1561763pullulanase [Bacteroides thetaiotaomicron]785822 86 
1354269835 37gi|1788036(AE000269) NH3-dependent AND synthetase [Escherichia coli]7866840
14024 26853 25423 gi|1100077phospho-beta-glucosidase [Clostridium longisporum]78641431 
150546904514gi|149464amino peptidase [Lactococcus lactis]7842177
1521  1 795gi|639915NADH dehydrogenase subunit [Thunbergia alata]7843795
162449974110gnl| PID|e323528putative YhaP protein [Bacillus subtilis]7864888
18110 86517947lactose repressor (lacR; alt.) [Lactococcus lactis]7848705
200436274958gnl |PID|d100172invertase [Zymomonas mobilis]78611332 
203332303015CycK [Pseudomonas fluorescens]7857216
210967897172ORF6 gene product [Bacillus subtilis]7842384
214638102797g nl|PID|d102049P. haemolytica o-sialoglycoproptein endopeptidase; P36174 (660)78601014 
transmembrane [Bacillus subtilis]
21413 8163gi|1377831unknown [Bacillus subtilis]78621842 
2171  92717gi|488430alcohol dehydrogenase 2 [Entamoeba histolytica]78642709 
222323163098gi|15733047spore gemination and vegetative growth protein (gerC2)7865783
[Haemophilus influenzae]
2681 742  8gi|517210putative transposase [Streptococcus pyogenes]7865735
2761 223 753ribosomal protein L1 [Bacillus subtilis]7865531
312315671079g i|289261comE ORF2 [Bacillus subtilis]7854489
3391 117 794CadD [Staphylococcus aureus]7853678
3422 762 265gi|1842439phosphatidylglycerophosphate synthase [Bacillus subtilis]7859498
3831 737  3gi|1184680polynucleotide phosphorylase [Bacillus subtilis]7864735
715 11923 11018 gi |1399855carboxyltransferase beta subunit [Synechococcus PCC7942]7763906
8216982255gi|149433 putative [Lactococcus lactis]7759558
 1714 69487550comA protein [Streptococcus pneumoniae]7760603
 3012 97618967gi|1000451TreP [Bacillus subtilis]7743795
 3614 11421 1213 1 gi|1573766phosphoglyceromutase (gpmA) [Haemophilus influenzae]7764711
 55338364096YeaB [Bacillus subtilis]7755261
 61883778054gi|1890649multidrug resistance protein LmrA [Lactococcus lactis]7751324
 652 6071254gi|40103ribosomal protein L4 [Bacillus stearothermophilus]7763 648
 688750972 40gi|47551MRP [Streptococcus suis]7768270
 6911083 118g nl|PID|e311493unknown [Bacillus subtilis]7757966
 77545834026gnl|PID51 e281578hypothetical 12.2 kd protein [Bacillus subtilis]7760558
 8314 13104 1455 2 gi|1590947amidophosphoribosyltransfera se [Methanococcus jannaschii]77561449
 94430065444 gnl|PID|e329895(AJ000496) cyclic nucleotide-gated channel beta subunit [Rattus77662439
norvegicu s]
 9611 85 188880gi|551879ORF 1 [Lactococcus lactis]7762363
 9911 14082 12799⠀‚gi|153737sugar-binding protein [Streptococcus mutans]77611284 
1062 3611176LicD protein [Haemophilus influenzae]7751816
108431524030gi|1574730tellurite resistance protein (tehB) [Haemophilus influenzae]7758879
118435203131gi|1573900D-alanine permease (dagA) [Haemophilus influenzae]7757390
124417961071gi|1573162tRNA (guanine-N1)-methyltransferase (trmD) [Haemorphilus7758
infl uenzae]
1264590 94614gnl|PID|d101163Srb [Bacillus subtilis]77621296 
1282 6301373gnl|PID|d101328YqiZ [Bacillus subtilis]7758744
1301  11287gnl|PID|e325013hypothetic al protein [Bacillus subtilis]77611287 
139543883639(AF008220) YtqA [Bacillus subtilis]7759750
14011 10931 9582gi|289284cysteinyl-tRNA synthetase [Bacillus subtilis]77641350 
14018 19451 19 263 gi|517210putative transposase [Streptococcus pyogenes]7766189
1412 9761683gnl|PID|e157887URF5 (aa 1-573) [Drosophila yakuba]7750708
141427355293gi| 556258secA [Listeria monocytogenes]77592559⠀‚
1442 671217 3gnl|PID|d100585lysyl-tRNA thynthetase [Bacillus subtilis]77611503 
163564127398dihydroorotate dehydrogenase A [Lactococcus lactis]7762987
16410 78417074gni|PID|d100964homologue of iron dicitrate transport ATP-binding protein FecE of7752768
E. coli [Bacillus subtilis]
19187 2575791gi|149516anthranilate synthase alpha subunit [Lactococcus lactis]77571467 
198853775177gi|1573856hypothetical [Haemophilus influenzae]7766201
2131 202 462gi|1743860Brac2 ]Mus musculus]7750261
2502 231 509YlbH protein [Bacillus subtilis]7760279
289317371276g nl|PID|d100947Ribosomal Protein L10 [Bacillus subtilis]7762462
29221399 668gi|143004transfer RNA-Gln synthetase [Bacillus stearothermophilus]7758 732
7327341166gnl|PID|d10182 4peptide-chain-release factor 3 [Synechocystis sp.]76531569 
723 18474 18235 gi |455157acyl carrier protein [Crypotomas phi]7657240
9857064342gi|1146247asparaginyl-tRNA synthetase [Bacillus subtilis]76611365 
 10545314385gnl|PID|e314495hypothetical protein [Clostridium perfringens]7653147
 1821615 842gi|1591672phosphate transport system ATP-binding protein [Methanococcus7656
jan naschii]
 2237  27796 28173 gnl|PID|e13389translation initiation factor IF3 (AA 1-172) [Bacillus766437 8
stearoth ermophilus]
 3562682gi|1773346Cap5G [Staphylococcus aureus]76611188 
 4828 21113 21 787 gi|2314328(AE000623) glutamine ABC transporter, permease protein (glnP)7652675
< td/>[Helicobacter pylori]
 5212 13786 gi|142521deoxy ribodipyrimidine photolyase [Bacillus subtilis]7658906
 5510 11521 1057 1 gnl|PID|e283110femD [Staphylococcus aureus]7661951
 57878246559g i|290561o188 [Escherichia coli]76471266 
 62524062095gnl|PID|e313024hypothetical protein [Bacillus subtilis]7659312
 65942234441gi|40148L29 protein (AA 1-66) [Bacillus subtilis]7658219
 68213282371gnl|PID|e284233anabolic ornithine carbamoyltransferase [Lactobacillus plantarum]76611044 
 69872976005gnl|PID|d101420Pyrimidine nucleoside phosphorylase [Bacillus stearothermophilus]7661 1293 
 7312 78397267gnl|PID|e243629unknown [Mycobacterium tubercolosis]7653573
 74584337039gnl|PID|d102048C. thermocellum beta-glucosidase; P2208 (985) [Bacillus subtilis]76601395 
 80576437936gi|2314030(AE000599) conserved hypothetical protein [Helicobacter pylori]7661294
 8215 16019 16996⠀‚gi|1573900D-alanine permease (dagA) [Haemophilus influenzae]7656978
 8319 18616 19 884 gi|143374phosphoribosyl glycinamide synthetase (PUR-D; gtg start condon)76601269 
[Bacillus subtilis]
 8614 13409 12231 gi|143806Aro F [Bacillus subtilis]76581179 
 871  31442gi|153804sucrose-6-phosph ate hydrolase [Streptococcus mutans]76591440 
 8716 15754 15 110 gnl|PID|e323500putative Gmk protein [Bacillus subtilis]7656645
 93417691539gi|15748201,4-alpha-glucan branching enzyme (glgB) [Haemophilus influenzae]7646231
 941 51 365gi|1443136.0 kd ORF [Plasmid ColE1]7673315
11 6221511678gi|153841 pneumococcal surface protein A [Streptococcus pneumoniae]7659474
123634425895gi|1314297ClpC ATPase [Listeria monocytogenes]76592454⠀‚
126221562932< /td>gnl|PID|d101328YqiZ [Bacillus subtilis]7661777
12810 69737797purine nucleoside phosphorylase [Bacillus subtilis]7660825
13111 61865812(AE000058) Mycoplasma pneumoniae, MG085 homolog, from7647375
M. genitalium [Mycoplasma pneumoniae]
139436413192gi|2293302(AF008220) YtgA [Bacillus subtilis]7653450
14014 14872 12536⠀‚gi|1184680polynucleotide phosphorylase [Bacillus subtilis]76622337 
143225833905transfer RNA-Tyr synthetase [Bacillus subtilis]76611323 
170650956114ycgQ [Bacillus subtilis]76441020 
18021927 557gi|40019ORF 821 (aa 1-821) [Bacillus subtilis]76531371 
191758155228anthranilate synthase beta subunit [Lactococcus lactis]7661588
195338292444gi| 2149905D-glutamic acid adding enzyme [Enterococcus faecalis]76601386 
200319143629lysis protein [Bacillus subtilis]76581716 
2011 431 207gi|2208998dextran glucosidase DexS [Streptococcus suis]7657225
214212832380gi|55 3278transposase [Streptococcus pneumoniae]76551098 < /td>
225323383411gi|1552775ATP-binding protein [Escherichia coli]76561074 
2331  2 724gi|1163115neuraminidase B [Streptococcus pneumoniae]7660723
3471 523 38gi|537033ORF_f356 [Escherichia coli]7660486
3562 842 165g i|2149905D-glutamic acid adding enzyme [Enterococcus faecalis]7661678
3663 734 348phosphoribosyl anthranilate isomerase [Lactococcus lactis]7669387
5812599 11484 gi|157 4293fimbrial transcription regulation repressor (pilB) [Haemophilus75611116 
i nfluenzae]
613 12553 11894 gn l|PID|d102050ydiH [Bacillus subtilis]7551660
910 72826062gi|142538< /td>aspartate aminotransferase [Bacillus sp.]75551221 
 1012 80807940gi|149 493SCRFI methylase [Lactococcus lactis]7556141
 18542663301g nl|PID|d101319YqgH [Bacillus subtilis]7552966
 22418382728gi|1373157orf-X; hypothetical protein; Method: conceptual translation supplied7562891
by author [Bacillus subtilis]
 3011 90157828gi|153801enzyme scr-II [Streptococcus mutans]75641188 
 31523622030(AF008220) putative thioredoxin [Bacillus subtilis]7553333
 32974848359gnl|PID|d100560formamidopyrimidine-DNA glycosylase [Streptococcus mutans]7561876
 33417351448g i|413976ipa-52r gene product [Bacillus subtilis]7553288
 3310 64705769gi|533105unknown [Bacillus subtilis]7556702
 3312 68787183pir|A00205|FECLferredoxin [4Fe-4S] - Clostridium thermaceticum7556306
 361 181  2gi|2088739(AF003141) strong similarity to the FABP/P2/CRBP/CRABP family7543180
< td/>of transporters [Caenorhabditis elegans]
 3822  14510 15379 gi|1574058hyp othetical [Haemophilus influenzae]7556870
 4833 23398 24 066 gi|1930092outer membrane protein [Campylobacter jejuni]7556669
 511  2 319gi|43985nifS-like gene [Lactobacillus delbrueckii]7555318
 5110 831811683  gi|537192CG Site No. 620; alternate gene names hs, hsp, hsr, rm; apparent75503366 
frameshift in GenBank Accession Number X06545 [Escherichia coli]
 5418 19566 20759 gi|666069orf2 gene product [Lactobacillus leichmannii]75581194 
 57984487822< /td>gi|290561o188 [Escherichia coli]7550627
 6514 60726356gi|60624130S ribosomal subunit protein S14 [Escherichia coli]7564285
 70430712472gi| 1256617adenine phosphoribosyltransferase [Bacillus subtilis]7557600
 7124 30399 2940 4 gi|1574390C4-dicarboxylate transport protein [Haemophilus influenzae]7557996
 732 910 455gnl|PID|e249656YneT [Bacillus subtilis]7557456
 7911810 49128.2% of identity to the Escherichia coli GTP-binding protein Era;75591320 
putative [Bacillus subtilis]
 82563606536gi|1655715BztD [Rhodobacter capsulatus]7555177
 83619382975putative PlsX protein [Bacillus subtilis]75561038 
 9311 73685317 gi|39989methionyl-tRNA synthetase [Bacillus stearothermophilus]7558 2052 
 9313 94098699gi|1591493glutamine transport ATP-binding protein Q [Methanococcus7554
jan naschii]
 951 1795 47gnl|PID|e323510Ylov protein [Bacillus subtilis]75571749 
1032 3621186gnl|PID|e266928unknown [Mycobacterium tuberculosis]7564825
1041 691 915gi|460026repressor protein [Streptococcus pneumoniae]7554225
113529513883gnl|PID|d101119ABC transporter subunit [Synechocystis sp.]7555933
121< /td>1 3201390gi|2145131repressor of class I heat shock gene expression HrcA [Streptococcus7558
mutans]
127626 143000gi|1500451M. jannaschii predicted coding region MJ1558 [Methanococcus7544
jan naschii]
13718 10687 gi|393116P-glyc oprotein 5 [Entamoeba histolytica]7552606
14911 84999338gnl|PID|d100582unknown [Bacillus subtilis]7555840
151691007673g i|40467HsdS polypeptide, part of CfrA family [Citrobacter freundii]75571428 
1581 986  3gnl|PID|e253891UDP-glucose 4-epimerase [Bacillus subtilis]7563984
172856536774g i|142978glycerol dehydrogenase [Bacillus stearothermophilus]7556 1122 
17297139 9730gnl|PID|e268456unknown [Mycobacterium tuberculosis]75582592†‚
1731 261 79gnl|PID|e236469C10C5.6 [Caenorhabditis elegans]7550183
185330662014gi |1574806spermidine/putrescine transport ATP-binding protein (potA)75561053 
[Haemophilus influenzae]
191652354213gi|149518phosphoribosyl anthranilate transferase [Lactococcus lactis]75611023 
226217741181gi|2314588(Ae000642) conserved hypothetical protein [Helicobacter pylori]7565594
2311  1 153gi|40173homolog of E. coli ribosomal protein L21 [Bacillus subtilis]7557153
2341  2 418gi|2293259(AF008220) YtqI [Bacillus subtilis]7559417
2791 552 151unknown protein [Bacillus subtilis]7550402
291735583827g i|40011ORF17 (AA 1-161) [Bacillus subtilis]7548270
3752 137628 gi|410137ORFX13 [Bacillus subtilis]7558492
620 16721 17560 gi |2293323(AF008220) YtdI [Bacillus subtilis]7453840
7646826052gi|1354211PET112-like protein [Bacillus subtilis]74601371 
 18433412427gnl|PID|d101319YqgI [Bacillus subtilis]7454915
 21658854800gi|1072381glutamyl-aminopeptidase [Lactococcus lactis]74591086 
 242 739 548gi|2314762(AE000655) ABC transporter, permease protein (yaeE) [Helicobacter7446
pylori]
 251  2 367gnl|PID|d100932H2O-form ing NADH Oxidase [Streptococcus mutans]7463366
 3818 11432 12964⠀‚gi|537034ORF_o488 [Escherichia coli]74571533 
 4810 89246669gi|1513069P-type adenosine triphosphatase [Listeria monocytogenes]74532256⠀‚
 5511 11964 11401 gnl|PID|e283110femD [Staphylococcus7464 564
au reus]
 612178 2 427gi|2293216(AF008220) putative UDP-N-acetylmuramate-alanine ligase [Bacillus745513 56 
subt ilis]
 7610 94148065gnl|PID|d101325YaiB [Bacillus subtilis]74541350 
 832 666 926 pir|C33496|C334hisC homolog - Bacillus subtilis7455261
 86989858080 gi|683585prephenate dehydratase [Lactococcus lactis]7455906
102550055652gi| 143394OMP-PRPP transferase [Bacillus subtilis]7457648
103543643267g nl|PID|e323524YloN protein [Bacillus subtilis]74621098 
108768647592methyltransferase [Lactococcus lactis]7456729
1312 478 146gnl|PID|d101320YqgZ [Bacillus subtilis]7445333
13321380 919gnl|PID|e313025hypothetical protein [Bacillus subtilis]7460462
137961676787g nl|PID|d100479Na+ -ATPase subunit D [Enterococcus hirae]7453621
149430083883gnl| PID|d100581high level kasgamycin resistance [Bacillus subtilis]7455876
1572 243 824methylated-DNA--protein-cysteine methyltransferase (dat1)7448582
< td/>[Haemophilus influenzae]
164635154249gi|410131ORFX7 [Bacillus subtilis]7448735
167754465201g i|413927ipa-3r gene product [Bacillus subtilis]7455246
1711  11818gnl|PID|d102251beta-galac tosidase [Bacillus circulans]74621818 
172410642392 gi|466474cellobiose phosphotransferase enzyme II″ [Bacillus745013 29 
stea rothermophilus]
1851 326  3gi|1573646Mg(2+) transport ATPase protein C (mgtC) (SP:P22037)7468324
[Haemophilus influenzae]
188210892018gi|1573008ATP dependent translocator homolog (msbA) [Haemophilus7444930
influ enzae]
18911  64917174gi|1661199sakacin A production response regulator [Streptococcus mutans]7460684
2102 5201287g i|2293207(AF008220) YtmQ [Bacillus subtilis]7460768
2611 836 192putative ATP binding subunit [Bacillus subtilis]7455645
263316193655g i|663232Similarity with S. cerevisiae hypothetical 137.7 kD protein in74422037 
< td/>subtelomeric Y′ repeat region [Saccharomyces cerevisiae]
2652 8441227gi|49272Asparaginase [Bacillus licheniformis]7464384
3681  1 942gi|603998unknown [Saccharomyces cerevisiae]7439942
716 13357 11921 gn l|PID|d101324YqhX [Bacillus subtilis]73571437 
 1710 57065449 gnl|PID|e305362unnamed protein product [Streptococcus thermophilus]7347258
 312 522 244 gnl|PID|d100576single strand DNA binding protein [Bacillus subtilis]7355279
 32656676194gnl|PID|d101315YqfG [Bacillus subtilis]7358528
 3415 10281 9790 gnl|PID|d102151(AB001684) ORF42c [Chlorella vulgaris]7346492
 4012 98769226gi|1173517riboflavin synthase alpha subunit [Actinobacillus pleuropneumoniae]735565 1
 5523592 8 39gnl|PID|d101887cation-transporting ATPase PacL [Synechocystis sp.]73602754 
 5518 17494 16586 unknown [Mycobacterium tuberculosis]7352909
 6516 72137767 gi|143419ribosomal protein L6 [Bacillus stearothermophilus]7360 555
 663330036 59gnl|PID|e269883LacF [Lactobacillus casei]7352360
 7010 55575733envelope protein [Human immunodeficiency virus type 1]7360177
 71< /td>461338262gnl|PID|e322063< /td>ss-1,4-galactosyltransferase [Streptococcus pneumoniae]73452130 < /td>
 721  3 851gi|1183177(AF008220) transporter [Bacillus subtilis]7350849
 76770196195gnl|PID|d101325YqiF [Bacillus subtilis]7366825
 7612 10009 9533 gi|1573086uridine kinase (uridine monophosphokinase) (udk) [Haemophilus7354477
influ enzae]
 80781 139372gi|1377823aminopeptidase [Bacillus subtilis]73601260 
 97533891668gnl|PID|d101954dihydroxyacid dehydratase [Synechocytis sp.]73541722 
 98969127619gnl|PID|e3 14991FtsE [Mycobacterium tuberculosis]7354708
10811 10928 10 440 gi|388109regulatory protein [Enterococcus faecalis]7354489
128636324222g i|1685111orf1091 [Streptococcus thermophilus]7363591
13821575 394gi|147326transport protein [Escherichia coli]73601182 
14013 12538 11903⠀‚pir|E53402|E534serine O-acetyltransferase (EC 2.3.1.30) - Bacillus7355636
stearothe rmophilus
16255 7014991gnl|PID|e323511putative YhaQ protein [Bacillus subtilis]7350711
164423232790g i|1592076hypothetical protein (SP:P25768) [Methanococcus jannaschii]7352468
164848155546gi|410137ORFX13 [Bacillus subtilis]7356732
170543945302g nl|PID|d100959homologue of unidentified protein of E. coli [Bacillus subtilis]7346909
178738934855g i|46242nodulation protein B, 5′end [Rhizobium loti]7356963
204650964278gnl|P ID|e214719PlcR protein [Bacillus thuringiensis]7341819
2132 8322037gi|156296ribosomal protein S1 homolog; sequence specific DNA-binding73551206 
protein [Leuconostoc lactis]
2312 84 287gi|40173homolog of E. coli ribosomal protein L21 [Bacillus subtilis]7361204
2371  2 505gi|1773151adenine phosphoribosyltransferase [Escherichia coli]7351504
2691  2 691gnl|PID|d101328YqiX [Bacillus subtilis]7336690
28921272 832pir|A02771|R7MCribosomal protein L/L12 - Micrococcus luteus7366441
3431 14 484gi|1788125(AE000276) hypothetical 30.4 kD protein in manZ-cpsC intergenic7347471
region [Escherichia coli]
3561 22 2  4gi|2149905D-glutamic acid adding enzyme [Enterococcus faecalis]7350219
7531654691gnl|PID|d10183 3amidase [Synechocystis sp.]72521527 
7971957647gi|146976 nusB [Escherichia coli]7254453
717 13743 13300 gn l|PID|e289141similar to hydroxymyristoyl-(acyl carrier protein) dehydratase7259444
[Bacillus subtilis]
 2219 15367 16224 gnl|PID|d1019297251588
 3 317 12111 11425 gn l|PID|d101190ORF3 [Streptococcus mutans]7255687
 34771475627g i|396501aspartyl-tRNA synthetase [Thermus thermophilus]72521521†‚
 3823 15372  16085 pir|H64108|H641L-ribulose-phos phate 4-epimerase (araD) homolog - Haemophilus7254 714
influe nzae (strain Rd KW20)
 3955094 6905gnl|PID|e254877unknown [Mycobacterium tuberculosis]72561812†‚
 40644694636 gi|153672lactose repressor [Streptococcus mutans]7258168
 48214591253g i|310380inhibin beta-A-subunit [Ovis aries]7233207
 4829 21729 22424†‚gi|2314329(AE000623) glutamine ABC transporter, permease protein (glnP)7249696
< td/>[Helicobacter pylori]
 5054 5293288gi|1750108YnbA [Bacillus subtilis]72541242 
 51310442282gi|2293230(AF008220) YtbJ [Bacillus subtilis]72541239 
 5213 13681  13938 gi|142521deoxyribodipyrimidine photolyase [Bacillus subtilis]7245258
 551 841 35gi|882518ORF_o304; GTG start [Escherichia coli]7259807
 75528323191gnl |PID|e209886mercuric resistance operon regulatory protein [Bacillus subtilis]7244360
 76662295771gi|142450ahrC protein [Bacillus subtilis]7253459
 79550654592gi|2293279(AF008220) YtcG [Bacillus subtilis]7246474
 8714 14726 1230 9 gnl|PID|e323502putative PriA protein [Bacillus subtilis]72522418 
 911 444 662 gi|500691MY01 gene product [Saccharomyces cerevisiae]7250219
 91745164764skeletal muscle sodium channel alpha-subunit [Equus caballus]7238249
 95220041717gnl|PID|e323527putative Asp23 protein [Bacillus subtilis]7240288
10911452 118gi|143331alkaline phosphatase regulatory protein [Bacillus subtilis]72521335 
1261  32192gnl|PID|d101831glutamine- binding periplasmic protein [Synechocystis sp.]72462190 
130317352478gi|2415396(AF015775) carboxypeptidase [Bacillus subtilis]7253744
137625852929g i|472922v-type Na-ATPase [Enterococcus hirae]7246345
14010 96019203 gi|49224URF 4 [Synechococcus sp.]7248399
146< /td>519061247gnl|PID|e324945< /td>hypothetical protein [Bacillus subtilis]7245660
147220841083g nl|PID|e325016hypothetical protein [Bacillus subtilis]72561002 
147561565146TPP-dependent acetoin dehydrogenase beta-subunit [Clostridium72561011 
m agnum]
14885381 6433gi|974332NAD(P)H-dependent dihydroxyacetone-phosphate reductase [Bacillus725410 53 
subt ilis]
14814 1 0256 9675gnl|PID|d101319YqgN [Bacillus subtilis]7250582
159840054949g i|1788770(AE000330) o463; 24 pct identical (44 gaps) to 338 residues from7243945
penicillin-binding protein 4*, PBPE_BACSU SW; P32959 (451 aa)
[Esche richia coli]
17210 9 90710620 gi|763387unknown [Saccharomyces cerevisiae]7255714
220328623602gi|1574175hypothetical [Haemophilus influenzae]7250741
2671  3 449gi|290513f470 [Escherichia coli]7248447
2812 899 540g nl|PID|d100964homologue of aspartokinase 2 alpha and beta subunits LysC of7245360
B. subtilis [Bacillus subtilis]
29011 018 14gi|474195This ORF is homologous to a 40.0 kd hypothetical protein in the htrB72541005 
3′ region from E. coli, Accession Number X61000 [Mycoplasma-like
o rganism]
3001 63 587gi|746399transcription elongation factor [Escherichia coli]7250525
31611326  4gi|158127protein kinase C [Drosophila melanogaster]72401323†‚
3421 227  3gnl|PID|d101164unknown [Bacillus subtilis]7254225
3541  11005gnl|PID|d102048C. thermocellum beta-glucosidase; P26208 (985) [Bacillus subtilis]72521005 
610 813410467 gnl|PI D|e264229unknown [Mycobacterium tuberculosis]71572334†‚
720 16231 15464 gi |180463-oxoacyl-[acyl-carrier protein] reductase [Cuphea lanceolata]7152768
 1511297  2gnl|PID|d100571replicative DNA helicase [Bacillus subtilis]71511296 
 15444353869gi|499384orf189 [Bacillus subtilis]7147567
 18651204218gnl|PID|d101318YqgG [Bacillus subtilis]7151903
 291  1 540gi|17773142similar to the 20.2kd protein in TETB-EXOA region of B. subtilis7156540
[Escherichia coli]
 3820 13327 13830 gi|537036ORF_o15 8 [Escherichia coli]7148504
 5112 15015 12676  gi|149528dipeptidyl peptidase IV [Lactococcus lactis]71592340 
 5523 21040 20 585 gi|2343285(AF015453) surface located protein [Lactobacillus rhamonus]7158456
 602 705 265gnl|PID|d101320YqgZ [Bacillus subtilis]7144441
 7118 24679 2622 6 gi|580920rodD (gtaA) polypeptide (AA 1-673) [Bacillus subtilis]71441548 
 7125 30587  30360 gi|606028ORF_o414; Geneplot suggests frameshift near start but none found7150228
[Escherichia coli]
 726523 96729gi|580835lysine decarboxylase [Bacillus subtilis]71481491 
 7214 11991  12878 gi|624085similar to rat beta-alanine synthetase encoded by GenBank Accession7154888
Number S27881; contains ATP/GTP binding motif [Paramecium
bursaria Chlorella virus 1]
 7311 72697033gi|1906594PN1 [Rattus norvegicus]7142237
 74610385 8517gi|1573733prolyl-tRNA synthetase (proS) [Haemophilus influenzae]71521869 < /td>
 81957726578gi|147404mannose permease subunit II-M-Man [Escherichia coli]7145807
 86546023604gnl |PID|e322063ss-1,4-galactosyltransferase [Streptococcus pneumoniae]7153999
105436194707gi|2323341(AF014460) PepQ [Streptococcus mutans]71581089 
10613 13557 1295 5 gi|1519287LemA [Listeria monocytogenes]7148603
114210291979 gi|310303mosA [Rhizobium meliloti]7155951
1222 5641205gi|1649037glutamine transport ATP-binding protein GLNQ [Salmonella7150 642
typhim urium]
13259018 7063gnl|PID|d102049H. influenzae hypothetical ABC transporter; P44808 (974)71511956 
[Bacillus subtilis]
14011 141 227gi|1673788(AE00015) Mycoplasma pneumonia, fructose-bisphosphate aldolase;7149915
similar to Swiss-Prot Accession Number P13243, from B. subtilis
[Mycoplasma pneumoniae]
140 556354973gnl|PID|d100964 homologue of hypothetical protein in a rapamycin synthesis gene7148663
cluster of Streptomyces hygroscopicus [Bacillus subtilis]
14177 3697845gnl|PID|d102005(AB001488) FUNCTION UNKNOWN, SIMILAR PRODUCT IN7151477
E. COLI AND MYCOPLASMA PNEUMONIAE. [Bacillus subtilis]
1931  1 165gi|46912ribosomal protein L13 [Staphylococcus carnosus]7159165
194322051594g i|535351CodY [Bacillus subtilis]7152612
199315101319g i|2182574(AE000090) Y4pE [Rhizobium sp. NGR234]7145192
2 08226163752gi|1787378(AE000213) hypothetical protein in purB 5′ region [Escherichia coli]71571137 
209220221141g i|41432fepC gene product [Escherichia coli]7146882
210519113071gi|49 316ORF2 gene product [Bacillus subtilis]71451161 
210630693386ORF3 gene product [Bacillus subtilis]7148318
212235611381g i|557567ribonucleotide reductase R1 subunit [Mycobacterium tubercolosis]71532181†‚
233320032920gnl|PID|d101320YqgR [Bacillus subtilis]7150918
2441 131053gnl|PID|d100964homologue of aspartokinase 2 alpha and beta subunits LysC or71551041 
< td/>B. subtilis [Bacillus subtilis]
25121 0081874gi|755601unknown [Bacillus subtilis]7146867
2822 906 712unknown [Rhodobacter capsulatus]7146195
312421371565gnl|PID|d102245(AB005554) yxbF [Bacillus subtilis]7134573
3381  3 683gi|1591045hypothetical protein (SP:P31466) [Methanococcus jannaschii]7148681
3461  3 164gi|1591234hypothetical protein (SP:P42297) [Methanococcus jannaschii]7136162
3741 619  2gi|397526clumping factor [Staphylococcus aureus]7123618
3771 688  2gi|397526clumping factor [Staphylococcus aureus]7123687
3874196958gnl|PID|e26948 6Unknown [Bacillus subtilis]7042462
310 83959075gnl|PID|e2 55543putative iron dependant repressor [Staphylococcus epidermidis]7046681
714 11024 10254 gn l|PID|d100290undefined open reading frame [Bacillus stearothermophilus]7055 771
718 14213 13719 gn l|PID|d101090biotin carboxyl carrier protein of acetyl-CoA carboxylase7056495
[Synechocystis sp.]
921057 287gnl|PID|d100 581unknown [Bacillus subtilis]7052771
 12426101789gnl|PID|d101195yycJ [Bacillus subtilis]7052822
 21225861846gi|2293447(AF008930) ATPase [Bacillus subtilis]7054741
 2213 10955 1151 2 gi|1165295Ydr540cp [Saccharomyces cerevisiae]7050558
 30643153980ATP binding protein of transport ATPase [Bacillus firmus]7051336
 311 370 113single-stranded DNA binding protein [unidentified eubacterium]7036258
 3315 10639 9521homolgous to D-amino acid dehydrogenase enzyme [Pseudomonas70501119 
a eruginosa]
 3864312gi|2058547ComYD [Streptococcus gordonii]7048501
 3825 17986 1847 7 gi|537033ORF_f356 [Escherichia coli]7058492
 4013 11054 9846gi|1173516riboflavin-specific deaminase [Actinobacillus pleuropneumoniae]705212 09 
 422 722gi|1146183putative [Bacillus subtilis]70511233 
 43323731612gi|1591493glutamine transport ATP-binding protein Q [Methanococcus7048
jan naschii]
 458 91978049gnl|PID|d102036subunit of ADP-glucose pyrophosphorylase [Bacillus705411 49 
stea rothermophilus]
 592 567 956gnl|PID|d100302neopullulanase [Bacillus sp.]7042390
 6 031874 795gnl|PID|e276 466aminopeptidase P [Lactococcus lactis]70481080 
 61455532437SNF [Bacillus cereus]70513117 
 61779146802cystathionine gamma-synthase (metB) [Haemophilus influenzae]70521113 < /td>
 63753727222gnl|PID|d100974unknown [Bacillus subtilis]70541851 
 68771266962gi|1263014emm18.1 gene product [Streptococcus pyogenes]7037165
 7212 10081 1091 1 gi|2313093(AE000524) carboxynorspermidine decarboxylase (nspC)7056831
< td/>[Helicobacter pylori]
 7510 8124gi|1877423galactose-1- P-uridyl transferase [Streptococcus mutans]7059237
 79334242525g i|39881ORK 311 (AA 1-311) [Bacillus subtilis]7047900
 8710 93697324gnl|PID|e323506putative Pkn2 protein [Bacillus subtilis]70522046 
 9614 10640  11788 gi|1573209tRNA-guanine transglycosylase (tgt) [Haemophilus influenzae]70521149 < /td>
1132 5741086gi|433630A180 [Saccharomyces cerevisiae]7059513
123529013461gnl|PID|d100585unknown [Bacillus subtilis]7045561
125545934282g nl|PID|e276474capacitative calcium entry channel 1 [Bos taurus]7035312
129545003454gnl |PID|d101314YqeT [Bacillus subtilis]70471047 
133326081394(AF008220) YtfP [Bacillus subtilis]70501215 
1351 420 662gnl|PID|e265530yorfE [Streptococcus pneumoniae]7047243
1373 438 932gi|472919v-type Na-ATPase [Enterococcus hirae]7057495
1381 440  1gi|147336transmembrane protein [Escherichia coli]7042438
14016 18796 16364 gi|976441N5-methyltetrahydrofolate homocysteine methyltransferase70532433 
[Saccharomyces cerevisiae]
16710 82636695gi|149535D-alanine activating enzyme [lactobacillus casei]70521569 
204432262747 gnl|PID|d102049E. coli hypothetical protein; P31805 (267) [Bacillus subtilis]7051480
207326272869g nl|PID|e309213racGAP [Dictyostelium discoidem]7045243
28231136 882unknown [Rhodobacter capsulatus]7050255
621 17554 18453 gn l|PID|e233879hypothetical protein [Bacillus subtilis]6944900
622 18482 19474 gi |580883ipa-88d gene product [Bacillus subtilis]6953990
 22646825824gi|2209379(AF006720) ProJ [Bacillus subtilis]69481143 
 22979928651gnl|PID|d100580unknown [Bacillus subtilis]6951660
 2212 987110767  gnl|PID|d100581unknown [Bacillus subtilis]6951897
 27758575348gnl|PID|d102012(AB001488) FUNCTION UNKNOWN. [Bacillus subtilis]6928510
 3610 729410116  gi|437916isoleucyl-tRNA synthetase [Staphylococcus aureus]69532823 
 381  21090gi|141900alcohol dehydrogenase (EC 1.1.1.1) [Alcaligenes eutrophus]69481089 
 4014 11333 11944 gi|1573280Holliday junction DNA helicase (ruvA) [Haemophilus influenzae]6944612
 4015 11942 12 517 gi|1573653DNA-3-methyladenine glycosidase I (tagI) [Haemophilus influenzae]6950576
 45669475490starch (bacterial glycogen) synthase [Bacillus subtilis]69471458 
 4834 24932  24153 gnl|PID|e233870hypothetical protein [Bacillus subtilis]6936780
 49661836521gi|396297similar to phosphotransferase system enzyme II [Escherichia coli]6950339
 49875868338gi| 396420similar to Alcaligenes eutrophus pHG1 D-ribulose-5-phosphate 36949753
< td/>epimerase [Escherichia coli]
 556826 27033gi|1146238poly(A) polymerase [Bacillus subtilis]69501230 
 593 9542333gnl|PID|e313038hypothetical protein [Bacillus subtilis]69541380 
 62311701418gnl|PID|d101915hypothetical protein [Synechocystis sp.]6949249
 6 3872987762gi|293017 ORF3 (put.); putative [Lactococcus lactis]6942465
 66436575081g i|153755phospho-beta-D-galactosidase (EC 3.2.1.85) [Lactococcus lactis69491425 
cremoris]
 66551266 829gi|433809enzyme II [Streptococcus mutans]69461704 
 71610017 10664⠀‚gnl|PID|e322063ss-1,4-galactosyltransfer ase [Streptococcus pneumoniae]6939648
 7121 27730 27 966 gnl|PID|d400649DE-cadherin [Drosophila melanogaster]6930237
 771  1 237gi|287870groES gene product [Lactococcus lactis]6944237
 81536224101g i|1573605fucose operon protein (fucU) [Haemophilus influenzae]6952480
 831 40 714pir|C33496|C334hisC homolog - Bacillus subtilis6946675
 8316 15742 16335  gi|143372phosphoribosyl glycinamide formyltransferase (PUR-N) [Bacillus694659 4
subtilis ]
 8521212 916gi|194097IFN-response element binding factor 1 [Mus musculus]6948297
 91536784274gi|1574712anaerobic ribonuleoside-triphosphate reductase activating protein6944597
(nrdG) [Haemophilus influenzae]
 9854032gnl|PID|d100262LivF protein [Salmonella typhimurium]6951786
108540855056transcription factor [Lactococcus lactis]6949972
126330784568gnl |PID|d101329YqjJ [Bacillus subtilis]69491491 
131641212889YqeR [Bacillus subtilis]69471233 
136215052299unknown [Bacillus subtilis]6947795
149538524763g nl|PID|e323525YloQ protein [Bacillus subtilis]6950912
14912 933610655 gi|151571Homology with E. coli and P. aeruginosa lysA gene; product of69521320 
< td/>unknown function; putative [Pseudomonas syringae]
15343 1913829gi|1710373BrnQ [Bacillus subtilis]6944639
1693 8492324gnl|PID|d100582temperature sensitive cell division [Bacillus subtilis]69491476 
1801 566  3gi|488339alpha-amylase [unidentified cloning vector]6950564
2 1211196 231gi|1395209< /td>ribonucleotide reductase R2-2 small subunit [Mycobacterium6953
tub erculosis]
2261  2 661pir|JQ2285|JQ22nodulin- 26 - soybean6941660
2 33532494766gi|472918v-type Na-ATPase [Enterococcus hirae]69561518 
2353 6601766methylase [Haemophilus influenzae]69431107 < /td>
2432 8652361gnl|PID|d100225ORF5 [Barley yellow dwarf virus]69691497 
328991967gi|2289231 macrolide-efflux protein [Streptococcus agalactiae]6951933
3101  1 282gnl|PID|e322442peptide deformylase [Clostridium beijerinckii]6955282
3691 868  2gi|397526clumping factor [Staphylococcus aureus]6955282
3701 749  3gi|397526clumping factor [Staphylococcus aureus]6921747
3791 44 280gnl|PID|d100649DE-cadheri n [Drosophila melanogaster]6930237
3881 260 72gi|1787524(AE000225) hypothetical 32.7 kD protein in trpL-btuR intergenic6944189
region [Escherichia coli]
1220063040gnl|PID|d10180 9ABC transporter [Synechocystis sp.]68431035 
 12539582600gi|2182992 histidine kinase [Lactococcus lactis cremoris]68451359 
 15217901311pir|S16974|R5BSribosomal protein L9 - Bacillus stearothermophilus68564 80
 1667353570 1gi|1787041(AE000184) o530; This 530 aa orf is 33 pct identical (14 gaps) to 52568451653 
residues of an approx. 640 aa protein YHES_HAEIN SW; P44808
[Es cherichia coli]
 1712 64796805gi|553165acetylcholinest erase [Homo sapiens]6868327
 2013 14128 14505  gi|142700P competence protein (ttg start codon) (put.); putative [Bacillus684037 8
subtilis ]
 2232 246 12 25397 gi|289262comE ORF3 [Bacillus subtilis]6836786
 30745484288gi|311388ORF1 [Azorhizobium caulinodans]6846261
 36539114585 gi|1573041hypothetical [Haemophilus influenzae]6854675
 46652196040(AE000446) hypothetical 29.7 kD protein in ibpA-gyrB intergenic6847822
region [Escherichia coli]
 5410 62357086gi|882579CF Site No. 29739 [Escherichia coli]6855852
 55570695165gnl |PID|d101914ABC transporter [Synechocystis sp.]68451905 
 71361345613gi|1573353 outer membrane integrity protein (tolA) [Haemophilus influenzae]6850522
 7110 15342 16 613 gi|580866ipa-12d gene product [Bacillus subtilis]68311272 
 7112 17560  18792 gi|44073SecY protein [Lactococcus lactis]68351233 
 7117 22295 24 703 gi|1762349involved in protein export [Bacillus subtilis]68502409 
 7316 10208  9729gi|1353537dUTPase [Bacteriophage rit]6851480
 8 618 17198 16011 gi |413943ipa-19d gene product [Bacillus subtilis]68531188 
 8717 17491  15866 gi|150209ORF 1 [Mycoplasma mycoides]68431626 
 89651391454gi|1498824M. jannaschii predicted coding region MJ0062 [Methanococcus6840
jan naschii]
 8911  80218242gi|1509744-oxalocroto nate tautomerase [Pseudomonas putida]6843222
 97867555394g i|2367358(AE000491) hypothetical 52.9 kD protein in aidB-rspF intergenic68411362 
region [Escherichia coli]
 983141 82308gni|PID|d100261LivA protein [Salmonella typhimurium]6840891
 9913 16414 1 7280 gi|455363regulatory protein [Streptococcus mutans]6850867
115350543693gi| 466474cellobiose phosphotransferase enzyme II″ [Bacillus684413 62 
stea rothermophilus]
124733943221gnl|PID|d100702cut14 protein [Schizosaccharomyces pombe]6856174
125229231922gi|4 50566transmembrane protein [Bacillus subtilis]68501002 
132248582888DNA ligase [Synechocystis sp.]68521971 
140777657580gi|1209711unknown [Saccharomyces cerevisiae]6847186
1501 539  3gi|402490ADP-ribosylarginine hydrolase [Mus musculus]6859537
1641 58 867gnl|PID|e255114glutamate racemase [Bacillus subtilis]6849810
1642 8191835gnl|PID|e255117hypothetical protein [Bacillus subtilis]68501017 
169739464104hypothetical protein - Lactococcus lactis subsp. lactis plasmid pSL26840159
170< /td>442474396gi|3041466852150
171860027054g i|38722precursor (aa −20 to 381) [Acinetobacter calcoaceticus]68541053⠀‚
198324731871< /td>gnl|PID|e313075hypothetical protein [Bacillus subtilis]6846603
2112 9691802gi|1439528EIIC-man [Lactobacillus curvatus]6845834
214849264231g nl|PID|d102049H. influenzae hypothetical protein, P43990 (182) [Bacillus subtilis]6850696
217649555170g nl|PID|e326966similar to B. vulgaris CMS-associated mitochondrial . . . (reverse6836216
transcriptase) [Arabidopsis thaliana]
21873 9304745go|2293198(AF008220) YtgP [Bacillus subtilis]6838816
220646284338g nl|PID|e325791(AJ000005) orf1 [Bacillus magaterium]6851291
2361 746 108gi|410137ORFX13 [Bacillus subtilis]6846639
2372 6751451gi|396348homoserine transsuccinylase [Escherichia coli]6849777
2504 7711229gi| 310859ORF2 [Synechococcus sp.]6850459
254< /td>1 517 155gi|1787105(AE000189) o648 was o669; This 669 aa orf is 40 pct identical (16844363
gaps) to 217 residues of an approx. 232 as protein YBBA_HAEIN
SW; P45247 [Escherichia coli]
3371  1 774gnl|PID|e261990putative orf [Bacillus subtilis]6847774
3451  3 653gi|149513thymidylate sythase (EX 2.1.1.45) [Lactococcus lactis]6861651
3862 417  4gi|1573353outer membrane integrity protein (tolA) [Haemophilus influenzae]6851414
2457224697gi|1592141M. jannaschii predicted coding region MJ1507 [Methanococcus6726
jannaschii]
3653974591gi|2293175(AF008220) signal transduction regulator [Bacillus subtilis]6744807
522301 574gi‘2313385 (AE000547) para-aminobenzoate synthetase (pabB) [Helicobacter6748
pylori]
619 16063 16758 gi |413931ipa-7d gene product [Bacillus subtilis]6741696
 22870947897gi|1928962pyrroline-5-carboxylate reductase [Actinidia deliciosa]6751804
 2910 83359072go|468745gtcR gene product [Bacillus brevis]6741738
 3131379 585gi|2425123(AF019986) PksB [Dictyostelium discoideum]6749795
 3211 884910150⠀‚gi|42029ORF1 gene product [Escherichia coli]67471302 
 3616 14830 1554 6 gi|1592142ABC transporter, probable ATP-binding subunit [Methanococcus6743
jan naschii]
 389 49585392gnl|PID|e214803T2283.3 [Caenorhabditis elegans]6747435
 3821 13775 14512  gi|537037ORF_o216 [Escherichia coli]6752738
 45910428 9181gi|551710branching enzyme (glgB) (EC 2.4.1.18) [Bacillus stearothermophilus]6751 1248 
 4823 18334†‚17514 gi|413949ipa-25d gene product [Bacillus subtilis]6750831
 5021773 952YqjQ [Bacillus subtilis]6755822
 531 431  3gi|1574291fimbrial transcription regulation repressor (pilB) [Haemophilus6740429
influ enzae]
 5513 11946 gnl|PID|e252990 ORF YDL037c [Saccharomyces cerevisiae]6751795
 61992108329ATP-binding cassette transporter A [Staphylococcus aureus]6750882
 71256146117g i|1197667vitellogenin [Anolis pulchellus]6736504
 81744894983phosphoenolpyruvate:mannose phosphotransferase element IIB6742495
[Lactobacillus curvatus]
 83729573214gi|1276746Acyl carrier protein [Porphyra purpurea]6737258
 86881406809gi|1147744PSR [Enterococcus hirae]67451332 
 973 9861366 gnl|PID|d102235(AB000631) unnamed protein product [Streptococcus mutans]6743381
1021 6011413g i|682765mccB gene product [Escherichia coli]6736813
106311091987gi|14 8921LicD protein [Haemophilus influenzae]6743879
115459825656gi|8955750putative cellobiose phosphotransferase enzyme III [Bacillus subtilis]6744327
115784218077g i|466473cellobiose phosphotransferase enzyme II′ [Bacillus675134 5
stearoth ermophilus]
12713 81277021gi|147326transport protein [Escherichia coli]67451107 
136322152859g nl|PID|d100581unknown [Bacillus subtilis]6749645
14021 23317 20906⠀‚gnl|PID|d101912phenylalanyl-tRNA synthetase [Synechocystis sp.]67432412 
146628941893gi|2182994histidine kinase [Lactococcus lactis cremoris]67441002 
151811476 11117⠀‚gnl|PID|d100085ORF129 [Bacillus cereus]6748360
16010 74538646gi|2281317OrfB; similar to a Streptococcus pneumoniae putative membrane67461194 
protein encoded by GenBank Accession Number X99400;
inactivati on of the OrfB gene leads to UV-sensitivity and to decrease
of homologous recombination (plasmidic test) [Lactococcus 1
163330994505gnl|PID|d101317YqfR [Bacillus subtilis]67471407 
167867045454DibB [Lactobacillus casei]67451251 
169423222879 gnl|PID|d101331YqkG [Bacillus subtilis]6741558
17111 76568384pneumococcal surface protein A [Streptococcus pneumoniae]6750729
188319303723gi|1542975AbcB [Thermoanaerobacterium thermosulfurigenes]6746 1794 
18963599 3141gnl|PID|e325178Hypothetical protein [Bacillus subtilis]6752459
205316632211g i|606073ORF_o169 [Escherichia coli]6747549
207428963456gi|22 76374DtxR/iron regulated lipoprotein precursor [Corynebacterium6749561
d iphtheriae]
217340863703gi|895750putative cellobiose phosphotransferase enzyme III [Bacillus subtilis]6742384
2462 291 662unknown [Bacillus subtilis]6743372
2521  2 745gi|2341768PspA [Streptococcus pneumoniae]6741744
265311341811gi|2313847(AE000585) L-asparaginase II (ansB) [Helicobacter pylori]6742678
2951  1 375gi|2276374DtxR/iron regulated lipoprotein precursor [Corynebacterium6743375
d iphtheriae]
1748985146gnl|PID|e25517 9unknown [Mycobacterium tuberculosis]6656249
31 389  3gnl|PID|e269548Unknown [Bacillus subtilis]6648387
320 19267 20805 gi |39956IIGlc [Bacillus subtilis]68501539 
4325452718gi|1787564(AE000228) phage shock protein C [Escherichia coli]6636174
5913197 12592 gi|157 4291fimbrial transcription regulation repressor (pilB) [Haemophilus6646606
influ enzae]
9428721451gnl|PID|e26692 8unknown [Mycobacterium tuberculosis]66431422†‚
 12214691200 gi|520407orf2; GTG start codon [Bacillus thuringiensis]6642270
 1512 10979 9897gi|2314738(AE0000653) translation elongation factor EF-Ts (tsf) [Helicobacter6649
pylori]
 1621 312 734gnl|PID|d102245(AB005554 ) yxbF [Bacillus subtilis]6635579
 22313721851gi|1480916signal peptidase type II [Lactococcus lactis]6638480
 22758287096g nl|PID|e206261gamma-glutamyl phosphate reductase [Streptococcus thermophilus]66511269†‚
 2220 16194  17138 gnl|PID|e281914YitL [Bacillus subtilis]6650945
 302 530 976gi|2314379(AE000627) ABC transporter, ATP-binding protein (yhcG)6640447
< td/>[Helicobacter pylori]
 321⠀‚199 984gi|312444ORF2 [Bacillus caldolyticus]6649786
 3313 83527234 gi|138797944% identity over 302 residues with hypothetical protein from66441119 
Synechocystis sp, accession D64006_CD; expression induced by
environmental stress; some similarity to glycosyl transferases; two
potential membrane-spanning helices [Bacillus subtil
 34656 584708gnl|PID|e250724orf2 [Lactobacillus sake]6639951
 3414 97929574gi|1590997M jannaschii predicted coding region MJ0272 [Methanococcus6648
jan naschii]
 3516  15163 14501 gi|1773352Cap 5M [Staphylococcus aureus]6646663
 36961736976g i|1518680minicell-associated protein DivIVA [Bacillus subtilis]6635804
 3611 10396 1082 4 bbs|155344insulin activator factor, INSAF [human, Pancreatic insulinoma,6643429
Peptide Partial, 744 aa] [Homo sapiens]
 481 281419gnl|PID|e325204hypothetical protein [Bacillus subtilis]66501392 
 48738104112gi|2182574(AE000090) Y4pE [Rhizobium sp. NGR234]6640303
⠀‚52435952789gi|388565major cell-binding factor [Campylobacter jejuni]6652807
 54326621076g nl|PID|d101831glutamine-binding periplasmic protein [Synechocystis sp.]66431587 
 6110 97409183gnl|PI D|e154144mdr gene product [Staphylococcus aureus]6644558
 7213 10893 11993⠀‚gi|2313129(AE000526) H. pylori predicted coding region HP0049 [Helicobacter6644
pylori]
 7491 3267 12476 gi|1573941hypothet ical [Haemophilus influenzae]6643792
 751  2 868gi|1574631nicotinamide mononucleotide transporter (pnuC) [Haemophilus6648867
influ enzae]
 75753 034275gi|41312put. EBG repressor protein [Escherichia coli]66401029 
 82768138123gnl|PID|e255128trigger factor [Bacillus subtilis]66531311 
 833 9051219pir|C33496|C334hisC homolog - Bacillus subtilis6644315
 8610 94078925 gi|683584shikimate kinase [Lactococcus lactis]6641483
 8810 70016060putative fimbrial-associated protein [Actinomyces naeslundii]6652942
1 951  4gi|410118ORFX19 [Bacillus subtilis]6641948
 93736612711gi|1787936(Ae000260) f298; This 298 as orf is 51 pct identical (5 gaps) to 2976649951
residue of an approx. 304 as protein YCSN_BACSU SW; P42972
[Es cherichia coli]
10431805< /td>3049gi|1469784putative cell division protein ftsW [Enterococcus hirae]66481245 
10614 13576 14253  gi|40027homologous to E. coli gidB [Bacillus subtilis]6652678
1073 9651864gi|144858ORF A [Clostridium perfringens]6649900
112757186593DprA [Haemophilus influenzae]6643876
1151  3 302gi|727367Hyrlp [Saccharomyces cerevisiae]6656300
1221  3 566gnl|PID|d101328YqiY [Bacillus subtilis]6636564
126811759 11046 gnl|PID|d101163ORF3 [Bacillus subtilis]6648714
12811 82018431growth associated protein GAP-43 [Xenopus laevis]6641231
131848944508gi| 486661TMnm related protein [Saccharomyces cerevisiae]6639387
140332362574gi|40056phoP gene product [Bacillus subtilis]6636663
14015 16318 15434⠀‚gi|16581895,10-methylenetetrahydrofolate reductase [Erwinia carotovara]6648885
14612 79267636gnl|PID|d101140transposase [Synechocystis sp.]6642291
147< /td>671376154gi|4723266648984
magnu m]
149644355430gnl|PID|d101887pentose-5-phosphat e-3-epimerase [Synechocystis sp.]6646996
149< /td>13 10754 11575 gi|4 2371pyruvate formate-lyase activating enzyme (AA 1-246) [Escherichia6642822
coli< /i>]
18642578gnl|PID|d101199ORF11 [Enterococcus faecalis]664309
207223402597gn l|PID|e321893envelope glycoprotein gp160 [Human immunodeficiency virus6646258
type 1]
210733583678< /td>gi|49318ORF4 gene product [Bacillus subtilis]6646321
217851435355g i|49538thrombin receptor [Cricetulus longicaudatus]6638213
220438753642 gi|466648alternate name ORFD of L23635 [Escherichia coli]6633234
22311070 138gnl |PID|e247187zinc finger protein [Bacteriophage phigle]6645933
224218642640gi| 1176399putative ABC transporter subunit [Staphylococcus epidermidis]6641777
2431  3 872dbj||AB000617_2(AB00061 7) YcdH [Bacillus subtilis]6645870
2682 891 568putative transposase [Streptococcus pyogenes]6660324
3221  2 643gi|1499836Zn protease [Methanococcus jannaschii]6640642
510 13909 13178 gi |1574292hypothetical [Haemophilus influenzae]6534732
611 10465 11190 gi |142854homologous to E. coli radC gene product and to unidentified protein6548726
from Staphylococcus aureus [Bacillus subtilis]
72 647 405pir|C64146 |C641hypothetical protein HI0259 - Haemophilus influenzae (strain RD6542243
KW20)
7762466821gni|PID|d10132 3YqhU [Bacillus subtilis]6550576
 10218731397gi|1163111ORF-1 [Streptococcus pneumoniae]6554477
 16314282222hypothetical protein [Bacillus subtilis]6545795
 21438153357gnl|PID|e314910hypothetical protein [Staphylococcus sciuri]6540459
 2234 25776 26384⠀‚gi|1123030CpxA [Actinobacillus pleuropneumoniae]654260 9
 4321648 2 90gi|1044826F14E5.1 [Caenorhabditis elegans]65381359 
 4813 10062 1 0856 gi|1573390hypothetical [Haemophilus influenzae]6545795
 4822 17521 16 883 gi|1573391hypothetical [Haemophilus influenzae]6537639
 4825 19027 18 533 gnl|PID|e264484YCR020c, len:215 [Saccharomyces cerevisiae]6538495
 49338565334putative transcriptional regulator [Bacillus stearothermophilus]6532 1479 
 5065337gi|171963tRNA isopentenyl transferase [Saccharomyces cerevisiae]6542819
 5215 14728 15 588 gi|1499745M. jannaschii predicted coding region MJ0912 [Methanococcus6546
jan naschii]
 597 39634745gi|496514orf zeta [Streptococcus pyogenes]5442783
 68325003483gi|887824ORF_o310 [Escherichia coli]6546984
 69321711077gnl |PID|e311453unknown [Bacillus subtilis]65421095 
 69760295325gi|809660deoxyribose-phosphate aldolase [Bacillus subtilis]6555705
 71585369783gi|1573224glycosyl transferase lgtC (GP:U14554_4) [Haemophilus influenzae]65421248 < /td>
 72876648527gnl|PID|e267589Unknown, highly similar to several spermidine synthases [Bacillus653986 4
subtilis ]
 76557734097gnl|PID|d101723DNA REPAIR PROTEIN RECN (RECOMBINATION PROTEIN65441677 
N). [Escherichia coli]
 769809 97875gi|1574276exodeoxyribonuclea se, small subunit (xseB) [Haemophilus6538225
influ enzae]
 84228 702352gi|2313188(AE000532) conserved hypothetical protein [Helicobacter pylori]6541519
 8615 14495 13407⠀‚gnl|PID|d1018803-dehydroquinate synthase [Synechocystis sp.]65441089 
 87337062423gi|151259< /td>HMG-CoA reductase (EC 1.1.1.88) [Pseudomonas mevalonii]65511284 
 88324252736gi|1098510unknown [Lactococcus lactis]6530213
 89216271007g nl|PID|d102008(AB001488) SIMILAR TO ORF14 OF ENTEROCOCCUS6541621
FAECA LIS TRANSPOSON TN916. [Bacillus subtilis]
11166 6356186gnl|PID|e246063NM23/nucleo side diphosphate kinase [Xenopus laevis]6550450
1161  31016gnl|PID|d101125queuosine biosynthesis protein QueA [Synechocystis sp.]65441014 
1231 69 389gi|498839ORF2 [Clostridium perfringens]6536321
123765227190DNA-binding response regulator [Thermotoga maritima]6539669
125338212859g nl|PID|e257609sugar-binding transport protein [Anaerocellum thermophilum]6547963
13712 80157818gi|2182574(AE000090) Y4pE [Rhizobium sp. NGR234]6541198
1 47450213884gi|472329dihydrolipoamide acetyltransferase [Clostridium magnum]65471137 
148210531931gnl|PID|d101319YqgH [Bacillus subtilis]6542879
151232124687g i|304987EcoE type I restriction modification enzyme M subunit [Escherichia65501476 
coli]
1562 730 437gi|310893membrane protein [Theileria parva]6547294
164742564837gi|4 10132ORFX8 [Bacillus subtilis]6548582
169531923914g i|1552737similar to purine nucleoside phosphorylase (deoD) [Escherichia coli]6541723
176429512220gnl|P ID|e339500oligopeptide binding lipoprotein [Streptococcus pneumoniae]6543732
195445563900gi|1592142ABC transporter, probable ATP-binding subunit [Methanococcus6540
jan naschii]
1961†‚1601572gnl|PID|d102004(AB001488) PROBABLE UDP-N-65511413 
ACETYLMURAMOYLALANYL-D-GLUTAMYL-2 ,6-
DIAMINOLIGASE (EC 6.3.2.15). [Bacillus subtilis]
20422 2461215gi|143156membrane bound protein [Bacillus subtilis]65371032 
210415441891ORF1 gene product [Bacillus subtilis]6548348
24221625 723gi|1787540(AE000226) f249; This 249 aa orf is 32 pct identical (8 gaps) to 2446542903
residues of an approx. 272 as protein AGAR_ECOLI SW: P42902
[Es cherichia coli]
2841  1 900gi|559861clyM [Plasmid pAD1]6536900
304 1  2 574gnl|PID|e290934unknown [Mycobacterium tuberculosis]6552573
3151  21483gi|790694mannuronan C-5-epimerase [Azotobacter vinelandi]65571482 
1201  3 569gnl|PID|d102048 K. aerogenes, histidine utilization repressor; P12380 (199) DNA6546567
binding [Bacillus subtilis]
3581  1 309gnl|PID|e323508YloS protein [Bacillus subtilis]6555309
2775716696gi|1498753nicotinate-nucleotide pyrophosphorylase [Rhodospirillum rubrum]6447876
6659246802gnl|PID|d10111 1methionine aminopeptidase [Synechocystis sp.]6452879
8434173686gi|1045935DNA helicase II [Mycoplasma genitalium]6458270
 11432492689OrfB [Streptococcus pneumoniae]6446561
 15765047145Ycr59c/YigZ homolog [Bacillus subtilis]6445642
 2211 95489895gnl|PID|d100581unknown [Bacillus subtilis]6438348
 2230 22503 2317 4 gi|289260comE ORF1 [Bacillus subtilis]6444672
 26714375 14199  gi|409286bmrU [Bacillus subtilis]6430177
 27215101334gi|40795DdeI methylase [Desulfovibrio vulgaris]6451177
 292 614 297gi|2326168type VII collagen [Mus musculus]6450318
 352 368 721pir|JC1151|JC11hypothetical 20.3K protein (insertion sequence IS1131) -6450354
< td/>Agrobacterium tumefaciens (strain PO22) plasmid Ti
 401  3 449gi|46970epiD gene product [Staphylococcus epidermidis]6441447
 40746834976 gnl|PID|e325792(AJ000005) glucose kinase [Bacillus megaterium]6445294
 45780686920subunit of ADP-glucose pyrophosphorylase [Bacillus644011 49 
stea rothermophilus]
 512 3011059gi|43985nifS-lik e gene [Lactobacillus delbrueckii]6454759
 5113 15251 1 8397 gi|2293260(Af008220) DNA-polymerase III alpha-chain [Bacillus subtilis]64463147 
 5331157 555gi|1574292hypothetical [Haemophilus influenzae]6447603
 58242361606alanyl-tRNA synthetase (alaS) [Haemophilus influenzae]64512631 < /td>
 661  31259gi|895749putative cellobiose phosphotransferase enzyme II″ [Bacillus subtilis]64421257 
 68552136556gi|436965[malA] gene products [Bacillus stearothermophilus]6447 1344 
 6965356gnl|PID|d101316Cdd [Bacillus subtilis]6452408
 74459485038gi|726480L-glutamine-D-fructose-6-phosphate amidotransferase [Bacillus645019 11 
subt ilis]
 753128 31465bbs|133379TLS-CHOP = fusion protein (CHOP = C/EBP transcription factor,6457183
TLS = nuclear RNA-binding protein) (human, myxoid liposarcomas
cells , Peptide Mutant, 462 aa) [Homo sapiens]
 8113  14016 14231 gi|143175meth anol dehydrogenase alpha-10 subunit [Bacillus sp.]6435216
 8 322 21851 22090 gn l|PID|d01315YqfA [Bacillus subtilis]6444240
 8711 10046 9300 gnl|PID|e323505putative Ptcl protein [Bacillus subtilis]6443747
 98750325706gnl|PID|e233880hypothetical protein [Bacillus subtilis]6438675
1051  21276gi|1657503similar to S. aureus mercury(II) reductase [Escherichia coli]64451275 
113751366410g nl|PID|d101119NifS [Synechocystis sp.]64501275 
1191  21297gnl|PID|e320520hypothetic al protein [Natronobacterium pharaonis]64371296 
123311252156 gnl|PID|e253284ORF YDL44w [Saccharomyces cerevisiae]64401032 < /td>
124523311780gnl|PID|d101884hypothetical protein [Synechocystis sp.]6450552
129< /td>434672709gnl|PID|d101314< /td>YqeU [Bacillus subtilis]6452759
1311 152  3gi|1377841unknown [Bacillus subtilis]6442150
13711 71967549hypothetical 20.3K protein (insertion sequence IS1131) -6450354
< td/>Agrobacterium tumefaciens (strain PO22) plasmid Ti
139332262651< /td>gi|2293301(AF008220) YtqB [Bacillus subtilis]6444576
14610 67305648mevalonate pyrophosphate decarboxylase [Rattus norvegicus]64451083 < /td>
1471  21018gnl|PID|e137033unknown gene product [Lactobacillus leichmannii]64461017 
14811 8430878 3gi|2130630(AF000430) dynamin-like protein [Homo sapiens]6428354
156743133612gn l|PID|d102050transmembrane [Bacillus subtilis]6431702
157412992114g nl|PID|d100892homologous to Gln transport system permease proteins [Bacillus644381 6
subtilis ]
16265880 6362gi|517204ORF1, putative 42 kDa protein [Streptococcus pyogenes]6458483
16413 97078769homologue of ferric anguibactin transport system permerase protein6440939
FatD of V. anguillarum [Bacillus subtilis]
17553 9063598gi|534045antiterminator [Bacillus subtilis]6439693
18910 61546507response regulator [Lactobacillus plantarum]6433354
191435192863 gi|149520phosphoribosyl anthranilate isomerase [Lactococcus lactis]6446657
2021 761140gnl|PID|e293806o-acetylhomo serine sulfhydrylase [Leptospira meyeri]64471065 
2241 2341571collagenase (prtC) [Haemophilus influenzae]64421338 < /td>
2313 291 647 gi|40174ORF X [Bacillus subtilis]6443357
2533 7091089pir|JC1151|HC11hypothetical 20.3K protein (insertion sequence IS1131) -6450381
< td/>Agrobacterium tumefaciens (strain PO22) plasmid Ti
2651 820  2gi|1377832unknown [Bacillus subtilis]6431819
2971  1 660gi|1590871collagenase [Methanococcus jannaschii]6448660
3281 263 21gi|992651Gln4p [Saccharomyces cerevisiae]6441243
5487308098gi|556885 Unknown [Bacillus subtilis]6448633
 10651784483gi|1573101hypothetical [Haemophilus influenzae]6340696
 1211 93249902gi|806536membrane protein [Bacillus acidopullulyticus]63425 79
 1510 88979187gi|722339unknown [Acetobacter xylinum]6340291
 1721031 309PlnU [Lactobacillus plantarum]6332723
 18877786975unknown [Bacillus subtilis]6345804
 26497807078gi|142440ATP-dependent nuclease [Bacillus subtilis]63462703 
 29534884192gi|1377829unknown [Bacillus subtilis]6335705
 3411 88307988gnl|PID|d101198ORF8 [Enterococcus faecalis]6345843
 3531187 876unknown [Acetobacter xylinum]6339312
 4815 12509 11691  gi|1573389hypothetical [Haemophilus influenzae]6341819
 5111 12719 12 189 gi|142450ahrC protein [Bacillus subtilis]6335531
 55439795022gi|1708640YeaB [Bacillus subtilis]63411044 
 5515 13669  14670 gnl|PID|e311502thioredoxine reductase [Bacillus subtilis]63441002 
 6810 92428919 sp|P37686|YIAY—HYP OTHETICAL 40.2 KD PROTEIN IN AVTA-SELB6340324
INTERGENIC REGION (F382)
 86765545685gi|1574382lic-1 operon protein (licD) [Haemophilus influenzae]6341870
 88860855180putative fimbrial-associated protein [Actinomyces naeslundii]6343906
 96858586484orflgyrb gene product [Streptococcus pneumoniae]6338627
1001 2401940fucosidase [Dictyostelium discoideum]63361701 < /td>
104430635765gi|144985phosphoenolpyrubate carboxylase [Corynebacterium glutamicum]63462703 < /td>
106891898554gi|533099endonuclease II [Bacillus subtilis]6345636
122647044886g nl|PID|d101139transposase [Synechosystis sp.]6339183
128< /td>745175203gnl|PID|d101434< /td>orf2 [Methanobacterium thermoautotrophicum]6350687
1374 9631 547gi|472920v-type Na-ATPase [Enterococcus hirae]6327585
142741004585gnl| PID|e313025hypothetical protein [Bacillus subtilis]6344486
159517412571g i|1787043(AE000184) f271; This 271 aa orf is 24 pct identical (16 gaps) to 2656339831
residues of an approx. 272 aa protein YIDA_ECOLI SW: PO9997
[Es cherichia coli]
17112 8 80314406 gnl|PID|e324918Iga1 protease [Streptococcus sanguis]63485604 
1771  3 347gi|1773150hypothetical 14.8kd protein [Escherichia coli]6334345
1782 423 917g i|722339unknown [Acetobacter xylinum]6341495
1783 7941012 gi|1591582cobalamin biosynthesis protein N [Methanococcus jannaschii]6336219
19511377 175ftsQ [Enterococcus hirae]63331203 
234517391527 gi|1591582cobalamin biosynthesis protein N [Methanococcus jannaschii]6336213
2491 81 257gi|1000453TreR [Bacillus subtilis]6341177
2831 1271347gi|396486ORF8 [Bacillus subtilis]63441221 
293328043466unknown [Acetobacter xylinum]6337663
3111 905 486UDP-galactose 4-epimerase [Streptococcus mutans]6346420
3241  2 556gi|1477741histidine periplasmic binding protein P29 [Campylobacter jejuni]6336555
3651 219 13gi|2252843(AF013293) No definition line found [Arabidopsis thaliana]6333207
3821 88 378gi|722339unknown [Acetobacter xylinum]6340291
3853 364 158(AF013293) No definition line found [Arabidopsis thaliana]6333207
212495 288gnl|PID|e325 007penicillin-binding protein [Bacillus subtilis]62422208 
323 23374 24231 gn l|PID|e254993hypothetical protein [Bacillus subtilis]6235858
616 14320 13193 gn l|PID|e349614nifS-like protein [Mycobacterium leprae]62371128 
7868197232gnl|PID|d10132 4YqhY [Bacillus subtilis]6232414
719 15466 14207 gn l|PID|d101804beta ketoacyl-acyl carrier protein syntase [Synechocystis sp.]62431260 
721 17155 16229 gn l|PID|e323514putative FabD protein [Bacillus subtilis]6246927
724 19526 18519 gi |1276434beta-ketoacyl-ACP synthase III [Cuphea wrightii]62371008 
 12759044702gi|1573768A/G-specific adenine glycosylase (mutY) [Haemophilus influenzae]62431203 < /td>
 12980328793gi|1591587pantothenate metabolism flavoprotein [Methanococcus jannaschii]6233762
 1511 96789328pir|JC1151|JC11hypothetical 20.3K protein (insertion sequence IS1131) -6243351
< td/>Agrobacterium tumefaciens (strain PO22) plasmid Ti
 1742609244 2gi|1591081M. jannaschii predicted coding region MJ0374 [Methanococcus6243
jan naschii]
 175 30532835gi|149570role in the expression of lactacin F, part of the laf operon6244219
< td/>[Lactobacillus sp.]
 2210 8627gnl|PID|d100580similar to B. subtilis DnaH [Bacillus subtilis]6243912
 303 8652043(AE000627) ABC transporter, ATP-binding protein (yhcG)62431179 
[Helicobacter pylori]
 3352 2351636gi|413976ipa-52r gene product [Bacillus subtilis]6244600
 3811 56896123gi|148231o251 [Escherichia coli]6234435
 4017 14272 13328  gnl|PID|d101904hypothetical protein [Synechocystis sp.]6243945
 4 21  3 311gi|1146182putative [Bacillus subtilis]6241309
 44212674005gi|1786952(AE000176) o877; 100 pct identical to the first 86 residues of the 10062432739 
aa hypothetical protein fragment YBGB_ECOLI SW: P54746
[Es cherichia coli]
 4812 97329304gi|662920repressor protein [Enterococcus hirae]6232429
 51856647181gn l|PID|e301153StySKI methylase [Salmonella enterica]62441518 
 52327912099gi|1183886integral membrane protein [Bacillus subtilis]6241693
 5516 15702 1470 4 gnl|PID|e313028hypothetical protein [Bacillus subtilis]6240999
 59634183984gi|2065483unknown [Lactococcus lactis lactis]6232567
 63549974809g i|149771pilin gene inverting protein (PivML) [Moraxella lacunata]6228189
 7014 10002 1073 9 gi|992977hplG gene product [Bordetella pertussis]6245738
 7113 18790 203 82 gi|1280135coded for by C. elegans cDNA cm21e6; coded for by C. elegans62621593 
cDNA cm01e2; similar to melibiose carrier protein
(thiomethy lgalactoside permease II) [Caenorhabditis elegans]
 7128  32217 32768 gnl|PID|d1013126235552
 74711666 10383  gi|1552753hypothetical [Escherichia coli]62381284 
 80893709609gnl|PID|d102002(AB001488) FUNCTION UNKNOWN. [Bacillus subtilis]6246240
 9710 90687041gi|882463protein-N(pi)-phosphohistidine-sugar phosphotransferase [Escherichia62422028 
c oli]
 9842306 3268gnl|PID|d101496BraE (integral membrane protein) [Pseudomonas aeruginosa]6242963
102328233539gnl|PID|e313010hypothetical protein [Bacillus subtilis]6224717
103327951242g nl|PID|d102049H. influenzae hypothetical ABC transporter; P44808 (974)62411554 
[Bacillus subtilis]
11122 0353462gi|581297NisP [Lactococcus lactis]62441428 
112431544080gi|1574379lic-1 operon protein (licA) [Haemophilus influenzae]6239927
112649395649gi|1574381lic-1 operon protein (licC) [Haemophilus influenzae]6239711
12431137 721anaerobic ribonucleoside-triphosphate reductase (nrdD) [Haemophilus6245417
influ enzae]
12463162 2329gi|609076leucyl aminopeptidase [Lactobacillus delbrueckii]6240834
126711073 7516gnl|PID|d101163ORF4 [Bacillus subtilis]62383558 
129649834540zinc finger protein EF6 - Chilo iridescent virus6248444
131 745104103gi|1857245 unknown [Lactococcus lactis]6242408
149219232579gi| 1592142ABC transporter, probable ATP-binding subunit [Methanococcus6241
jan naschii]
149753 606055gnl|PID|e323508YloS protein [Bacillus subtilis]6240696
1561 450 238membrane protein [Streptococcus pneumoniae]6240213
156636062935gnl|PID|d102050transmembrane [Bacillus subtilis]6237672
171217792291g i|43941EIII-B Sor PTS [Klebsiella pneumoniae]6235513
1722 385 723gi|895750putative cellobiose phosphotransferase enzyme III [Bacillus subtilis]6239339
17332599 893gi|1591732cobalt transport ATP-binding protein O [Methanococcus jannaschii]62421707 < /td>
1792 4921754gi|1574071H. influenzae predicted coding region HI1038 [Haemophilus influenzae]62381263 < /td>
181628563707gi|1777435LacT [Lactobacillus casei]6242852
18522074 311gi |2182397(AE00073) Y4fN [Rhizobium sp. NGR234]62411764 
210611984gi|450566 transmembrane protein [Bacillus subtilis]6237924
202325833473g i|42219P35 gene product (AA 1-314) [Escherichia coli]6241891
210313741565gi|49 315ORF1 gene product [Bacillus subtilis]6245192
2111  3 971gi|147402mannose permease subunit III-Man [Escherichia coli]6243969
223214951034gnl|P ID|d101190ORF2 [Streptococcus mutans]6241462
2281 34 909gi|530063glycerol uptake facilitator [Streptococcus pneumoniae]6244876
2342 90 917gi|2293259(AF008220) YtqI [Bacillus subtilis]6238828
282517651487g nl|PID|e273475galactokinase [Arabidopsis thaliana]6233279
3751  1 159gi|1674231(AE000052) Mycoplasma pneumoniae, hypothetical protein homolog;6240159
similar to Swiss-Prot Accession Number P35155, form B. subtilis
[Mycoplasma pneumoniae]
3855 584 357gi|1573353outer membrane integrity protein (tolA) [Haemophilus influenzae]6247228
319 18550 19269 gi |606162ORF_f229 [Escherichia coli]6141720
7427253225gi|2114425similar to Synechocystis sp. hypothetical protein, encoded by6142501
GenBank Accession Number D64006 [Bacillus subtilis]
 17633263054gi|149569lactacin F [Lactobacillus sp.]6143273
 4 4340614957gnl|PID|d10106 8xylose repressor [Synechocystis sp.]6138897
 5 411 83887234gnl|PID|d1 01329YqjH [Bacillus subtilis]61421155 
 57639746037gnl|PID|d101316YqfK [Bacillus subtilis]61422064 
 58573566565sp|P45169|POTC—SPERMID INE/PUTRESCINE TRANSPORT SYSTEM PERMEASE6134792
PROTEIN POTC.
 671  3 692gi|537108ORF_f254 [Escherichia coli]6146690
 68988167890gi| 19501pPLX12 gene product (AA 1-184) [Lupinus polyphyllus]6141927
 7015 10737 1 2008 gi|992976bplF gene product [Bordetella pertussis]61441272 
 7211 9759102 02 gnl|PID|d101833carboxynorspermidine decarboxylase [Synechocystis sp.]6136444
 7 6878817003gnl|PID|d10030 5farnesyl disphosphate syntase [Bacillus stearothermophilus]6145 879
 874491436 97gi|528991unknown [Bacillus subtilis]61421218 
 8713 12311  11361 gi|1789683(AE000407) methionyl-tRNA formultransferase [Escherichia coli]6144951
 912 7312989g i|537080ribonucleoside triphosphate reductase [Escherichia coli]61452259 
105327113499g nl|PID|d101851hypothetical protein [Synechocystis sp.]6144789
115< /td>679866478gi|89574761361491 
123871818518protein histidine kinase [Enterococcus faecalis]61401338 
126675256725(AE000184) f271; This 271 as orf is 24 pct identical (16 gaps) to 2656138801
residues of an approx. 272 as protein YIDA_ECOLI SW; PO9997
[Es cherichia coli]
1281 11 1 639gnl|PID|d101328YqiY [Bacillus subtilis]6141639
139747945054g i|1022726unknown [Staphylococcus haemolyticus]6141261
139912632 5913gnl|PID|e270014beta-galactosidase [Thermoanaerobacter ethanolicus]61416720 
14312552 42gi|520541penicillin-binding proteins 1A and 1B [Bacillus subtilis]61422511 
14816 12125 11 424 gi|1552743tetrahydrodipicolinate N-succinyltransferase [Escherichia coli]6142702
162341123456gnl|P ID|d101829phosphoglycolate phosphatase [Synechocystis sp.]6130657
172< /td>3 7271077gnl|PID|d10204 8B. subtilis, cellobiose phosphotransferase system, celA; P46318 (220)6144351
[Bacillus subtilis]
17731 1011772gnl|PID|d100574unknown [Bacillus subtilis]6143672
202212782585g i|1045831hypothetical protein (GB:L18965_6) [Mycoplasma genitalium]61361308 < /td>
224327823144gi|1591144M. jannaschii predicted coding region MJ0440 [Methanococcus6130
jan naschii]
225433 953766gi|1552774hypothetical [Escherichia coli]6140372
2492 212 802g i|1000453TreR [Bacillus subtilis]6142591
2542 843 484ORF120 [Escherichia coli]6136360
2571  3 350gnl|PID|e255315unknown [Mycobacterium tuberculosis]6142348
293439713657hypothetical 20.3K protein (insertion sequence IS1131) -6145315
< td/>Agrobacterium tumefaciens (strain PO22) plasmid Ti
3011 949 17gi|2291209(AF016424) contains similarity to acyltransferases [Caenorhabditis6133 933
el egans]
37311066 287gi|393396Tb-292 membrane associated protein [Trypanosoma brucei subgroup]6138780
324 24473 24955 gi |537093ORF_o153b [Escherichia coli]6027483
6546365739gi|2293258(AF008220) YtoI [Bacillus subtilis]60351104 
612 11936 11187 gi |293017ORF3 (put.); putative [Lactococcus lactis]6044750
 1713 67086484lactacin F [Lactobacillus sp.]6032225
 1 8769775670gi|1788140(AE000278) o481; This 481 as orf is 35 pct identical (19 gaps) to 30960431308 
residues of an approx. 856 aa protein NOL1_HUMAN SW: P46087
[Es cherichia coli]
 2015 15878 17167 gnl|PID|d100584u nknown [Bacillus subtilis]60441290 
 221  1 243gnl|PID|d102050transmem brane [Bacillus subtilis]6036243
 3210 82968964gi|2293275(AF008220) YtaG [Bacillus subtilis]6037669
 3815 88379697gi|40023B. subtilis genes rpmH, rnpA, 50kd, gidA and gidB [Bacillus subtilis]6035861
 43666105944gi|171787protein kinase 1 [Saccharomyces cerevisiae]60362667 < /td>
 441  11269gnl|PID|e235823unknown [Schizosaccharomyces pombe]60441269 
 4510 11138 103 68 gi|3974881,4-alpha-glucan branching enzyme [Bacillus subtilis]6043771
 4819 15766 1437 8 gnl|PID|e205173orf1 [Lactobacillus helveticus]60391389 < /td>
 4821 16727 gnl|PID|d102041(AB002668) unnamed protein product [Haemophilus6032225
actin omycetemcomitans]
 501  2 898gnl|PID|e246537ORF286 protein [Pseudomonas stutzeri]6031897
 622 6381177unknown [Bacillus subtilis]6042540
 68435905203gi|1573583H. influenzae predicted coding region HI0594 [Haemophilus60361614 
i nfluenzae]
 7011 57816182gnl|PID|d102014(AB0 01488) SIMILAR TO YDFR GENE PRODUCT OF THIS6033402
ENTRY (YDFR_BACSU) [Bacillus subtilis]
 7012 63438133gnl|PID|e324970hypot hetical protein [Bacillus subtilis]60381791 
 71811701 1415 7 gi|580866ipa-12d gene product [Bacillus subtilis]60332457 
 74812509 1166 4 gnl|PID|d101832phosphatidate cytidylyltransferase [Synechocystis sp.]6045846
 7 6441163367gi|2352096orf; similar to serine/threonine protein phosphatase [Fervidobacterium6039750
islandicum]
 8047665gi|1786420(AE000131) f86; 100 pct identical to GB: ECODINJ_66030294
ACCESSION: D38582 [Bacillus subtilis]
 81640734522gi|147402mannose permease subunit III-Man [Escherichia coli]6035450
 861 940 155gi|143177putative [Bacillus subtilis]6026786
 921  1 192gi|396348homoserine transsuccinylase [Escherichia coli]6045192
 9314 10619 9384gi|1788389(AE000297) o464; This 464 aa orf is 33 pct identical (9 gaps) to 33160271236 
residues of an approx. 416 aa protein MTRC_NEIGO SW: P43505
[Es cherichia coli]
 945554 88121gnl|PID|e329895(AJ000496) cyclic nucleotide-gated channel beta subunit [Rattus60502574  
norveg icus]
 977539 74533gi|1591396transketolase′ [Methanococcus jannaschii]6043864
102220812833gnl|PID|e320929hypothetical protein [Mycobacterium tuberculosis]6043753
106997739183YlbN protein [Bacillus subtilis]6031591
113863616837g i|466875nifU; BB1496_C1_157 [Mycobacterium laprae]6043477
11522755 524g nl|PID|e328143(AJ000332) Glucosidase II [Homo sapiens]60322232 
122747635068transposase [Synechocystis sp.]6039306
127< /td>845105283gi|17779386038774
138430822672g nl|PID|e325196hypothetical protein [Bacillus subtilis]6036411
1391 177  4gnl|PID|d100680ORF [Thermus thermophilus]6039174
13911 14520 13 009 gi|537145ORF_f437 [Escherichia coli]60301512 
140225921249g i|1209527protein histidine kinase [Enterococcus faecalis]60371344 
1411 2101049gi|463181E5 ORF from bp 3842 to 4081; putative [Human papillomavirus6034840
type 33]
141553686405 gi|145362tyrosine-sensitive DAHP synthase (aroF) [Escherichia coli]60411038 
142635584049g i|600711putative [Bacillus subtilis]6037492
14810 77428713hypothetical protein [Bacillus subtilis]6027972
153536674278g i|2293322(AF008220) branch-chain amino acid transporter [Bacillus subtilis]6042612
15511413 748gi|2104504putative UDP-glucos dehydrogenase [Escherichia coli]6040666
158331162472gnl|P ID|d100872a negative regulator of pho regulon [Pseudomonas aeruginosa]6037645
1593 7783386product highly similar to Bacillus anthracis CapA [Bacillus subtilis]6048609
163780498468g nl|PID|d101313YqeN [Bacillus subtilis]6038420
170341302688g i|1574179H. influenzae predicted coding region HI1244 [Haemophilus60391443 
i nfluenzae]
1717 47175901gi|606076ORF_o384 [Escherichia coli]60441185 
183324402135g i|1877427repressor [Streptococcus pyogenes phage T12]6038306
191< /td>10 94448428gi|415664catabolite control protein [Bacillus megaterium]60421017 < /td>
2001 1391083gi|438462transmembrane protein [Bacillus subtilis]6037945
201338951928g i|475112enzyme IIabe [Pediococcus pentosaceus]60391968 
21415 10930 10439 gi|1573407hypothetical [Haemophilus influenzae]6039492
218421452363gi|608520myosin heavy chain kinase A [Dictyostelium discoideum]6031219
226425182351gi|437705hyaluronidase [Streptococcus pneumoniae]6053168
2421 725  3gi|43938Sor regulator [Klebsiella pneumoniae]6041723
2451  1 288gi|304897EcoE type I restriction modification enzyme M subunit [Escherichia6056288
coli< /i>]
2511 905 45gi|671632unknown [Staphylococcus aureus]6036861
2591 969 82gi|153794rgg [Streptococcus gordonii]6032888
260214921662p ir|S31840|S318probable transposase - Bacillus stearothermophilus60261 71
2741 836 96gi|1592173N-ethylammeline chlorohydrolase [Methanococcus jannaschii]6040741
3081 463  2gi|1787397(AE000214) o157 [Escherichia coli]6043462
3181  3 308gnl|PID|e137594xerC recombinase [Lactobacillus leichmannii]6042306
3441 73 522gi|509672repressor protein [Bacteriophage Tuc2009]6032450
51 576  4gi|2293147(AF008220) YtxM [Bacillus subtilis]5931573
722 18140 17142 gn l|PID|e280724unknown [Mycobacterium tuberculosis]5939999
 1011413  4gi|1353880sialidase L [Macrobdella decora]59411410 
 15664635156F1 [Bacillus subtilis]59351308 
 222 4791393gi|142469als operom regulatory protein [Bacillus subtilis]5934915
 22526984614gnl|PID|e280623PCPA [Streptococcus pneumoniae]59441917 < /td>
 301 208558< /td>gnl|PID|e233868hypothetical protein [Bacillus subtilis]5937351
 30436782455gnl|PID|e202290unknown [Lactobacillus sake]59331224 
 3513 12201 1107 1 gnl|PID|e238664hypothetical protein [Bacillus subtilis]59351131 
 3514 13288  12182 gi|1657647Cap8H [Staphylococcus aureus]59391107 
 3618 18076 17 897 gi|1500535M. jannaschii predicted coding region MJ1635 [Methanococcus5933
jan naschii]
 3812  61727137gi|2293239(AF008220) YtxK [Bacillus subtilis]5934966
 42319523361gi|1684845pinin [Canis familiaris]59401410 < /td>
 50326781728gnl|PID|d101329YqjK [Bacillus subtilis]5941951
 56518702388gnl|PID|e137594xerC recombinase [Lactobacillus leichmannii]5941519
 61668125628 gnl|PID|e311516aminotransferase [Bacillus subtilis]59401185 
 67523823023gi|11461902-keto-3-deoxy-6-phosphogluconate aldolase [Bacillus subtilis]5936642
 6910 85678899gi|1573628antothenate kinase (coaA) [Haemophilus influenzae]5938333
 8712 11383 10 055 gnl|PID|e323504putative Fmu protein [Bacillus subtilis]59441325 
11314 13927 15 894 gi|1673731(AE000010) Mycoplasma pneumoniae, fructose-permease IIBC59431968 
component; similar to Swiss-Prot Accession Number P20966, from
E. coli [Mycoplasma pneumoniae]
115887668521gi|1590886M. jannaschii predicted coding region MJ0110 [Methanococcus5938
jan naschii]
119219 661526gnl|PID|e209005homologous to ORF2 in nrdEF operons of E. coli and S. typhimurim5943441
[Lactococcus lactis]
12817 13438 13178 gnl|PID|e279632u nknown [Mycobacterium tuberculosis]5938261
14022 23903 23 388 gi|482922protein with homology to pail repressor of B. subtilis [Lactobacillus59516
< HIL>delbrueckii]
1481 3 96979014gnl|PID|d1020055932684
H. INFLUENZAE AND SYNECHOCYSTIS. [Bacillus subtilis]
14910 8244gi|710422cmp-binding-f actor 1 [Staphylococcus aureus]59401032 
164969936013gnl|PID|d100965ferric anguibactin-binding protein precursor FabT of V. anguillarum5941981
[Bacillus subtilis]
16412 7823gni|PID|d100964homolog ue of ferric anguibactin transport system permerase protein59351014 
FatC of V. anguillarum [Bacillus subtilis]
17724 011072gi|289759coded for by C. elegans cDNA CE2G3 (GenBank:Z14728); putative5940672
[Caenorhabditis elegans]
177738 414200gi|2313445(AE000551) H. pylori predicted coding region HP0342 [Helicobacter5938
pylo ri]
183427682508gi|509672repressor protein [Bacteriophage Tuc2009]5950261
186633982820gi |606080ORF_o290; Geneplot suggests frameshift linking to o267, not found5938579
[Escherichia coli]
19033120< /td>1711gi|1613768histidine protein kinase [Streptococcus pneumoniae]59321410 < /td>
194216211019gnl|PID|d100579unknown [Bacillus subtilis]5940603
198752054306g nl|PID|e313073hypothetical protein [Bacillus subtilis]5938900
220543623958g nl|PID|d101322YqhL [Bacillus subtilis]5946405
242315732367g i|1787045(AE000184) f308; This 308 aa orf is 35 pct identical (35 gaps) to 3055942795
residues of an approx. 296 aa protein PFLC_ECOLI SW: P32675
[Es cherichia coli]
24721154< /td>1480gi|40073ORF107 [Bacillus subtilis]5939327
2561 868  2gnl|PID|d101924hemolysin [Synechocystis sp.]5939867
258< /td>1 65 820gi|2246532ORF 73, contains large complex repeat CR 73 (Kaposi's5920756
sarcoma-associated herpesvirus]
2701 3861126gnl|PID|d102092YfnB [Bacillus subtilis]5940741
2811 552 166putative [Lactococcus lactis]5931387
3091  3 479gi|405879yeiH [Escherichia coli]5938477
3631  21894gi|915208gastric mucin [Sus scrofa]59311893 
3872 425 84gi|160671S antigen precursor [Plasmodium falciparum]5944342
5511223 10465 gnl|PI D|d101812LumQ [Synechocystis sp.]5829759
 2 9420983513gnl|PID|d10047 9Na+ -ATPase subunit J [Enterococcus hirae]58391416 
 30540583651ATP binding protein of transport ATPases [Bacillus firmus]5834408
 33629832210g nl|PID|d101164unknown [Bacillus subtilis]5845774
 36853166179gi|1518679orf [Bacillus subtilis]5832864
 43559263971gi|1788150(AE000278) protease II [Escherichia coli]58371956 
 46537045221gnl|PID|e267329Unknown [Bacillus subtilis]58421518 
 4814 11722  11066 gnl|PID|d101771thiamin biosynthetic bifunctional enzyme [Synechocystis sp.]5834657
 5 211229  3gnl|PID|d101291reductase [Pseudomonas aeruginosa]58351227 < /td>
 532 702 4 12gi|2313357(AE000545) cytochrome c biogenesis protein (ccdA) [Helicobacter5825
pylo ri]
 5846586< /td>5498gi|147329transport protein [Escherichia coli]58411089 
 69549343807gnl|PID|e311492unknown [Bacillus subtilis]58411128 
 7127 31357  32277 gi|2408014hypothetical protein [Schizosaccharomyces pombe]5833921
 72435862882gi |18694nodulin-21 (AA 1-201) [Glycine max]5834705
 7 4349374230gi|2293252(AF008220) YtmO [Bacillus subtilis]5833708
 79445943422gi|1217989ORF3 [Streptococcus pneumoniae]58441173 < /td>
 82810585 81 71gi|882711exonuclease V alpha-subunit [Escherichia coli]58382415 
 8617 16017 1533 7 gi|476425-dehydroquinate hydrolyase (3-dehydroquinase) [Salmonella typhi]5832681
 972 931 560rgg [Streptococcus gordonii]5832372
1082 3582724gi|537020vacB gene product [Escherichia coli]58372367 
111545935240g i|1592142ABC transporter, probable ATP-binding subunit [Methanococcus5836
jan naschii]
120344 215110gni|PID|d101320YqgX [Bacillus subtilis]5847690
12816 13131 12673⠀‚gi|662919ORF U [Enterococcus hirae]5842459
132361744939gi|1 800301macrolide-efflux determinant [Streptococcus pneumoniae]58351236 < /td>
1331 111 890 gnl|PID|e269488Unknown [Bacillus subtilis]5836780
16011 86159865ORF1 [Lactococcus lactis]58391251 
161662686849gnl|PID|d101024DJ-1 protein [Homo sapiens]5832582
1691 214  2gnl|PID|d100447translation elongation factor-3 [Chlorella virus]5831213
18 71 487  2gi|475114regulatory protein [Pediococcus pentosaceus]5838486
187643844620dessication-related protein [Craterostigma plantagineum]5855237
190214641640competence pheromone [Streptococcus gordonii]5838177
192220121344g nl|PID|d100556rat GCP360 [Rattus rattus]5844669
20611292 696g nl|PID|e202579product similar to WrbA [Lactobacillus sake]5835597
21622333 555gnl |PID|e325036hypothetical protein [Bacillus subtilis]58331779 
217552504321cellobiose phosphotransferase enzyme II″ [Bacillus583893 0
stearoth ermophilus]
217756365106gnl|PID|d102048B . subtilis cellobiose phosphotransferase system celB; P46317 (998)5844531
transmembrane [Bacillus subtilis]
2321  2 811gi|1573777cell division ATP-binding protein (ftsE) [Haemophilus influenzae]5839810
2641  2 715gi|973330NatA [Bacillus subtilis]5832714
2801 33 767gi|1786187(AE000111) hypothetical 29.6 kD protein in thrC-talB intergenic5831735
region [Escherichia coli]
3061 84 5  3gnl|PID|e334780YlbL protein [Bacillus subtilis]5847843
360315561092s p|P46351|YZGD—HYPOTHETICAL 45.4 KD PROTEIN IN THIAMINASE I5832465
< td/>5′REGION
3635< /td>21601867gi|160671S antigen precursor [Plasmodium falciparum]5851294
3721 806  3gi|393394Tb-291 membrane associated protein [Trypanosoma brucei subgroup]5837804
3822 749 519hypothetical 20.3K protein (insertion sequence IS1131) -5841231
< td/>Agrobacterium tumefaciens (strain PO22) plasmid Ti
3084097471gi|1499745M. jannaschii predicted coding region MJ0912 [Methanococcus5738
jan naschii]
 1010  76747507gi|1737169homologue to SKP1 [Arabidopsis thaliana]5730168
 111  2 412gnl|PID|d100139ORF [Acetobacter pasteurianus]5742411
 31420321388gi|2293213(AF008220) YtpR [Bacillus subtilis]5737645
 3311 69316449gnl|PID|e324949hypothetical protein [Bacillus subtilis]5736483
 45554465060gi|1592204phosphoserine phosphatase [Methanococcus jannaschii]5744387
 49765237632PTS enzyme-II fructose [Xanthomonas campestris]57351110 < /td>
 52545206850gi|1574144single-stranded-DNA-specific exonuclease (recJ) [Haemophilus57352331 
i nfluenzae]
 5351795gi|1843580replicase-ass ociated polyprotein [oat blue dwarf virus]5746285
†‚63653124995gi|2182608(AE000094) Yr4J [Rhizobium sp. NGR234]5739318
⠀‚7215 13883 13059 gnl|PID|d100892homologous to SwissPrto:YIDA_ECOLI hypothetical protein5740825
[Bacillus subtilis]
 79225611815gnl|PID|d100965homologue of NADPH-flavin oxidoreductase Frp of V. harveyi5744747
[Bacillus subtilis]
 82995969763gi|1206045short region of similarity to glycerophosphoryl diester5735168
phosphodiesterases [Caenorhabditis elegans]
 8616  15371 14493 gi|1787983(AE 000264) o288; 92 pct identical (1 gaps) to 222 residues of5734879
fragment YDIB_ECOLI SW: P28244 (223 aa) [Escherichia coli]
 933169 51177gi|1500003mutator mutT protein [Methanococcus jannaschii]5733519
 96630264519threonine synthase [Arabidopsis thaliana]57431494 
 9914 17211  18212 gi|773349BirA protein [Bacillus subtilis]57441002 
112874487903M. jannaschii predicted coding region MJ0678 [Methanococcus5730
jan naschii]
11316 18328 pir|A45605|A456 mature-parasite-infected erythrocyte surface antigen MESA -5722300
< td/>Plasmodium falciparum
1232  3431110pir|F64149|F641hypothet ical protein HI0335 - Haemophilus influenzae (strain Rd5738768
KW20)
1234 21082884gnl|PID|d102148(AB001 684) sulfate transport system permease protein [Chlorella57397 77
vulgari s]
12710 6477 5587gi|1573082nitrogenase C (nifC) [Haemophilus influenzae]5735891
12813 92519790gi|153692pneumolysin [Streptococcus pneumoniae]5738540
131421391363gi|42081nagD gene product (AA 1-250) [Escherichia coli]5736777
1361 2141221bbs |148453SpA = endocarditis immunodominant antigen [Streptococcus5744
sorbinus, MUCOB 263, Peptide, 1566 aa) [Streptococcus sobrinus]
14025 26851 gi|505576beta- glucoside permease [Bacillus subtilis]57381851 
141663957438unknown [Schizosaccharomyces pombe]57411044 
144332312785 gnl|PID|d100139ORF [Acetobacter pasteurianus]5742447
155454544564glycosyl transerase [Erwinia amylovora]5734891
159948775854 gi|290509o307 [Escherichia coli]5735978
16711 97109429g nl|PID|d100139ORF [Acetobacter pasteurianus]5742462
171640234436mannose permease subunit III-Man [Escherichia coli]5729414
178421701076gnl|P ID|d102004(AB001488) ATP-DEPENDENT RNA HELICASE DEAD57391095 
HOMOLOG. [Bacillus subtilis]
1901⠀‚1451455gi|149420export/processi ng protein [Lactococcus lactis]57301311 
1981 298 95gi|522268unidentified ORF22 [Bacteriophage bIL67]5736204
20 3231952110gnl|PID|e28391 5orf c01003 [Sulfolobus solfataricus]57411086†‚
2051 40 507gi|1439527EIIA-man [Lactobacillus curvatus]5728468
214742433797g nl|PID|d102049H. influenzae, ribosomal protein alanine acetyltransferase; P443055748447
< td/>(189) [Bacillus subtilis]
26831 7671276gi|43979L. curvatus small cryptic plasmid gene for rep protein [Lactobacillus5736
curvatus]
3511 324 34gnl|PID|e275871T03F6.b [Caenorhabditis elegans]5731291
3861226  2gi|160671S antigen precursor [Plasmodium falciparum]5745225
5510486 8777gi|405857< /td>yehU [Escherichia coli]56331710 
8536743910gi|467199 pksC; L518_F1_2 [Mycobacterium laprae]5639237
 10334421874g nl|PID|d101907sodium-coupled permease [Synechocystis sp.]56361569 
 2111880 333gi|23139 49(AE000593) osmoprotection protein (proWX) [Helicobacter pylori]56331548 
 2229 21968 22 456 gnl|PID|d102001(AB001488) PROBABLE ACETYLTRANSFERASE. [Bacillus563748 9
subtilis ]
 2711361  3gi|215132ea59 (525) [Bacteriophage lambda]56301359 
 28946674278DNA repair protein RAD2 [Methanococcus jannaschii]5629390
 331  3 386gnl|PID|d100139ORF [Acetobacter pasteurianus]5641384
 36751225397pir|PQ0053|PQ00hypothetical protein (proC 3′ region) - Pseudomonas aeruginosa5628276
(strain PAO) (fragment)
 40431374318gi|1800301macrolide-efflux determinate [Streptococcus pneumoniae]56271182 < /td>
 4016 12511 gnl|PID|e217602PlnU [Lactobacillus plantarum]5638681
 4817 13775 130 23 gi|143729transcription activator [Bacillus subtilis]5635753
 75416742594gnl|PID|d102036membrane protein [Bacillus stearothermophilus]5625 921
 853184214 59gnl|PID|d100139ORF [Acetobacter pasteurianus]5641384
 89758154940gi|853777product similar to E. coli PRFA2 protein [Bacillus subtilis]5642876
105213602718g nl|PID|d101913hypothetical protein [Synechocystis sp.]56371359 
112321513194gi|537201ORF_o345 [Escherichia coli]56311044 
113427542963g nl|PID|d100340ORF [Plum pox virus]5628210
12 2312032054gi|1649035high-affinity periplasmic glutamine binding protein [Salmonella5630 852
typhim urium]
12483939 3694gnl|PID|e248893unknown [Mycobacterium tuberculosis]5627246
125444034107human non-muscle myosin heavy chain [Homo sapiens]5632297
12711 66086405(AE000073) Y4fN [Rhizobium sp. NGR234]5635204
1 34547693849gnl|PID|d1018 70hypothetical protein [Synechocystis sp.]5639921
137< /td>10 68147245gi|1592011sulfate permease (cysA) [Methanococcus jannaschii]5634432
142850194582pir|A47071|A470orf1 immediately 5′ of nifS - Bacillus subtilis5629438
146846763660gn l|PID|d101911hypothetical protein [Synechocystis sp.]56321017 
148319062739gnl|PID|d101 099phosphate transport system permease protein PstA5636834
[Synechocystis sp.]
15044449274 3gnl|PID|e304628probably site-specific recombinase of the resolvase family enzyme56271707 
[Bacteriophage TP21]
1721  2 208gi|1787791(AE000249) f317; This 317 aa orf is 27 pct identical (16 gaps) to 3015634207
residues of an approx. 320 as protein YXXC_BACSU SW: P39140
[Es cherichia coli]
17274979< /td>5668gi|396293similar to Bacillus subtilis hypoth. 20 kDa protein, in tsr 3′ region5640690
< td/>[Escherichia coli]
18673732< /td>3367gi|1732200PTS permease for mannose subunit IIPMan [Vibrio furnissii]5636366
18722402 819virR49 protein - Streptococcus pyogenes (strain CS101, serotype56351584 
M49)
20427722239gi|606376ORF _o162 [Escherichia coli]5635534
206233421633gi|55 9861clyM [Plasmid pAD1]56381710 
219316891096gi|1146197< /td>putative [Bacillus subtilis]5627594
2302 4091485pir|C60328|C603hypothetical protein 2 (sr 5′ region) - Streptococcus mutans (strain56401077 
OMZ175, serotype f)
233429303268< /td>gi|1041785rhoptry protein [Plasmodium yoelii]5624339
273215432724gi| 143089iep protein [Bacillus subtilis]56321182 
3531  1 516gnl|PID|e325000hypothet ical protein [Bacillus subtilis]5641516
3591 87 641gi|1786952(AE000176) o877; 100 pct identical to the first 86 residues of the5635555
100 aa hypothetical protein fragment YBGB_ECOLI SW: P54746
[Es cherichia coli]
36374482< /td>4198gi|1573353outer membrane integrity protein (tolA) [Haemophilus influenzae]5638285
3761  2 508gnl|PID|e325031hypothet ical protein [Bacillus subtilis]5633507
 181 836 177gnl|PID|d100872a negative regulator of pho regulon [Pseudomonas aeruginosa]5531660
 28418241618STAT protein [Dicytostelium discoideum]5540207
 29644965041unknown protein [Anabaena sp.]5531546
 3 816 969510702 gi|580 905B. subtilis genes rpmH, rnpA, 50kd, gidA and gibB [Bacillus subtilis]55311008 
 49557276182gi|1786951(AE000176) heat-responsive regulatory protein [Escherichia coli]5529456
 51423813241gnl |PID|d101293YbbA [Bacillus subtilis]5542861
 529964010866 gi|153016ORF 419 protein [Staphylococcus aureus]55231227 
 53418131349OspF [Borrelia burgdorferi]5530465
 60547945756 gi|1499876magnesium and cobalt transport protein [Methanococcus jannaschii]5538963
 71914176 15408⠀‚gi|1857120glycosyl transferase [Neisseria meningitidis]55411233†‚
 75531894229 gnl|PID|e108780NAD alcohol dehydrogenase [Bacillus subtilis]55441041 
10810 10488 98 20gnl|PID|e324997hypothetical protein [Bacillus subtilis]5536669
11312 12273 13037⠀‚gnl|PID|e311496unknown [Bacillus subtilis]5534765
11313 13007 13945⠀‚gi|15734231-phosphofructokinase (fruK) [Haemophilus influenzae]5539939
126567645907gi|1790131(AE000446) hypothetical 29.7 kD protein in ibpA-gyrB intergenic5537858
region [Escherichia coli]
12932719< /td> 902gnl|PID|d101425Pz-peptidase [Bacillus licheniformis]55351818⠀‚
138325931610< /td>gi|142833ORF2 [Bacillus subtilis]5537984
140669165633g nl|PID|d100964homologue of hypothetical protein in a rapamycin synthesis gene55261284 
cluster of Streptomyces hygroscopicus [Bacillus subtilis]
14733 8542136gi|472330dihydrolipoamide dehydrogenase [Clostridium magnum]55391719 
14710 10204 8921 gnl|PID|e73078dihydroorotase [Lactobacillus leichmannii]55381284 
148534304119gi|290572peripheral membrane protein U [Escherichia coli]5529690
148641714650gi|69 5769transposase [Xanthobacter autotrophicus]5537480
14914 12564 1 1650 gnl|PID|d101329YqjG [Bacillus subtilis]5532915
15631113 550gi|2314496(AE000634) conserved hypothetical integral membrane protein5534564
[Helicobacter pylori]
15910 66255897gi|290533similar to E. coli ORF adjacent to suc operon; similar to gntH class of5529729
regulatory proteins [Escherichia coli]
16431784< /td>2332gnl|PID|e255118hypothetical protein [Bacillus subtilis]5537549
164527723521g i|40348put. resolvase Tnp I (AA 1-284) [Bacillus thuringiensis]5535750
16411 74287216< /td>gnl|PID|e249407unknown [Mycobacterium tuberculosis]5538213
167538603345involved in protein secretion [Bacillus subtilis]5528516
186528802563g i|606080ORF_o290; Geneplot suggests frameshift linking to o267, not found5535318
[Escherichia coli]
18984311< /td>5396gnl|PID|e183450hypothetical EcsB protein [Bacillus subtilis]55321086 
192532703079vitellogenin convertase [Aedes aegypti]5538192
195224541384gi |1574693transferase, peptidoglycan synthesis (murG) [Haemophilus55331071 
i nfluenzae]
1984 30132471gnl|PID|e313074hypothetic al protein [Bacillus subtilis]5529543
2141 373 744transposase [Synechocystis sp.]5533372
219< /td>21115 456gi|288301 ORF2 gene product [Bacillus megaterium]5530660
263737423443gi|18137cgcr-4 product [Chlamydomonas reinhardtii]5548300
2851  2 829gnl|PID|d100974unknown [Bacillus subtilis]5540828
2861 650 249ORF (18 kDa) [Vibrio cholerae]5531402
297212291696g i|150848prtc [Porphyromonas gingivalis]5539468
3092 218 982gi|1574491hypothetical [Haemophilus influenzae]5535765
3282 646 224gi|571500prohibition [Saccharomyces cerevisiae]5527423
33011340 474soxS [Escherichia coli]5529867
364325381546gi|39 3394Tb-291 membrane associated protein [Trypanosoma brucei subgroup]5536993
3683 941 105S antigen precursor [Plasmodium falciparum]5540837
3546043624gi|2293176(AF008220) signal transduction protein kinase [Bacillus subtilis]5426981
911 77467246gi|1146245 putative [Bacillus subtilis]5438501
 3824 16213 1793 7 gi|1480429putative transcriptional regulator [Bacillus stearothermophilus]5427 1725 
 4085076gi|39989methionyl-tRNA synthetase [Bacillus stearothermophilus]5351 95
 4343980236 7gnl|PID|e148611ABC transporter [Lactobacillus helveticus]54251614 < /td>
 5210 10844 gi|1762962FemA [Staphylococcus simulans]54291260 
 571  3 512gi|558177endo-1,4-beta- xylanase [Cellulomonas fimi]5436510
 58347494246gnl |PID|d101237hypothetical [Bacillus subtilis]5429504
 71710684 11703  gi|510255orf3 [Escherichia coli]54311020 
 7120 27546 2773 7 gi|202543serotonin receptor [Rattus novegicus]5431192
 722 8441098 gi|148613arnB gene product [Plasmid F]5437255
 72< /td>774386695gi|11964965438744
 7410 14043 13465†‚gi|1200342ORF 3 gene product [Bradyrhizobium japonicum]5432579
 7412 16483 159 95 gi|2317798maturase-related protein [Pseudomonas alcaligenes]5430489
 86328772155 gi|46988orf9.6 possibly encodes the O unit polymerase [Salmonella enterica]5434723
 89544333921gi|147211phnO protein [Escherichia coli]5441513
 901  3 464gi|2317798maturase-rela ted protein [Pseudomonas alcaligenes]5430462
 9610 80588510< /td>gnl|PID|d102015(AB001488) SIMILAR TO SALMONELLA TYPHIMURIUM SLYY5432453
GENE REQUIRED FOR SURVIVAL IN MACROPHAGE.
[ Bacillus subtilis]
 97646623604gi|1591394transketolase⠀³ [Methanococcus jannaschii]54301059 < /td>
10611 10406  12010 gi|1606286ORD_o637 [Escherichia coli]54321605 
147886637404g nl|PID|d101615ORF_ID:o319#7; similar to (SwissProt Accession Number P37340)54351260 
[Escherichia coli]
17142477< /td>3223gi|1439528EIIC-man [Lactobacillus curvatus]5436747
174220681787g nl|PID|d100518motor protein [Homo sapiens]5435282
1881 5261188 gnl|PID|e250352unknown [Mycobacterium tuberculosis]5431663
198535822884hypothetical protein [Bacillus subtilis]5433699
2071  11641gnl|PID|d101813hypothetic al protein [Synechocystis sp.]54241641 
2101  2 655gi|2293206(AF008220) YtmP [Bacillus subtilis]5429654
2252 9662357gnl|PID|e330194R11H6.1 [Caenorhabditis elegans]54391392 
24111681 347 gnl|PID|d101813hypothetical protein [Synechocystis sp.]54261335 
2632 9071395gnl|PID|d1 01886transposase [Synechocystis sp.]5430489
263< /td>634502977gi|1606715447474
277325171363gi|1196926unknown protein [Streptococcus mutans]54301155 
3071 828  4gi|2293198(AF008220) YtgP [Bacillus subtilis]5428825
3251 19 768gi|2182507(AE000083) Y41H (Rhizobium sp. NGR234)5437750
3 322 898 590gi|159181 5ADP-ribosylglycohydrolase (draG) [Methanococcus jannaschii]5432309
3854 240 479gi|530878amino acid feature: N-glycosylation sites, aa 41 . . . 43, 46 . . . 48,5449240
51 . . . 53, 72 . . . 74, 107 . . . 109, 128 . . . 130, 132 . . . 143,
158 . . . 160, 153 . . . 165, amino acid feature: Rod protein domain,
aa 169 . . . 340; amino acid feature: globular protein domai
725 19702 19493 gn l|PID|e255111hyptothetical protein [Bacillus subtilis]5332210
 23324972033gnl|PID|d102015(AB001488) SIMILAR TO SALMONELLA TYPHIMURIUM SLYY5325465
GENE REQUIRED FOR SURVIVAL IN MACROPHAGE.
[ Bacillus subtilis]
 2911 904210121 gi|143331alkalin e phosphatase regulatory protein [Bacillus subtilis]53311080 
 33314791009pir|S10655|S106hypothetical protein X - Pyrococcus woesei (fragment)5333471
 36645835134unknown [Mycobacterium tuberculosis]5330552
 3814 85218898 gi|580904homologous to E. coli rnpA [Bacillus subtilis]5330378
 52770078686gi|1377831unknown [Bacillus subtilis]53291680 
 5417 17555  19564 gi|666069orf2 gene product [Lactobacillus leichmannii]53362010 
 561  1 681gi|1592266restriction modification system S subunit [Methanococcus jannaschii]5332681
 571094318487 gi|1788543(AE000310) f351; Residues 1-121 are 100 pct identical to5331945
YOJL_ECOLI SW; P33944 (122 aa) and aa 152-351 are 100 pct
identical to YOJK_ECOLI SW; P33943 [Escherichia coli]
 611  429  4gnl|PID|e236467B0024.13 [Caenorhabditis elegans]5333426
 7115772  4gi|393394Tb-291 membrane associated protein [Trypanosoma brucei subgroup]53335769 
 723 8942840gi|2293178(AF008220) YtsD [Bacillus subtilis]53271947 
 7314 97939212 gi|1778556putative cobalamin synthesis protein [Escherichia coli]5332582
 88752174342gi| 2098719putative fimbrial-associated protein [Actinomyces naeslundii]5338876
 93523951688gluconate oxidoreductase [Gluconobacter oxydans]5333708
 96966327762 gi|517204ORF1, putative 42 kDa protein [Streptococcus pyogenes]53421131 
108876298600maturation protein [Lactobacillus paracasei]5332972
128964126972 gnl|PID|e317237unknown [Mycobacterium tuberculosis]5336561
12812 84299253gi|311070pentraxin fusion protein [Xenopus laevis]5331825
141  3 950pir|A61607|A616probable hemolysin precursor - Streptococcus agalactiae (strain5336948
74-360)
1 63221623022gi|1755150nocturnin [Xenopus laevis]5330861
171323042624gi| 1732200PTS permease for mannose subunit IIPMan [Vibrio furnissii]5332321
182537853051 gnl|PID|d100572unknown [Bacillus subtilis]5335735
209329481935g i|1778505ferric enterobactin transport protein [Escherichia coli]53281014 
218538842406g i|140162murE gene product [Bacillus subtilis]53341479 
2503 473 790gnl|PID|e334776YlbH protein [Bacillus subtilis]5330318
2751  11611gnl|PID|d101314YqeW [Bacillus subtilis]53351611 
3321 544  2gi|409286bmrU [Bacillus subtilis]5331543
2225433445gnl|PID|e23387 9hypothetical protein [Bacillus subtilis]5239903
322 22402 23376 gi |38969lacF gene product [Agrobacterium radiobacter]5236975
5380942356gnl|PID|e32491 5IgAl protease [Streptococcus sanguis]52325739 
 2226 19961 2 0212 gi|152901ORF 3 [Spirochaeta aurantia]5235252
 2231 23140 2466 6 gi|289262comE ORF3 [Bacillus subtilis]52321527 
 27653974801gi|39573P20 (AA 1-178) [Bacillus licheniformis]5235597
 3510 8604735 7gi|508241putative O-antigen transporter [Escherichia coli]52271248 
 45448013662gnl|PID|d102243(AB005554) homologs are found in E. coli and H. influenzae; see52361140 
SWISS_PROT ACC#: P42100 [Bacillus subtilis]
 4818 14385 13726 gnl|PID|e2051745225660
 49453215755(AF013987) nitrogen regulatory IIA protein [Vibrio cholerae]5219435
 54427734668gi|1500472M. jannaschii predicted coding region MJ1577 [Methanococcus5236
jannaschii]
 546 52504969gi|2182453(AE000079) Y4iO [Rhizobium sp. NGR234]5240282
⠀‚66684006955gi|43140TrkG protein [Escherichia coli]52301446 
 7126 30659 3131 2 gnl|PID|e314993unknown [Mycobacterium tuberculosis]5223654
 75216731035gnl|PID|d102271(AB001683) FarA [Streptomyces sp.]5227639
 8 1314392893gnl|PID|e31145 8rhamnulose kinase [Bacillus subtilis]52321455 
 81849875781gi|147403mannose permease subunit II-P-Man [Escherichia coli]5237795
 8321 20687 21853  gi|143365phosphoribosyl aminoimidazole carboxylase II (PUR-K; ttg start52371167 
codon) [Bacillus subtilis]
 86657854592gi|1276879EpsF [Streptococcus thermophilus]52261194†‚
 8620 19390  17861 gi|454844ORF 3 [Schistosoma mansoni]52261530 
 9613 10540 9 659gi|288299ORF1 gene product [Bacillus megaterium]5233882
1111  22026gi|148309cytolysin B transport protein [Enterococcus faecalis]52272025 
112214572167orf1 [Haemophilus influenzae]5233711
118329312365bbs|151233Mip = 24 kda macrophage infectivity potentiator protein [Legionella5233 567
pneumo phila, Philadelphia-1, Peptide, 184 aa] [Legionella
pneumophila]
122956465951gi|8214 myosin heavy chain [Drosophila melanogaster]5236306
12211 61596374gi|434025dihydrolipoamide acetyltransferase [Pelobacter carinolicus]5252216
134648806313M protein trans-acting positive regulator [Streptococcus pyogenes]52431434 
135312382716unknown [Mycobacterium tuberculosis]52351479†‚
141316812319gnl|PID|d100573unknown [Bacillus subtilis]5232639
161425625024g i|114624322.4% identity with Escherichia coli DNA-damage inducible52362463 
protein . . . ; putative [Bacillus subtilis]
1732⠀‚968 183gi|1215693putative orf; GT9_orf434 [Mycoplasma pneumoniae]5230786
198644003567gnl|PID|e313010hypothetical protein [Bacillus subtilis]5226834
21012 88449107DNA gyrase subunit B [Mycoplasma genitalium]5238264
21410 52645431gi|550697envelope protein [Human immunodeficiency virus type 1]5236168
2251 15 884gi|1552773hypothetical [Escherichia coli]5234870
2301 39 362gnl|PID|d100582unknown [Bacillus subtilis]5228324
2871 871  2gnl|PID|e335028protease/peptidase [Mycobacterium leprae]5229870
36321305  4gi|393394Tb-291 membrane associated protein [Trypanosoma brucei subgroup]52321302 
 23220481173gnl|PID|e254943unknown [Mycobacterium tuberculosis]5130876
 293 7421521gi|9299005′-methylthioadenosine phosphorylase [Sulfolobus solfataricus]5131780
 451 4101597gi|1877429integrase [Streptococcus pyogenes phage T12]51321188 
 4826 19227 18946 (AE000633) transcriptional regulator (tenA) [Helicobacter pylori]5133282
 73542764016g i|474177alpha-D-1,4-glucosidase [Staphylococcus xylosus]5131261
 8111 893512057 < /td>gi|311070pentraxin fusion protein [Xenopus laevis]51312123 
 83511951986yqfI [Bacillus subtilis]5133792
 9810 75318538gi|41500ORF 3 (AA 1-352); 38 kD (put. ftsX) [Escherichia coli]51281008 
113639085173g i|466882pps1; B1496_C2_189 [Mycobacterium leprae]51271266 
1241 326 57gi|2191168(AF007270) contains similarity to myosin heavy chain [Arabidopsis5132270
thali ana]
12910 72 866816gi|1046241orf14 [Bacteriophage HP1]5130471
143< /td>349633983gi|13549355126981
14815 11359 10226 gi|2293256(AF008220) putative hippurate hydrolase [Bacillus subtilis]51361134 
149860037313Herpesvirus saimiri ORF73 homolog [Kaposi's sarcoma-associated51211311 
herpes-like virus]
151912092 gnl|PID|e281580hypothetical 40.7 kd protein [Bacillus subtilis]5134543
159625553208g i|146944CMP-N-acetylneuraminic acid synthetase [Escherichia coli]5136654
17411797  4gi|1773166probable copper-transporting atpase [Escherichia coli]51281794 
265422311773g nl|PID|e256400anti-P. falciparum antigenic polypeptide [Saimiri sciureus]5118459
2772 6431311pir|S32915|S329pilD protein - Neisseria gonorrhoeae5133669
3501 890  3gi|290509o307 [Escherichia coli]5130888
363412284485gi|17 07247partial CDS [Caenorhabditis elegans]51233258 
36711701  4gi|393394Tb-291 membrane associated protein [Tyrpanosoma brucei subgroup]51321698 
 15551744497gnl|PID|e58151F3 [Bacillus subtilis]5038678
 16422202582gnl|PID|e325010hypothetical protein [Bacillus subtilis]5029363
 19525914159gi|1552733similar to voltage-gated chloride channel protein [Escherichia coli]50301569 
 25427011997gi|887849ORF_f219 [Escherichia coli]5027705
 351 211 417gnl|PID|e236697unknown [Saccharomyces cerevisiae]5033207
 39434165152unknown [Bacillus subtilis]50271737 
 51740005181gi|1592027carbamoyl-phosphate synthase, pyrimidine-specific, large subunit50271182 
[Methanococcus jannaschii]
 5198303gi|1591847type I restriction-modification enzyme, S subunit [Methanococcus5028
jannaschii]
 528 87409534gi|144297acetyl esterase (XynC) [Caldocellum saccharolyticum]5034795
 5216 16951 gi|2108229basic surface protein [Lactobacillus fermentum]5034822
 5776031633660S ribosomal protein L7B [Schizosaccharomyces pombe]5040306
 7123 29348 28383†‚gnl|PID|d101328YqjA [Bacillus subtilis]5030966
 8612 11155 1076 9 gnl|PID|e324964hypothetical protein [Bacillus subtilis]5024387
 9321205 330similar to Escherichia coli pyruvate, water dikinse, Swiss-Prot5024876
Accession Number P23538 [Pyrococcus furiosus]
 96516732959gnl|PID|e322433gamma-glu tamylcysteine synthetase [Brassica juncea]50291287 
 982 2181171gi|151110leucine-, isoleucine-, and valine-binding protein [Pseduomonas5030954
aerug inosa]
10343303 2785gi|154330O-antigen ligase [Salmonella typhimurium]5031519
115564805980putative cel operon regulatro [Bacillus subtilis]5026501
12911 75597305skeletal muscle ryanodine receptor [Homo sapiens]5032255
12913 81927965319-kDA protein [Rhizobium meliloti]5030228
151576346819g i|40348put. resolvase Tnp I (AA 1-284) [Bacillus thuringiensis]5035816
1531  1 597gnl|PID|d102015(AB00148 8) SIMILAR TO NITROREDUCTASE. [Bacillus subtilis]5029597
155559865432g i|1276880EspsG [Streptococcus thermophilus]5028555
160973906323(AE000179) o331; 92 pct identical to the 333 as hypothetical protein50301068 
YBHE_ECOLI SW: P52697; 26 pct identical (7 gaps) to 167
residues of the 373 as protein MLE_TRICU SW: P46057;
SW: P52697 [Escherichia coli]
16367396< /td>8091gnl|PID|d101313YqeN [Bacillus subtilis]5022696
167652323940g i|413926ipa-2r gene product [Bacillus subtilis]50271293 
1692 807 130gnl|PID|e304540endolysin [Bacertiophage Bastille]5035678
171531584025g i|606080ORF_o290; Geneplot suggests frameshift linking to o267, not found5027858
[Escherichia coli]
21013 8 1518414gi|330038HRV 2 polyprotein [Human rhinovirus]5025264
36411538 135Tb-292 membrane associated protein [Trypanosoma brucei subgroup]50311404 
 10759115090gi|144859ORF B [Clostridium perfringens]4924822
 26510754 9768< /td>gi|142440ATP-dependent nuclease [Bacillus subtilis]4931987
 66797778398gi|414170trkA gene product [Methanosarcina mazeii]49261380 
 77653644648RecX protein [Mycobacterium smegmatis]4928717
 8213 12689 132 49 gnl|PID|e255091hypothetical protein [Bacillus subtilis]4920561
 93948664531gi|40067X gene product [Bacillus sphaericus]4926336
112540194948gi|1574380lic-1 operon protein (licB) [Haemophilus influenzae]4927930
129760584949gnl|PID|e267587Unknown [Bacillus subtilis]49351110 
135538754438P20 (AA 1-178) [Bacillus licheniformis]4925564
154214231953 gnl|PID|d101102regulatory components of sensory transduction system4929531
< td/>[Synechocystis sp.]
15652878163 7gnl|PID|d101732hypothetical protein [Synechocystis sp.]49251242 
173535002940gi|490324LORF X gene product [unidentified]4930561
18211057  2gi|331002first methionine codon in the ECLF1 ORF [Saimirine herpesvirus 2]49251056 
192653523667gi|2 394472(AF024499) contains similarity to homeobox domains49231686 
[Caenorhabditis elegans]
253411 291350gi|531116SIR4 protein [Saccharomyces cerevisiae]4923222
2771 600 136gi|396844ORF (18 kDa) [Vibrio cholerae]4932465
32731435 887gi|733524phosphatidylinositol-4,5-diphosphate 3-kinase [Dictyostelium4924
dis coideum]
365314 36 132gi|393394Tb-291 membrane associated protein [Trypanosoma brucei subgroup]49311305 
 33744613277gi|145644codes for a protein of unknown function [Escherichia coli]48261185 
 402 6521776ornithine decarboxylase [Nicotiana tabacum]48291125 
 67413772384 gi|17726522-keto-3-deoxygluconate kinase [Haloferax alicantei]48301008 
 74242693871gi|12182678(AE000101) Y4vJ [Rhizobium sp. NGR234]4827399
⠀‚8121326 541gi|153672 lactose repressor [Streptococcus mutans]4833786
 81429813646g i|146042fuculose-1-phosphate aldolase (fucA) [Escherichia coli]4830666
 971 602 51gi|153794rgg [Streptococcus gordonii]4829552
1101  13132gi|1381114prtB gene product [Lactobacillus delbrueckii]48233132 
131529142147gnl|PID|e183811Acyl-ACP thioesterase [Brassica napus]4827768
133434942628gnl| PID|e261988putative ORF [Bacillus subtilis]4827867
139642314599g i|10983882K470.1 gene product [Caenorhabditis elegans]4823369
139850365665gi |1022725unknown [Staphylococcus haemolyticus]4829630
14012 11936 11 007 gnl|PID|d102049H. influenzae, ribosomal protein alanine acetyltransferase; P443054827930
< td/>(189) [Bacillus subtilis]
14695 6704654gi|1591731melvalonate kinase [Methanococcus jannaschii]48241017 < /td>
161312802374gnl|PID|d101578Collagenase precursor (EC 3.4.—.—), [Escherichia coli]48241095 
17211 10581 11048⠀‚gnl|PID|d101132hypothetical protein [Synechocystis sp.]4827468
182< /td>429302586gi|40067X gene product [Bacillus sphaericus]4837345
21015 10786 1119 6 sp|P13940|LE29—LATE EMBRYOGENESIS ABUNDANT PROTEIN D029 (LEA4830411
D-29)
21412†‚62316482gi|40389non-tox ic components [Clostridium botulinum]4826252
2211 704  3gi|11573364H. influenzae predicted coding region HI0392 [Haemophilus4827702
influenzae]
2272 6473928gi|1673693(AE000005) Mycoplasma pneumoniae, C09_orf718 Protein48303282 
[Mycoplasma pneumoniae]
2532 480 758gnl|PID|e236697unkno wn [Saccharomyces cerevisiae]4831279
363318741122gi|18137cgcr-4 product [Chlamydomonas reinhardtii]4840753
3891 505  2gi|18137cgcr-4 product [Chlamydomonas reinhardtii]4838504
321 20879 22258 gn i|PID|e264778putative maltose-binding protein [Streptomyces coelicolor]47331380 < /td>
6440894658gi|395734723570
 15337361760gnl|PID|d100572unknown [Bacillus subtilis]47251977 
 3515 14516  13263 gi|17773351Cap5L [Staphylococcus aureus]47201254 
 5163547400232K antigen precursor - Mycobacterium tuberculosis4738456
 55810154 9273< /td>gi|39848U3 [Bacillus subtilis]4726882
 92417533276gnl|PID|e280611PCPC [Streptococcus pneumoniae]47351524 < /td>
127955895386gi|1786458(AE000134) f120; This 120 aa orf is 76 pct identical (0 gaps) to 424732204
residues of an approx. 48 aa protein Y127_HAEIN SW: P43949
[Es cherichia coli]
13021232< /td>1759gnl|PID|e266555unknown [Mycobacterium tuberculosis]4723528
140449513542homologue of hypothetical protein in a rapamycin synthesis gene47241410 
cluster of Streptomyces hygroscopicus [Bacillus subtilis]
15146 8146200gi|1522674M. jannaschii predicted coding region MJ3CL41 [Methanococcus4727
jan naschii]
1573†‚8031174gnl|PID|d101320YqgZ [Bacillus subtilis]4725372
178532672155g i|2367190(AE000390) o334; sequence change joins ORFs ygjR & ygjS from47301113 
earlier version (YGJR_ECOLI SW: P42599 and YGJS_ECOLI SW:
P42600) [Escherichia coli]
2731  21549gnl|PID|e254973autolysin sensor kinase [Bacillus subtilis]47321548 
3002 880 644gi|1835755zinc finger protein Png-1 [Mus muculus]4722237
 5414 14182 12638  pir|S43609|S436rofA protein - Streptococcus pyogenes46241545 
 881  21018gnl|PID|e223891xylose repressor [Anaerocellum thermophilum]46271017†‚
 96745535860 gnl|PID|d101652ORF_ID:0347#5; similar to (SwissProt Accession Number P45272)46231308 
[Escherichia coli]
11211127< /td>  3gi|2209215(AF004325) putative oligosaccharide repeat unit transporter46241125 
[Streptococcus pneumoniae]
12213 73087982gi|1054776hr44 gene product [Homo sapiens]4634675
12714 91988125afuA gene product [Actinobacillus pleuropneumoniae]462810 74 
1324709361 97gi|153794rgg [Streptococcus gordonii]4626897
140882207723g i|1235795pullulanase [Thermoanaerobacterium thermosulfurigenes]4621 498
140992058315 gi|407878leucine rich protein [Streptococcus equisimilis]4627891
1621  11125gi|1143209ORF7; Method: conceptual translation supplied by author [Shigella462511 25 
sonn ei]
1991  1 585gi|1947171(AF000299) No definition line found [Caenorhabditis elegans]4628585
223319711477sp |P02562|MYSS—MYOSIN HEAVY CHAIN, SKELETAL MUSCLE (FRAGMENTS)4627495
2 7601608gi|1016 112ycf38 gene product [Cyanophora paradoxa]4628849
2921 687 220(AE000011) Mycoplasma pneumoniae, cytidine deaminase; similar to4629468
GenBank Accession Number C53312, from M. pirum [Mycoplasma
pneumoniae]
 30858436472g i|1788049(AE000270) o235; This 235 aa orf is 29 pct identical (10 gaps) to 1984524630
residues of an approx. 216 as protein YTXB_BACSU SW: P06568
[Es cherichia coli]
 486346 13868gi|722339unknown [Acetobacter xylinum]4529408
 601 307  2gi|1699079coded for by C. elegans cDNA yk41h4.3; coded for by C. elegans4536306
cDNA yk148g10.5; coded for by C. elegans cDNA yk152g5.5; coded
for by C. elegans cDNA yk59a10.5; coded for by C. elegans cDNA
yk4 1h4.5; coded for by C. elegans cDNA cm20g10; coded
 721614371 14874 gi|1321900NADH dehydrogenase (ubiquinone) [Artemia franciscana]4525504
 99791587941 gi|152192mutation causes a succinoglucan-minus phenotype; ExoQ is45281218 
< td/>atransmembrane protein; third gene of the exoYFQ operon;; putative
[ Rhizobium meliloti]
12712 70466606bbs|153689HitB = iron utilization protein [Haemophilus influenzae, type b,4524441
DL42, NTHI TN106, Peptide, 506 aa] [Haemophilus influenzae]
137515612619gi|472921v-type Na-ATPase [Enterococcus hirae]45331059 
2091 774 364 gi|304141restriction endonuclease beta [Bacillus coagulans]4528411
3141 604  2gi|1480457latex allergen [Hevea brasiliensis]4531603
 2018 19782  20288 gi|433942ORF [Lactococcus lactis]4426507
 87870306452g i|537207ORF_f277 [Escherichia coli]4426579
166549094037gnl|P ID|e308082membrane transport protein [Bacillus subtilis]4425873
2471 818 75gnl|PID|d100718ORF1 [Bacillus sp.]4420744
 3 2318853876gi|2351768PspA [Streptococcus pneumoniae]43241992 < /td>
 3617 15467 gi|1045739M. genitalium predicted coding region MG064 [Mycoplasma4326 2790 
ge nitalium]
 5415 14656 17343 gi|520541pen icillin-binding proteins 1A and 1B [Bacillus subtilis]43272688 
 672 6961352gi|536934yjcA gene product [Escherichia coli]4329657
13922416 338gi| 396400similar to eukaryotic Na+/H+ exchangers [Escherichia coli]43242079 
2981  3 809gi|413972ipa-48r gene product [Bacillus subtilis]4324807
3871 47 427gi|2315652(AF016669) No definition line found [Caenorhbditis elegans]4330381
185442213127gi |2182399(AE000073) Y4fP [Rhizobium sp. NGR234]41251095 
1 582 70gnl|PID|e218681CDP-diacylglycerol synthetase [Arabidopsis thaliana]4120513
363642051914g i|1256742R27-2 protein [Trypanosoma cruzi]41272292 
3682  2 943gi|21783LMW glutenin (AA 1-356) [Triticum aestivum]4134942
155344892861g i|42023member of ATP-dependent transport family, very similar to mdr40181629 
proteins and hemolysin B, export protein [Escherichia coli]
3652 951438gi|1633572Herpesvir us saimiri ORF73 homolog [Kaposi's sarcoma-associated40211344 
herpes-like virus]
1329793860gnl|PID|d10190 8hypothetical protein [Synechocystis sp.]3926882
1538144647gnl|PID|d10196 1hypothetical protein [Synechocystis sp.]3919834
 2 6614035 10724 gi|142 439ATP-dependent nuclease [Bacillus subtilis]38203312 
 471  34916gi|632549NF-180 [Petromyzon marinus]36234914 
< tr>< tr>< tr>< tr>< tr>< td/>< td/>1050< td>3< /tr>5< td/>< td/>42695< td/>7< td>2416< td>1165< td>6< td>4< /tr>19 < td/>< tr>< td>23951< td> 9044924< tr>< td> 10210 14840 14 < td>13 < /tr>25573 6312< tr>< td> 5464923< td/>< tr>< td>28612< td>23306 600< td>2116 310< td>121858 494< td/>< td> 607 342< td> 705 44 636
TABLE 3
S. pneumoniae - Putative coding regions of novel proteins not similar
to known proteins
ContigORFStartStop
IDID(nt) (nt)
1434283009
2646114964
32 818 994
3311821574
3753826497
325 25046 25396 
326 25625 26317 
6215191689
614 12875 12618 
615 13215 12841 
618 15977 15390 
712 99559419
713 10161 9910
8639154280
9960245704
 10869096298
 10971366888
 1011 79687672
 1211140  4
 1231779 1456
 1421913< /td>1434
 161  1 243
 16556753087
 17 1 324 34
 1731451
 17948904465
 2014 14544 15893 
 2133592589
 21548024482
  2221 17099 17362 
 2225 19467 19 982 
 2233 255 40 25764 
 223 5 26388 26218 
 2236 26382 27572 
 23766556032
 238713266 53
 241 36 518
 255 30092641
 274< /td>48194223
 2747894956
 28530171797
 2 8842723850
 2810 50284597
 2811 57465072
 29755964919
 2985039551 8
 29955958207
 30965116263
 3162664 2344
 325 52035538
 338 53274668
 3410  80247740
 3412 93608641
 3413 96679377
 3418 13104 11902†‚
 3511 96888588
 3512  11073 9670
 362< /td> 3341041
 36 12 11120 10893 
 3613 10993 11388 < /td>
 3615 12172 14595 
 3874577
 38844805001
 38 10 55175711
 38< /td>17 10732 11376 
 40317283143
 431 172  5
 43788848732
 43895689071
 4444831 6831
 453 32043665
 464 38753468
 467< /td>60747081
 4831963582
 48845794229
 4 811 93238922
 4816 13042 12494 
 4820 16342 15764 
 4824  17971 18351 
 4830 21979 21776 
 491 209  3
 50433072672
 51532393598
 5211  12146 12883 
 5455885187
 54860135459
 5 4960046210
 5416 17685 17506 
 55910515 101 23 
 5512 1194 7 12141 
 563< /td> 9351387
 56 414961939
 57< /td>316242130
†‚57421002501
 58675417335
 591  2 430
 5942736
 59527343063
 59847435549
 59954595929
 60 657416451
⠀‚61323951772
< td> 61533163176
 6412722  2
 6621180 3147
 6689082< /td>9495
 6731 3431182
 692 980
 705 40593922
 7042154057
 70952685504
  7115 20351 21901 
 7116 21859 22 338 
 7119 262 04 27556 
 729 84588081
 7338154216
 73642144582
  73743694773
 7310 71836428
 7315 94629668
 761 524 1 95
 762 867  535
 7611 86029210
 80679248109
 811  244  2
 8110 66318931
 8341872 1150
 8317  16810 16460 
 84344642929
 8 6221471092
 86436062875
 8619 16767 17114 
 8755326500 0
 87764596001
 87972247006
 8718  17930 17670 
 8718275 17928 
 88216191840
 88427112878
 88962526016
 89326341621
 8997371686 8
 902 899
 9031143 952
 913 29593141
 914 31703691
 916< /td>42534573
 93 391  2
 93626482379
 93845333712
 961  3 182
 962 632
 96314071147
 96412501420
 97 970436753
⠀‚9915 18522 18692 
 9917 19717  19541 
100240941980
1031 48 299
10364373
1045 61426735
105760986517
1061< /td>  1 363
10610 < /td>983210212 
108 1  2 268
1113 34173788
111438094606
11510  10854 10438 
116328732121
118222741357
122426982333
12210 58586199
12212 63017416
1242 346 69 0
128425443 368
1291 689
12921011 724
129864 546056
1299 65406277
12912 78097621
1313< /td>1433 756
13159725673
134< /td>11 11838 11209 
1352 6251140
136429133830
1372 325 134< /td>
13912 14027 14521 
13913 14532 
13915363 14875 
14020 19822 20838 
1421  1 285
1463 760 479
14641149 778
146736042885
14682239401
146 14 939910676 
14615 10052 9750
147774887276
147989138647
148752984765
1491  21936
14932880
1499 62586070
15021355 579
150 325561909
15320612642
154 319531741
155221811411
156< /td>845504311
15 71 37 294
1592 780
1594< /td>13841722
159 732714017
16113321018
165 355354945
166654064972
167< /td>960756395
16 9528283205
170764856243
170869646362
170973036962
17011 87907906
171971507476
172522981948
173429132677
1752 659 83 5
1753 8931789
17621487
176322001466
17794686 4925
17710 5177
17711 < /td>51115347
177 13 73968703
178634523724
181< /td>518532473
18 2221121102
182326172006
183221262320
185546834219
185648464634
187429403557
188436864363
188541834821
188658826493
189531432844
189959565564
1911 618  4
19111 10357†‚10001 
19232268
192430812878
192768005331
1933  997 839
194423152127
195562494543
195 666206231
1 96215531849
1971  1 861
1989 68446644
200553295769
2006< /td>59936595
204 539143276
205 4471709
209420382460
209< /td>524582682
21 010 73708230
21013 902910441 
21014 10439 107 05 
21452581
214950655277
21411 59 965754
2172  541 194
2182 9141432
218 314301972
21836393821
219 1 458 39
2201 869
223426171964
2271  1 510
234415391312
23461838
2351 52 312
2352 687
2381< /td> 660 64
2461  1 270
2481  3 362
2482 4431222
25432789 792
2582 11791616
260317702123
263 653 177
263 422441900
2 63535692973
2661  1 342
2662  1771022
2702< /td>11241681
272 1 857 186
275216842295
278< /td>1  2 406
2821  714 391
282 414631134
2871119 826
2881 540  4
2891 684  4
29151589
29322539 2925
2941 21 608
2962 700
2963< /td> 670 843
302 1 261 530
30 93 559 350
3102 2491889
316220871818
31721048 584
3182 313 777
3193 477  133
3272 912
3311  1 549
3331  2 535
3332 465 82
3333 127
34111
3452 895  701
3462  750 199
3491  1 198
3502 81 413
3551 973
3582 448
3602< /td> 948 628
364 216391265
3781 3451004
37 92 683 510
3811 109 693
3851 150  4
3852 269  30
SEQUENCE LISTING
The patent contains a lengthy “Sequence Listing” section. A copy of the “Sequence Listing” is available in electronic form from the USPTO
web site (http://seqdata.uspto.gov/sequence.html?DocID=06420135B1). An electronic copy of the “Sequence Listing” will also be available from the
USPTO upon request and payment of the fee set forth in 37 CFR 1.19(b)(3).

Field of search:


Foreign documents:

EP0622081 / 1969-12-31
EP0687688 / 1969-12-31
WO/1993/010238 / 1969-12-31
WO/1995/006732 / 1969-12-31
WO/1995/014712 / 1969-12-31
WO/1995/031548 / 1969-12-31
WO/1996/005859 / 1969-12-31
WO/1996/008582 / 1969-12-31
WO/1996/016082 / 1969-12-31
WO/1996/033276 / 1969-12-31
WO/1997/043303 / 1969-12-31
WO/1998/018930 / 1969-12-31
WO/1998/026072 / 1969-12-31



Browse by classes

Advertisements

© 2013 Patentsmania.com | viewweather.com | tubelyrics.org | lyricsinfo.org | getacd.es | getamovie.org | getalyric.com | carpati.org | getamap.net | ro | 0.1111s