Looking for Hypothetical Proteins in Clusters of Related Features¶
In this exercise we will look for hypothetical proteins found inside clusters of features that routinely occur together on the chromosome. We will postulate that these proteins perform functions related to those in the cluster and that this provides a clue to their actual purpose.
To begin, we need to select a group of genomes and find the functionally-coupled roles. Protein-encoding genes (pegs) that are functionally related (e.g., enzymes from a single pathway or subunits of a transport cassette) tend to cluster on the chromosome. Thus, if a genome includes pegs from a pathway, they often occur close to one another on the chromosome. When one simply looks at the genes as they occur on a chromosome, one notices these clusters of functionally-related pegs, and often they suggest clues to the functions of hypothetical proteins.
To illustrate the use of p3-scripts, we will compose pipelines that allow you to locate clusters of functionally-related genes.
The Input: a List of Pegs from a Set of Genomes¶
As input, we will start with a large set of pegs from which we will extract possible clusters. There are many ways one can amass a set of pegs, but let’s just select a set of genomes and take the set of features that occur in the genomes. Here we ask for all genomes of the species Yersinia pestis, and then select for all the pegs. For each peg we want its ID, the ID of the sequence on which it is located, its location on that sequence, and its functional assignment.
p3-all-genomes --eq "genome_name,Yersinia pestis" | p3-get-genome-features --eq feature_type,CDS --attr patric_id,sequence_id,location,product >features.tbl
This pipeline takes a while to run, but it gets us a nice large set of features (1.5 million) from a collection of 324 distinct genomes. The first script gets us the genome IDs; the second expands those genomes into features. The output starts something like this.
genome.genome_id feature.patric_id feature.sequence_id feature.location feature.product
385964.3 fig|385964.3.peg.1001 JMUF01000016 complement(28922..29248) FIG002060: uncharacterized protein YggL
385964.3 fig|385964.3.peg.1003 JMUF01000016 29951..30067 hypothetical protein
385964.3 fig|385964.3.peg.1005 JMUF01000016 31598..31870 FIG001341: Probable Fe(2+)-trafficking protein YggX
385964.3 fig|385964.3.peg.101 JMUF01000001 complement(90500..90613) hypothetical protein
385964.3 fig|385964.3.peg.1023 JMUF01000017 complement(7081..8124) Sucrose specific transcriptional regulator CscR, LacI family
385964.3 fig|385964.3.peg.102 JMUF01000001 90601..92034 Putative virulence factor
385964.3 fig|385964.3.peg.1027 JMUF01000017 12280..12681 Redox-sensing transcriptional regulator QorR
Note that we were able to specify the output columns for
p3-get-genome-features using a comma-delimited list of field names.
We could also have used multiple invocations of the --attr
option.
p3-all-genomes --eq "genome_name,Yersinia pestis" | p3-get-genome-features --eq feature_type,CDS --attr patric_id --attr sequence_id --attr location --attr product >features.tbl
The position on the contig (location column) is represented by a
leftmost position and rightmost position separated by two dots. If the
feature is on the negative strand, the entire thing is enclosed in
parentheses and preceded by the modifier complement
. So, for
example, 29951..30067
is on the positive strand from position 29951
(1-based) through position 30067. complement(28922..29248)
would be
on the negative strand starting at position 29248 and ending at position
28922.
We will use this output file features.tbl again later, but first we need to convert the functional assignments to roles. Most of the time, a functional assignment is a single role; however, sometimes multiple roles are performed by one peg. For example, a product value of
Glutamate 5-kinase (EC 2.7.2.11) / RNA-binding C-terminal domain PUA
indicates that the protein contains two domains (at least), one of which performs the role Glutamate 5-kinase and the other which performs the role RNA-binding C-terminal domain PUA. We use the script p3-function-to-role to replace functions with roles. For example, if features.tbl contained the lines
385964.3 fig|385964.3.peg.1466 JMUF01000027 31258..31698 Hypothetical protein YaeJ with similarity to translation release factor
385964.3 fig|385964.3.peg.1467 JMUF01000027 31783..32466 Copper homeostasis protein CutF precursor / Lipoprotein NlpE involeved in surface adhesion
385964.3 fig|385964.3.peg.1470 JMUF01000027 complement(35126..35533) Protein RcsF
then the command
p3-function-to-role <features.tbl >feature.roles.tbl
would convert these lines to
385964.3 fig|385964.3.peg.1467 JMUF01000027 31783..32466 Copper homeostasis protein CutF precursor
385964.3 fig|385964.3.peg.1467 JMUF01000027 31783..32466 Lipoprotein NlpE involeved in surface adhesion
385964.3 fig|385964.3.peg.1470 JMUF01000027 complement(35126..35533) Protein RcsF
Here the hypothetical protein has been eliminated, and the multiple-role peg has been duplicated once for each role.
In our case, we need to remove a lot more than the hypothetical proteins: we want to restrict ourselves to roles in which we have a high degree of confidence. You do this by specifying a role file that contains information used to disambiguate substantially identical roles and identifies the roles of interest. The following script generates this file.
p3-subsys-roles >subsys_roles.tbl
Now we use this file as input to p3-function-to-role.
p3-function-to-role --roles=subsys_roles.tbl <features.tbl >feature.roles.tbl
The output starts like this:
genome.genome_id feature.patric_id feature.sequence_id feature.location feature.role
385964.3 fig|385964.3.peg.1005 JMUF01000016 31598..31870 FIG001341: Probable Fe(2+)-trafficking protein YggX
385964.3 fig|385964.3.peg.1046 JMUF01000017 32247..32828 Flagellar transcriptional activator FlhC
385964.3 fig|385964.3.peg.1047 JMUF01000017 33033..33920 Flagellar motor rotation protein MotA
385964.3 fig|385964.3.peg.1073 JMUF01000018 15496..15663 LSU ribosomal protein L33p
385964.3 fig|385964.3.peg.1073 JMUF01000018 15496..15663 LSU ribosomal protein L33p, zinc-independent
Note that the role column replaces the product column, and remains the last column in the file.
Creating the Clusters¶
So now we have a five-column file feature.roles.tbl that contains location and role information and we have a similar file containing all the features and showing the functional assignment rather than the role. We are going to work with the role-oriented file to compute the clusters. We are looking for groups of roles that commonly occur together.
Creating clusters is a two-step process: first we compute the functionally-coupled roles (that is, roles that commonly occur close to each other on the chromsome), then we use transitive closure to group them into clusters. The scripts used are p3-find-couples and p3-generate-clusters. This gives us the following pipe.
p3-find-couples --location=location --sequence=sequence_id --col=patric_id role <feature.roles.tbl | p3-generate-clusters 1 2 >clusters.tbl
p3-find-couples uses the feature ID as its key column. We inform the
script of this with the parameter --col=patric_id
. We tell it that
the coupling is to be by the roles listed in the role column with the
positional parameter role
. Note that even though the column name is
technically feature.role
, we can just specify the part after the dot
(role
) and the column specification will still work, since there is
only the one role column in the input file.
p3-find-couples will look up location information if necessary, but
we already have it in the input file, so we tell it to use the
sequence_id
column for the sequence IDs and the location
column
for the feature locations. The output file contains the coupled roles in
the first two columns, so we tell p3-generate-clusters to look in
those columns with the positional parameters (1
and 2
).
The output starts like this.
cluster_id size cluster
1 1160 Molybdenum cofactor guanylyltransferase (EC 2.7.7.77)::Molybdopterin-guanine dinucleotide biosynthesis protein MobB::DNA polymerase I (EC 2.7.7.7)::Periplasmic thiol:disulfide interchange protein DsbA::GTP-binding protein TypA/BipA::Glutamine synthetase type I (EC 6.3.1.2)::Inner membrane protein YihY, formerly thought to be RNase BN::D-aminoacyl-tRNA deacylase (EC 3.1.1.96)::Coproporphyrinogen III oxidase, oxygen-independent (EC 1.3.99.22)::Nitrogen regulation protein NR(I), GlnG (=NtrC)::Nitrogen regulation protein NtrB (EC 2.7.13.3)::GTP-binding protein EngB::Aspartate aminotransferase (EC 2.6.1.1)::Outer membrane porin OmpF::Asparaginyl-tRNA synthetase (EC 6.1.1.22)::Nicotinate phosphoribosyltransferase (EC 6.3.4.21)::Membrane alanine aminopeptidase N (EC 3.4.11.2)::Dihydroorotate dehydrogenase (quinone) (EC 1.3.5.2)::Hypothetical metal-binding enzyme, YcbL homolog::Lactose permease::alpha-galactosidase (EC 3.2.1.22)::Peptide transport periplasmic protein sapA (TC 3.A.1.5.5)::Peptide transport system permease protein sapB (TC 3.A.1.5.5)::Peptide transport system ATP-binding protein SapD (TC 3.A.1.5.5)::Peptide transport system ATP-binding protein sapF (TC 3.A.1.5.5)::Peptide transport system permease protein sapC (TC 3.A.1.5.5)::Phage shock protein C::Phage shock protein D::Phage shock protein A::Suppressor of sigma54-dependent transcription, PspA-like::Psp operon transcriptional activator::Phage shock protein B::Anhydro-N-acetylmuramic acid kinase (EC 2.7.1.170)::Inhibitor of vertebrate c-type lysozyme, outer membrane => MliC::Pyridoxamine 5'-phosphate oxidase (EC 1.4.3.5)::Pyridoxal kinase (EC 2.7.1.35)::Tyrosyl-tRNA synthetase (EC 6.1.1.1)::Glutathione S-transferase (EC 2.5.1.18)::ABC transporter, permease protein (cluster 10, nitrate/sulfonate/bicarbonate)::ABC transporter, substrate-binding protein (cluster 10, nitrate/sulfonate/bicarbonate)::AMP nucleosidase (EC 3.2.2.4)::5-keto-D-gluconate 5-reductase (EC 1.1.1.69)::Gluconokinase (EC 2.7.1.12)::Gluconate utilization system Gnt-I transcriptional repressor::High-affinity gluconate transporter GntT::Quercetin 2,3-dioxygenase => YhhW (EC 1.13.11.24)::Glutathione S-transferase, omega (EC 2.5.1.18)::Aspartate-semialdehyde dehydrogenase (EC 1.2.1.11)::MarC family integral membrane protein::16S rRNA (cytidine(1402)-2'-O)-methyltransferase (EC 2.1.1.198)::Cell division protein FtsA::Cell division protein FtsQ::Cell division protein FtsZ (EC 3.4.24.-)::D-alanine--D-alanine ligase (EC 6.3.2.4)::UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine deacetylase (EC 3.5.1.108)::Cell division protein FtsW::UDP-N-acetylglucosamine--N-acetylmuramyl-(pentapeptide) pyrophosphoryl-undecaprenol N-acetylglucosamine transferase (EC 2.4.1.227)::UDP-N-acetylmuramoylalanine--D-glutamate ligase (EC 6.3.2.9)::UDP-N-acetylmuramate--alanine ligase (EC 6.3.2.8)::Phospho-N-acetylmuramoyl-pentapeptide-transferase (EC 2.7.8.13)::UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-alanine ligase (EC 6.3.2.10)::UDP-N-acetylmuramoylalanyl-D-glutamate--2,6-diaminopimelate ligase (EC 6.3.2.13)::Cell division protein FtsI [Peptidoglycan synthetase] (EC 2.4.1.129)::16S rRNA (cytosine(1402)-N(4))-methyltransferase EC 2.1.1.199)::Cell division protein FtsL::Cell division protein MraZ::Protein translocase subunit SecA::Secretion monitor SecM::Zn-ribbon-containing, possibly RNA-binding protein and truncated derivatives::Mutator MutT protein (7,8-dihydro-8-oxoguanine-triphosphatase) (EC 3.6.1.-)::Sulfate and thiosulfate binding protein CysP::Sulfate transport system permease protein CysW::Sulfate transport system permease protein CysT::Sulfate and thiosulfate import ATP-binding protein CysA (EC 3.6.3.25)::Sialic acid transporter (permease) NanT::Cysteine synthase B (EC 2.5.1.47)::N-acetylmannosamine kinase (EC 2.7.1.60)::Sialic acid utilization regulator, RpiR family::N-acetylneuraminate lyase (EC 4.1.3.3)::Predicted outer membrane lipoprotein YfeY::N-acetylmannosamine-6-phosphate 2-epimerase (EC 5.1.3.9)::IncF plasmid conjugative transfer surface exclusion protein TraT::Type III secretion negative modulator of injection (YopK,YopQ,controls size of translocator pore)::Type III secretion injected virulence protein (EC 3.4.22.-,YopT,cysteine protease,depolymerizes actin filaments of cytoskeleton,causes cytotoxicity)::Type III secretion chaperone protein for YopT (SycT)::GTP pyrophosphokinase, (p)ppGpp synthetase II (EC 2.7.6.5)::Guanosine-3',5'-bis(diphosphate) 3'-pyrophosphohydrolase (EC 3.1.7.2)::tRNA (guanosine(18)-2'-O)-methyltransferase (EC 2.1.1.34)::Guanylate kinase (EC 2.7.4.8)::ATP-dependent DNA helicase RecG (EC 3.6.4.12)::DNA ligase (NAD(+)) (EC 6.5.1.2)::DNA-directed RNA polymerase omega subunit (EC 2.7.7.6)::Cysteine synthase (EC 2.5.1.47)::Sulfate transporter, CysZ-type::Phosphotransferase system, phosphocarrier protein HPr::Sodium/glutamate symporter::Xanthine permease::2-deoxyglucose-6-phosphate hydrolase YniC::Ribulosamine/erythrulosamine 3-kinase potentially involved in protein deglycation::Cysteinyl-tRNA synthetase (EC 6.1.1.16)::Peptidyl-prolyl cis-trans isomerase PpiB (EC 5.2.1.8)::N5-carboxyaminoimidazole ribonucleotide mutase (EC 5.4.99.18)::N5-carboxyaminoimidazole ribonucleotide synthase (EC 6.3.4.18)::UDP-2,3-diacylglucosamine diphosphatase (EC 3.6.1.54)::Methenyltetrahydrofolate cyclohydrolase (EC 3.5.4.9)::Methylenetetrahydrofolate dehydrogenase (NADP+) (EC 1.5.1.5)::Uncharacterized protein YbcJ::Anthranilate phosphoribosyltransferase (EC 2.4.2.18)::Phosphoribosylanthranilate isomerase (EC 5.3.1.24)::Indole-3-glycerol phosphate synthase (EC 4.1.1.48)::Anthranilate synthase, amidotransferase component (EC 4.1.3.27)::Anthranilate synthase, aminase component (EC 4.1.3.27)::Tryptophan synthase beta chain (EC 4.2.1.20)::Tryptophan synthase alpha chain (EC 4.2.1.20)::FIG00031715: Predicted metal-dependent phosphoesterases (PHP family)::Hypothetical YciO protein, TsaC/YrdC paralog::Cob(I)alamin adenosyltransferase (EC 2.5.1.17)::Ribosomal large subunit pseudouridine synthase B (EC 5.4.99.22)::Aliphatic amidase AmiE (EC 3.5.1.4)::Glutamine-dependent 2-keto-4-methylthiobutyrate transaminase::Methylthioribulose-1-phosphate dehydratase (EC 4.2.1.109)::2,3-diketo-5-methylthiopentyl-1-phosphate enolase-phosphatase (EC 3.1.3.77)::1,2-dihydroxy-3-keto-5-methylthiopentene dioxygenase (EC 1.13.11.54)::5-methylthioribose kinase (EC 2.7.1.100)::Methylthioribose-1-phosphate isomerase (EC 5.3.1.23)::Antitoxin HigA::Toxin HigB::Outer membrane protease Pla::Protease VII (Omptin) precursor (EC 3.4.23.49)::DEAD-box ATP-dependent RNA helicase DeaD (= CshA) (EC 3.6.4.13)::Polyribonucleotide nucleotidyltransferase (EC 2.7.7.8)::RND efflux system, inner membrane transporter::RND efflux system, membrane fusion protein::Efflux transport system, outer membrane factor (OMF) lipoprotein::Uncharacterized peptidase U32 family member YhbV::Uncharacterized protease YhbU::UPF0213 protein YhbQ::Bacteriocin/lantibiotic efflux ABC transporter, permease/ATP-binding protein::L-arabinose isomerase (EC 5.3.1.4)::Ribulokinase (EC 2.7.1.16)::DNA-damage-inducible protein I::Dihydroorotase (EC 3.5.2.3)::Cytoplasmic protein YaiE::DNA recombination-dependent growth factor C::SAM-dependent methyltransferase (EC 2.1.1.-)::Shikimate kinase III (EC 2.7.1.71)::Glutamate 5-kinase (EC 2.7.2.11)::RNA-binding C-terminal domain PUA::Gamma-glutamyl phosphate reductase (EC 1.2.1.41)::Fermentation/respiration switch protein::Curlin genes transcriptional activator::Flagellar brake protein YcgR::Xanthine-guanine phosphoribosyltransferase (EC 2.4.2.22)::SAM-dependent methyltransferase YafE (UbiE paralog)::Response regulator of zinc sigma-54-dependent two-component system::Sensor protein of zinc sigma-54-dependent two-component system::Aminomethyltransferase (glycine cleavage system T protein) (EC 2.1.2.10)::Glycine dehydrogenase [decarboxylating] (glycine cleavage system P protein) (EC 1.4.4.2)::2-octaprenylphenol hydroxylase::Glycine cleavage system H protein::2-polyprenyl-6-methoxyphenol hydroxylase::Xaa-Pro aminopeptidase (EC 3.4.11.9)::5-formyltetrahydrofolate cyclo-ligase (EC 6.3.3.2)::ABC-type efflux pump permease component YbhS::ABC-type efflux pump, duplicated ATPase component YbhF::ABC-type efflux pump membrane fusion component YbhG::ABC-type efflux pump permease component YbhR::3'-to-5' oligoribonuclease (orn)::Ribosome small subunit biogenesis RbfA-release protein RsgA::Phosphatidylserine decarboxylase (EC 4.1.1.65)::Ferrous iron transport permease EfeU::Proline/sodium symporter PutP (TC 2.A.21.2.1)::Ferrous iron transport periplasmic protein EfeO, contains peptidase-M75 domain and (frequently) cupredoxin-like domain::Delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88)::Transcriptional repressor of PutA and PutP::Proline dehydrogenase (EC 1.5.5.2)::Ferrous iron transport peroxidase EfeB::Glycine betaine/L-proline transport substrate-binding protein ProX (TC 3.A.1.12.1)::Glycine betaine/L-proline transport system permease protein ProW (TC 3.A.1.12.1)::Glycine betaine/L-proline transport ATP-binding protein ProV (TC 3.A.1.12.1)::Ribonucleotide reductase of class Ib (aerobic), alpha subunit (EC 1.17.4.1)::Ribonucleotide reduction protein NrdI::Glutaredoxin-like protein NrdH, required for reduction of Ribonucleotide reductase class Ib::Ribonucleotide reductase of class Ib (aerobic), beta subunit (EC 1.17.4.1)::Gamma-glutamyltranspeptidase (EC 2.3.2.2)::Glutathione hydrolase (EC 3.4.19.13)::Alkyl hydroperoxide reductase subunit C-like protein::Acyl carrier protein phosphodiesterase (EC 3.1.4.14)::Protein translocase subunit SecD::tRNA-guanine transglycosylase (EC 2.4.2.29)::S-adenosylmethionine:tRNA ribosyltransferase-isomerase (EC 2.4.99.17)::Protein translocase subunit SecF::Protein translocase subunit YajC::Maltodextrin glucosidase (EC 3.2.1.20)::Acetyltransferase YpeA::Coproporphyrinogen III oxidase, aerobic (EC 1.3.3.3)::GDP-mannose pyrophosphatase NudK::NADP-dependent malic enzyme (EC 1.1.1.40)::Cytochrome c-type protein NapC::Nitrate reductase cytochrome c550-type subunit::Periplasmic nitrate reductase precursor (EC 1.7.99.4)::Periplasmic nitrate reductase component NapD::Ferredoxin-type protein NapF (periplasmic nitrate reductase)::Anaerobic respiratory reductase chaperone::Cytochrome d ubiquinol oxidase subunit I (EC 1.10.3.-)::Cytochrome d ubiquinol oxidase subunit II (EC 1.10.3.-)::Cyd operon protein YbgE::Cytochrome d ubiquinol oxidase subunit X (EC 1.10.3.-)::Tol biopolymer transport system, TolR protein::Tol-Pal system protein TolQ::Tol-Pal system-associated acyl-CoA thioesterase::Tol-Pal system beta propeller repeat protein TolB::TolA protein::Cell division coordinator CpoB::Tol-Pal system peptidoglycan-associated lipoprotein PAL::Ribosome association toxin RatA::UPF0125 protein RatB::DNA repair protein RecN::Outer membrane beta-barrel assembly protein BamE::tmRNA-binding protein SmpB::Citrate synthase (si) (EC 2.3.3.1)::NAD kinase (EC 2.7.1.23)::Heat shock protein GrpE::2-oxoglutarate dehydrogenase E1 component (EC 1.2.4.2)::Succinate dehydrogenase iron-sulfur protein (EC 1.3.5.1)::Succinate dehydrogenase flavoprotein subunit (EC 1.3.5.1)::Succinate dehydrogenase cytochrome b-556 subunit::Succinate dehydrogenase hydrophobic membrane anchor protein::Dihydrolipoamide succinyltransferase component (E2) of 2-oxoglutarate dehydrogenase complex (EC 2.3.1.61)::Succinyl-CoA ligase [ADP-forming] beta chain (EC 6.2.1.5)::Succinyl-CoA ligase [ADP-forming] alpha chain (EC 6.2.1.5)::5'-methylthioadenosine nucleosidase (EC 3.2.2.16)::Vitamin B12 ABC transporter, substrate-binding protein BtuF::S-adenosylhomocysteine nucleosidase (EC 3.2.2.9)::Cysteine desulfurase => sulfur transfer pathway protein CsdA (EC 2.8.1.7)::Sulfur acceptor protein => sulfur transfer pathway protein CsdE::HesA/MoeB/ThiF family protein => sulfur transfer pathway protein CsdL::Membrane-bound lytic murein transglycosylase A (EC 3.2.1.-)::Glycine cleavage system transcriptional activator GcvA::Arginine pathway regulatory protein ArgR, repressor of arg regulon::Malate dehydrogenase (EC 1.1.1.37)::CTP synthase (EC 6.3.4.2)::Enolase (EC 4.2.1.11)::GTP pyrophosphokinase, (p)ppGpp synthetase I (EC 2.7.6.5)::Inactive (p)ppGpp 3'-pyrophosphohydrolase domain::Nucleoside triphosphate pyrophosphohydrolase MazG (EC 3.6.1.8)::23S rRNA (uracil(1939)-C(5))-methyltransferase (EC 2.1.1.190)::Signal transduction histidine-protein kinase BarA (EC 2.7.13.3)::7-carboxy-7-deazaguanine synthase (EC 4.3.99.3)::Superoxide dismutase [Cu-Zn] precursor (EC 1.15.1.1)::Helicase PriA essential for oriC/DnaA-independent DNA replication::Transcriptional (co)regulator CytR::Cell division protein FtsN::ATP-dependent protease subunit HslV (EC 3.4.25.2)::ATP-dependent hsl protease ATP-binding subunit HslU::1,4-dihydroxy-2-naphthoate polyprenyltransferase (EC 2.5.1.74)::LSU ribosomal protein L31p::Multidrug efflux system AcrAB-TolC, inner-membrane proton/drug antiporter AcrB (RND type)::LSU ribosomal protein L31p, zinc-independent::LSU ribosomal protein L36p, zinc-independent::LSU ribosomal protein L36p::LSU ribosomal protein L31p, zinc-dependent::Ribonuclease E inhibitor RraA::Multidrug efflux system AcrAB-TolC, membrane fusion component AcrA::Transcriptional regulator of acrAB operon, AcrR::LSU ribosomal protein L3p (L3e)::LSU ribosomal protein L4p (L1e)::LSU ribosomal protein L22p (L17e)::LSU ribosomal protein L2p (L8e)::LSU ribosomal protein L23p (L23Ae)::SSU ribosomal protein S19p (S15e)::SSU ribosomal protein S10p (S20e)::SSU ribosomal protein S3p (S3e)::LSU ribosomal protein L16p (L10e)::LSU ribosomal protein L14p (L23e)::LSU ribosomal protein L24p (L26e)::LSU ribosomal protein L18p (L5e)::LSU ribosomal protein L6p (L9e)::SSU ribosomal protein S8p (S15Ae)::Protein translocase subunit SecY::SSU ribosomal protein S11p (S14e)::SSU ribosomal protein S13p (S18e)::LSU ribosomal protein L15p (L27Ae)::SSU ribosomal protein S5p (S2e)::LSU ribosomal protein L30p (L7e)::SSU ribosomal protein S4p (S9e)::SSU ribosomal protein S4p (S9e), zinc-independent::SSU ribosomal protein S14p (S29e), zinc-independent::SSU ribosomal protein S14p (S29e)::LSU ribosomal protein L5p (L11e)::LSU ribosomal protein L29p (L35e)::DNA-directed RNA polymerase alpha subunit (EC 2.7.7.6)::SSU ribosomal protein S17p (S11e)::LSU ribosomal protein L17p::LSU ribosomal protein L36p, zinc-dependent::Bacterioferritin (EC 1.16.3.1)::16S rRNA (cytosine(967)-C(5))-methyltransferase (EC 2.1.1.176)::Methionyl-tRNA formyltransferase (EC 2.1.2.9)::Trk potassium uptake system protein TrkA::Peptide deformylase (EC 3.5.1.88)::Protein of unknown function Smg::Rossmann fold nucleotide-binding protein Smf possibly involved in DNA uptake::Uncharacterized protein YrdD::Protein YrdA::Shikimate 5-dehydrogenase I alpha (EC 1.1.1.25)::Threonylcarbamoyl-AMP synthase (EC 2.7.7.87)::Bacterioferritin-associated ferredoxin::Ferric hydroxamate ABC transporter, ATP-binding protein FhuC (TC 3.A.1.14.3)::Ferric hydroxamate ABC transporter, periplasmic substrate binding protein FhuD (TC 3.A.1.14.3)::Ferric hydroxamate ABC transporter, permease component FhuB (TC 3.A.1.14.3)::3-dehydroquinate synthase (EC 4.2.3.4)::Shikimate kinase I (EC 2.7.1.71)::Septum-associated cell division protein DamX::Methyl-directed repair DNA adenine methylase (EC 2.1.1.72)::Multimodular transpeptidase-transglycosylase (EC 3.4.-.-) (EC 2.4.1.129)::Type IV pilus biogenesis protein PilM::ADP compounds hydrolase NudE (EC 3.6.1.-)::Type IV pilus biogenesis protein PilN::Type IV pilus biogenesis protein PilO::Type IV pilus biogenesis protein PilP::Type IV pilus biogenesis protein PilQ::Glutamate-1-semialdehyde 2,1-aminomutase (EC 5.4.3.8)::probable iron binding protein from the HesB_IscA_SufA family::DNA mismatch repair protein MutS::Ribose 5-phosphate isomerase B (EC 5.3.1.6)::N,N'-diacetylchitobiose-specific 6-phospho-beta-glucosidase (EC 3.2.1.86)::Peptidyl-prolyl cis-trans isomerase PpiA precursor (EC 5.2.1.8)::ABC transporter, substrate-binding protein (cluster 8, B12/iron complex)::Iron(III) dicitrate transport system permease protein FecD (TC 3.A.1.14.1)::ABC transporter, ATP-binding protein (cluster 8, B12/iron complex)::Ferrichrome transport ATP-binding protein FhuC (TC 3.A.1.14.3)::Ferrichrome-iron receptor::Iron compound ABC transporter, permease protein::Lysine-specific permease::Transcriptional regulator YeiE, LysR family::Iron(III) dicitrate transport ATP-binding protein FecE (TC 3.A.1.14.1)::Nitrite reductase [NAD(P)H] large subunit (EC 1.7.1.4)::Nitrite reductase [NAD(P)H] small subunit (EC 1.7.1.4)::Cytosine deaminase (EC 3.5.4.1)::TonB-dependent receptor::Endonuclease IV (EC 3.1.21.2)::Ribulose-phosphate 3-epimerase (EC 5.1.3.1)::Tryptophanyl-tRNA synthetase (EC 6.1.1.2)::Phosphoglycolate phosphatase (EC 3.1.3.18)::Glucosamine-1-phosphate N-acetyltransferase (EC 2.3.1.157)::N-acetylglucosamine-1-phosphate uridyltransferase (EC 2.7.7.23)::Glucosamine--fructose-6-phosphate aminotransferase [isomerizing] (EC 2.6.1.16)::ATP synthase beta chain (EC 3.6.3.14)::ATP synthase epsilon chain (EC 3.6.3.14)::Branched-chain amino acid transport system carrier protein::Proline-specific permease proY::Phosphate transport system permease protein PstA (TC 3.A.1.7.1)::Phosphate transport system permease protein PstC (TC 3.A.1.7.1)::Phosphate transport ATP-binding protein PstB (TC 3.A.1.7.1)::Phosphate ABC transporter, periplasmic phosphate-binding protein PstS (TC 3.A.1.7.1)::Phosphate regulon sensor protein PhoR (SphS) (EC 2.7.13.3)::Phosphate regulon transcriptional regulatory protein PhoB (SphR)::Exonuclease SbcD::Phosphate transport system regulatory protein PhoU::ATP synthase alpha chain (EC 3.6.3.14)::ATP synthase gamma chain (EC 3.6.3.14)::ATP synthase delta chain (EC 3.6.3.14)::ATP synthase F0 sector subunit b (EC 3.6.3.14)::ATP synthase F0 sector subunit c (EC 3.6.3.14)::Exonuclease sbcC (EC 3.1.11.-)::ATP synthase F0 sector subunit a (EC 3.6.3.14)::ATP synthase protein I::Gamma-aminobutyrate:alpha-ketoglutarate aminotransferase (EC 2.6.1.19)::Spermidine/putrescine import ABC transporter substrate-binding protein PotD (TC 3.A.1.11.1)::Spermidine/putrescine import ABC transporter permease protein PotC (TC 3.A.1.11.1)::Exopolyphosphatase (EC 3.6.1.11)::Guanosine-5'-triphosphate,3'-diphosphate pyrophosphatase (EC 3.6.1.40)::ATP-dependent DNA helicase rep (EC 3.6.1.-)::ATP-dependent RNA helicase rhlB (EC 3.6.1.-)::Thioredoxin::Polyphosphate kinase (EC 2.7.4.1)::Mg/Co/Ni transporter MgtE, CBS domain-containing::Peptidyl-prolyl cis-trans isomerase PpiC (EC 5.2.1.8)::5'-nucleotidase SurE (EC 3.1.3.5)::Uncharacterized chaperone protein YegD::2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (EC 4.6.1.12)::2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase (EC 2.7.7.60)::tRNA pseudouridine(13) synthase (EC 5.4.99.27)::Cell division protein DivIC (FtsB), stabilizes FtsL against RasP cleavage::Transcription termination factor Rho::Protein-L-isoaspartate O-methyltransferase (EC 2.1.1.77)::Murein hydrolase activator NlpD::Phosphoadenylyl-sulfate reductase [thioredoxin] (EC 1.8.4.8)::Sulfite reductase [NADPH] hemoprotein beta-component (EC 1.8.1.2)::Sulfite reductase [NADPH] flavoprotein alpha-component (EC 1.8.1.2)::6-carboxy-5,6,7,8-tetrahydropterin synthase (EC 4.1.2.50)::Precorrin-2 oxidase (EC 1.3.1.76)::Uroporphyrinogen-III methyltransferase (EC 2.1.1.107)::Sirohydrochlorin ferrochelatase activity of CysG (EC 4.99.1.4)::Sulfate adenylyltransferase subunit 2 (EC 2.7.7.4)::Sulfate adenylyltransferase subunit 1 (EC 2.7.7.4)::Adenylylsulfate kinase (EC 2.7.1.25)::Aspartate--ammonia ligase (EC 6.3.1.1)::Flavoprotein MioC::16S rRNA (guanine(527)-N(7))-methyltransferase (EC 2.1.1.170)::tRNA-5-carboxymethylaminomethyl-2-thiouridine(34) synthesis protein MnmG::ABC transporter, permease protein (cluster 3, basic aa/glutamine/opines)::ABC transporter, substrate-binding protein (cluster 3, basic aa/glutamine/opines)::Glutamine ABC transporter, ATP-binding protein GlnQ::Polymer-forming bactofilin::L-alanine:glyoxylate aminotransferase (EC 2.6.1.44)::Serine--pyruvate aminotransferase (EC 2.6.1.51)::N-carbamoyl-L-amino acid hydrolase (EC 3.5.1.87)::Glutamine ABC transporter, permease protein GlnP::Glutamine ABC transporter, substrate-binding protein GlnH::Asp-tRNAAsn/Glu-tRNAGln amidotransferase A subunit and related amidases::RNA polymerase sigma factor RpoS::Phosphoribosylglycinamide formyltransferase (EC 2.1.2.2)::Spermidine N1-acetyltransferase (EC 2.3.1.57)::Lipid III flippase::dTDP-4-amino-4,6-dideoxygalactose transaminase (EC 2.6.1.59)::dTDP-fucosamine acetyltransferase (EC 2.3.1.210)::Putative ECA polymerase WzyE::TDP-N-acetylfucosamine:lipid II N-acetylfucosaminyltransferase (EC 2.4.1.325)::Lipopolysaccharide N-acetylmannosaminouronosyltransferase (EC 2.4.1.180)::Glucose-1-phosphate thymidylyltransferase (EC 2.7.7.24)::UDP-N-acetyl-D-mannosamine dehydrogenase (EC 1.1.1.336)::dTDP-glucose 4,6-dehydratase (EC 4.2.1.46)::Lipopolysaccharide biosynthesis protein WzzE::UDP-N-acetylglucosamine 2-epimerase (EC 5.1.3.14)::Undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphate transferase (EC 2.7.8.33)::Dephospho-CoA kinase (EC 2.7.1.24)::GMP reductase (EC 1.7.1.7)::Anaerobic dimethyl sulfoxide reductase chain A, molybdopterin-binding domain (EC 1.8.5.3)::Anaerobic dimethyl sulfoxide reductase chain B, iron-sulfur binding subunit (EC 1.8.5.3)::Anaerobic dimethyl sulfoxide reductase chain C, anchor subunit (EC 1.8.5.3)::Anaerobic dimethyl sulfoxide reductase chaperone DmsD::3'-to-5' exoribonuclease RNase R::Adenylosuccinate synthetase (EC 6.3.4.4)::HflC protein::HflK protein::Putative inner membrane protein YjeT (clustered with HflC)::Arsenic efflux pump protein::Arsenical resistance operon repressor::Putative dihydroxyacetone kinase, ADP-binding subunit (EC 2.7.1.29)::Putative dihydroxyacetone kinase, dihydroxyacetone binding subunit (EC 2.7.1.29)::Flp pilus assembly protein RcpC/CpaB::Type II/IV secretion system secretin RcpA/CpaC, associated with Flp pilus assembly::Type II/IV secretion system ATPase TadZ/CpaE, associated with Flp pilus assembly::Type IV prepilin peptidase TadV/CpaA::Type II/IV secretion system ATP hydrolase TadA/VirB11/CpaF, TadA subfamily::Flp pilus assembly protein TadB::Biopolymer transport protein ExbD/TolR::Protein TadG, associated with Flp pilus assembly::Flp pilus assembly surface protein TadF, ATP/GTP-binding motif::Flp pilus assembly membrane protein TadE::Flp pilus assembly protein TadD, contains TPR repeat::Flagellar motor rotation protein MotA::Flagellar motor rotation protein MotB::Flagellar transcriptional activator FlhC::Flagellar transcriptional activator FlhD::Positive regulator of CheA protein activity (CheW)::Signal transduction histidine kinase CheA (EC 2.7.3.-)::RidA/YER057c/UK114 superfamily protein::Aerobic respiration control protein arcA::Conserved uncharacterized protein CreA::Cytosine/purine/uracil/thiamine/allantoin permease family protein::DNA polymerase III epsilon subunit (EC 2.7.7.7)::FIG005121: SAM-dependent methyltransferase (EC 2.1.1.-)::Hydroxyacylglutathione hydrolase (EC 3.1.2.6)::Membrane-bound lytic murein transglycosylase D::Ribonuclease HI (EC 3.1.26.4)::Cell division protein FtsH (EC 3.4.24.-)::Dihydropteroate synthase (EC 2.5.1.15)::Phosphoglucosamine mutase (EC 5.4.2.10)::Penicillin-binding protein 2 (PBP-2)::Rod shape-determining protein RodA::D-alanyl-D-alanine carboxypeptidase (EC 3.4.16.4)::Sensor protein BasS (activates BasR)::Transcription elongation factor GreA::Septum-associated rare lipoprotein A::GTP-binding protein Obg::Proposed lipoate regulatory protein YbeD::Two-component transcriptional regulatory protein BasR (activated by BasS)::Octanoate-[acyl-carrier-protein]-protein-N-octanoyltransferase (EC 2.3.1.181)::Cystathionine beta-lyase (EC 4.4.1.8)::Serine transporter::2,5-didehydrogluconate reductase (2-dehydro-D-gluconate-forming) (EC 1.1.1.274)::Methylglyoxal reductase, dihydroxyacetone producing::NADPH-dependent broad range aldehyde dehydrogenase YqhD::Transcriptional regulator YqhC, positively regulates YqhD and DkgA::2,5-didehydrogluconate reductase (2-dehydro-L-gulonate-forming) (EC 1.1.1.346)::LSU ribosomal protein L21p::LSU ribosomal protein L27p::Octaprenyl diphosphate synthase (EC 2.5.1.90)::Protein translocase membrane subunit SecG::Lipoyl synthase (EC 2.8.1.8)::Twin-arginine translocation protein TatE::23S rRNA (pseudouridine(1915)-N(3))-methyltransferase (EC 2.1.1.177)::ABC-type hemin transport system, ATPase component::Hemin ABC transporter, permease protein::Cystathionine gamma-synthase (EC 2.5.1.48)::Methionine repressor MetJ::Aspartokinase (EC 2.7.2.4)::Homoserine dehydrogenase (EC 1.1.1.3)::Homoserine kinase (EC 2.7.1.39)::Threonine synthase (EC 4.2.3.1)::5,10-methylenetetrahydrofolate reductase (EC 1.5.1.20)::Hemin transport protein HmuS::TonB-dependent hemin, ferrichrome receptor::Periplasmic hemin-binding protein::Putative heme iron utilization protein::Radical SAM family protein HutW, similar to coproporphyrinogen III oxidase, oxygen-independent, associated with heme uptake::Glucose-6-phosphate isomerase (EC 5.3.1.9)::DNA polymerase III delta subunit (EC 2.7.7.7)::Leucyl-tRNA synthetase (EC 6.1.1.4)::Nicotinate-nucleotide adenylyltransferase (EC 2.7.7.18)::Ribosomal silencing factor RsfA::Ferredoxin--NADP(+) reductase (EC 1.18.1.2)::Fructose-1,6-bisphosphatase, GlpX type (EC 3.1.3.11)::4-carboxy-4-hydroxy-2-oxoadipate aldolase (EC 4.1.3.17)::4-oxalmesaconate hydratase (EC 4.2.1.83)::Transcriptional regulator, LysR family::Triosephosphate isomerase (EC 5.3.1.1)::Glycerol kinase (EC 2.7.1.30)::Glycerol uptake facilitator protein::Uricase (urate oxidase) (EC 1.7.3.3)::D-glycero-beta-D-manno-heptose-1,7-bisphosphate 7-phosphatase (EC 3.1.3.82)::Methionine ABC transporter substrate-binding protein::Quorum-sensing transcriptional activator YpeR::Acyl-homoserine-lactone synthase YpeI (EC 2.3.1.184)::Phosphoenolpyruvate synthase (EC 2.7.9.2)::UPF0118 inner membrane protein YdiK::Arogenate dehydratase (EC 4.2.1.91)::Cytochrome c551 peroxidase (EC 1.11.1.5)::Chorismate mutase I (EC 5.4.99.5)::Prephenate dehydratase (EC 4.2.1.51)::2-keto-3-deoxy-D-arabino-heptulosonate-7-phosphate synthase I alpha (EC 2.5.1.54)::Cyclohexadienyl dehydrogenase (EC 1.3.1.43) (EC 1.3.1.12)::Aldose 1-epimerase (EC 5.1.3.3)::Galactokinase (EC 2.7.1.6)::Galactose-1-phosphate uridylyltransferase (EC 2.7.7.10)::Phosphoglycerate mutase (EC 5.4.2.11)::UDP-glucose 4-epimerase (EC 5.1.3.2)::Inosine/xanthosine triphosphatase (EC 3.6.1.-)::Transcriptional repressor protein TrpR::Phosphoenolpyruvate synthase regulatory protein::Type II/IV secretion system protein TadC, associated with Flp pilus assembly::Flagellar hook-associated protein FlgK::Flagellar hook-associated protein FlgL::Flagellar basal-body rod protein FlgF::Flagellar basal-body rod protein FlgG::Flagellar L-ring protein FlgH::Flagellar basal-body P-ring formation protein FlgA::Flagellar basal-body rod protein FlgB::Flagellar basal-body rod protein FlgC::Flagellar basal-body rod modification protein FlgD::Flagellar hook protein FlgE::Flagellar P-ring protein FlgI::Flagellar protein FlgJ [peptidoglycan hydrolase] (EC 3.2.1.-)::Negative regulator of flagellin synthesis FlgM (anti-sigma28)::Flagellar biosynthesis protein FlgN::Flagellar M-ring protein FliF::Flagellar motor switch protein FliG::Flagellar assembly protein FliH::Flagellum-specific ATP synthase FliI::Flagellar hook-basal body complex protein FliE::Flagellar hook-length control protein FliK::Flagellar biosynthesis protein FlhA::Flagellar biosynthesis protein FlhB::Flagellar biosynthesis protein FliP::Flagellar biosynthesis protein FliR::Flagellar motor switch protein FliM::Flagellar motor switch protein FliN::Flagellar biosynthesis protein FliQ::Flagellar protein FlhE::Flagellar biosynthesis protein FliO::Flagellar protein FliJ::Flagellar basal body-associated protein FliL::DNA repair protein RadA::Phosphoserine phosphatase (EC 3.1.3.3)::Deoxyribose-phosphate aldolase (EC 4.1.2.4)::Thymidine phosphorylase (EC 2.4.2.4)::Phosphopentomutase (EC 5.4.2.7)::Purine nucleoside phosphorylase (EC 2.4.2.1)::Undecaprenyl-diphosphatase BcrC (EC 3.6.1.27)::ABC transporter, ATP-binding protein (cluster 2, ribose/xylose/arabinose/galactose)::ABC transporter, permease protein (cluster 2, ribose/xylose/arabinose/galactose)::ABC transporter, substrate-binding protein (cluster 2, ribose/xylose/arabinose/galactose)::Ribokinase (EC 2.7.1.15)::D-xylose transport ATP-binding protein XylG::Xylose ABC transporter, permease protein XylH::Xylose ABC transporter, periplasmic xylose-binding protein XylF::NadR transcriptional regulator::Nicotinamide-nucleotide adenylyltransferase, NadR family (EC 2.7.7.1)::Ribosylnicotinamide kinase (EC 2.7.1.22)::Uncharacterized glutathione S-transferase-like protein::Flagellar regulatory protein FleQ::Xylulose kinase (EC 2.7.1.17)::PTS system, fructose-specific IIB component (EC 2.7.1.202)::PTS system, fructose-specific IIC component (EC 2.7.1.69)::1-phosphofructokinase (EC 2.7.1.56)::PTS system, fructose-specific IIA component (EC 2.7.1.202)::Na+ dependent nucleoside transporter NupC::D-malate dehydrogenase [decarboxylating] (EC 1.1.1.83)::LysR family transcriptional regulator DmlR::3-oxoacyl-[acyl-carrier-protein] synthase, KASIII (EC 2.3.1.180)::Phosphate:acyl-ACP acyltransferase PlsX (EC 2.3.1.n2)::Malonyl CoA-acyl carrier protein transacylase (EC 2.3.1.39)::LSU ribosomal protein L32p, zinc-independent::LSU ribosomal protein L32p::FIG01269488: protein, clustered with ribosomal protein L32p::3-oxoacyl-[acyl-carrier protein] reductase (EC 1.1.1.100)::3-oxoacyl-[acyl-carrier-protein] synthase, KASII (EC 2.3.1.179)::Acyl carrier protein::3-hydroxybutyryl-CoA epimerase (EC 5.1.2.3)::Enoyl-CoA hydratase (EC 4.2.1.17)::3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35)::3-ketoacyl-CoA thiolase (EC 2.3.1.16)::Hydroxymethylglutaryl-CoA synthase (EC 2.3.3.10)::FIG000605: protein co-occurring with transport systems (COG1739)::Trk potassium uptake system protein TrkH::Xaa-Pro dipeptidase PepQ (EC 3.4.13.9)::Protoporphyrinogen IX oxidase, oxygen-independent, HemG (EC 1.3.-.-)::Porphobilinogen synthase (EC 4.2.1.24)::Transcriptional activator RfaH::3-polyprenyl-4-hydroxybenzoate carboxy-lyase (EC 4.1.1.98)::Deoxyribonuclease TatD::NAD(P)H-flavin reductase (EC 1.16.1.3) (EC 1.5.1.41)::Carbon starvation protein A::Exoribonuclease II (EC 3.1.13.1)::L-Proline/Glycine betaine transporter ProP::Transaldolase (EC 2.2.1.2)::Molybdopterin adenylyltransferase (EC 2.7.7.75)::L-ribulose-5-phosphate 4-epimerase (EC 5.1.3.4)::Orotidine 5'-phosphate decarboxylase (EC 4.1.1.23)::Translation initiation factor SUI1-related protein::Ubiquinone biosynthesis protein UbiJ::Ubiquinone biosynthesis regulatory protein kinase UbiB::DNA recombination protein RmuC::2-methoxy-6-polyprenyl-1,4-benzoquinol methylase (EC 2.1.1.201)::Demethylmenaquinone methyltransferase (EC 2.1.1.163)::Twin-arginine translocation protein TatB::Twin-arginine translocation protein TatA::Twin-arginine translocation protein TatC::5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferase (EC 2.1.1.14)::Uridine phosphorylase (EC 2.4.2.3)::L-cystine ABC transporter (wide substrate range), ATP-binding protein YecC::L-cystine ABC transporter (wide substrate range), permease protein YecS::Flagellar biosynthesis protein FliS::Flagellar cap protein FliD::Flagellar biosynthesis protein FliT::D-cysteine desulfhydrase (EC 4.4.1.15)::L-cystine ABC transporter (wide substrate range), substrate-binding protein FliY::RNA polymerase sigma factor for flagellar operon::Regulator of sigma S factor FliZ::Flagellin protein FlaA::Phenylalanyl-tRNA synthetase alpha chain (EC 6.1.1.20)::Phenylalanyl-tRNA synthetase beta chain (EC 6.1.1.20)::Integration host factor alpha subunit::LSU ribosomal protein L20p::Threonyl-tRNA synthetase (EC 6.1.1.3)::Translation initiation factor 3::LSU ribosomal protein L35p::UPF0056 inner membrane protein MarC::UDP-4-amino-4-deoxy-L-arabinose--oxoglutarate aminotransferase (EC 2.6.1.87)::Undecaprenyl-phosphate 4-deoxy-4-formamido-L-arabinose transferase (EC 2.4.2.53)::Vitamin B12 ABC transporter, ATP-binding protein BtuD::Glutathione peroxidase (EC 1.11.1.9)::Vitamin B12 ABC transporter, permease protein BtuC::UDP-4-amino-4-deoxy-L-arabinose formyltransferase (EC 2.1.2.13)::UDP-glucuronic acid oxidase (UDP-4-keto-hexauronic acid decarboxylating) (EC 1.1.1.305)::4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol deformylase ArnD (EC 3.5.1.n3)::Undecaprenyl phosphate-alpha-4-amino-4-deoxy-L-arabinose arabinosyl transferase (EC 2.4.2.43)::Undecaprenyl phosphate-aminoarabinose flippase subunit ArnF::Undecaprenyl phosphate-aminoarabinose flippase subunit ArnE::Lipoate-protein ligase A::Mg(2+) transport ATPase, P-type (EC 3.6.3.2)::Mg(2+)-transport-ATPase-associated protein MgtC::Modular polyketide synthase (EC 2.3.1.- )::Cobalt-zinc-cadmium resistance protein CzcD::Quinolinate synthetase (EC 2.5.1.72)::Ribosyl nicotinamide transporter, PnuC-like::Cold shock protein of CSP family => CspE (naming convention as in E.coli)::Uncharacterized fimbrial chaperone YbgP::Chemotaxis protein methyltransferase CheR (EC 2.1.1.80)::Chemotaxis response regulator protein-glutamate methylesterase CheB (EC 3.1.1.61)::Chemotaxis regulator - transmits chemoreceptor signals to flagellar motor components CheY::Chemotaxis response - phosphatase CheZ::DNA mismatch repair protein MutL::tRNA dimethylallyltransferase (EC 2.5.1.75)::RNA-binding protein Hfq::Ribosome LSU-associated GTP-binding protein HflX::Exodeoxyribonuclease V alpha chain (EC 3.1.11.5)::Exodeoxyribonuclease V beta chain (EC 3.1.11.5)::1,6-anhydro-N-acetylmuramyl-L-alanine amidase::N-acetylmuramoyl-L-alanine amidase (EC 3.5.1.28)::ADP-dependent (S)-NAD(P)H-hydrate dehydratase (EC 4.2.1.136)::NAD(P)H-hydrate epimerase (EC 5.1.99.6)::tRNA threonylcarbamoyladenosine biosynthesis protein TsaE::Epoxyqueuosine reductase (EC 1.17.99.6)::Type IV fimbrial assembly protein PilC::Type IV fimbrial assembly, ATPase PilB::Type IV pilin PilA::Quinolinate phosphoribosyltransferase [decarboxylating] (EC 2.4.2.19)::Membrane-bound metal-dependent hydrolase YdjM, induced during SOS response::N-acetylglutamate synthase (EC 2.3.1.1)::Oligogalacturonate lyase (EC 4.2.2.6)::Transcriptional regulator KdgR, KDG operon repressor::Exodeoxyribonuclease V gamma chain (EC 3.1.11.5)::Protease III precursor (EC 3.4.24.55)::D-tagatose-1,6-bisphosphate aldolase subunit KbaZ::Transcriptional repressor of aga operon::Galactosamine-6-phosphate isomerase AgaS::2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase (EC 1.1.1.127)::2-deoxy-D-gluconate 3-dehydrogenase (EC 1.1.1.125)::N-acetylgalactosamine-6-phosphate deacetylase (EC 3.5.1.25)::PTS system, N-acetylgalactosamine-specific IIA component::PTS system, galactosamine-specific IIA component (EC 2.7.1.69)::Rhamnogalacturonides degradation protein RhiN::PTS system, N-acetylgalactosamine-specific IID component (EC 2.7.1.69)::4-deoxy-L-threo-5-hexosulose-uronate ketol-isomerase (EC 5.3.1.17)::Pectin degradation protein KdgF::PTS system, N-acetylgalactosamine-specific IIC component (EC 2.7.1.69)::D-3-phosphoglycerate dehydrogenase (EC 1.1.1.95)::PTS system, N-acetylgalactosamine-specific IIB component (EC 2.7.1.69)::Ribose 5-phosphate isomerase A (EC 5.3.1.6)::ABC transporter, ATP-binding protein (cluster 1, maltose/g3p/polyamine/iron)::ABC transporter, ATP-binding protein (cluster 10, nitrate/sulfonate/bicarbonate)::ABC transporter, permease protein 2 (cluster 1, maltose/g3p/polyamine/iron)::ABC transporter, substrate-binding protein (cluster 1, maltose/g3p/polyamine/iron)::ABC transporter, permease protein 1 (cluster 1, maltose/g3p/polyamine/iron)::Exopolygalacturonate lyase (EC 4.2.2.9)::Transcriptional regulator ArgP, LysR family::AaeAB efflux system for hydroxylated, aromatic carboxylic acids, inner membrane subunit AaeB::Succinate-semialdehyde dehydrogenase [NAD(P)+] (EC 1.2.1.16)::Ribonuclease (Barnase), secreted::Barstar, ribonuclease (Barnase) inhibitor::UPF0307 protein YjgA::AaeAB efflux system for hydroxylated, aromatic carboxylic acids, membrane fusion component AaeA::LysR family transcriptional regulator AaeR::TldE protein, part of TldE/TldD proteolytic complex::AaeX protein, function unknown::Oligogalacturonate-specific porin protein KdgM::Arginine exporter protein ArgO::Fumarate hydratase class I (EC 4.2.1.2)::Methionine aminopeptidase (EC 3.4.11.18)::[Protein-PII] uridylyltransferase (EC 2.7.7.59)::[Protein-PII]-UMP uridylyl-removing enzyme::2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase (EC 2.3.1.117)::SSU ribosomal protein S2p (SAe)::Translation elongation factor Ts::Hypothetical flavoprotein YqcA (clustered with tRNA pseudouridine synthase C)::UPF0325 protein YaeH::tRNA pseudouridine(65) synthase (EC 5.4.99.26)::Outer membrane protein H precursor::Outer membrane protein assembly factor YaeT::UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase (EC 2.3.1.191)::3-hydroxyacyl-[acyl-carrier-protein] dehydratase, FabZ form (EC 4.2.1.59)::Acyl-CoA:1-acyl-sn-glycerol-3-phosphate acyltransferase (EC 2.3.1.51)::Topoisomerase IV subunit A (EC 5.99.1.-)::Acyl-ACP:1-acyl-sn-glycerol-3-phosphate acyltransferase (EC 2.3.1.n4)::1-deoxy-D-xylulose 5-phosphate reductoisomerase (EC 1.1.1.267)::Phosphatidate cytidylyltransferase (EC 2.7.7.41)::Undecaprenyl diphosphate synthase (EC 2.5.1.31)::CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase (EC 2.7.8.5)::Ribosome recycling factor::Uridine monophosphate kinase (EC 2.7.4.22)::BarA-associated response regulator UvrY (= GacA = SirA)::Excinuclease ABC subunit C::Acetyl-coenzyme A carboxyl transferase alpha chain (EC 6.4.1.2)::DNA polymerase III alpha subunit (EC 2.7.7.7)::Ribonuclease HII (EC 3.1.26.4)::Lipid-A-disaccharide synthase (EC 2.4.1.182)::Acyl-[acyl-carrier-protein]--UDP-N-acetylglucosamine O-acyltransferase (EC 2.3.1.129)::Allophanate hydrolase 2 subunit 1 (EC 3.5.1.54)::Allophanate hydrolase 2 subunit 2 (EC 3.5.1.54)::3',5'-cyclic-nucleotide phosphodiesterase (EC 3.1.4.17)::Deoxyribodipyrimidine photolyase (EC 4.1.99.3)::GTP cyclohydrolase 1 type 2 homolog YbgI::ADP-ribose pyrophosphatase (EC 3.6.1.13)::Outer membrane channel TolC (OpmH)::Topoisomerase IV subunit B (EC 5.99.1.-)::Lactam utilization protein LamB::FIG143828: Hypothetical protein YbgA::Osmosensitive K+ channel histidine kinase KdpD (EC 2.7.3.-)::Potassium-transporting ATPase B chain (TC 3.A.3.7.1) (EC 3.6.3.12)::Potassium-transporting ATPase A chain (TC 3.A.3.7.1) (EC 3.6.3.12)::Potassium-transporting ATPase C chain (TC 3.A.3.7.1) (EC 3.6.3.12)::DNA-binding response regulator KdpE::N,N'-diacetylchitobiose-specific regulator ChbR, AraC family::PTS system, N,N'-diacetylchitobiose-specific IIC component::PTS system, N,N'-diacetylchitobiose-specific IIA component (EC 2.7.1.196)::PTS system, N,N'-diacetylchitobiose-specific IIB component (EC 2.7.1.196)::N,N'-diacetylchitobiose utilization operon protein YdjC::Phosphoglucomutase (EC 5.4.2.2)::SeqA protein, negative modulator of initiation of replication::Pyrrolidone-carboxylate peptidase (EC 3.4.19.3)::Uracil-DNA glycosylase, family 1 (EC 3.2.2.27)::L-xylulose 5-phosphate 3-epimerase (EC 5.1.3.-)::UPF0246 protein YaaA::DNA-3-methyladenine glycosylase II (EC 3.2.2.21)::Betaine/carnitine/choline transporter (BCCT) family::Carnitine monooxygenase, oxygenase component CntA::Carnitine monooxygenase, reductase component CntB::Channel-forming transporter/cytolysins activator of TpsB family::Putative haemolysin/cytolysin secreted via TPS pathway::Putative large exoprotein involved in heme utilization or adhesion of ShlA/HecA/FhaA family::Arginyl-tRNA synthetase (EC 6.1.1.19)::5-methyltetrahydrofolate--homocysteine methyltransferase (EC 2.1.1.13)::Isocitrate dehydrogenase phosphatase /kinase (EC 3.1.3.-) (EC 2.7.11.5)::Multidrug resistance protein MdtH::Protein of unknown function YceH::Ribosomal-protein-S5p-alanine acetyltransferase (EC 2.3.1.128)::Proposed peptidoglycan lipid II flippase MurJ::Cytoplasmic copper homeostasis protein cutC::tRNA (mo5U34)-methyltransferase::tRNA (cmo5U34)-methyltransferase::Copper homeostasis protein CutF precursor::Hypothetical protein YaeJ with similarity to translation release factor::Prolyl-tRNA synthetase, bacterial type (EC 6.1.1.15)::tRNA (adenine(37)-N6)-methyltransferase::Rho-specific inhibitor of transcription termination (YaeO)::Uncharacterized protein YaeR with similarity to glyoxylase family::tRNA(Ile)-lysidine synthetase (EC 6.3.4.19)::Phosphoribosylformylglycinamidine cyclo-ligase (EC 6.3.3.1)::Uracil phosphoribosyltransferase (EC 2.4.2.9)::Catalase-peroxidase KatG (EC 1.11.1.21)::Regulator of nucleoside diphosphate kinase::Ribonuclease E (EC 3.1.26.12)::Ribosomal large subunit pseudouridine synthase C (EC 5.4.99.24)::Transcription termination protein NusA::Translation initiation factor 2::Ribosome-binding factor A::tRNA pseudouridine(55) synthase (EC 5.4.99.25)::Bacterial ribosome SSU maturation protein RimP::SSU ribosomal protein S15p (S13e)::Isocitrate lyase (EC 4.1.3.1)::Malate synthase (EC 2.3.3.9)::Energy-dependent translational throttle protein EttA::Copper resistance protein CopC::Copper resistance protein CopD::DNA polymerase III theta subunit (EC 2.7.7.7)::Proline iminopeptidase (EC 3.4.11.5)::2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase (EC 4.1.2.52)::4-hydroxyphenylacetate symporter, major facilitator superfamily (MFS)::4-hydroxyphenylacetate 3-monooxygenase (EC 1.14.14.9)::2-oxo-hepta-3-ene-1,7-dioic acid hydratase (EC 4.2.-.-)::4-hydroxyphenylacetate 3-monooxygenase, reductase component (EC 1.6.8.-)::3,4-dihydroxyphenylacetate 2,3-dioxygenase (EC 1.13.11.15)::5-carboxymethyl-2-hydroxymuconate semialdehyde dehydrogenase (EC 1.2.1.60)::2-hydroxyhepta-2,4-diene-1,7-dioate isomerase (EC 5.3.3.-)::5-carboxymethyl-2-oxo-hex-3- ene-1,7-dioate decarboxylase (EC 4.1.1.68)::5-carboxymethyl-2-hydroxymuconate delta-isomerase (EC 5.3.3.10)::Homoprotocatechuate degradative operon repressor::L-serine dehydratase, alpha subunit (EC 4.3.1.17)::L-serine dehydratase, beta subunit (EC 4.3.1.17)::ATP-dependent 23S rRNA helicase DbpA::Phosphoribosylglycinamide formyltransferase 2 (EC 2.1.2.-)::Para-aminobenzoate synthase, aminase component (EC 2.6.1.85)::Uncharacterized Nudix hydrolase NudL::Protease II (EC 3.4.21.83)::Glutamyl-tRNA synthetase (EC 6.1.1.17)::DNA polymerase III psi subunit (EC 2.7.7.7)::Peptide chain release factor 3::Osmotically inducible protein OsmY::Putative deoxyribonuclease YjjV::A/G-specific adenine glycosylase (EC 3.2.2.-)::FIG001341: Probable Fe(2+)-trafficking protein YggX::tRNA (guanine(46)-N(7))-methyltransferase (EC 2.1.1.33)::Fructose-1,6-bisphosphatase, type I (EC 3.1.3.11)::UDP-N-acetylmuramate:L-alanyl-gamma-D-glutamyl-meso-diaminopimelate ligase (EC 6.3.2.-)::Inner membrane component of TAM transport system::Outer membrane component of TAM transport system::Peptide-methionine (S)-S-oxide reductase MsrA (EC 1.8.4.11)::UPF0131 protein YtfP::Inorganic pyrophosphatase (EC 3.6.1.1)::Type III secretion host injection and negative regulator protein (YopD)::Type III secretion host injection protein (YopB)::Type III secretion chaperone protein for YopD (SycD)::Type III secretion cytoplasmic LcrG inhibitor (LcrV,secretion and targeting control protein, V antigen)::Type III secretion bridge between inner and outermembrane lipoprotein (YscJ,HrcJ,EscJ, PscJ)::Type III secretion protein SsaI::Type III secretion outermembrane pore forming protein (YscC,MxiD,HrcC, InvG)::Type III secretion thermoregulatory protein (LcrF,VirF,transcription regulation of virulence plasmid)::Type III secretion inner membrane protein (YscT,HrcT,SpaR,EscT,EpaR1,homologous to flagellar export components)::Type III secretion inner membrane protein (YscU,SpaS,EscU,HrcU,SsaU, homologous to flagellar export components)::Type III secretion inner membrane protein (YscR,SpaR,HrcR,EscR,homologous to flagellar export components)::Type III secretion inner membrane protein (YscS,homologous to flagellar export components)::Type III secretion inner membrane protein (YscQ,homologous to flagellar export components)::Type III secretion cytoplasmic ATP synthase (EC 3.6.3.14, YscN,SpaL,MxiB,HrcN,EscN)::Type III secretion spans bacterial envelope protein (YscO)::Type III secretion inner membrane channel protein (LcrD,HrcV,EscV,SsaV)::Type III secretion protein SsaK::Type III secretion protein SsaH::Type III secretion inner membrane protein (YscD,homologous to flagellar export components)::Type III secretion cytoplasmic protein (YscL)::Type III secretion low calcium response protein (LcrR)::Type III secretion protein (YscP)::Chaperone protein YscY (Yop proteins translocation protein Y)::Type III secretion protein SctX::Type III secretion cytoplasmic protein (YscK)::Type III secretion protein (YscE)::Type III secretion chaperone SycN::Type III secretion chaperone protein for YopN (SycN,YscB)::Type III secretion negative regulator of effector production protein (LcrQ,YscM, YscM1 and YscM2)::Type III secretion outermembrane contact sensing protein (YopN,Yop4b,LcrE)::Type III secretion cytoplasmic protein (YscI)::Type III secretion cytoplasmic protein (YscF)::Type III secretion spans bacterial envelope protein (YscG)::Type III secretion effector protein (YopR, encoded by YscH)::Type III secretion cytoplasmic plug protein (LcrG)::Type III secretion outermembrane negative regulator of secretion (TyeA)::Type III secretion protein SsaG::Type III secretion transporter lipoprotein (YscW,VirG)::D-erythrose-4-phosphate dehydrogenase (EC 1.2.1.72)::Phosphoglycerate kinase (EC 2.7.2.3)::Fructose-bisphosphate aldolase class II (EC 4.1.2.13)::Transcriptional regulator YbiH, TetR family::ATP-dependent RNA helicase RhlE::Zinc resistance-associated protein::Type III secretion protein (YscA)::Type III secretion possible injected virulence protein (YopM)::internalin, putative::Type III secretion, hypothetical protein::ClpB protein::FIG00003370: Multicopper polyphenol oxidase::Outer membrane beta-barrel assembly protein BamD::Ribosomal large subunit pseudouridine synthase D (EC 5.4.99.23)::Ribosome-associated inhibitor A::Cold shock protein of CSP family => CspC (naming convention as in E.coli)::beta-galactosidase (EC 3.2.1.23)::FIG004453: protein YceG like::Thymidylate kinase (EC 2.7.4.9)::DNA polymerase III delta prime subunit (EC 2.7.7.7)::Uncharacterized metal-dependent hydrolase YcfH::Aminodeoxychorismate lyase (EC 4.1.3.38)::Virulence factor MviM::4-hydroxythreonine-4-phosphate dehydrogenase (EC 1.1.1.262)::SSU rRNA (adenine(1518)-N(6)/adenine(1519)-N(6))-dimethyltransferase (EC 2.1.1.182)::Bis(5'-nucleosyl)-tetraphosphatase, symmetrical (EC 3.6.1.41)::ApaG protein::DnaJ-like protein DjlA::Organic solvent tolerance protein precursor::Ribosomal large subunit pseudouridine(746) synthase (EC 5.4.99.29)::tRNA pseudouridine(32) synthase (EC 5.4.99.28)::Isoleucyl-tRNA synthetase (EC 6.1.1.5)::Riboflavin kinase (EC 2.7.1.26)::FMN adenylyltransferase (EC 2.7.7.2)::Lipoprotein signal peptidase (EC 3.4.23.36)::Chaperone protein DnaJ::Chaperone protein DnaK::Na+/H+ antiporter NhaA type::Transcriptional activator NhaR::4-hydroxy-3-methylbut-2-enyl diphosphate reductase (EC 1.17.7.4)::SSU ribosomal protein S20p::FKBP-type peptidyl-prolyl cis-trans isomerase SlpA (EC 5.2.1.8)::Hydroxymethylpyrimidine phosphate synthase ThiC (EC 4.1.99.17)::Thiamin-phosphate pyrophosphorylase (EC 2.5.1.3)::2-iminoacetate synthase (ThiH) (EC 4.1.99.19)::Thiazole synthase (EC 2.8.1.10)::Sulfur carrier protein ThiS adenylyltransferase (EC 2.7.7.73)::Sulfur carrier protein ThiS::IMP cyclohydrolase (EC 3.5.4.10)::Phosphoribosylaminoimidazolecarboxamide formyltransferase (EC 2.1.2.3)::Phosphoribosylamine--glycine ligase (EC 6.3.4.13)::DNA-binding protein HU-alpha::Endonuclease V (EC 3.1.21.7)::Uroporphyrinogen III decarboxylase (EC 4.1.1.37)::NADH pyrophosphatase, decaps 5'-NAD modified RNA (EC 3.6.1.22)::LSU ribosomal protein L11p (L12e)::LSU ribosomal protein L1p (L10Ae)::LSU ribosomal protein L10p (P0)::DNA-directed RNA polymerase beta subunit (EC 2.7.7.6)::DNA-directed RNA polymerase beta' subunit (EC 2.7.7.6)::LSU ribosomal protein L7p/L12p (P1/P2)::Translation elongation factor G::Translation elongation factor Tu::SSU ribosomal protein S7p (S5e)::Protein translocase subunit SecE::SSU ribosomal protein S12p (S23e)::Transcription antitermination protein NusG::tRNA 5-methylaminomethyl-2-thiouridine synthase subunit TusC::tRNA 5-methylaminomethyl-2-thiouridine synthase subunit TusD::tRNA 5-methylaminomethyl-2-thiouridine synthase subunit TusB::FKBP-type peptidyl-prolyl cis-trans isomerase FkpA precursor (EC 5.2.1.8)::Carbamoyl-phosphate synthase large chain (EC 6.3.5.5)::Transporter, LysE family::Carbamoyl-phosphate synthase small chain (EC 6.3.5.5)::Taurine-binding periplasmic protein TauA::Taurine transport ATP-binding protein TauB::Taurine transport system permease protein TauC::Phosphoribulokinase homolog, function unknown (EC 2.7.1.19)::Alpha-ketoglutarate-dependent taurine dioxygenase (EC 1.14.11.17)::4-hydroxy-tetrahydrodipicolinate reductase (EC 1.17.1.8)::Bis-ABC ATPase YheS::Glycosyltransferase (EC 2.4.1.-)::Glutathione-regulated potassium-efflux system ancillary protein KefG::Glutathione-regulated potassium-efflux system protein KefB::Cytoplasmic protein, probably associated with glutathione-regulated potassium-efflux::FKBP-type peptidyl-prolyl cis-trans isomerase SlyD (EC 5.2.1.8)::Glycosyl transferase, family 2::Cyclic AMP receptor protein::GDP-L-fucose synthetase (EC 1.1.1.271)::GDP-mannose 4,6-dehydratase (EC 4.2.1.47)::Fumarate hydratase class II (EC 4.2.1.2)::Mannose-6-phosphate isomerase (EC 5.3.1.8)::Mannose-1-phosphate guanylyltransferase (EC 2.7.7.13)::5'-nucleotidase (EC 3.1.3.5)::Inner membrane protein YbaL, KefB/KefC family::Cys-tRNA(Pro) deacylase YbaK::O-antigen chain length determinant protein WzzB::Phosphomannomutase (EC 5.4.2.8)::Dihydrofolate reductase (EC 1.5.1.3)::Anaerobic glycerol-3-phosphate dehydrogenase subunit A (EC 1.1.5.3)::Anaerobic glycerol-3-phosphate dehydrogenase subunit B (EC 1.1.5.3)::Anaerobic glycerol-3-phosphate dehydrogenase subunit C (EC 1.1.5.3)::Copper-translocating P-type ATPase (EC 3.6.3.4)::Lead, cadmium, zinc and mercury transporting ATPase (EC 3.6.3.5) (EC 3.6.3.3)::Putative activity regulator of membrane protease YbbK::Putative preQ0 transporter YhhQ::tRNA 5-methylaminomethyl-2-thiouridine synthesis sulfur carrier protein TusA::Protein QmcA (possibly involved in integral membrane quality control)::FIG000875: Thioredoxin domain-containing protein EC-YbbN::Transketolase (EC 2.2.1.1)::DnaA inactivator Hda (shorter homolog of DnaA)::Aminopeptidase YpdF (MP-, MA-, MS-, AP-, NP- specific)::Maltodextrin phosphorylase (EC 2.4.1.1)::Transcriptional activator of maltose regulon, MalT::4-alpha-glucanotransferase (amylomaltase) (EC 2.4.1.25)::Thiosulfate sulfurtransferase GlpE (EC 2.8.1.1)::Glycerol-3-phosphate regulon repressor GlpR::Ferrous iron transport protein A::Ferrous iron transport protein B::Osmolarity sensory histidine kinase EnvZ::Transcription elongation factor GreB::Two-component system response regulator OmpR::Transcription accessory protein (S1 RNA-binding domain)::Competence protein F homolog, phosphoribosyltransferase domain::[4Fe-4S] cluster carrier protein NfuA::Pimeloyl-[acyl-carrier protein] methyl ester esterase BioH (EC 3.1.1.85)::Phosphoenolpyruvate carboxykinase [ATP] (EC 4.1.1.49)::Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog)::Acetate kinase (EC 2.7.2.1)::Phosphate acetyltransferase (EC 2.3.1.8)::PTS system, IIA component (EC 2.7.1.69 )::Gamma-D-Glutamyl-meso-Diaminopimelate Amidase::L-alanine-DL-glutamate epimerase (EC 5.1.1.n1)::Magnesium and cobalt transport protein CorA::Murein peptide ABC transporter, substrate-binding protein (requires DppBCDF)::Thiol peroxidase, Tpx-type (EC 1.11.1.15)::Transcriptional repressor protein TyrR::Glycerol-3-phosphate ABC transporter, permease protein UgpA (TC 3.A.1.1.3)::Glycerol-3-phosphate ABC transporter, substrate-binding protein UgpB::Glycerol-3-phosphate ABC transporter, permease protein UgpE (TC 3.A.1.1.3)::Glycerol-3-phosphate ABC transporter, ATP-binding protein UgpC (TC 3.A.1.1.3)::Cytosol nonspecific dipeptidase (EC 3.4.13.18)::DNA polymerase IV (EC 2.7.7.7)::Possible ABC transporter, periplasmic substrate X binding protein precursor::Putative ABC transporter of substrate X, permease subunit II::Glycerophosphoryl diester phosphodiesterase (EC 3.1.4.46)::Putative ABC transporter of substrate X, ATP-binding subunit::Putative ABC transporter of substrate X, permease subunit I::Na(+)-translocating NADH-quinone reductase subunit A (EC 1.6.5.-)::Na(+)-translocating NADH-quinone reductase subunit B (EC 1.6.5.-)::Na(+)-translocating NADH-quinone reductase subunit C (EC 1.6.5.-)::Na(+)-translocating NADH-quinone reductase subunit D (EC 1.6.5.-)::Na(+)-translocating NADH-quinone reductase subunit E (EC 1.6.5.-)::Na(+)-translocating NADH-quinone reductase subunit F (EC 1.6.5.-)::Ferric enterobactin-binding periplasmic protein FepB (TC 3.A.1.14.2)::Formaldehyde activating enzyme::ATP-dependent Clp protease ATP-binding subunit ClpA::ATP-dependent Clp protease adaptor protein ClpS::Macrolide export ATP-binding/permease protein MacB (EC 3.6.3.-)::Macrolide-specific efflux protein MacA::CoA-disulfide reductase (EC 1.8.1.14)::Polysulfide binding and transferase domain::Cold shock protein of CSP family => CspD (naming convention as in E.coli)::Virulence factor VirK::Predicted ATP-dependent endonuclease of the OLD family, YbjD subgroup::Formate efflux transporter (TC 2.A.44 family)::Pyruvate formate-lyase (EC 2.3.1.54)::L-asparaginase (EC 3.5.1.1)::Ribosomal protein S12p methylthiotransferase accessory factor YcaO::Pyruvate formate-lyase activating enzyme (EC 1.97.1.4)::DNA translocase FtsK::Outer membrane lipoprotein carrier protein LolA::Leucine-responsive regulatory protein, regulator for leucine (or lrp) regulon and high-affinity branched-chain amino acid transport system::Seryl-tRNA synthetase (EC 6.1.1.11)::Efflux ABC transporter for glutathione/L-cysteine, essential for assembly of bd-type respiratory oxidases => CydC subunit::Efflux ABC transporter for glutathione/L-cysteine, essential for assembly of bd-type respiratory oxidases => CydD subunit::Leucyl/phenylalanyl-tRNA--protein transferase (EC 2.3.2.6)::Thioredoxin reductase (EC 1.8.1.9)::Translation initiation factor 1::Aerobic cobaltochelatase CobT subunit (EC 6.6.1.2)::Arginine decarboxylase, catabolic (EC 4.1.1.19)::Arginine/agmatine antiporter::Osmoprotectant ABC transporter binding protein YehZ::Osmoprotectant ABC transporter permease protein YehY::Osmoprotectant ABC transporter ATP-binding subunit YehX::Osmoprotectant ABC transporter inner membrane protein YehW::ADP-ribose pyrophosphatase of COG1058 family (EC 3.6.1.13)::Nicotinamide-nucleotide amidase paralog YfaY, no functional activity::2-polyprenyl-6-hydroxyphenyl methylase (EC 2.1.1.222)::3-demethylubiquinol 3-O-methyltransferase (EC 2.1.1.64)::Ribonucleotide reductase of class Ia (aerobic), alpha subunit (EC 1.17.4.1)::Ribonucleotide reductase of class Ia (aerobic), beta subunit (EC 1.17.4.1)::DNA gyrase subunit A (EC 5.99.1.3)::DNA-binding capsular synthesis response regulator RcsB::Phosphotransferase RcsD::Sensor histidine kinase RcsC (EC 2.7.13.3)::ATP-dependent helicase DinG/Rad3::Catalase KatE (EC 1.11.1.6)::Outer membrane porin PhoE::Tyrosine-specific transport protein::Uncharacterized outer membrane protein YfaZ::Elongation factor P-like protein::Mannonate dehydratase (EC 4.2.1.8)::D-mannonate oxidoreductase (EC 1.1.1.57)::Uxu operon transcriptional regulator::NADPH-dependent 7-cyano-7-deazaguanine reductase (EC 1.7.1.13)::DNA-binding domain of ModE::Molybdate-binding domain of ModE::Ferric iron ABC transporter, iron-binding protein::Membrane-bound lytic murein transglycosylase B (EC 3.2.1.-)::ABC transporter ATP-binding protein (EC 3.6.3.- )::Ferric iron ABC transporter, permease protein::Ferric iron ABC transporter, ATP-binding protein::Dethiobiotin synthetase (EC 6.3.3.3)::Acetolactate synthase large subunit (EC 2.2.1.6)::Acetolactate synthase small subunit (EC 2.2.1.6)::Long-chain-fatty-acid--CoA ligase (EC 6.2.1.3)::Ribonuclease D (EC 3.1.26.3)::Pyrimidine 5'-nucleotidase YjjG (EC 3.1.3.5)::Dihydroxy-acid dehydratase (EC 4.2.1.9)::Threonine dehydratase biosynthetic (EC 4.3.1.19)::Branched-chain amino acid aminotransferase (EC 2.6.1.42)::Cell division topological specificity factor MinE::Septum site-determining protein MinC::Septum site-determining protein MinD::Excinuclease ABC subunit B::LysR family transcriptional regulator YnfL::6-phosphogluconolactonase (EC 3.1.1.31)::Adenosylmethionine-8-amino-7-oxononanoate aminotransferase (EC 2.6.1.62)::Biotin synthase (EC 2.8.1.6)::8-amino-7-oxononanoate synthase (EC 2.3.1.47)::Malonyl-[acyl-carrier protein] O-methyltransferase (EC 2.1.1.197)::Molybdenum ABC transporter ATP-binding protein ModC::Molybdenum ABC transporter permease protein ModB::Molybdenum ABC transporter, substrate-binding protein ModA::AcrZ membrane protein associated with AcrAB-TolC multidrug efflux pump::Ornithine cyclodeaminase (EC 4.3.1.12)::Cyclopropane-fatty-acyl-phospholipid synthase (EC 2.1.1.79)::Multidrug efflux transporter MdtK/NorM (MATE family)::Riboflavin synthase eubacterial/eukaryotic (EC 2.5.1.9)::2-dehydro-3-deoxyphosphogluconate aldolase (EC 4.1.2.14)::4-hydroxy-2-oxoglutarate aldolase (EC 4.1.3.16)::Glucose-6-phosphate 1-dehydrogenase (EC 1.1.1.49)::DinG family ATP-dependent helicase YoaA::tRNA threonylcarbamoyladenosine biosynthesis protein TsaB::RidA/YER057c/UK114 superfamily, group 2, YoaB-like protein::Spermidine export protein MdtI::Spermidine export protein MdtJ::Pyruvate kinase (EC 2.7.1.40)::Glycerol-3-phosphate dehydrogenase (EC 1.1.5.3)
2 40 Histidinol dehydrogenase (EC 1.1.1.23)::Histidinol-phosphate aminotransferase (EC 2.6.1.9)::Histidinol-phosphatase (EC 3.1.3.15)::Imidazoleglycerol-phosphate dehydratase (EC 4.2.1.19)::ATP phosphoribosyltransferase => HisGl (EC 2.4.2.17)::Imidazole glycerol phosphate synthase amidotransferase subunit (EC 2.4.2.-)::Imidazole glycerol phosphate synthase cyclase subunit (EC 4.1.3.-)::Phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase (EC 5.3.1.16)::Iron-siderophore [Alcaligin-like] transport system, permease component::Iron-siderophore transport system, permease component::Iron-siderophore [Alcaligin-like] transport system, substrate-binding component::Iron-siderophore transport system, substrate-binding component::Siderophore [Alcaligin-like] biosynthesis complex, long chain::Siderophore synthetase component, ligase::Iron-siderophore [Alcaligin-like] receptor::TonB-dependent siderophore receptor::Aerobactin siderophore receptor IutA::Siderophore [Alcaligin-like] biosynthesis complex, medium chain::Siderophore synthetase large component, acetyltransferase::6-phosphogluconate dehydrogenase, decarboxylating (EC 1.1.1.44)::UTP--glucose-1-phosphate uridylyltransferase (EC 2.7.7.9)::Galactose permease::Siderophore [Alcaligin-like] decarboxylase (EC 4.1.1.-)::Siderophore biosynthesis L-2,4-diaminobutyrate decarboxylase::N(2)-citryl-N(6)-acetyl-N(6)-hydroxylysine synthase, aerobactin biosynthesis protein IucA (EC 6.3.2.38)::Possible H+-antiporter clustered with aerobactin genes::Siderophore biosynthesis protein, monooxygenase::Siderophore synthetase small component, acetyltransferase::N6-hydroxylysine O-acetyltransferase, aerobactin biosynthesis protein IucB (EC 2.3.1.102)::L-lysine 6-monooxygenase [NADPH], aerobactin biosynthesis protein IucD (EC 1.14.13.59)::Siderophore [Alcaligin-like] biosynthetic enzyme (EC 1.14.13.59)::Aerobactin synthase, aerobactin biosynthesis protein IucC (EC 6.3.2.39)::Iron-siderophore [Alcaligin-like] transport system, ATP-binding component::Iron-siderophore transport system, ATP-binding component::Siderophore [Alcaligin-like] biosynthesis complex, short chain::Phosphoribosyl-AMP cyclohydrolase (EC 3.5.4.19)::Phosphoribosyl-ATP pyrophosphatase (EC 3.6.1.31)::Iron-siderophore [Alcaligin-like] transport system, transmembran component::Iron-siderophore transport system, transmembran component::Iron-siderophore [Alcaligin-like] ferric reductase (1.6.99.14)
3 30 Cytochrome O ubiquinol oxidase subunit I (EC 1.10.3.-)::Cytochrome O ubiquinol oxidase subunit III (EC 1.10.3.-)::Cytochrome O ubiquinol oxidase subunit IV (EC 1.10.3.-)::Heme O synthase, protoheme IX farnesyltransferase COX10-CtaB (EC 2.5.1.-)::1-deoxy-D-xylulose 5-phosphate synthase (EC 2.2.1.7)::Dimethylallyltransferase (EC 2.5.1.1)::(2E,6E)-farnesyl diphosphate synthase (EC 2.5.1.10)::Rhodanese-like domain required for thiamine synthesis::tRNA 4-thiouridine synthase (EC 2.8.1.4)::DJ-1/YajL/PfpI superfamily, includes chaperone protein YajL (former ThiJ), parkinsonism-associated protein DJ-1, peptidases PfpI, Hsp31::2-dehydropantoate 2-reductase (EC 1.1.1.169)::UPF0234 protein Yitk::Exodeoxyribonuclease VII small subunit (EC 3.1.11.6)::ATP-dependent Clp protease ATP-binding subunit ClpX::ATP-dependent protease La Type I (EC 3.4.21.53)::DNA-binding protein HU-beta::Cell division protein BolA::Cell division trigger factor (EC 5.2.1.8)::ATP-dependent Clp protease proteolytic subunit (EC 3.4.21.92)::AmpG permease::Cytochrome O ubiquinol oxidase subunit II (EC 1.10.3.-)::7-cyano-7-deazaguanine synthase (EC 6.3.4.20)::Peptidyl-prolyl cis-trans isomerase PpiD (EC 5.2.1.8)::5-amino-6-(5-phosphoribosylamino)uracil reductase (EC 1.1.1.193)::Diaminohydroxyphosphoribosylaminopyrimidine deaminase (EC 3.5.4.26)::6,7-dimethyl-8-ribityllumazine synthase (EC 2.5.1.78)::Ribonucleotide reductase transcriptional regulator NrdR::Transcription termination protein NusB::Phosphatidylglycerophosphatase A (EC 3.1.3.27)::Thiamine-monophosphate kinase (EC 2.7.4.16)
4 26 Acetylglutamate kinase (EC 2.7.2.8)::N-acetyl-gamma-glutamyl-phosphate reductase (EC 1.2.1.38)::Acetylornithine deacetylase (EC 3.5.1.16)::ABC exporter for hemopore HasA, ATP-binding component HasD::Type I secretion system ATPase::ABC exporter for hemopore HasA, membrane fusion protein (MFP) family component HasE::Type I secretion membrane fusion protein, HlyD family::Dihydrolipoamide dehydrogenase (EC 1.8.1.4)::Dihydrolipoamide dehydrogenase of pyruvate dehydrogenase complex (EC 1.8.1.4)::Dihydrolipoamide acetyltransferase component of pyruvate dehydrogenase complex (EC 2.3.1.12)::Ferric siderophore transport system, periplasmic binding protein TonB::Pyruvate dehydrogenase E1 component (EC 1.2.4.1)::Hemophore HasA::Hydrogen peroxide-inducible genes activator => OxyR::Soluble pyridine nucleotide transhydrogenase (EC 1.6.1.1)::Transcriptional repressor for pyruvate dehydrogenase complex::Glutamate racemase (EC 5.1.1.3)::Outer membrane vitamin B12 receptor BtuB::tRNA (uracil(54)-C5)-methyltransferase (EC 2.1.1.35)::tmRNA (uracil(341)-C5)-methyltransferase::YciL protein::Hemophore HasA outer membrane receptor HasR::Iron siderophore receptor protein::Argininosuccinate lyase (EC 4.3.2.1)::Phosphoenolpyruvate carboxylase (EC 4.1.1.31)::Aconitate hydratase 2 (EC 4.2.1.3)
5 23 Iron siderophore ABC transporter, permease/ATP-binding protein YbtP::Iron siderophore ABC transporter, permease/ATP-binding protein YbtQ::iron aquisition regulator (YbtA,AraC-like,required for transcription of FyuA/psn,Irp2)::Salicylate synthetase (EC 4.2.99.21) (EC 5.4.4.2)::Salicylate synthetase (EC 4.2.99.21) of siderophore biosynthesis (EC 5.4.4.2)::Stresses-induced protein Ves (HutD)::Formiminoglutamic iminohydrolase (EC 3.5.3.13)::Histidine utilization repressor::Imidazolonepropionase (EC 3.5.2.7)::N-formylglutamate deformylase (EC 3.5.1.68)::Siderophore biosynthesis non-ribosomal peptide synthetase modules::iron aquisition yersiniabactin synthesis enzyme (Irp1,polyketide synthetase)::iron aquisition yersiniabactin synthesis enzyme (Irp2)::2,3-dihydroxybenzoate-AMP ligase of siderophore biosynthesis (EC 2.7.7.58)::Thiazolinyl imide reductase in siderophore biosynthesis gene cluster::Thioesterase in siderophore biosynthesis gene cluster::iron aquisition 2,3-dihydroxybenzoate-AMP ligase (EC 2.7.7.58,Irp5)::Putative reductoisomerase in siderophore biosynthesis gene cluster::Yersiniabactin synthetase, thiazolinyl reductase component Irp3::Outer membrane receptor for ferric siderophore::iron aquisition outermembrane yersiniabactin receptor (FyuA,Psn,pesticin receptor)::Iron aquisition yersiniabactin synthesis enzyme YbtT::Putative ABC iron siderophore transporter, fused permease and ATPase domains
6 22 Diacylglycerol kinase (EC 2.7.1.107)::Glycerol-3-phosphate acyltransferase (EC 2.3.1.15)::4-hydroxybenzoate polyprenyltransferase (EC 2.5.1.39)::SOS-response repressor and protease LexA (EC 3.4.21.88)::Chorismate--pyruvate lyase (EC 4.1.3.40)::Alanine racemase (EC 5.1.1.1)::Biosynthetic Aromatic amino acid aminotransferase alpha (EC 2.6.1.57)::Replicative DNA helicase (DnaB) (EC 3.6.4.12)::Quinone oxidoreductase (EC 1.6.5.5)::Zinc uptake regulation protein Zur::tRNA-dihydrouridine(20/20a) synthase (EC 1.3.1.91)::Excinuclease ABC subunit A::Lactaldehyde dehydrogenase involved in fucose or rhamnose utilization (EC 1.2.1.22)::Rhamnulose-1-phosphate aldolase (EC 4.1.2.19)::L-rhamnose isomerase (EC 5.3.1.14)::L-rhamnose mutarotase (EC 5.1.3.n3)::L-rhamnose operon transcriptional activator RhaR::L-rhamnose-proton symporter::L-rhamnose operon regulatory protein RhaS::Rhamnulokinase (EC 2.7.1.5)::Single-stranded DNA-binding protein::IncN plasmid KikA protein
7 22 Alpha-D-ribose 1-methylphosphonate 5-triphosphate diphosphatase (EC 3.6.1.63)::Ribose 1,5-bisphosphate phosphokinase PhnN (EC 2.7.4.23)::Alpha-D-ribose 1-methylphosphonate 5-phosphate C-P lyase (EC 4.7.1.1)::Phosphonates utilization ATP-binding protein PhnK::Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnI (EC 2.7.8.37)::Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnL (EC 2.7.8.37)::Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnH (EC 2.7.8.37)::Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnG (EC 2.7.8.37)::Ribonucleotide reductase of class III (anaerobic), activating protein (EC 1.97.1.4)::Transcriptional regulator PhnF::ABC transporter, ATP-binding protein (cluster 5, nickel/peptides/opines)::ABC transporter, permease protein 2 (cluster 5, nickel/peptides/opines)::Ribonucleotide reductase of class III (anaerobic), large subunit (EC 1.17.4.2)::ABC transporter, permease protein 1 (cluster 5, nickel/peptides/opines)::ABC transporter, substrate-binding protein (cluster 5, nickel/peptides/opines)::Ornithine carbamoyltransferase (EC 2.1.3.3)::Ribonuclease E inhibitor RraB::Lipopolysaccharide export system permease protein LptF::Lipopolysaccharide export system permease protein LptG::Cytosol aminopeptidase PepA (EC 3.4.11.1)::DNA polymerase III chi subunit (EC 2.7.7.7)::Valyl-tRNA synthetase (EC 6.1.1.9)
8 21 Phospholipid ABC transporter shuttle protein MlaC::UDP-N-acetylglucosamine 1-carboxyvinyltransferase (EC 2.5.1.7)::Phospholipid ABC transporter permease protein MlaE::Phospholipid ABC transporter substrate-binding protein MlaD::Phospholipid ABC transporter ATP-binding protein MlaF::Phospholipid ABC transporter-binding protein MlaB::D-arabinose 5-phosphate isomerase (EC 5.3.1.13)::PTS IIA-like nitrogen-regulatory protein PtsN::Ribosome hibernation promoting factor Hpf::RNase adapter protein RapZ::Phosphocarrier protein, nitrogen regulation associated::RNA polymerase sigma-54 factor RpoN::Lipopolysaccharide ABC transporter, ATP-binding protein LptB::Aspartate carbamoyltransferase (EC 2.1.3.2)::Aspartate carbamoyltransferase regulatory chain (PyrI)::2-iminobutanoate/2-iminopropanoate deaminase RidA/TdcF (EC 3.5.99.10)::Lipopolysaccharide export system protein LptC::3-deoxy-D-manno-octulosonate 8-phosphate phosphatase (EC 3.1.3.45)::PTS system, trehalose-specific IIB component (EC 2.7.1.201)::PTS system, trehalose-specific IIC component (EC 2.7.1.69)::Inhibitor of invertebrate i-type lysozyme, periplasmic => PliI
9 20 Sensory histidine kinase in two-component regulatory system with RstA::Thermostable carboxypeptidase 1 (EC 3.4.17.19)::Transcriptional regulatory protein RstA::Arginine/ornithine antiporter ArcD::NAD(P) transhydrogenase alpha subunit (EC 1.6.1.2)::NAD(P) transhydrogenase subunit beta (EC 1.6.1.2)::ADA regulatory protein (EC 2.1.1.63 )::Methylated-DNA--protein-cysteine methyltransferase (EC 2.1.1.63)::Copper sensory histidine kinase CpxA::Copper-sensing two-component system response regulator CpxR::Alkaline phosphatase (EC 3.1.3.1)::Universal stress protein E::Glycerol-3-phosphate dehydrogenase [NAD(P)+] (EC 1.1.1.94)::Protein-export protein SecB (maintains pre-export unfolded state)::2,3-bisphosphoglycerate-independent phosphoglycerate mutase (EC 5.4.2.12)::Murein hydrolase activator EnvC::Rhodanese-related sulfurtransferase YibN::Glutaredoxin 3 (Grx3)::Serine acetyltransferase (EC 2.3.1.30)::tRNA (cytidine(34)-2'-O)-methyltransferase (EC 2.1.1.207)
10 19 IncF plasmid conjugative transfer pilus assembly protein TraF::IncF plasmid conjugative transfer pilus assembly protein TraH::IncF plasmid conjugative transfer protein TrbB::IncF plasmid conjugative transfer protein TraG::IncF plasmid conjugative transfer pilus assembly protein TraU::IncF plasmid conjugative transfer protein TraN::IncF plasmid conjugative transfer pilus assembly protein TraC::IncF plasmid conjugative transfer protein TrbI::IncF plasmid conjugative transfer pilus assembly protein TraW::IncF plasmid conjugative transfer protein TrbC::IncF plasmid conjugative transfer pilus assembly protein TraB::IncF plasmid conjugative transfer pilus assembly protein TraK::IncF plasmid conjugative transfer pilus assembly protein TraE::IncF plasmid conjugative transfer pilus assembly protein TraL::IncF plasmid conjugative transfer pilin protein TraA::IncF plasmid conjugative transfer pilus assembly protein TraV::IncF plasmid conjugative transfer DNA-nicking and unwinding protein TraI::IncF plasmid conjugative transfer protein TraD::IncF plasmid conjugative transfer pilin acetylase TraX
11 17 Adenylosuccinate lyase (EC 4.3.2.2)::SAICAR lyase (EC 4.3.2.2)::Isocitrate dehydrogenase [NADP] (EC 1.1.1.42)::Ribosomal large subunit pseudouridine synthase E (EC 5.4.99.20)::tRNA-specific 2-thiouridylase MnmA (EC 2.8.1.13)::Nudix-like NDP and NTP phosphohydrolase NudJ::Sensor histidine kinase PhoQ (EC 2.7.13.3)::Transcriptional regulatory protein PhoP::LSU ribosomal protein L16p arginine hydroxylase::Ynd::Lipoprotein releasing system transmembrane protein LolE::N-acetyl-D-glucosamine kinase (EC 2.7.1.59)::NAD-dependent protein deacetylase of SIR2 family::Tripeptide aminopeptidase (EC 3.4.11.4)::Lipoprotein-releasing system ATP-binding protein LolD::Lipoprotein releasing system transmembrane protein LolC::Transcription-repair coupling factor
The output contains one very large group plus around one hundred small ones. We have our clusters, and we are now in a position to look for hypothetical proteins embedded in them.
Finding the Hypotheticals¶
Our goal of finding the hypothetical proteins embedded in the role clusters is accomplished in two steps.
Find the start and endpoint of each cluster in our target genomes.
Look inside those points for hypothetical proteins.
The script p3-identify-clusters performs the first task. It takes as the standard input our file of roles in the genomes (feature.roles.tbl) and as a reference input the cluster file produced by p3-generate-clusters (clusters.tbl). The command is as follows.
p3-identify-clusters --showRoles clusters.tbl <feature.roles.tbl >real.clusters.tbl
The output tells us which clusters were found where, and which roles were involved. It will look something like this.
cluster_id genome_id sequence_id start end roles
2 1035377.4 CP002956 1961 9099 L-lysine 6-monooxygenase [NADPH], aerobactin biosynthesis protein IucD (EC 1.14.13.59)::Siderophore biosynthesis protein, monooxygenase::Aerobactin synthase, aerobactin biosynthesis protein IucC (EC 6.3.2.39)::N6-hydroxylysine O-acetyltransferase, aerobactin biosynthesis protein IucB (EC 2.3.1.102)::Siderophore synthetase small component, acetyltransferase::N(2)-citryl-N(6)-acetyl-N(6)-hydroxylysine synthase, aerobactin biosynthesis protein IucA (EC 6.3.2.38)::N(2)-citryl-N(6)-acetyl-N(6)-hydroxylysine synthase, aerobactin biosynthesis protein IucA (EC 6.3.2.38)::Possible H+-antiporter clustered with aerobactin genes
1 1035377.4 CP002956 48338 55756 Ferrichrome-iron receptor::ABC transporter, substrate-binding protein (cluster 8, B12/iron complex)::FIG001341: Probable Fe(2+)-trafficking protein YggX::A/G-specific adenine glycosylase (EC 3.2.2.-)::tRNA (guanine(46)-N(7))-methyltransferase (EC 2.1.1.33)
15 1035377.4 CP002956 67985 84937 Radical SAM family enzyme, similar to coproporphyrinogen III oxidase, oxygen-independent, clustered with nucleoside-triphosphatase RdgB::Nucleoside 5-triphosphatase RdgB (dHAPTP, dITP, XTP-specific) (EC 3.6.1.15)::Cell division integral membrane protein, YggT and half-length relatives::Pyrroline-5-carboxylate reductase (EC 1.5.1.2)::UPF0001 protein YggS::Twitching motility protein PilT::Agmatine deiminase (EC 3.5.3.12)::N-carbamoylputrescine amidase (EC 3.5.1.53)::Putative pre-16S rRNA nuclease::UPF0301 protein YqgE::Glutathione synthetase (EC 6.3.2.3)::16S rRNA (uracil(1498)-N(3))-methyltransferase (EC 2.1.1.193)::S-adenosylmethionine synthetase (EC 2.5.1.6)::Biosynthetic arginine decarboxylase (EC 4.1.1.19)
1 1035377.4 CP002956 92010 113986 D-erythrose-4-phosphate dehydrogenase (EC 1.2.1.72)::Phosphoglycerate kinase (EC 2.7.2.3)::Fructose-bisphosphate aldolase class II (EC 4.1.2.13)::Arginine exporter protein ArgO::Transcriptional regulator ArgP, LysR family::Ribose 5-phosphate isomerase A (EC 5.3.1.6)::D-3-phosphoglycerate dehydrogenase (EC 1.1.1.95)::5-formyltetrahydrofolate cyclo-ligase (EC 6.3.3.2)::Xaa-Pro aminopeptidase (EC 3.4.11.9)::2-polyprenyl-6-methoxyphenol hydroxylase::2-octaprenylphenol hydroxylase::Aminomethyltransferase (glycine cleavage system T protein) (EC 2.1.2.10)::Glycine cleavage system H protein::Glycine dehydrogenase [decarboxylating] (glycine cleavage system P protein) (EC 1.4.4.2)
23 1035377.4 CP002956 122234 135177 Succinate dehydrogenase flavin-adding protein, antitoxin of CptAB toxin-antitoxin::Response regulator CreB of two-component signal transduction system CreBC::Sensory histidine kinase CreC of two-component signal transduction system CreBC::Inner membrane protein CreD::Flavodoxin 2::Site-specific tyrosine recombinase XerD::Thiol:disulfide interchange protein DsbC::Single-stranded-DNA-specific exonuclease RecJ (EC 3.1.-.-)::Peptide chain release factor 2::programmed frameshift-containing::Lysyl-tRNA synthetase (class II) (EC 6.1.1.6)::Phage integrase, Phage P4-associated
72 1035377.4 CP002956 137985 141261 Zinc binding domain::DNA primase, Phage P4-associated (EC 2.7.7.-)::Replicative helicase RepA, Phage P4-associated::FIG033266: Phage DNA binding protein
1 1035377.4 CP002956 184662 193899 Rhamnogalacturonides degradation protein RhiN::2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase (EC 1.1.1.127)::2-deoxy-D-gluconate 3-dehydrogenase (EC 1.1.1.125)::N-acetylgalactosamine-6-phosphate deacetylase (EC 3.5.1.25)::PTS system, N-acetylgalactosamine-specific IIA component::PTS system, galactosamine-specific IIA component (EC 2.7.1.69)::PTS system, N-acetylgalactosamine-specific IID component (EC 2.7.1.69)::PTS system, N-acetylgalactosamine-specific IIC component (EC 2.7.1.69)::PTS system, N-acetylgalactosamine-specific IIB component (EC 2.7.1.69)::Galactosamine-6-phosphate isomerase AgaS::D-tagatose-1,6-bisphosphate aldolase subunit KbaZ::Transcriptional repressor of aga operon
14 1035377.4 CP002956 207180 219726 Carbonic anhydrase, beta class (EC 4.2.1.1)::General secretion pathway protein C::General secretion pathway protein D::General secretion pathway protein E::General secretion pathway protein F::General secretion pathway protein G::General secretion pathway protein I::General secretion pathway protein J::General secretion pathway protein K::General secretion pathway protein L::General secretion pathway protein M::Leader peptidase (Prepilin peptidase) (EC 3.4.23.43)::N-methyltransferase (EC 2.1.1.-)
46 1035377.4 CP002956 223321 233505 Thioredoxin-like protein clustered with PA0057::Metallo-beta-lactamase superfamily protein PA0057::LysR-family transcriptional regulator clustered with PA0057::Transcriptional activator protein LysR::Diaminopimelate decarboxylase (EC 4.1.1.20)::Galactose operon repressor, GalR-LacI family of transcriptional regulators::Free methionine-(S)-sulfoxide reductase
5 1035377.4 CP002956 248492 270247 Siderophore biosynthesis non-ribosomal peptide synthetase modules::2,3-dihydroxybenzoate-AMP ligase of siderophore biosynthesis (EC 2.7.7.58)::Siderophore biosynthesis non-ribosomal peptide synthetase modules::Putative reductoisomerase in siderophore biosynthesis gene cluster::Putative reductoisomerase in siderophore biosynthesis gene cluster::Thiazolinyl imide reductase in siderophore biosynthesis gene cluster::Thioesterase in siderophore biosynthesis gene cluster::Putative ABC iron siderophore transporter, fused permease and ATPase domains::Putative ABC iron siderophore transporter, fused permease and ATPase domains
The hypothetical proteins can be found in the file features.tbl generated way back at the beginning of this tutorial. We want to compare them to the cluster locations computed above, but first we have to extract them from the file. This is accomplished using p3-match.
p3-match --col=product "hypothetical protein" <features.tbl >hypotheticals.tbl
The result is a subset of features.tbl consisting only of records that
have the value hypothetical protein
in the product column. About
296,000 proteins are filtered into the output file, which looks
something like this.
genome.genome_id feature.patric_id feature.sequence_id feature.location feature.product
385964.3 fig|385964.3.peg.1003 JMUF01000016 29951..30067 hypothetical protein
385964.3 fig|385964.3.peg.101 JMUF01000001 complement(90500..90613) hypothetical protein
385964.3 fig|385964.3.peg.1035 JMUF01000017 17773..17895 hypothetical protein
385964.3 fig|385964.3.peg.1055 JMUF01000017 complement(42310..42438) hypothetical protein
385964.3 fig|385964.3.peg.1101 JMUF01000018 complement(39479..39601) hypothetical protein
385964.3 fig|385964.3.peg.1144 JMUF01000019 39640..39753 hypothetical protein
385964.3 fig|385964.3.peg.1146 JMUF01000019 complement(41198..41398) hypothetical protein
385964.3 fig|385964.3.peg.1200 JMUF01000021 19613..19729 hypothetical protein
385964.3 fig|385964.3.peg.1207 JMUF01000021 complement(26958..27077) hypothetical protein
385964.3 fig|385964.3.peg.1218 JMUF01000021 complement(36030..36161) hypothetical protein
385964.3 fig|385964.3.peg.1223 JMUF01000021 complement(39935..40057) hypothetical protein
This filtered set of hypothetical proteins is fed as input to p3-find-in-clusters, which compares the feature locations to the cluster locations and outputs a record if there is a match.
p3-find-in-clusters --col=patric_id real.clusters.tbl <hypotheticals.tbl >clustered.hypo.tbl
Note that this could have been done with a pipeline and eschewed the intermediate hypotheticals.tbl file.
p3-match --col=product "hypothetical protein" <features.tbl | p3-find-in-clusters --col=patric_id real.clusters.tbl >clustered.hypo.tbl
For the purposes of this tutorial, however, we wanted to show the intermediate result. The output including the clusters looks like this.
genome.genome_id feature.patric_id feature.sequence_id feature.location feature.product cluster_id genome_id sequence_id start end roles
385964.3 fig|385964.3.peg.1003 JMUF01000016 29951..30067 hypothetical protein 1 385964.3 JMUF01000016 29248 36665 tRNA (guanine(46)-N(7))-methyltransferase (EC 2.1.1.33)::A/G-specific adenine glycosylase (EC 3.2.2.-)::FIG001341: Probable Fe(2+)-trafficking protein YggX::ABC transporter, substrate-binding protein (cluster 8, B12/iron complex)::Ferrichrome-iron receptor
385964.3 fig|385964.3.peg.101 JMUF01000001 complement(90500..90613) hypothetical protein 1 385964.3 JMUF01000001 85974 90395 ABC transporter, substrate-binding protein (cluster 10, nitrate/sulfonate/bicarbonate)::ABC transporter, permease protein (cluster 10, nitrate/sulfonate/bicarbonate)::ABC transporter, permease protein (cluster 10, nitrate/sulfonate/bicarbonate)::AMP nucleosidase (EC 3.2.2.4)
385964.3 fig|385964.3.peg.1200 JMUF01000021 19613..19729 hypothetical protein 1 385964.3 JMUF01000021 18473 23601 Membrane-bound lytic murein transglycosylase B (EC 3.2.1.-)::Membrane-bound lytic murein transglycosylase B (EC 3.2.1.-)::Ferric iron ABC transporter, iron-binding protein::Ferric iron ABC transporter, permease protein::Ferric iron ABC transporter, permease protein::Ferric iron ABC transporter, ATP-binding protein
385964.3 fig|385964.3.peg.1207 JMUF01000021 complement(26958..27077) hypothetical protein 1 385964.3 JMUF01000021 27052 32405 Anaerobic dimethyl sulfoxide reductase chain A, molybdopterin-binding domain (EC 1.8.5.3)::Anaerobic dimethyl sulfoxide reductase chain B, iron-sulfur binding subunit (EC 1.8.5.3)::Anaerobic dimethyl sulfoxide reductase chain C, anchor subunit (EC 1.8.5.3)::Anaerobic respiratory reductase chaperone::Ferredoxin-type protein NapF (periplasmic nitrate reductase)
385964.3 fig|385964.3.peg.1237 JMUF01000022 complement(13686..13844) hypothetical protein 32 385964.3 JMUF01000022 15208 23148 ABC transporter involved in cytochrome c biogenesis, ATPase component CcmA::ABC transporter involved in cytochrome c biogenesis, CcmB subunit::Cytochrome c-type biogenesis protein CcmC, putative heme lyase for CcmE::Cytochrome c-type biogenesis protein CcmD, interacts with CcmCE::Cytochrome c-type biogenesis protein CcmE, heme chaperone::Cytochrome c heme lyase subunit CcmF::Cytochrome c-type biogenesis protein CcmG/DsbE, thiol:disulfide oxidoreductase::Cytochrome c heme lyase subunit CcmL::Cytochrome c heme lyase subunit CcmH::Outer-membrane-phospholipid-binding lipoprotein MlaA
385964.3 fig|385964.3.peg.1253 JMUF01000022 complement(29533..29670) hypothetical protein 1 385964.3 JMUF01000022 25517 29151 3-ketoacyl-CoA thiolase (EC 2.3.1.16)::Enoyl-CoA hydratase (EC 4.2.1.17)::3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35)::3-hydroxybutyryl-CoA epimerase (EC 5.1.2.3)
385964.3 fig|385964.3.peg.1253 JMUF01000022 complement(29533..29670) hypothetical protein 1 385964.3 JMUF01000022 25517 29151 3-ketoacyl-CoA thiolase (EC 2.3.1.16)::Enoyl-CoA hydratase (EC 4.2.1.17)::3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35)::3-hydroxybutyryl-CoA epimerase (EC 5.1.2.3) 58 385964.3 JMUF01000022 31462 39818 Ribosomal protein L3 N(5)-glutamine methyltransferase (EC 2.1.1.298)::Chorismate synthase (EC 4.2.3.5)::Translation elongation factor P Lys34 hydroxylase::tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase (EC 2.1.1.61)::FAD-dependent cmnm(5)s(2)U34 oxidoreductase::3-oxoacyl-[acyl-carrier-protein] synthase, KASI (EC 2.3.1.41)
385964.3 fig|385964.3.peg.1278 JMUF01000023 complement(11479..11595) hypothetical protein 1 385964.3 JMUF01000023 1030 17470 Cold shock protein of CSP family => CspE (naming convention as in E.coli)::Twin-arginine translocation protein TatE::Lipoyl synthase (EC 2.8.1.8)::Octanoate-[acyl-carrier-protein]-protein-N-octanoyltransferase (EC 2.3.1.181)::Proposed lipoate regulatory protein YbeD::D-alanyl-D-alanine carboxypeptidase (EC 3.4.16.4)::D-alanyl-D-alanine carboxypeptidase (EC 3.4.16.4)::Septum-associated rare lipoprotein A::Rod shape-determining protein RodA::Penicillin-binding protein 2 (PBP-2)::23S rRNA (pseudouridine(1915)-N(3))-methyltransferase (EC 2.1.1.177)::Ribosomal silencing factor RsfA::Nicotinate-nucleotide adenylyltransferase (EC 2.7.7.18)::DNA polymerase III delta subunit (EC 2.7.7.7)::Leucyl-tRNA synthetase (EC 6.1.1.4)
385964.3 fig|385964.3.peg.1300 JMUF01000023 32920..33093 hypothetical protein 24 385964.3 JMUF01000023 22342 37129 Apolipoprotein N-acyltransferase (EC 2.3.1.-)::Copper homeostasis protein CutE::Magnesium and cobalt efflux protein CorC::Metal-dependent hydrolase YbeY, involved in rRNA and/or ribosome maturation and assembly::Phosphate starvation-inducible protein PhoH, predicted ATPase::tRNA-i(6)A37 methylthiotransferase (EC 2.8.4.3)::2-polyprenyl-3-methyl-6-methoxy-1,4-benzoquinol hydroxylase::Asparagine synthetase [glutamine-hydrolyzing] (EC 6.3.5.4)::Asparagine synthetase [glutamine-hydrolyzing] (EC 6.3.5.4)::N-acetylglucosamine-6P-responsive transcriptional repressor NagC, ROK family::N-acetylglucosamine-6-phosphate deacetylase (EC 3.5.1.25)::N-acetylglucosamine-6-phosphate deacetylase (EC 3.5.1.25)::Glucosamine-6-phosphate deaminase (EC 3.5.99.6)
385964.3 fig|385964.3.peg.1310 JMUF01000024 complement(2364..2477) hypothetical protein 1 385964.3 JMUF01000024 138 25612 Glycerol-3-phosphate regulon repressor GlpR::Thiosulfate sulfurtransferase GlpE (EC 2.8.1.1)::Transcriptional activator of maltose regulon, MalT::Maltodextrin phosphorylase (EC 2.4.1.1)::Maltodextrin phosphorylase (EC 2.4.1.1)::4-alpha-glucanotransferase (amylomaltase) (EC 2.4.1.25)::[4Fe-4S] cluster carrier protein NfuA::Competence protein F homolog, phosphoribosyltransferase domain::Pimeloyl-[acyl-carrier protein] methyl ester esterase BioH (EC 3.1.1.85)::Ferrous iron transport protein B::Ferrous iron transport protein B::Ferrous iron transport protein A::Transcription accessory protein (S1 RNA-binding domain)::Transcription accessory protein (S1 RNA-binding domain)::Transcription elongation factor GreB::Two-component system response regulator OmpR::Osmolarity sensory histidine kinase EnvZ::Phosphoenolpyruvate carboxykinase [ATP] (EC 4.1.1.49)::Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog)
385964.3 fig|385964.3.peg.1371 JMUF01000025 22506..22685 hypothetical protein 1 385964.3 JMUF01000025 22675 34197 Acyl-CoA:1-acyl-sn-glycerol-3-phosphate acyltransferase (EC 2.3.1.51)::Acyl-ACP:1-acyl-sn-glycerol-3-phosphate acyltransferase (EC 2.3.1.n4)::Topoisomerase IV subunit A (EC 5.99.1.-)::Topoisomerase IV subunit B (EC 5.99.1.-)::3',5'-cyclic-nucleotide phosphodiesterase (EC 3.1.4.17)::ADP-ribose pyrophosphatase (EC 3.6.1.13)::Outer membrane channel TolC (OpmH)
385964.3 fig|385964.3.peg.1374 JMUF01000025 25951..26199 hypothetical protein 1 385964.3 JMUF01000025 22675 34197 Acyl-CoA:1-acyl-sn-glycerol-3-phosphate acyltransferase (EC 2.3.1.51)::Acyl-ACP:1-acyl-sn-glycerol-3-phosphate acyltransferase (EC 2.3.1.n4)::Topoisomerase IV subunit A (EC 5.99.1.-)::Topoisomerase IV subunit B (EC 5.99.1.-)::3',5'-cyclic-nucleotide phosphodiesterase (EC 3.1.4.17)::ADP-ribose pyrophosphatase (EC 3.6.1.13)::Outer membrane channel TolC (OpmH)
You can see a lot of features are in or near the giant cluster 1, but we also picked up some hypotheticals crowding the smaller clusters (32 and 24). Using the p3-sort script, we can see how many hypotheticals we found for each cluster.
p3-sort cluster_id --count <clustered.hypo.tbl
cluster count
1 83823
10 263
101 6
102 18
103 21
104 61
105 1
106 6
107 34
108 10
109 17
11 827
The next step is to choose a cluster and compare the hypotheticals for that cluster to see if any of them are substantially similar. We would then have reason to believe the similar hypotheticals have a function related to the cluster.