ohnosequences!

era7 bioinformatics R&D group

Escherichia coli plasmid pEC_Bactec is a part of EHEC H112180280 strain

As it was suggested by Kat Holt (http://bacpathgenomics.wordpress.com/2011/06/05/ehec-genomes-snp-locations/) and others a plasmid very similar to pEC_Bactec is a part of the genome of the recently sequenced EHEC H112180280 strain.

The figure displays a simple alignment obtained using MAUVE  Move Contigs tool between pEC_Bactec plasmid (above) and the scaffolds 7 and 13 of H112180280 genome (below):

Scaffolds 7 and 13 cover practically all pEC-Bactec plasmid sequence. The two white pEC-Bactec regions that hasn’t associated similar regions in H112180280 correspond to the genes:

  • pndC and  TnpA OrfB IS66 (the left white region into the red similarity block)
  • TnpA IS26 transposase (The white patch into the blue block)

The H112180280 sequence regions without any similarity conexion with  pEC-Bactec flanking the two little green blocks correspond to N regions that remain undefined in the H112180280 sequence obtained with paired end 454 technology.

HPA announces a new E. coli genome. Strain H112180280. We do a quick & visual comparison with TY-2482

HPA (Health Public Agency http://www.hpa.org.uk/) has just announced the sequence of a E. coli strain. the strain H112180280.

They have sequenced the strain with 454 and they’ve released

  • sff files
  • FASTA file with the scaffolds
  • The annotation (done by Anthony Underwood) in GenBank format

Data available here http://www.hpa-bioinformatics.org.uk/lgp/genomes

They got

  • 13 scaffolds
  • 5405081 bp
  • 88748 Ns (1.64%)

When we saw the data, the genome in only 13 scaffolds… we couldn’t help aligning it with the other high-quality de novo assembly we have so far (the BGI version 2 of the TY-2482 strain)

How similar these two strains would be? 454 assembly could help scaffolding Illumina-IonTorrent contigs?

Here’s what we got after aligning both genomes using Mauve (http://gel.ahabs.wisc.edu/mauve/)

You can get the results of this Mauve analysis in the GitHub repository https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/tree/master/strains/comparativeAnalysis/era7bioinformatics/Mauve_H112180280_TY2482

This quick analysis could give us some hints to reduce the number of contigs in both assemblies.

For example. Scaffolds 1, 2, 3 and 4 in the HPA assembly (the one above) could be merged in one contig (provided confirmation with PCR, Sanger sequence, etc).

And from the point of view of TY-2482 assembly, even more contigs could be merged. See for instance the similarity region in green bottom left (red vertical lines indicate different contigs) . As well as the other similarity regions along the whole assembly (the pink, light green, turquoise and purple blocks)

Restriction Modification systems found in E. coli TY-2482 genome

Analysing the automatic annotation we did of the second BGI assembly of TY-2482 genome (see post here http://blog.ohnosequences.com/2011/06/automatic-annotation-of-the-second-bgi-assembly-of-e-coli-ty-2482-genome/) we have found that this isolate has 3 restriction modification systems

  • Type I Restriction modification system encoded in an operon in contig 42. The specific protein encoded by the gene 79712, the modification protein encoded by the gene 84400 and the restriction protein encoded by 66267
  • Type II system encoded in an operon in contig 486. The nuclease protein encoded by the gene 21919 and the methyltransferase protein encoded by gene 23135
  • Type III system encoded in the contig 493. The nuclease protein encoded by the gene 3634 and the methyltransferase one encoded by gene 5265

Type I restiction-modification system encoded in the contig 42

Type II restriction-modification system encoded in contig 486

Type III Restriction-modification system encoded in contig 493

Are there plasmids in the E. coli TY-2482 genome?

After taken a first look at the annotation of  the second BGI assembly of the E. coli TY-2482 genome there are lot’s of questions arising. The quality of the sequence and assembly of this second version of TY-2482 genome is much higher allowing more detailed analysis.

The presence of plasmids is clearly an interesting one.

In the automatic annotation we published yesterday (see post here and wiki page here) we’ve detected 4 contigs that may probably be part of plasmids

  • Contig 63 with 48 predicted genes
  • Contig 74 with 7 predicted genes; all of these involved in Mercury resistance
  • Contig 98 with 12 predicted genes. This contig contain SOS inhibition proteins, nucleases, single-stranded DNA binding protein  and other uncharacterized proteins
  • Contig 503 with 37 predicted proteins. Among these proteins we find a beta-lactamase and some interesting proteins (in terms of mobility and gene transfer) like several transposons and proteins involved in conjugation and plasmid maintenance.

We can’t conclude yet how many different plasmids TY-2482 may have.

We’ll keep on analysing this genome and posting updates here and in the EHEC GitHub wiki

**Update88 (10-Jun-2011):

Kat Holt has noticed that the protein annotated as beta-lactamase CTX-M-3 in the contig 503 is actually beta-lactamase CTX-M-15. Read the whole story in the comments of this Kat’s post http://bacpathgenomics.wordpress.com/2011/06/07/new-german-stecehec-data-from-bgi/

Automatic annotation of E. coli LB226692 genome

And this morning we’ve finished the automatic annotation of the other isolate from the outbreak whose sequence is also available. The isolate LB226692.

Life Tech and University of Muenster have sequenced the genome and done a hybrid mapping assembly getting finally 364 contigs. More details on the assembly method here

We usedBG7 system to annotate the genome.

In this case we’ve selected as reference protein the same protein set we used to annotate the other E. coli isolate (the TY-2482). This protein set has 137,063 proteins and includes:

  • The representative Uniprot proteins corresponding to all Uniref90 clusters for all Escherichia coli proteins
  • All Uniprot proteins from organisms including in their name the terms “EHEC” or “EAEC”
  • All Uniprot proteins from bacteria that have in any Uniprot field the term “toxin”
  • All Uniprot proteins from bacteria that have in any Uniprot field  “hemolysin”
  • All the proteins from Salmonella typhi, Yersinia pestis and Shigella dysenteriae

Results

We’ve detected 6,302 genes

  • 6,132 protein encoding genes
  • 170 RNA genes

4,504 out of the 6,132 (73.45%) protein encoding genes have canonical start and stop codon and haven´t either frame-shifts or intragenic stop codons.

1,125 out of the 6,132 (18.34%) protein encoding genes have some frameshifts or intragenic stop codon in their sequences, probably caused by inherent technology errors.

You can get the results of the annotation here https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/tree/master/strains/LB226692/annotations/era7bioinformatics

Automatic annotation of the second BGI assembly of E. coli TY-2482 genome

We’ve just finished the automatic annotation of the second BGI assembly of the E. coli TY-2482 genome (https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/blob/master/strains/TY2482/assemblies/BGI/Escherichia_coli_TY-2482.contig.20110606.fa.gz). In this case BGI combined 200x of Illumina single-end reads and 12x of Ion Torrent. They have done a de novo assembly with Newbler v. 2.0.00.22, Soapdenovo v. 1.06 and AMOS minimus2 v. 1.59 getting finally 513 contigs.

In this case we have used the same set of reference proteins as we used in the annotation of the Nick Loman’s assembly of the same isolate, TY-2482 (see annotation in this post).

This set has 137,063 proteins and includes:

  • The representative Uniprot proteins corresponding to all Uniref90 clusters for all Escherichia coli proteins
  • All Uniprot proteins from organisms including in their name the terms “EHEC” or “EAEC”
  • All Uniprot proteins from bacteria that have in any Uniprot field the term “toxin”
  • All Uniprot proteins from bacteria that have in any Uniprot field “hemolysin”
  • All the proteins from Salmonella typhi, Yersinia pestis and Shigella dysenteriae

Results

We have predicted 5,982 genes

  • 5,849 protein encoding genes
  • 133 RNA genes (rRNA and tRNA)

4,797 out of the 5,849 (82.01%) protein encoding genes have canonical start and stop codon and haven´t either frame-shifts or intragenic stop codons.

658 out of the 5,849 (11.24%) protein encoding genes have some frameshifts or intragenic stop codon in their sequences.

You can get the results here https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/tree/master/strains/TY2482/annotations/era7bioinformatics/BGI_V2

We have just had time to take a quick glance at these results but they look promising. We’ll come back with more results on this genome, that’s for sure :)

EU outbreak Escherichia coli EHEC genome plasticity

transposases

It seems that there are 121 putative transposases in this genome. It probably implies a high genomic plasticity and flexibility for adaptation to changing environments.

Table 1: Transposases

Contig ID

tags

Similar to

Protein names

husec41_c1004

Transposase

D3H358

Transposase

husec41_c1066

Transposase

P30192

Putative uncharacterized protein ychG

husec41_c1147

Transposase

B7L8U5

Putative transposase ORF A, IS609 family

husec41_c1214

Transposase

C2DME2

Possible transposase insP for IS630

husec41_c130

Transposase

B7L5X1

Tn7-like transposition protein TnsB

husec41_c130

Transposase

B7L5W7

Tn7-like transposition protein TnsC

husec41_c1390

Transposase

E9Z073

Transposase

husec41_c1412

Transposase

C8TNV4

Predicted transposase

husec41_c1539

Transposase

Q8XC14

Putative IS encoded protein encoded within prophage CP-933O

husec41_c1554

Transposase

P16943

Insertion element IS630 uncharacterized 39 kDa protein (ISO-IS200 39 kDa protein)

husec41_c1562

Transposase, plasmid

B7LWU0

Transposon Tn3 resolvase

husec41_c1562

Transposase, plasmid

C7S9E2

Transposase for transposon Tn3 (Truncated TnpA)

husec41_c1569

Transposase

B3HPU8

Transposase OrfA, ISEc8

husec41_c1595

Transposase

Q6Q6S9

Transposase

husec41_c1628

Transposase

B7LBW1

Transposase

husec41_c1683

Transposase

B7LHE6

Putative uncharacterized protein yncI

husec41_c1687

Transposase

B3HAL5

Transposon Tn21 resolvase

husec41_c1689

Transposase

P03008

Transposase for transposon Tn3

husec41_c1781

Transposase

B7LFE0

IS3 element protein InsE

husec41_c1781

Transposase, plasmid

B7LA99

IS3 element protein InsF (Transposase ORF B, IS3)

husec41_c1799

Transposase

C8TKW2

Putative uncharacterized protein

husec41_c1859

Transposase

B6IAL2

H repeat-associated protein

husec41_c1864

Transposase, plasmid

C9WXH4

Transposase (Fragment)

husec41_c1864

Transposase, plasmid

Q935I2

Putative transposase

husec41_c188

Transposase

D3GR79

Putative transposase

husec41_c188

Transposase

D3GR79

Putative transposase

husec41_c1905

Transposase, plasmid

B7LWW4

Transposase ORF A, IS1

husec41_c1921

Transposase

D3H358

Transposase

husec41_c1974

Transposase

B7LWV0

Transposase ORF A, IS629

husec41_c1974

Transposase, plasmid

B7LWV1

Putative uncharacterized protein

husec41_c199

Transposase

P03008

Transposase for transposon Tn3

husec41_c1995

Transposase

B7LFN9

Transposase ORF A, IS3 family

husec41_c1998

Transposase

D3GV20

Transposase (Fragment)

husec41_c2024

Transposase

B5AXD2

Putative transposase

husec41_c2049

Transposase

Q6Q6S9

Transposase

husec41_c2055

Transposase

A8A1D5

Transposase, IS605 family

husec41_c2063

Transposase

B7L8E6

Putative transposase

husec41_c2064

Transposase

E7H9R9

Transposase IS66 family protein

husec41_c2066

Transposase

B7L5X1

Tn7-like transposition protein TnsB

husec41_c211

Transposase

B7L4U1

Putative transposase

husec41_c2125

Transposase, plasmid

B7LA98

Putative uncharacterized protein

husec41_c2125

Transposase

D3GV17

Transposase

husec41_c2125

Transposase, plasmid

B7LX19

Putative transposase, IS110 family

husec41_c2132

Transposase

B7LFP9

Transposase ORF B, IS629

husec41_c2176

Transposase

E9Z008

Transposase

husec41_c2181

Transposase

B7LFP9

Transposase ORF B, IS629

husec41_c2203

Transposase, plasmid

F2W481

ISL3 family transposase

husec41_c226

Transposase, plasmid

B7UHC5

Transposase of ISEc13 of IS110 family (Transposase of ISEc21)

husec41_c226

Transposase

B3IA39

Truncated transposase

husec41_c230

Transposase

B7LBF9

Putative transposase

husec41_c273

Transposase

C4HU49

Transposase

husec41_c30

Transposase

P03008

Transposase for transposon Tn3

husec41_c30

Transposase, plasmid

D9Z5B6

TnpA

husec41_c37

Transposase

D7ZCL8

Transposase, IS4 family (Fragment)

husec41_c42

Transposase

A7ZMR8

Transposase, IS605 family

husec41_c524

Transposase, plasmid

C7S9T2

Transposon Tn21 modulator protein

husec41_c524

Transposase

D3H359

Transposon Tn21 resolvase

husec41_c524

Transposase

D3H358

Transposase

husec41_c526

Transposase

D3GX03

Insertion sequence IS100, ATP-binding protein

husec41_c526

Transposase

P59697

Transposase for insertion sequence element IS200

husec41_c6

Transposase, plasmid

D3H553

Transposase

husec41_c642

Transposase

B3WUE9

TniA transposase protein

husec41_c66

Transposase, plasmid

Q0H058

IS1N transposase

husec41_c662

Transposase

Q19NI3

TnsD

husec41_c699

Transposase?

B7LHE6

Putative uncharacterized protein yncI

husec41_c826

Transposase

D8AYZ0

Transposase

husec41_c826

Transposase

B7L939

Putative transposase ORF 2, IS66 family

husec41_c835

Transposase

B7LDR0

Transposase ORF B, IS1

husec41_c838

Transposase

E7SID2

Transposase

husec41_c87

Transposase

B7L5X0

Tn7-like transposase TnsA

husec41_c885

Transposase

E9Z073

Transposase

husec41_c887

Transposase

C8TWQ4

Predicted IS602 transposase OrfB

husec41_rep_c2269

Transposase

C8TGN1

Putative IS621 transposase

husec41_rep_c2271

Transposase

P11901

Transposase for insertion sequence element IS421

husec41_rep_c2273

Transposase

E1U309

Transposase InsAB’

husec41_rep_c2275

Transposase, plasmid

Q7AQT7

Putative transposase

husec41_rep_c2276

Transposase

B7LDT8

Putative uncharacterized protein

husec41_rep_c2286

Transposase

B7L8E6

Putative transposase

husec41_rep_c2306

Transposase

E9YLK7

Transposase (Fragment)

husec41_rep_c2315

Transposase

E9YLK7

Transposase (Fragment)

husec41_rep_c2317

Transposase

Q8XC14

Putative IS encoded protein encoded within prophage CP-933O

husec41_rep_c2317

Transposase

C6UPH5

Transposase ISEc8 (Transposase, ISEc8)

husec41_rep_c2321

Transposase

C8UCH9

Putative uncharacterized protein

husec41_rep_c2323

Transposase

C8TND9

Putative uncharacterized protein

husec41_rep_c2336

Transposase

A8A1D5

Transposase, IS605 family

husec41_rep_c2336

Transposase

C8TP26

Putative IS609 transposase TnpB

husec41_rep_c2356

Transposase

B7L8Y0

IS30 transposase; KpLE2 phage-like element

husec41_rep_c2370

Transposase

Q8VRB2

Putative transposase

husec41_rep_c2372

Transposase

B7L5W7

Tn7-like transposition protein TnsC

husec41_rep_c2408

Transposase

E8IR83

IS66 transposase

husec41_rep_c2429

Transposase

P03008

Transposase for transposon Tn3

husec41_rep_c2437

Transposase, plasmid

B7LWW4

Transposase ORF A, IS1

husec41_rep_c2441

Transposase

D3GV27

Transposase

husec41_rep_c2445

Transposase

B7LDR0

Transposase ORF B, IS1

husec41_rep_c2446

Transposase, plasmid

B7LWV1

Putative uncharacterized protein

husec41_rep_c2449

Transposase

D3GV14

Transposase

husec41_rep_c2478

Transposase

E7IJ32

Putative transposase

husec41_rep_c2487

Transposase

E1U309

Transposase InsAB’

husec41_rep_c2498

Transposase

B7LFP9

Transposase ORF B, IS629

husec41_rep_c2499

Transposase

E3Y631

ISL3 family transposase domain protein

husec41_rep_c2515

Transposase

Q2EEQ8

Putative defective transposase ybfQ

husec41_rep_c2516

Transposase

E3XVP6

Putative transposase

husec41_rep_c2520

Transposase

B6ICP6

Truncated transposase

husec41_rep_c2540

Transposase

P76102

Putative transposase InsQ for insertion sequence element IS609

husec41_rep_c2544

Transposase

D6HU78

Putative uncharacterized protein

husec41_rep_c2565

Transposase

B7L6S2

Putative transposase

husec41_rep_c2570

Transposase

D6HU78

Putative uncharacterized protein

husec41_rep_c2572

Transposase

C8ULY2

Putative IS609 transposase TnpA

husec41_rep_c2577

Transposase

Q4E6D2

Transposase (IS4 family)

husec41_rep_c2578

Transposase

E3Y0F9

Transposase, IS605 OrfB family

husec41_rep_c2639

Transposase

B7LBV5

Transposase, IS110 family

husec41_rep_c2668

Transposase

E9YA15

Transposase

husec41_rep_c2679

Transposase

E8Y4A6

Transposase IS200-family protein

husec41_rep_c2743

Transposase, plasmid

B5YPI1

ISSd1, transposase OrfB

husec41_rep_c2812

Transposase

B7L6S2

Putative transposase

husec41_rep_c2873

Transposase

Q8VRB2

Putative transposase

husec41_rep_c2875

Transposase

P0CE57

Transposase insH for insertion sequence element IS5R

husec41_rep_c2889

Transposase

D8B2Q9

ISPsy10, transposase family protein (Fragment)

husec41_rep_c2890

Transposase

B1LJZ9

IS5 transposase

husec41_rep_c2903

Transposase

D6HU78

Putative uncharacterized protein

husec41_rep_c2907

Transposase

B7L6S2

Putative transposase

In the next table we have collected some comments about these set of transposases of E. coli EHEC genome.

UNIPROT

Annotation

COMMENTS

FAMILY

D3H358

Transposase

Transposase_7. Segun interpro

Tn3

P30192

Putative uncharacterized protein ychG

Caution,  it can be a pseudogene

B7L8U5

Putative transposase ORF A, IS609 family

Putative

IS609

C2DME2

Possible transposase insP for IS630

Length 115 aa

IS630

E9Z073

Transposase

REP-associated transposases (Pubmed: 16563168)

IS110

C8TNV4

Predicted transposase

ISL3

B7LWU0

Transposon Tn3 resolvase

On plasmid, in some cases is associated to mercury resistance

Tn3

C7S9E2

Transposase for transposon Tn3 (Truncated TnpA)

125 aa on plasmid ?

Tn3

B3HPU8

Transposase OrfA, ISEc8

120 aa

ISEc8

Q6Q6S9

Transposase

Molecular characterization of cefoxitin-resistant Escherichia coli from Canadian hospitals

IS911

B7LBW1

Transposase

Pubmed 19165319

B7LHE6

Putative uncharacterized protein yncI

REP-associated transposases (Pubmed: 16563168)

IS4

P03008

Transposase for transposon Tn3

Tn3

B7LFE0

IS3 element protein InsE

Pubmed  19165319

IS3

B7LA99

IS3 element protein InsF (Transposase ORF B, IS3)

IS3

C8TKW2

Putative uncharacterized protein

IS66

B6IAL2

H repeat-associated protein

REP-associated transposases (Pubmed: 16563168)

IS4

C9WXH4

Transposase (Fragment)

Tni?

Q935I2

Putative transposase

IS26

D3GR79

Putative transposase

D3GR79

Putative transposase

B7LWW4

Transposase ORF A, IS1

REP-associated transposases (Pubmed: 16563168)

IS110

D3H358

Transposase

On plasmid, in some cases is associated to mercury resistance

Tn3

B7LWV0

Transposase ORF A, IS629

IS629

B7LWV1

Putative uncharacterized protein

P03008

Transposase for transposon Tn3

On plasmid, in some cases is associated to mercury resistance

Tn3

B7LFN9

Transposase ORF A, IS3 family

IS911

D3GV20

Transposase (Fragment)

Mutator

B5AXD2

Putative transposase

IS30

Q6Q6S9

Transposase

IS911

A8A1D5

Transposase, IS605 family

IS609

B7L8E6

Putative transposase

ISEc3

E7H9R9

Transposase IS66 family protein

IS66

B7L5X1

Tn7-like transposition protein TnsB

Tn7

B7L4U1

Putative transposase

YhgA

B7LA98

Putative uncharacterized protein

D3GV17

Transposase

IS66

B7LX19

Putative transposase, IS110 family

IS116/IS110/IS902 family REP-associated transposases (Pubmed: 16563168)

IS110

B7LFP9

Transposase ORF B, IS629

IS629

E9Z008

Transposase

B7LFP9

Transposase ORF B, IS629

IS629

F2W481

ISL3 family transposase

ISL3

B7UHC5

Transposase of ISEc13 of IS110 family (Transposase of ISEc21)

IS116/IS110/IS902  family REP-associated transposases (Pubmed: 16563168)

IS110

B3IA39

Truncated transposase

B7LBF9

Putative transposase

YhgA

C4HU49

Transposase

P03008

Transposase for transposon Tn3

On plasmid, in some cases is associated to mercury resistance

Tn3

D9Z5B6

TnpA

ISEcp1  TnpA

D7ZCL8

Transposase, IS4 family (Fragment)

REP-associated transposases (Pubmed: 16563168)

IS4

A7ZMR8

Transposase, IS605 family

IS605

D3H358

Transposase

In some cases is associated to mercury resistance

Tn3

P59697

Transposase for insertion sequence element IS200

TpnA2 related

IS200

D3H553

Transposase

IS66

B3WUE9

TniA transposase protein

Q0H058

IS1N transposase

ISN1

B7LHE6

Putative uncharacterized protein yncI

REP-associated transposases (Pubmed: 16563168)

IS4

D8AYZ0

Transposase

B7L939

Putative transposase ORF 2, IS66 family

IS66

B7LDR0

Transposase ORF B, IS1

IS1

E7SID2

Transposase

B7L5X0

Tn7-like transposase TnsA

Tn7-like

E9Z073

Transposase

REP-associated transposases (Pubmed: 16563168)

IS110

C8TWQ4

Predicted IS602 transposase OrfB

IS911

C8TGN1

Putative IS621 transposase

IS116/IS110/IS902 family

IS621

P11901

Transposase for insertion sequence element IS421

IS4 related

IS186

E1U309

Transposase InsAB’

IS1

Q7AQT7

Putative transposase

IS1216 related

TnpA  - IS26

B7LDT8

Putative uncharacterized protein

ISSd1 related

IS600

B7L8E6

Putative transposase

ISEc3

E9YLK7

Transposase (Fragment)

E9YLK7

Transposase (Fragment)

C6UPH5

Transposase ISEc8 (Transposase, ISEc8)

IS66

C8UCH9

Putative uncharacterized protein

IS682 related

IS66

C8TND9

Putative uncharacterized protein

ISEc8 ??

A8A1D5

Transposase, IS605 family

IS609

C8TP26

Putative IS609 transposase TnpB

IS605

B7L8Y0

IS30 transposase; KpLE2 phage-like element

IS30

Q8VRB2

Putative transposase

E8IR83

IS66 transposase

IS66

P03008

Transposase for transposon Tn3

Tn3 - TnpA

B7LWW4

Transposase ORF A, IS1

IS1

D3GV27

Transposase

IS30

B7LDR0

Transposase ORF B, IS1

IS1

B7LWV1

Putative uncharacterized protein

D3GV14

Transposase

IS116/IS110/IS902 family

IS110

E7IJ32

Putative transposase

E1U309

Transposase InsAB’

IS1

B7LFP9

Transposase ORF B, IS629

IS629

E3Y631

ISL3 family transposase domain protein

ISL3

Q2EEQ8

Putative defective transposase ybfQ

ISEc2

E3XVP6

Putative transposase

B6ICP6

Truncated transposase

P76102

Putative transposase InsQ for insertion sequence element IS609

IS605

D6HU78

Putative uncharacterized protein

B7L6S2

Putative transposase

IS605

D6HU78

Putative uncharacterized protein

C8ULY2

Putative IS609 transposase TnpA

IS 200 related

IS609

Q4E6D2

Transposase (IS4 family)

InsH  -   IS4  related REP-associated transposases (Pubmed: 16563168)

IS5

E3Y0F9

Transposase, IS605 OrfB family

IS605

B7LBV5

Transposase, IS110 family

IS116/IS110/IS902

IS110

E9YA15

Transposase

IS1 1/5/6

E8Y4A6

Transposase IS200-family protein

IS200

B5YPI1

ISSd1, transposase OrfB

ISSd1

B7L6S2

Putative transposase

IS 609 related

IS605

Q8VRB2

Putative transposase

P0CE57

Transposase insH for insertion sequence element IS5R

REP-associated transposases (Pubmed: 16563168)

IS5

D8B2Q9

ISPsy10, transposase family protein (Fragment)

ISpsy10

B1LJZ9

IS5 transposase

REP-associated transposases (Pubmed: 16563168)

IS5

D6HU78

Putative uncharacterized protein

B7L6S2

Putative transposase

IS 609 related

IS605

Plasmids

Around 246 predicted proteins appear to be related with plasmids. See the genome annotation table with functional tags at:

http://www.era7bioinformatics.com/docs/EHEC_E_COLI_GERMANY_OUTBREAK_Annotation_Era7Bioinformatics_v1562011.xls

This strain has many genetic capabilities that probably confer it a competitive advantage and a high capability for environment adaptation. Some important features that this strain bears in its genome:

  • A restriction-modification system
  • ­Many proteins involved in Fe transport and utilization. Siderophores: aerobactin, enterobactin.
  • Lysozyme
  • ­A general inhibitor of pancreatic serine proteases: inhibits chymotrypsin, trypsin, elastases, factor X, kallikrein as well as a variety of other proteases
  • Proteins involved in anaerobic respiration
  • ­Antimicrobial peptides
  • ­Proteins involved in quorum-sensing and biofilm formation
  • ­Proteins involved in Ni, Cu, Zn and Co resistance
  • ­More tham 170 phage proteins

EU outbreak Escherichia coli EHEC: some genes involved in adhesion, colonization, pathogenicity and metal resistance

This strain has many genes involved in adhesion, flagellum and fimbria functions, colonization (invasins), and a set of genes related to secretion systems. Some of them are collected in Table 1.

Table 1: Adhesion related, secretion system and pathogenicity and virulence related proteins

Contig ID

tags

Similar to

Protein names

husec41_c1007

Adhesion

C8ULG1

Predicted fimbrial-like adhesin protein

husec41_c101

Adhesion

C8TPC9

Adhesin AIDA-I

husec41_c1040

Adhesion

C8TNN7

Curlin major subunit CsgA

husec41_c110

Adhesion

C8U1Y5

Predicted fimbrillin

husec41_c117

Adhesion

C8TI10

AidA-I adhesin-like protein

husec41_c123

Adhesion

C8TI54

Predicted fimbrial-like adhesin protein

husec41_c1256

Adhesion

B7LH27

Putative adhesin major subunit pilin

husec41_c128

Adhesion

C8U8H5

Predicted fimbrial-like adhesin protein

husec41_c1306

Adhesion

C8TM57

Predicted fimbrial-like adhesin protein

husec41_c1331

Adhesion

C8TUN4

Adhesin YfaL

husec41_c1427

Adhesion

C8U5A7

Predicted fimbrial-like adhesin protein

husec41_c1427

Adhesion

C8U5A6

Predicted fimbrial-like adhesin protein

husec41_c1503

Adhesion

C6UM96

Predicted fimbrial-like adhesin protein

husec41_c151

Adhesion

C8TNL5

Putative AidA-I adhesin-like protein

husec41_c1510

Adhesion

C8TQI1

AidA-I adhesin-like protein

husec41_c1570

Adhesion

C8TGZ3

Putative Iha adhesin

husec41_c1732

Adhesion

C8TZB2

Putative adhesin

husec41_c1801

Adhesion

C6UM96

Predicted fimbrial-like adhesin protein

husec41_c1871

Adhesion

C8TLM3

Putative adhesin

husec41_c2014

Adhesion

D3GX42

Putative adhesin autotransporter

husec41_c2017

Adhesion

C8TTS6

AidA-I adhesin-like protein

husec41_c202

Adhesion

C8TU34

Predicted fimbrial-like adhesin protein

husec41_c279

Adhesion

C8TI51

Predicted fimbrial-like adhesin protein

husec41_c297

Adhesion

C8TGZ3

Putative Iha adhesin

husec41_c311

Adhesion

C6V2W1

Adhesin

husec41_c353

Adhesion

B7L4E2

Putative uncharacterized protein

husec41_c571

Adhesion

C8UG07

Conserved predicted protein

husec41_c678

Adhesion

C8TUN4

Adhesin YfaL

husec41_c749

Adhesion

C8TJB6

Predicted fimbrial-like adhesin protein SfmA

husec41_c822

Adhesion

C8TKM1

Predicted fimbrial-like adhesin protein

husec41_c92

Adhesion

C8TJB9

Predicted fimbrial-like adhesin protein SfmH

husec41_c98

Adhesion

C8TH66

Predicted fimbrial-like adhesin protein

husec41_rep_c2345

Adhesion

C8TNL5

Putative AidA-I adhesin-like protein

husec41_rep_c2430

Adhesion

C6UM96

Predicted fimbrial-like adhesin protein

husec41_rep_c2436

Adhesion

C8TPC9

Adhesin AIDA-I

husec41_c1650

Flagellum

B7LBH5

Flagella control of anti-sigma factor FlgM secretion into the periplasm

husec41_c106

Flagellum

C8TPF0

Flagellar P-ring protein 2 (Basal body P-ring protein 2)

husec41_c106

Flagellum

C8TPE9

Flagellar L-ring protein 2 (Basal body L-ring protein 2)

husec41_c106

Flagellum

C8TPE8

Flagellar component FlgG of cell-distal portion of basal-body rod

husec41_c106

Flagellum

B7LG11

Flagellar component of cell-proximal portion of basal-body rod

husec41_c106

Flagellum

C8TPE6

Flagellar hook protein FlgE

husec41_c106

Flagellum

C8TPE5

Flagellar hook assembly protein FlgD

husec41_c106

Flagellum

C8TPE4

Flagellar component FlgC of cell-proximal portion of basal-body rod

husec41_c106

Flagellum

C8TPE3

Flagellar component FlgB of cell-proximal portion of basal-body rod

husec41_c106

Flagellum

C8TPE2

Assembly protein FlgA for flagellar basal-body periplasmic P ring

husec41_c106

Flagellum

B7LG04

Export chaperone for FlgK and FlgL

husec41_c1100

Flagellum

B7LG17

Flagellar hook-filament junction protein

husec41_c1100

Flagellum

C8TPF2

Flagellar hook-filament junction protein FlgK

husec41_c1100

Flagellum

B5YVV1

Flagellar rod assembly protein/muramidase FlgJ

husec41_c1317

Flagellum

C8TJW2

EAL domain containing protein involved in flagellar function

husec41_c1479

Flagellum

C8TTB9

Conserved predicted protein

husec41_c1600

Flagellum

Q4W8H8

Flagellar transcriptional regulator FlhD 1

husec41_c1656

Flagellum

D3GXI6

Flagellar hook-basal body complex protein FliE 2

husec41_c178

Flagellum

B7L8T6

Flagellin

husec41_c178

Flagellum

C6UYC8

Flagellar filament capping protein

husec41_c180

Flagellum

C8TTC1

Predicted flagellar export pore protein FlhB

husec41_c180

Flagellum

C8TTC0

Predicted flagellar export pore protein FlhA

husec41_c229

Flagellum

B7L524

Putative flagellin structural protein; putative exported protein

husec41_c27

Flagellum

C8TTM1

Flagellar motor switching and energizing component FliM

husec41_c27

Flagellum

C8TTM0

Flagellar biosynthesis protein FliL

husec41_c27

Flagellum

B7L8V5

Flagellar hook-length control protein

husec41_c27

Flagellum

C8TTL8

Flagellar protein FliJ

husec41_c27

Flagellum

C8TTL7

Flagellum-specific ATP synthase FliI

husec41_c27

Flagellum

B7L8V2

Flagellar biosynthesis protein

husec41_c27

Flagellum

C8TTL5

Flagellar motor switching and energizing component protein FliG

husec41_c27

Flagellum

C8TTL4

Flagellar basal-body MS-ring and collar protein FliF

husec41_c518

Flagellum

C8TTK6

Flagellar protein FliS, potentiates polymerization

husec41_c518

Flagellum

D3GXH8

Flagellar protein FliT

husec41_c591

Flagellum

B7LGV4

Flagellar brake protein YcgR (Cyclic di-GMP binding protein YcgR)

husec41_c677

Flagellum

C8THR0

Predicted lateral flagellar system protein

husec41_c723

Flagellum

C8TTM2

Flagellar motor switching and energizing component FliN

husec41_c723

Flagellum

B7L8V9

Flagellar biosynthesis protein

husec41_c723

Flagellum

B5YRX7

Flagellar biosynthetic protein FliP

husec41_c723

Flagellum

C8TTM5

Flagellar biosynthesis protein FliQ

husec41_c723

Flagellum

B7L8W2

Flagellar biosynthetic protein fliR

husec41_c751

Flagellum

C8TUT4

DNA-binding transcriptional repressor LrhA of flagellar, motility and chemotaxis genes

husec41_c807

Flagellum

C8TMI2

Predicted outer membrane lipoprotein

husec41_c1479

Flagelum

C8UC56

Predicted flagellar export pore protein FlhA

husec41_c1940

immune sistem interaction

B7N797

Surface presentation of antigens protein

husec41_c1047

invasin

D3GYK0

Putative invasin

husec41_c215

invasin

C8THZ4

Putative invasin

husec41_c540

invasin

Q8Z7G3

Putative invasin

husec41_c955

invasin

D3GYK0

Putative invasin

husec41_c1172

pathogenesis

B7LBK5

Putative uncharacterized protein

husec41_c2051

pathogenesis

C8TGA6

Conserved predicted protein

husec41_rep_c2406

pathogenesis

B7LF41

Putative uncharacterized protein

husec41_c1167

Secretion system

C8TGA7

Type III secretion system lipoprotein EprK

husec41_c146

Secretion system

B7LG99

Putative secretion pathway M-type protein, membrane anchored

husec41_c146

Secretion system

B7LGA0

Putative secretion pathway protein, L-type protein

husec41_c146

Secretion system

B7LGA1

Putative type II secretion protein (GspK-like)

husec41_c146

Secretion system

B7LG98

Putative uncharacterized protein

husec41_c146

Secretion system

B7LGA2

Putative type II secretion protein (GspI-like)

husec41_c146

Secretion system

B7LGA4

Putative uncharacterized protein

husec41_c146

Secretion system

B7LGA5

Putative general secretion pathway protein G (EpsG-like)

husec41_c146

Secretion system

B7LGA3

General secretion pathway protein F

husec41_c146

Secretion system

D3GVL9

Type II secretion system protein E

husec41_c146

Secretion system

B7LGA8

General secretion pathway protein D

husec41_c146

Secretion system

B7LGA9

Putative secretion pathway protein, C-type protein

husec41_c1468

Secretion system

D8AT29

Type III secretion system protein PrgH-EprH

husec41_c1538

Secretion system

B7LF39

Putative Type III secretion EprH protein

husec41_c1598

Secretion system

C8TGB3

Type III secretion protein EpaQ

husec41_c1598

Secretion system

B7LF43

Putative Type III secretion protein EpaR

husec41_c16

Secretion system

D3GUQ5

Microcin H47

husec41_c16

Secretion system

D3GUQ6

Microcin H47 immunity protein

husec41_c1639

Secretion system

C8TGF0

T3SS effector-like protein EspX-homolog

husec41_c1643

Secretion system

E9Z029

DotU family protein type IV/VI secretion system protein

husec41_c1674

Secretion system

C8U8E4

T3SS effector-like protein EspR-homolog

husec41_c1783

Secretion system

D3GUZ5

Putative type VI secretion protein

husec41_c1791

Secretion System

Q3SBC5

EpaS1

husec41_c1791

Secretion system

B7LF43

Putative Type III secretion protein EpaR

husec41_c1849

Secretion system

D3GV03

Putative type VI secretion protein

husec41_c1983

Secretion system

D3GV03

Putative type VI secretion protein

husec41_c2008

Secretion system

D3GV03

Putative type VI secretion protein

husec41_c2036

Secretion system

D3GV03

Putative type VI secretion protein

husec41_c2178

Secretion system

D3GUY9

Putative type VI secretion protein

husec41_c323

Secretion system

D3GUZ0

Putative type VI secretion protein

husec41_c323

Secretion system

D3GUZ1

Putative type VI secretion protein

husec41_c333

Secretion system

C8THD4

Assembly protein HofC in type IV pilin biogenesis, transmembrane protein

husec41_c343

Secretion system

C8TKE9

Sec-independent protein translocase protein tatA/E homolog 1

husec41_c539

Secretion System

Q3SBB5

EpaO

husec41_c542

Secretion System

B7LF47

Putative type III secretion system protein EpaP

husec41_c576

Secretion system

B5YZD3

Secretion monitor protein

husec41_c716

Secretion system

C8TIZ8

SecYEG protein translocase auxillary subunit SecD

husec41_c716

Secretion system

C8TIZ9

SecYEG protein translocase auxillary subunit SecF

husec41_c72

Secretion system

D3GUQ1

Probable microcin H47 secretion/processing ATP-binding protein (EC 3.4.22.-)

husec41_c734

Secretion system

C8TIZ8

SecYEG protein translocase auxillary subunit SecD

husec41_c734

Secretion system

C8TIZ7

SecYEG protein translocase auxillary subunit

husec41_c771

Secretion system

D3GU38

Putative type VI secretion protein

husec41_c813

Secretion system

C8UKR4

T3SS effector-like protein EspL-homolog

husec41_c854

Secretion system

D3GSV7

Type III secretion system protein

husec41_c854

Secretion system

C8UAK5

Type III secretion protein EprJ

husec41_c862

Secretion system

D3GUY9

Putative type VI secretion protein

husec41_c929

Secretion system

D3GUZ5

Putative type VI secretion protein

husec41_c975

Secretion system

C8TKY1

Sec-independent protein translocase protein tatA/E homolog 2

husec41_c335

Secretion system?

C8TVQ3

Predicted transglycosylase

husec41_c553

Secretion System?

B7L5T3

Putative HlyD family secretion protein

husec41_c902

virulence, plasmid

D3H5A9

Virulence protein required for expression/correct membrane localisation of IcsA (VirG)

husec41_c1048

fimbria

B7L7K6

Putative major fimbrial subunit FmlA

husec41_c1130

fimbria

B7LAC4

Putative uncharacterized protein ybgQ

husec41_c1130

fimbria

C8TKM4

Putative fimbrial-like protein

husec41_c123

fimbria

C8TI53

Predicted outer membrane protein

husec41_c1256

fimbria

C8THB2

Putative fimbrial protein

husec41_c1279

fimbria

C8U6Q7

Putative fimbrial subunit protein

husec41_c128

fimbria

B7L7K6

Putative major fimbrial subunit FmlA

husec41_c128

fimbria

C8TQ64

Outer membrane usher protein FimD

husec41_c128

fimbria

B7L7K2

Putative fimbrial-like adhesin exported protein

husec41_c128

fimbria

B7L7K1

Putative fimbrial-like exported adhesin protein

husec41_c1369

fimbria

C8TL83

Putative fimbrial protein

husec41_c1400

fimbria

C8U6Q8

Putative fimbrial subunit protein

husec41_c1400

fimbria

B7LBJ0

Putative minor fimbrial subunit

husec41_c1503

fimbria

C8THG4

Putative fimbrial protein

husec41_c1503

fimbria

C8THG3

Putative fimbrial protein

husec41_c1527

fimbria, plasmid

P46005

Outer membrane usher protein AggC

husec41_c1592

fimbria

C8TMJ0

Putative fimbrial usher protein

husec41_c1651

fimbria

C8TUY3

Putative fimbrial-like protein

husec41_c1651

fimbria

C8TUY2

Outer membrane usher protein

husec41_c1675

fimbria

D3H3A6

Fimbrial outer membrane usher protein

husec41_c1701

fimbria, plasmid

P46005

Outer membrane usher protein AggC

husec41_c1953

fimbria, plasmid

B7LWW0

14 kDa aggregative adherence fimbriae I protein (Modular protein)

husec41_c202

fimbria

C8TU35

Putative outer membrane protein

husec41_c202

fimbria

B7L9Y3

Putative fimbrial-like adhesin protein

husec41_c2205

fimbria, plasmid

P46005

Outer membrane usher protein AggC

husec41_c279

fimbria

C8TI53

Predicted outer membrane protein

husec41_c353

fimbria

D3H3A6

Fimbrial outer membrane usher protein

husec41_c487

fimbria

B7L8D9

Putative fimbrial-like adhesin protein

husec41_c498

fimbria

C8TUY2

Outer membrane usher protein

husec41_c507

fimbria, plasmid

P46005

Outer membrane usher protein AggC

husec41_c514

fimbria

C8THG6

Probable outer membrane porin protein

husec41_c636

fimbria

C8TMJ2

Putative fimbrial adhesin protein

husec41_c636

fimbria

C8TMJ1

Putative minor fimbrial subunit

husec41_c636

fimbria

C8TM56

Putative outer membrane usher protein

husec41_c636

fimbria

C8TMI8

Putative fimbrial major protein

husec41_c679

fimbria

C8THG8

Putative fimbrial-like protein

husec41_c775

fimbria, plasmid

P46005

Outer membrane usher protein AggC

husec41_c777

fimbria

C8THG4

Putative fimbrial protein

husec41_c777

fimbria

C8U1K8

Putative fimbrial protein

husec41_c777

fimbria

C8THG6

Probable outer membrane porin protein

husec41_c92

fimbria

C8TQ64

Outer membrane usher protein FimD

husec41_c972

Fimbria

Q8Z2Q1

Probable fimbrial subunit protein

husec41_c98

fimbria

C8TH67

Predicted outer membrane usher protein

husec41_rep_c2402

fimbria

D3GT73

Putative fimbrial protein



Genes involved in metal resistance

This strain has a set of genes involved in metal resistance.

It could bear a mercuric resistance plasmid. The predicted proteins included in Table 2 are all located in the same contig and probably are in a plasmid forming a functional operon with a MerR family regulator in a divergent orientation to the rest of the components of the operon.

Table 2:  Mercury resistance plasmid genes

Contig ID

tags

Similar to

Protein names

husec41_c784

Mercuric resistance

C8UQM8

Mercuric resistance operon regulatory protein MerR

husec41_c784

Mercuric resistance, plasmid

C8UQM9

Mercuric ion transport protein MerT

husec41_c784

Mercuric resistance

D3H375

Mercuric ion transport protein

husec41_c784

Mercuric resistance, plasmid

Q0ZKU6

Mercuric resistance protein MerC (Mercury resistance operon transport protein MerC) (Putative uncharacterized protein)

husec41_c784

Mercuric

Q935L3

Putative mercuric reductase (EC 1.16.1.1)

husec41_c784

Mercuric resistance

D3H372

MerR-family transcriptional regulator

husec41_c784

plasmid

Q5J458

Urf2

Table 3 collects the genes involved in Tellurium resistance.


Table 3: Tellurium resistance genes

Contig ID

tags

Similar to

Protein names

husec41_c1152

Tellurium

C8TNH6

Putative tellurium resistance protein TerC

husec41_c1243

Tellurium

C8TNH3

Putative tellurium resistance protein TerZ

husec41_c1243

Tellurium

C8TNH4

Putative tellurium resistance protein TerA

husec41_c1243

Tellurium

C8TNH5

Putative tellurium resistance protein TerB

husec41_c1512

Tellurium

C8TX21

Putative tellurium resistance protein TerF

husec41_c1713

Tellurium

C8TNH6

Putative tellurium resistance protein TerC

husec41_c1713

Tellurium

C8TNH7

Putative tellurium resistance protein TerD

husec41_c273

Tellurium resistance

Q19NM8

TerY3

husec41_c2960

Tellurium resistance, plasmid

Q19NL2

TerE

husec41_c750

Tellurium resistance

Q19NM7

TerY2

husec41_c750

Tellurium resistance, plasmid

P75012

Tellurium resistance protein TerX

husec41_c750

Tellurium resistance

Q19NM5

TerY1

husec41_c750

Tellurium resistance, plasmid

P75010

Tellurium resistance protein terW

husec41_c794

Tellurium

C8TNH8

Putative tellurium resistance protein TerE

In addition to genes involved in Mercury resistance and Tellurium resistance we have predicted and annotated in this genome many genes involved in resistance to other metals as Cobalt and Copper. See complete annotation at:

Escherichia coli EHEC Germany outbreak preliminary manual functional annotation

Based on the preliminary results of our semi-automated method of annotation we have selected some predicted protein with potential implications in pathogenicity and virulence. There are 33 predicted genes annotated as toxins and we have found three putative hemolysins: Hemolysin E, a putative hemolysin expression modulating protein and a channel protein, hemolysin III family. We have found 31 predicted genes that could be related to specific antibiotic resistance: beta-lactamic, aminoglycoside, macrolide, polymyxin, tetracycline, fosfomycin and deoxycholate, novobiocin, chloramphenicol, bicyclomycin, norfloxacin and enoxacin and 6-mercaptopurine. This strain is rich in adhesion, secretion systems, pathogenicity and virulence related proteins. It seems to have a restriction-modification system, many proteins involved in Fe transport and utilization (siderophores as aerobactin and enterobactin), lysozyme, one inhibitor of pancreatic serine proteases, proteins involved in anaerobic respiration, antimicrobial peptides, proteins involved in quorum-sensing and biofilm formation that could confer competitive advantage to this strain.


**Fast manual annotation **

Based on the preliminary results of our semi-automated method of annotation we have reviewed manually the annotation tagging the principal genes and functions. This preliminary tagging has been carried out analyzing the annotation for each predicted protein following specific interesting points: toxins, hemolysins, antibiotic resistance, pathogenicity, adhesion, plasmid, phage and other features.

We have selected and clustered genes with specific functions especially important from the human health perspective.  The tagged genes are displayed in the following simplified tables. The complete annotation tables are available at:

http://www.era7bioinformatics.com/docs/EHEC_E_COLI_GERMANY_OUTBREAK_Annotation_Era7Bioinformatics_v1562011.xls

**Table 1: Toxins **

In this table we have selected predicted proteins with annotations related with toxin function.

Contig ID

Gen ID

tags

Similar to

Protein names

husec41_c1134

36180

Toxin

Q83Z99

Putative acyltransferase MchD

husec41_c1252

46188

Toxin

D8CGU2

Toxin-antitoxin system, toxin component, PIN family

husec41_c136

40883

Toxin

D7YRT1

Toxin-antitoxin system, antitoxin component, Xre family

husec41_c145

66526

Toxin

C8THQ6

Predicted antitoxin of YafQ-DinJ toxin-antitoxin system

husec41_c145

54284

Toxin

B7LHF6

Toxin of the YafQ-DinJ toxin-antitoxin system

husec41_c1518

5119

Toxin

Q8VSL2

Serine protease sepA autotransporter (EC 3.4.21.-) [Cleaved into: Serine protease sepA; Serine protease sepA translocator]

husec41_c1554

45805

Toxin

C1NDL5

Secreted autotransporter toxin Sat

husec41_c1932

50684

Toxin

B7LBW0

Serine protease pic (ShMu)

husec41_c2009

41962

Toxin

D8ET41

Toxin-antitoxin system protein

husec41_c2132

76041

Toxin

B7MC95

Vacuolating autotransporter toxin

husec41_c291

110361

Toxin

B7LBY5

Serine protease pet (Plasmid-encoded toxin pet) (EC 3.4.21.72)

husec41_c380

110360

Toxin

B7LBY5

Serine protease pet (Plasmid-encoded toxin pet) (EC 3.4.21.72)

husec41_c482

84923

Toxin

C8TUV8

Membrane protein

husec41_c543

42847

Toxin

E6ARB5

Toxin-antitoxin system, antitoxin component, Xre family

husec41_c567

110363

Toxin

B7LBY5

Serine protease pet (Plasmid-encoded toxin pet) (EC 3.4.21.72)

husec41_c581

45751

Toxin

E7J0S6

Putative shET2 enterotoxin

husec41_c59

63485

Toxin

C8TP46

Toxin ChpB of the ChpB-ChpS toxin-antitoxin system

husec41_c672

46347

Toxin

D7X7P6

ShET2 enterotoxin, region

husec41_c69

41563

Toxin

D8BT94

Toxin-antitoxin system, antitoxin component, HicB family

husec41_c74

69806

Toxin

C8TTV0

Toxin of the YoeB-YefM toxin-antitoxin system

husec41_c74

68869

Toxin

A1ACN0

Antitoxin of the YoeB-YefM toxin-antitoxin system

husec41_c746

89468

Toxin

C8TG29

Toxin ChpA

husec41_c782

5124

Toxin

Q8VSL2

Serine protease sepA autotransporter (EC 3.4.21.-) [Cleaved into: Serine protease sepA; Serine protease sepA translocator]

husec41_c797

36434

Toxin

D7ZZI1

Toxin-antitoxin system, antitoxin component, AbrB family

husec41_rep_c2274

50678

Toxin

B7LBW0

Serine protease pic (ShMu)

husec41_rep_c2292

59806

Toxin

B7LBU0

Toxin of the YeeV-YeeU toxin-antitoxin system

husec41_rep_c2292

25611

Toxin

E3PD62

Putative antitoxin

husec41_rep_c2295

19119

Toxin

B3HJU4

Serine proteAse eata (EC 3.4.21.-)

husec41_rep_c2297

108631

Toxin

C6UP08

Shiga toxin II subunit B

husec41_rep_c2297

42022

Toxin

B3BQ93

Shiga toxin subunit A (EC 3.2.2.22)

husec41_rep_c2359

5123

Toxin

Q8VSL2

Serine protease sepA autotransporter (EC 3.4.21.-) [Cleaved into: Serine protease sepA; Serine protease sepA translocator]

husec41_rep_c2607

104486

Toxin

B7MN77

Toxin of the YeeV-YeeU toxin-antitoxin system

husec41_rep_c2848

5126

Toxin

Q8VSL2

Serine protease sepA autotransporter (EC 3.4.21.-) [Cleaved into: Serine protease sepA; Serine protease sepA translocator]

husec41_c1549

66741

Toxin?

D3QPV9

Small toxic membrane polypeptide

husec41_c685

78980

Toxin?

E8YB84

Hok/gef cell toxic protein

husec41_c201

19118

Toxin?

B3HJU4

Serine proteAse eata (EC 3.4.21.-)

Table 2: Hemolysins and heme metabolism related proteins

Contig ID

tags

Similar to

Protein names

husec41_c1375

Hemolysin

B5YXK6

Hemolysin E, chromosomal

husec41_c1786

Hemolysin

Q0TDJ8

Putative hemolysin expression modulating protein

husec41_c604

Hemolysin

D3RHI6

Channel protein, hemolysin III family

husec41_c1103

Heme

C8U6D2

Heme lyase, CcmH subunit

husec41_c1469

Heme

C8U6D2

Heme lyase, CcmH subunit

husec41_c1469

Heme

C8TUD4

Heme lyase, CcmF subunit

husec41_c1914

Heme

C8TUD7

Heme exporter protein C

husec41_c194

Heme

C8TTP2

Sulfoxide reductase heme-binding subunit yedZ (Flavocytochrome yedZ)

husec41_c231

Heme

D3GXL9

Sulfoxide reductase heme-binding subunit yedZ (Flavocytochrome yedZ)

husec41_c458

Heme

C8TIQ7

Siroheme synthase

husec41_c656

Heme

C8TMW6

Heme lyase, NrfG subunit

husec41_c656

Heme

C8TMW5

Heme lyase, NrfF subunit

husec41_c656

Heme

C8TMW4

Heme lyase, NrfE subunit

husec41_c849

Heme

B7LAM4

Heme exporter subunit ; ATP-binding component of ABC superfamily

husec41_c849

Heme

B7LAM3

Heme exporter subunit ; membrane component of ABC superfamily

husec41_c849

Heme

C8TUD7

Heme exporter protein C

**Table 3: Antibiotic resistance. **

In table 3 we have selected genes involved in specific antibiotic resistance and also genes encoding efflux pumps and multidrug resistance proteins that could be involved in some additional antibiotic resistance capabilities of this strain.

Contig ID

tags

Similar to

Protein names

husec41_c1261

Aminoglycoside resistance

C8TVG3

Aminoglycoside/multidrug efflux system protein AcrD

husec41_c177

Aminoglycoside resistance

C8TVG3

Aminoglycoside/multidrug efflux system protein AcrD

husec41_c626

Aminoglycoside resistance

C8TVG3

Aminoglycoside/multidrug efflux system protein AcrD

husec41_c697

Aminoglycoside resistance

C8TVG3

Aminoglycoside/multidrug efflux system protein AcrD

husec41_c738

Aminoglycoside resistance

C8TVG3

Aminoglycoside/multidrug efflux system protein AcrD

husec41_c312

Macrolide resistance

C8TLZ5

Fused macrolide transporter subunits of ABC superfamily: ATP-binding component/membrane component

husec41_c584

Macrolide resistance

C8TLZ5

Fused macrolide transporter subunits of ABC superfamily: ATP-binding component/membrane component

husec41_c312

Macrolide resistance

C8TLZ4

Macrolide transporter subunit, membrane fusion protein component

husec41_c48

Penicillin resistance

B7LBI2

Penicillin-insensitive murein endopeptidase (EC 3.4.24.-) (D-alanyl-D-alanine-endopeptidase) (DD-endopeptidase)

husec41_c1681

Polymyxin resistance

B7LAS5

Polymyxin resistance protein B

husec41_c497

Polymyxin resistance

D6I9J9

Polymyxin resistance protein PmrM

husec41_c2170

beta-lactam resistance

C8UQP5

TEM-1 beta-lactamase

husec41_c642

Tetracycline resistance

D3H382

Tetracycline resistance protein

husec41_c559

beta-lactam resistance

C8TJ28

Regulator of penicillin binding proteins and **beta-lactamase **transcription

husec41_c1436

Tetracycline resistance

C8TNP7

Multidrug resistance protein mdtG

husec41_c55

Fosfomycin and deoxycholate resistance

C8TU05

Multidrug resistance protein mdtA (Multidrug transporter mdtA)

husec41_c1334

Novobiocin and deoxycholate resistance

C6V0N5

Multidrug resistance protein MdtB (Multidrug transporter MdtB)

husec41_c1334

Novobiocin and deoxycholate resistance

C8TU07

Multidrug resistance protein MdtC (Multidrug transporter MdtC)

husec41_c55

Novobiocin and deoxycholate resistance

C8TU07

Multidrug resistance protein MdtC (Multidrug transporter MdtC)

husec41_c1063

chloramphenicol resistance

B7L855

Multidrug resistance protein mdtL

husec41_c654

beta-lactam resistance

C8TNX4

Beta-lactamase/D-alanine carboxypeptidase AmpC

husec41_c229

beta-lactam resistance

C8TIW8

Beta-lactamase/D-alanine carboxypeptidase

husec41_c1120

Beta-lactamic resistance

B7L5H7

Beta-lactam resistance membrane protein

husec41_c30

Beta-lactamic resistance

Q6BBP7

**Beta-lactamase **(Beta-lactamase CTX-M-3) (CTX-M-3 extended-spectrum beta-lactamase) (Extended-spectrum class A beta-lactamase CTX-M-3)

husec41_rep_c2845

Beta-lactamic resistance

B2CD48

**Beta-lactamase **TEM (Fragment)

husec41_c132

Bicyclomycin resistance

B7LAK6

Bicyclomycin/multidrug efflux system

husec41_c720

Bicyclomycin resistance

C6V2S0

Bicyclomycin/multidrug efflux system

husec41_c1017

Polymyxin resistance

C8TUQ2

Bifunctional** polymyxin** resistance protein ArnA

husec41_c497

Polymyxin resistance

C8TUQ2

Bifunctional polymyxin resistance protein ArnA

husec41_c770

norfloxacin and enoxacin resistance

B7LFZ9

Multidrug resistance protein mdtH

husec41_c310

6-mercaptopurine resistance

C8TLE9

Purine ribonucleoside efflux pump nepI

husec41_c212

Antibiotic resistance?

B5Z494

Drug resistance transporter, Bcr/CflA subfamily

husec41_c642

Antibiotic resistance?

D3H381

Tetracycline repressor

husec41_c18

Antibiotic resistance?

C8TME8

Membrane fusion protein (MFP) component of efflux pump, signal anchor

husec41_c348

Antibiotic resistance?

B7LB31

Membrane fusion protein of efflux pump

husec41_c1372

Antibiotic resistance?

D3GVT1

Modulator of drug activity B

husec41_c1352

Antibiotic resistance?

B7LEA2

Multidrug efflux system

husec41_c348

Antibiotic resistance?

B7LB24

Multidrug efflux system component

husec41_c1670

Antibiotic resistance?

B7LHX4

Multidrug efflux system protein

husec41_c684

Antibiotic resistance?

C8TQ83

Multidrug efflux system protein

husec41_c709

Antibiotic resistance?

C8UN45

Multidrug efflux system protein

husec41_c697

Antibiotic resistance?

C8TJ55

Multidrug efflux system protein AcrA

husec41_c1084

Antibiotic resistance?

C8TLV8

Multidrug efflux system protein Cmr

husec41_c75

Antibiotic resistance?

C8TFT3

Multidrug efflux system protein EmrB

husec41_c654

Antibiotic resistance?

C8TNX2

Multidrug efflux system protein SugE

husec41_c214

Antibiotic resistance?

B5YX79

Multidrug resistance protein D

husec41_c212

Antibiotic resistance?

C8TSE0

Multidrug resistance protein mdtK (Multidrug-efflux transporter)

husec41_c288

Antibiotic resistance?

Q3I3P0

ORF3-QacEdelta1 fusion protein

husec41_c656

Antibiotic resistance?

B7LB29

Outer membrane factor of efflux pump

husec41_c407

Antibiotic resistance?

C8TJM6

Predicted antibiotic transporter

husec41_c993

Antibiotic resistance?

C8TTQ9

Predicted multidrug efflux system

husec41_c1456

Antibiotic resistance?

C8TV77

Predicted multidrug efflux system protein Y

husec41_c481

Antibiotic resistance?

C8TV77

Predicted multidrug efflux system protein Y

husec41_c1726

Antibiotic resistance?

C8ULT4

Predicted outer membrane factor of efflux pump

husec41_c1014

Antibiotic resistance?

C8TSY9

Predicted transporter

husec41_c497

Antibiotic resistance?

B7LAS1

Probable 4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol deformylase ArnD (EC 3.5.1.n3)

husec41_c1334

Antibiotic resistance?

D3GZE1

Putative multidrug resistance protein mdtD

husec41_c464

Antibiotic resistance?

C8TU08

Putative multidrug resistance protein mdtD

husec41_c1656

Antibiotic resistance?

B7L8U8

Putative multidrug resistance protein; DLP12 prophage

husec41_c1277

Antibiotic resistance?

B7L7J5

Putative multidrug transporter fused subunits of ABC superfamily transporter: permease component; ATP-binding component

husec41_c225

Antibiotic resistance?

C8TUQ0

UDP-4-amino-4-deoxy-L-arabinose–oxoglutarate aminotransferase (EC 2.6.1.87) (UDP-(beta-L-threo-pentapyranosyl-4’‘-ulose diphosphate) aminotransferase) (UDP-4-amino-4-deoxy-L-arabinose aminotransferase)

husec41_c1017

Antibiotic resistance?

C6V2Y6

Undecaprenyl-phosphate 4-deoxy-4-formamido-L-arabinose transferase (EC 2.7.8.30) (Undecaprenyl-phosphate Ara4FN transferase)

husec41_c225

Antibiotic resistance?

D3H1C1

Undecaprenyl-phosphate 4-deoxy-4-formamido-L-arabinose transferase (EC 2.7.8.30) (Undecaprenyl-phosphate Ara4FN transferase)

husec41_c181

Antibiotic resistance?

B7LAR9

Undecaprenyl-phosphate 4-deoxy-4-formamido-L-arabinose transferase (EC 2.7.8.30) (Undecaprenyl-phosphate Ara4FN transferase) (Ara4FN transferase)

husec41_c908

Antibiotic resistance?

C8TMT4

DNA-damage-inducible SOS response protein DinF

husec41_c481

Antibiotic resistance?

C8TV78

EmrKY-TolC multidrug resistance efflux pump protein K, membrane fusion protein component

husec41_c2258

Antibiotic resistance?

B7LBS1

EmrKY-TolC multidrug resistance efflux pump, membrane fusion protein component

husec41_c2050

Antibiotic resistance?

C8U6F0

Fused predicted multidrug transport subunits of ABC superfamily: membrane component/ATP-binding component

husec41_c931

Antibiotic resistance?

C8TUF0

Fused predicted multidrug transport subunits of ABC superfamily: membrane component/ATP-binding component

husec41_c449

Antibiotic resistance?

C8TJ42

Fused predicted multidrug transporter subunits of ABC superfamily: ATP-binding components

husec41_c449

Antibiotic resistance?

C8TJ41

Fused predicted multidrug transporter subunits of ABC superfamily: ATP-binding components

husec41_c897

Antibiotic resistance?

C8U8G6

Fused predicted multidrug transporter subunits of ABC superfamily: membrane component/ATP-binding component

husec41_c1382

Antibiotic resistance?

B7L9Q8

Wzx

husec41_c2032

Antibiotic resistance?

Q93NP6

Wzx

husec41_c2082

Antibiotic resistance?

Q93NP6

Wzx

husec41_c2085

Antibiotic resistance?

B7L9Q8

Wzx

husec41_c832

Antibiotic resistance?

C8TRP6

DNA-binding transcriptional dual activator MarA of multiple antibiotic resistance

husec41_c832

Antibiotic resistance?

C8TRP5

DNA-binding transcriptional repressor MarR of multiple antibiotic resistance

husec41_c795

Antibiotic resistance?

C8THH6

Fused glycosyl transferase and transpeptidase

husec41_c440

Antibiotic resistance?

C8TJI6

Fused penicillin-binding protein 1a: murein transglycosylase/murein transpeptidase

husec41_c1203

Antibiotic resistance?

C8TJV0

Multidrug resistance efflux transporter MdtE

husec41_c1099

Antibiotic resistance?

D3GR96

Penicillin-binding protein 1B [includes: penicillin-insensitive transglycosylase; penicillin sensitive transpeptidase] (EC 2.4.1.129) (EC 3.4.-.-)

husec41_c1011

Antibiotic resistance?

C6UUQ5

p-hydroxybenzoic acid efflux pump subunit AaeA (pHBA efflux pump protein A)

husec41_c1123

Antibiotic resistance?

C8TIE2

p-hydroxybenzoic acid efflux pump subunit AaeA (pHBA efflux pump protein A)

husec41_c1011

Antibiotic resistance?

C8TIE1

p-hydroxybenzoic acid efflux pump subunit AaeB (pHBA efflux pump protein B)

husec41_c150

Antibiotic resistance?

C8TIE1

p-hydroxybenzoic acid efflux pump subunit AaeB (pHBA efflux pump protein B)

husec41_c228

Antibiotic resistance?

C6UU60

Predicted methyl viologen efflux pump

husec41_c1148

Antibiotic resistance?

C8TL54

Predicted multidrug or homocysteine efflux system protein HsrA

husec41_c124

Antibiotic resistance?

B7L8U8

Putative multidrug resistance protein; DLP12 prophage

Escherichia coli EHEC Germany outbreak semi-automated annotation

Semi-automated annotation using BG7 system

We did the semi-automated annotation of the genome sequenced by BGI (6-2-2011, http://www.bgisequence.com/eu/index.php?cID=194 ) and assembled with MIRA by Nick Loman (6-2-2011  http://pathogenomics.bham.ac.uk/blog/2011/06/ehec-genome-assembly/ ).

Our system BG7 (Bacterial Genome annotation of Era7 Bioinformatics, https://registration.hinxton.wellcome.ac.uk/display_info.asp?id=227 , http://www.slideshare.net/marina_manrique/bg7-a-new-system-for-bacterial-genome-annotation-designed-for-ngs-data ) predicts ORFs and annotates them based on fragments of similarity with Uniprot proteins.

In contrast to other annotation pipelines where finding ORFs is the first step followed by the annotation one, BG7 system first searches for protein similarity and then defines the ORF searching for start and stop signals. It is specifically designed for annotating prokaryotic genomes obtained with NGS data since it handles the principal errors of these technologies: false indels in homopolymer regions and substitutions. Annotation systems based on initial and exact ORF detection often may lose ORFs due to these kinds of sequencing errors that may lead to introduction or lack of stop codons and modification of start signals. BG7 is also designed to work with genomes fragmented in many contigs solving the problem of the detection of incomplete genes at the end of contigs. The system is especially suitable to detect rare genes similar to proteins from taxonomically distant organisms. BG7 takes advantage of cloud computing to perform extensive computing tasks in a reasonable time. The annotation of a 3Mb bacterial genome can be performed in less than 12 hours.

Dataset of proteins for similarity-based ORF prediction

A set of 137063 proteins were selected as reference protein for the system BG7:

  • All representative proteins corresponding to Escherichia coli protein Uniref90 clusters  from organisms including in their name the terms “EHEC”  or “EAEC”
  • All Uniprot proteins from bacteria including in any field the term “toxin”
  • ­All Uniprot proteins from bacteria including in any field the term “hemolysin”
  • All the proteins from Salmonella typhi, Yersinia pestis and Shigella dysenteriae

The system search for similarities for each protein of the set in the contigs sequenced. This BLAST similarity results are the seed for prediction of ORFs.

RESULTS

We have predicted 6327 genes, 6156 encoding proteins y 171 corresponding to ribosomal and tRNA.

Only 1326 out of the 6156 protein encoding genes have canonical start and stop codon and haven´t frame-shifts neither intragenic stop codons. 2479 protein encoding genes (out of the 6156 predicted) include some frameshift or some intragenic stop codon in their sequences, probably caused by inherent technology errors. However our system is tolerant to errors of massive sequencing technologies and it has been able to detect a rich set of genes even with very preliminary sequencing results.

Probably some of the proteins detected are fragmented and some of them could appear as two different predicted genes if they are in different contigs.

We have analyzed the taxonomic origin of the proteins responsible of the prediction of the detected genes. Table 1, Figure 1 and Figure 2 display the result of this analysis.

Table 1: Taxonomic origin of **proteins responsible of the prediction of the detected genes **

Organism

number of proteins

Escherichia coli O26:H11 (strain 11368 / EHEC)

2810

Escherichia coli (strain 55989 / EAEC)

1166

Escherichia coli O44:H18 (strain 042 / EAEC)

339

Escherichia coli O103:H2 (strain 12009 / EHEC)

296

Escherichia coli

221

Escherichia coli O111:H- (strain 11128 / EHEC)

151

Escherichia coli O157:H7 (strain EC4115 / EHEC)

148

Escherichia coli O157:H7 (strain TW14359 / EHEC)

144

Escherichia coli (strain K12)

51

Salmonella typhi

51

Escherichia coli O1:K1 / APEC

50

Escherichia coli (strain UTI89 / UPEC)

40

Escherichia coli O81 (strain ED1a)

30

Yersinia pestis

29

Escherichia coli O139:H28 (strain E24377A / ETEC)

18

Escherichia coli B354

14

Escherichia coli O55:H7 (strain CB9615 / EPEC)

13

Escherichia coli O6:K15:H31 (strain 536 / UPEC)

13

Escherichia coli B088

12

Escherichia coli MS 119-7

12

Escherichia coli O6

12

Escherichia coli TA007

12

Escherichia coli (strain SE11)

11

Escherichia coli 1827-70

11

Escherichia coli MS 107-1

11

Escherichia coli O127:H6 (strain E2348/69 / EPEC)

11

Escherichia coli EPECa14

10

Escherichia coli M863

10

Escherichia coli MS 124-1

10

Escherichia coli B7A

9

Escherichia coli H120

9

Escherichia coli MS 117-3

9

Escherichia coli MS 198-1

9

Escherichia coli MS 21-1

9

Escherichia coli O157:H7

9

Shigella dysenteriae

9

Shigella flexneri 2a str. 2457T

9

Escherichia coli 3431

8

Escherichia coli LT-68

8

Escherichia coli MS 116-1

8

Escherichia coli MS 182-1

8

Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)

8

Escherichia coli O45:K1 (strain S88 / ExPEC)

8

Escherichia coli EC4100B

7

Escherichia coli MS 145-7

7

Escherichia coli MS 196-1

7

Escherichia coli MS 84-1

7

Escherichia coli (strain SMS-3-5 / SECEC)

6

Escherichia coli B185

6

Escherichia coli E128010

6

Escherichia coli E482

6

Escherichia coli FVEC1302

6

Escherichia coli MS 16-3

6

Escherichia coli MS 175-1

6

Escherichia coli MS 187-1

6

Escherichia coli MS 69-1

6

Escherichia coli O9:H4 (strain HS)

6

Escherichia sp. 3253FAA

6

Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)

6

Shigella dysenteriae serotype 1 (strain Sd197)

6

Shigella flexneri

6

Escherichia coli E1520

5

Escherichia coli MS 110-3

5

Escherichia coli MS 115-1

5

Escherichia coli MS 45-1

5

Escherichia coli O157:H7 str. TW14588

5

Escherichia coli O7:K1 (strain IAI39 / ExPEC)

5

Enterobacter sakazakii (strain ATCC BAA-894)

4

Escherichia coli 1357

4

Escherichia coli 53638

4

Escherichia coli B171

4

Escherichia coli E110019

4

Escherichia coli E22

4

Escherichia coli F11

4

Escherichia coli MS 185-1

4

Escherichia coli MS 78-1

4

Escherichia coli MS 85-1

4

Escherichia coli O111:H-

4

Escherichia coli O55:H7 str. USDA 5905

4

Escherichia coli O78:H11 (strain H10407 / ETEC)

4

Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)

4

Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)

3

Enterobacteria phage lambda (Bacteriophage lambda)

3

Escherichia albertii TW07627

3

Escherichia coli (strain ATCC 55124 / KO11)

3

Escherichia coli (strain B / BL21)

3

Escherichia coli (strain B / REL606)

3

Escherichia coli 1180

3

Escherichia coli 83972

3

Escherichia coli H263

3

Escherichia coli MS 57-2

3

Escherichia coli MS 60-1

3

Escherichia coli RN587/1

3

Escherichia sp. 1143

3

Salmonella typhimurium

3

Shigella boydii serotype 4 (strain Sb227)

3

Shigella flexneri serotype 5b (strain 8401)

3

Bacillus cereus G9241

2

Citrobacter youngae ATCC 29220

2

Enterobacteria phage CUS-3

2

Enterobacteria phage VT2phi_272

2

Escherichia coli (strain K12 / DH10B)

2

Escherichia coli (strain K12 / MC4100 / BW2952)

2

Escherichia coli 2362-75

2

Escherichia coli FVEC1412

2

Escherichia coli H489

2

Escherichia coli MS 146-1

2

Escherichia coli MS 200-1

2

Escherichia coli NC101

2

Escherichia coli O150:H5 (strain SE15)

2

Escherichia coli O157:H- str. H 2687

2

Escherichia coli O157:H7 str. 1125

2

Escherichia coli O157:H7 str. EC869

2

Escherichia coli O55:H7 str. 3256-97

2

Escherichia coli O8 (strain IAI1)

2

Escherichia coli TW10509

2

Salmonella choleraesuis

2

Serratia marcescens

2

Shigella flexneri CDC 796-83

2

Shigella sonnei (strain Ss046)

2

Citrobacter rodentium (strain ICC168) (Citrobacter freundii biotype 4280)

1

Citrobacter sp. 30_2

1

Cronobacter turicensis (strain DSM 18703 / LMG 23827 / z3032)

1

Enterobacteria phage H19B (Bacteriophage H19B)

1

Enterobacteria phage Sf6 (Shigella flexneri bacteriophage VI) (Bacteriophage SfVI)

1

Enterobacteria phage VT2-Sa (Bacteriophage VT2-Sa)

1

Erwinia amylovora (strain CFBP1430)

1

Escherichia coli (strain UM146)

1

Escherichia coli 101-1

1

Escherichia coli H252

1

Escherichia coli MS 153-1

1

Escherichia coli O157:H- str. 493-89

1

Escherichia coli O157:H7 str. 1044

1

Escherichia coli O157:H7 str. EC1212

1

Escherichia coli O157:H7 str. EC4042

1

Escherichia coli O157:H7 str. EC4486

1

Escherichia coli O157:H7 str. G5101

1

Escherichia coli O83:H1 (strain NRG 857C / AIEC)

1

Escherichia coli OR:K5:H- (strain ABU 83972)

1

Haemophilus ducreyi

1

Klebsiella pneumoniae (strain 342)

1

Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)

1

Klebsiella pneumoniae subsp. pneumoniae NTUH-K2044

1

Klebsiella variicola (strain At-22)

1

Pantoea sp. (strain At-9b)

1

Proteus mirabilis

1

Salmonella enterica subsp. enterica serovar Typhimurium str. TN061786

1

Salmonella paratyphi A (strain AKU_12601)

1

Shigella boydii ATCC 9905

1

Shigella dysenteriae 1617

1

Shigella dysenteriae CDC 74-1112

1

Shigella sonnei

1

Shigella sonnei 53G

1

Uncultured gamma proteobacterium HF0010_10D20

1

Uncultured Oceanospirillales bacterium HF4000_21D01

1

Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)

1

Wolbachia endosymbiont of Drosophila simulans

1

Yersinia pestis Pestoides A

1


[caption id=”attachment_298” align=”aligncenter” width=”452” caption=”Figure 1: Taxonomic origin of proteins responsible of the prediction of the detected genes”] [/caption]