We are the R&D group at era7bioinformatics.
Hi everyone! Christmas Eve is almost here and there’s still time for a last-minute present. Thanks to CloudFormation and this template I’m about to show you, Neo4j Server is now friends with AWS (Amazon Web Services) and together they bring you the opportunity of getting your own fresh Neo4j Server machine running in just a [...]
Hi everyone! A couple of days ago I published a post describing how to obtain cool GO annotation visualizations with Gephi + Bio4j. As an example I used data from one of the first assemblies for the EHEC genome, and I was wondering today: Why not using the last version from BGI assembly we annotated [...]
Hi ! I just finished this afternoon a small project I had to do about identification of microsatellites in DNA sequences. As with every new project I start, I think of something that: I didn’t try before is worth learning is applicable in order to meet the needs of the specific project These last few [...]
Today I found an interesting discussion in Neo4j user list and found myself in the mood of writing a couple of related thoughts I have had in mind the last months. Here they are: (the titles are taken from the guidelines for building your domain model - Use reference and subreference nodes to organize entry [...]
And right now we’ve finished the automatic annotation of the assembly of new strain that HPA (Health Protection Agency UK) made available yesterday (get the assembly file here http://www.hpa-bioinformatics.org.uk/lgp/resource/454Scaffolds.fna) Once more we’ve used BG7 and the same set of reference proteins (137,063 proteins in total): The representative Uniprot proteins corresponding to all Uniref90 clusters for all Escherichia [...]
We’ve finished the automatic annotation of the third BGI assembly of the E. coli TY-2482 strain genome (get the assembly file here ftp://ftp.genomics.org.cn/pub/Ecoli_TY-2482/Escherichia_coli_TY-2482.scaffold.20110610.fa.gz) As in the other annotations we’ve done so far we used BG7 system to annotate the genome. And we have used the same set of reference proteins (137,063 proteins in total): The [...]
David Studholme detected a missed region in EHEC TY-2482-v1 (http://www.genomic.org.uk/blog/?p=523) assembly that was also absent in TY-2482-v2. This is the region with type VI secretion system surrounding the detected by Studholme missed region: contig Era7 geneID ini Era7 tags Protein name 106 108864 2776 secretion system VI Putative uncharacterized protein 106 83591 6193 secretion system [...]
As it was suggested by Kat Holt (http://bacpathgenomics.wordpress.com/2011/06/05/ehec-genomes-snp-locations/) and others a plasmid very similar to pEC_Bactec is a part of the genome of the recently sequenced EHEC H112180280 strain. The figure displays a simple alignment obtained using MAUVE Move Contigs tool between pEC_Bactec plasmid (above) and the scaffolds 7 and 13 of H112180280 genome (below): [...]
HPA (Healt Public Agency http://www.hpa.org.uk/) has just announced the sequence of a E. coli strain. the strain H112180280. They have sequenced the strain with 454 and they’ve released sff files FASTA file with the scaffolds The annotation (done by Anthony Underwood) in GenBank format Data available here http://www.hpa-bioinformatics.org.uk/lgp/genomes They got 13 scaffolds 5405081 bp 88748 Ns [...]
Analysing the automatic annotation we did of the second BGI assembly of TY-2482 genome (see post here http://blog.ohnosequences.com/2011/06/automatic-annotation-of-the-second-bgi-assembly-of-e-coli-ty-2482-genome/) we have found that this isolate has 3 restriction modification systems Type I Restriction modification system encoded in an operon in contig 42. The specific protein encoded by the gene 79712, the modification protein encoded by the [...]
Nature Precedings (2011).
Raquel Tobes, Marina Manrique, Pablo Pareja-Tobes, Eduardo Pareja-Tobes, Eduardo Pareja et al.
We have annotated the European outbreak E. coli EHEC genome sequenced by BGI (6-2-2011) and assembled with MIRA by Nick Loman (6-2-2011 ). Our system BG7, Bacterial Genome annotation of Era7 Bioinformatics, predicts ORFs and annotates them based on fragments of similarity with Uniprot proteins. We have predicted 6327 genes, 6156 encoding proteins y 171 corresponding to ribosomal and tRNA. Based on the preliminary results of our semi-automated method of annotation we have selected some predicted proteins with potential implications in pathogenicity and virulence. There are 33 predicted genes annotated as toxins and we have found three putative hemolysins: Hemolysin E, a putative hemolysin expression modulating protein and a channel protein, hemolysin III family. We have found 31 predicted genes that could be related to specific antibiotic resistance: beta-lactamic, aminoglycoside, macrolide, polymyxin, tetracycline, fosfomycin and deoxycholate, novobiocin, chloramphenicol, bicyclomycin, norfloxacin and enoxacin and 6-mercaptopurine. This strain is rich in adhesion, secretion systems, pathogenicity and virulence related proteins. It seems to have a restriction-modification system, many proteins involved in Fe transport and utilization (siderophores as aerobactin and enterobactin), lysozyme, one inhibitor of pancreatic serine proteases, proteins involved in anaerobic respiration, antimicrobial peptides, and proteins involved in quorum sensing and biofilm formation that could confer competitive advantage to this strain.
Published using Mendeley: The bibliography manager for researchers
Expert Systems with Applications (2008). Volume: 34, Issue: 4. Pages: 2891-2895.
P PAREJATOBES, D PELTA, A SANCHOROYO, J VERDEGAY et al.
We present formally a search space representation method in combinatorial optimization problems, as well as a tool for the visualization of the trajectories of heuristic algorithms based on this method. A software that develops the ideas of the formal method is presented. Several numerical examples are given.
Published using Mendeley: The reference manager for researchers
1st International Student Symposium in Computational Biology (2005).
Pablo Pareja-Tobes, Marina Manrique, Eduardo Pareja-Tobes, Raquel Tobes, Eduardo Pareja et al.
Published using Mendeley: The library management tool for researchers
Journal of immunology (Baltimore, Md. : 1950) (2007). Volume: 179, Issue: 1. Pages: 31-5.
Mattias Magnusson, Raquel Tobes, Jaime Sancho, Eduardo Pareja et al.
Bacterial DNA exerts immunostimulatory effects on mammalian cells via the intracellular TLR9. Although broad analysis of TLR9-mediated immunostimulatory potential of synthetic oligonucleotides has been developed, which kinds of natural bacterial DNA sequences are responsible for immunostimulation are not known. This work provides evidence that the natural DNA sequences named repetitive extragenic palindromic (REPs) sequences present in Gram-negative bacteria are able to produce innate immune system stimulation via TLR9. A strong induction of IFN-alpha production by REPs from Escherichia coli, Salmonella enterica, Pseudomonas aeruginosa, and Neisseria meningitidis was detected in splenocytes from 129 mice. In addition, the involvement of TLR9 in immune stimulation by REPs was confirmed using B6.129P2-Tlr9(tm1Aki) knockout mice. Considering the involvement of TLRs in Gram-negative septic shock, it is conceivable that REPs play a role in its pathogenesis. This study highlights REPs as a potential novel target in septic shock treatment.
Published using Mendeley: The research tool for desktop & web
Traffic (Copenhagen, Denmark) (2008). Volume: 9, Issue: 3. Pages: 325-37.
Carmen Alvarez-Dominguez, Fidel Madrazo-Toca, Lorena Fernandez-Prieto, Joël Vandekerckhove, Eduardo Pareja, Raquel Tobes, Maria Teresa Gomez-Lopez, Elida Del Cerro-Vadillo, Manuel Fresno, Francisco Leyva-Cobián, Eugenio Carrasco-Marín et al.
Listeria monocytogenes (LM) phagocytic strategy implies recruitment and inhibition of Rab5a. Here, we identify a Listeria protein that binds to Rab5a and is responsible for Rab5a recruitment to phagosomes and impairment of the GDP/GTP exchange activity. This protein was identified as a glyceraldehyde-3-phosphate dehydrogenase (GAPDH) from Listeria (p40 protein, Lmo 2459). The p40 protein was found within the phagosomal membrane. Analysis of the sequence of LM p40 protein revealed two enzymatic domains: the nicotinamide adenine dinucleotide (NAD)-binding domain at the N-terminal and the C-terminal glycolytic domain. The putative ADP-ribosylating ability of this Listeria protein located in the N-terminal domain was examined and showed some similarities to the activity and Rab5a inhibition exerted by Pseudomonas aeruginosa ExoS onto endosome-endosome fusion. Listeria p40 caused Rab5a-specific ADP ribosylation and blocked Rab5a-exchange factor (Vps9) and GDI interaction and function, explaining the inhibition observed in Rab5a-mediated phagosome-endosome fusion. Meanwhile, ExoS impaired Rab5-early endosomal antigen 1 (EEA1) interaction and showed a wider Rab specificity. Listeria GAPDH might be the first intracellular gram-positive enzyme targeted to Rab proteins with ADP-ribosylating ability and a putative novel virulence factor.
Published using Mendeley: The digital library for researchers
Molecular microbiology (2009). Volume: 72, Issue: 3. Pages: 668-82.
Eugenio Carrasco-Marín, Fidel Madrazo-Toca, Juan R de los Toyos, Eva Cacho-Alonso, Raquel Tobes, Eduardo Pareja, Alberto Paradela, Juan Pablo Albar, Wei Chen, Maria Teresa Gomez-Lopez, Carmen Alvarez-Dominguez et al.
Listeriolysin O (LLO) is a thiol-activated cytolysin secreted by Listeria monocytogenes. LLO and phosphatidylinositol phospholipase C are two essential virulence factors, which this bacterium needs to escape from the phagosomal compartment to the cytoplasm. Cathepsin-D specifically cleaves LLO, between the Trp-491 (tryptophan amino acid in three letter nomenclature) and Trp-492 residues of the conserved undecapeptide sequence, ECTGLAWEWWR, in the domain 4 of LLO (D4). Moreover, these residues also correspond to the phagosomal-binding epitope. Cathepsin-D had no effect on phosphatidylinositol phospholipase C. We have observed that cathepsin-D cleaved the related cholesterol-dependent cytolysin pneumolysin at the same undecapeptide sequence between Trp-435 and Trp-436 residues. These studies also revealed an additional cathepsin-D cleavage site in the pneumolysin D4 domain localized in the 361-GDLLLD-366 sequence. These differences might confer a pathogenic advantage to listeriolysin O, increasing its resistance to phagosomal cathepsin-D action by reducing the number of cleavages sites in the D4 domain. Using ΔLLO/W491A and ΔLLO/W492A bacterial mutants, we reveal that the Trp-491 residue has an important role linked to cathepsin-D in Listeria innate immunity.
Published using Mendeley: Academic software for researchers
Nucleic acids research (2002). Volume: 30, Issue: 1. Pages: 318-21.
Raquel Tobes, Juan L Ramos et al.
The AraC-XylS database contains information about a family of positive transcriptional regulators broadly distributed in bacteria. This specific database focuses on protein sequences and on the biological and functional features of each of the proteins that belong to this family. Each entry provides information on the protein itself, the annotated protein sequence and, when the crystal is available, a comprehensive representation of its three-dimensional structure. The organization of the database is based on an exhaustive analysis of the scientific literature. The data are interconnected and linked with other databases. Multiple alignments of the members of the family, an extensive collection of references and a tutorial about the family provide additional information. The AraC-XylS database is accessible on the World Wide Web at http://www.AraC-XylS.org.
Published using Mendeley: The research tool for desktop & web
BMC genomics (2006). Pages: 62.
Raquel Tobes, Eduardo Pareja et al.
BACKGROUND: Mobile elements are involved in genomic rearrangements and virulence acquisition, and hence, are important elements in bacterial genome evolution. The insertion of some specific Insertion Sequences had been associated with repetitive extragenic palindromic (REP) elements. Considering that there are a sufficient number of available genomes with described REPs, and exploiting the advantage of the traceability of transposition events in genomes, we decided to exhaustively analyze the relationship between REP sequences and mobile elements. RESULTS: This global multigenome study highlights the importance of repetitive extragenic palindromic elements as target sequences for transposases. The study is based on the analysis of the DNA regions surrounding the 981 instances of Insertion Sequence elements with respect to the positioning of REP sequences in the 19 available annotated microbial genomes corresponding to species of bacteria with reported REP sequences. This analysis has allowed the detection of the specific insertion into REP sequences for ISPsy8 in Pseudomonas syringae DC3000, ISPa11 in P. aeruginosa PA01, ISPpu9 and ISPpu10 in P. putida KT2440, and ISRm22 and ISRm19 in Sinorhizobium meliloti 1021 genome. Preference for insertion in extragenic spaces with REP sequences has also been detected for ISPsy7 in P. syringae DC3000, ISRm5 in S. meliloti and ISNm1106 in Neisseria meningitidis MC58 and Z2491 genomes. Probably, the association with REP elements that we have detected analyzing genomes is only the tip of the iceberg, and this association could be even more frequent in natural isolates. CONCLUSION: Our findings characterize REP elements as hot spots for transposition and reinforce the relationship between REP sequences and genomic plasticity mediated by mobile elements. In addition, this study defines a subset of REP-recognizer transposases with high target selectivity that can be useful in the development of new tools for genome manipulation.
Published using Mendeley: The reference manager for researchers
Research in microbiology (2005). Volume: 156, Issue: 3. Pages: 424-33.
Raquel Tobes, Eduardo Pareja et al.
Repetitive extragenic palindromic (REPs) sequences were first described in enterobacteriacea and later in Pseudomonas putida. We have detected a new variant (51 base pairs) of REP sequences that appears to be disseminated in more than 300 copies in the Pseudomonas syringae DC3000 genome. The finding of REP sequences in P. syringae confirms the broad presence of this type of repetitive sequence in bacteria. We analyzed the distribution of REP sequences and the structure of the clusters, and we show that palindromy is conserved. REP sequences appear to be allocated to the extragenic space, with a special preference for the intergenic spaces limited by convergent genes, while their presence is scarce between divergent genes. Using REP sequences as markers of extragenicity we re-annotated a set of genes of the P. syringae DC3000 genome demonstrating that REP sequences can be used for refinement of annotation of a genome. The similarity detected between virulence genes from evolutionarily distant pathogenic bacteria suggests the acquisition of clusters of virulence genes by horizontal gene transfer. We did not detect the presence of P. syringae REP elements in the principal pathogenicity gene clusters. This absence suggests that genome fragments lacking REP sequences could point to regions recently acquired from other organisms, and REP sequences might be new tracers for gaining insight into key aspects of bacterial genome evolution, especially when studying pathogenicity acquisition. In addition, as the P. syringae REP sequence is species-specific with respect to the sequenced genomes, it is an exceptional candidate for use as a fingerprint in precise genotyping and epidemiological studies.
Published using Mendeley: The bibliography manager for researchers
4th Meeting of the Spanish Systems Biology Network (REBS) (2008).
Eduardo Pareja-Tobes, Marina Manrique, Raquel Tobes, Eduardo Pareja et al.
Published using Mendeley: The reference software for researchers
BMC microbiology (2006). Pages: 29.
Eduardo Pareja, Pablo Pareja-Tobes, Marina Manrique, Eduardo Pareja-Tobes, Javier Bonal, Raquel Tobes et al.
BACKGROUND: Transcriptional regulation processes are the principal mechanisms of adaptation in prokaryotes. In these processes, the regulatory proteins and the regulatory DNA signals located in extragenic regions are the key elements involved. As all extragenic spaces are putative regulatory regions, ExtraTrain covers all extragenic regions of available genomes and regulatory proteins from bacteria and archaea included in the UniProt database. DESCRIPTION: ExtraTrain provides integrated and easily manageable information for 679816 extragenic regions and for the genes delimiting each of them. In addition ExtraTrain supplies a tool to explore extragenic regions, named Palinsight, oriented to detect and search palindromic patterns. This interactive visual tool is totally integrated in the database, allowing the search for regulatory signals in user defined sets of extragenic regions. The 26046 regulatory proteins included in ExtraTrain belong to the families AraC/XylS, ArsR, AsnC, Cold shock domain, CRP-FNR, DeoR, GntR, IclR, LacI, LuxR, LysR, MarR, MerR, NtrC/Fis, OmpR and TetR. The database follows the InterPro criteria to define these families. The information about regulators includes manually curated sets of references specifically associated to regulator entries. In order to achieve a sustainable and maintainable knowledge database ExtraTrain is a platform open to the contribution of knowledge by the scientific community providing a system for the incorporation of textual knowledge. CONCLUSION: ExtraTrain is a new database for exploring Extragenic regions and Transcriptional information in bacteria and archaea. ExtraTrain database is available at http://www.era7.com/ExtraTrain/.
Published using Mendeley: The research paper manager