It had already been some time without having some fun with Gephi so today I told myself: why not trying visualizing the whole Gene Ontology and seeing what happens?
First of all I had to generate the corresponding file in gexf format containing all the terms and relationships belonging to the ontology. For that I did a small program (GenerateGexfGo.java) which uses Bio4j for terms/relationships info retrieval and a couple of XML Gexf wrapper classes from the github project BioinfoXML.
Once I had my gexf file I tried opening it (~17 MB) with gephi in my laptop with no success, (gephi froze forever when trying to import the file). Then, after a quick search on google I figured out that the amount of memory used by Gephi was really easy to change, (just open the file ‘etc/gephi07beta.conf’ and change the -Xmx value).
With my file already imported, first I applied the algorithm OpenOrd (which is the best one for large graphs) and then once it had an acceptable distribution I finally applied some iterations of the algorithm Fruchterman Reingold for a better visualization. And this is what I got:
- Green: Cellular component
- Blue: Molecular function
- Orange: Biological process
UPDATE: zoomable independent ontology visualizations using gephi SeaDragon plugin.
Here you can download the gexf file in case you want to experiment a bit with it.