Genomic arrays

There are tasks that require mapping some features over an array (i.e. chromosome). Here, I post a simple class in python that can store features over an array. It allows easy additions of the values to it. You need to have numpy installed in order to run it. A simple use is presented. Please remember, it follows the python rules it means the system is zero based and the last position is not included (you need to add 1 to include the last number – in other words, the interval is open on the right or right-open).


Getting species lineage

Often one is faced with a task to get the lineage for a list of species. I did not find anything easily adaptable so I decided to create a small script which uses the NCBI Taxonomy to provide annotation of the lineage for a specified list of species which is provided in the file.

To get all the data and script you can visit Download section and get
To execute a script(posted below or found in the in the shell (terminal) you need to have a file with species name (for example if you have a file species.txt the code would be):

After script finishes it will produce a file: species_lineage.txt – where the complete lineage will be reported  (each of elements is separated by “|”).

Alternatively, for the newest annotation, one can visit NCBI Taxonomy FTP and download a file After, one should unzip this archive. A file called should be created inside this new directory with a copy of the script posted above. Execution is performed as stated above.

Plotting – dos and don’ts

One of my previous post on XKCD python library was inspired by one of the articles in PLoS Computational Biology.

I do agree with most of their comments plus I put some that I think are also important. Some of the points seem to be common sense but it is important to make them as a checklist that nothing would be missed when producing your great figure.

A checklist for preparing figures:

  1. Who will be looking at your figure:
    • Experts or students…
  2. How you will present your graphs
    • Screen display, electronic form, paper?
    • Avoiding complicated graphs for presentations
    • Font size appropriate for circumstance
  3. What is your message?
    • One chart type will be better than other
    • Think and discuss your plots
  4. Putting adequate legend and caption
    • One should provide much effort to make a figure with a caption to stand on its own.
    • One should be able to draw appropriate conclusions based on a figure.
  5. Optimize plot graphics:
    • Check if a scale is adequate on each of the axes
    • Not always defaults provide the best solution
  6. Select colors that would convey a message:
    • Care for colorblind is a virtue
    • Not all colors are well displayed by projectors
  7. Show exactly your data:
    • Again look at the scales
    • Avoid 3D if your data is 2D
  8. Avoid non-necessary elements
  9. Better clear message than amazing graphics
  10. Use tools that suit your needs
  11. Try to not overcomplicate the graph content

XKCD – cartoon-like plots

Did you wonder how to add to graphs a cartoon-like touch? Well, one option is to use matplotlib evoking xkcd function in python. Still, this small code snippet  provides a useful information how to customize a plot (adding annotations, coloring, removing axis) using matplotlib.pendulum