Getting species lineage

Often one is faced with a task to get the lineage for a list of species. I did not find anything easily adaptable so I decided to create a small script which uses the NCBI Taxonomy to provide annotation of the lineage for a specified list of species which is provided in the file.

To get all the data and script you can visit Download section and get lineage.zip
To execute a script(posted below or found in the lineage.zip) in the shell (terminal) you need to have a file with species name (for example if you have a file species.txt the code would be):

After script finishes it will produce a file: species_lineage.txt – where the complete lineage will be reported  (each of elements is separated by “|”).

Alternatively, for the newest annotation, one can visit NCBI Taxonomy FTP and download a file taxdmp.zip. After, one should unzip this archive. A file called lineage.py should be created inside this new directory with a copy of the script posted above. Execution is performed as stated above.

Plotting – dos and don’ts

One of my previous post on XKCD python library was inspired by one of the articles in PLoS Computational Biology.

I do agree with most of their comments plus I put some that I think are also important. Some of the points seem to be common sense but it is important to make them as a checklist that nothing would be missed when producing your great figure.

A checklist for preparing figures:

  1. Who will be looking at your figure:
    • Experts or students…
  2. How you will present your graphs
    • Screen display, electronic form, paper?
    • Avoiding complicated graphs for presentations
    • Font size appropriate for circumstance
  3. What is your message?
    • One chart type will be better than other
    • Think and discuss your plots
  4. Putting adequate legend and caption
    • One should provide much effort to make a figure with a caption to stand on its own.
    • One should be able to draw appropriate conclusions based on a figure.
  5. Optimize plot graphics:
    • Check if a scale is adequate on each of the axes
    • Not always defaults provide the best solution
  6. Select colors that would convey a message:
    • Care for colorblind is a virtue
    • Not all colors are well displayed by projectors
  7. Show exactly your data:
    • Again look at the scales
    • Avoid 3D if your data is 2D
  8. Avoid non-necessary elements
  9. Better clear message than amazing graphics
  10. Use tools that suit your needs
  11. Try to not overcomplicate the graph content