Genomic arrays

There are tasks that require mapping some features over an array (i.e. chromosome). Here, I post a simple class in python that can store features over an array. It allows easy additions of the values to it. You need to have numpy installed in order to run it. A simple use is presented. Please remember, it follows the python rules it means the system is zero based and the last position is not included (you need to add 1 to include the last number – in other words, the interval is open on the right or right-open).

 

Getting species lineage

Often one is faced with a task to get the lineage for a list of species. I did not find anything easily adaptable so I decided to create a small script which uses the NCBI Taxonomy to provide annotation of the lineage for a specified list of species which is provided in the file.

To get all the data and script you can visit Download section and get lineage.zip
To execute a script(posted below or found in the lineage.zip) in the shell (terminal) you need to have a file with species name (for example if you have a file species.txt the code would be):

After script finishes it will produce a file: species_lineage.txt – where the complete lineage will be reported  (each of elements is separated by “|”).

Alternatively, for the newest annotation, one can visit NCBI Taxonomy FTP and download a file taxdmp.zip. After, one should unzip this archive. A file called lineage.py should be created inside this new directory with a copy of the script posted above. Execution is performed as stated above.