Public access genome sequences. Programmatic download
Genbank holds a dedicated site for Covid-19 resources here.
Follow the link Entrez nucleotide. A summary of available sequences appears. At the top right area Send to... allows to download a File of Accession List
list of files. Sorting by sequence size helps pick up whole genomes (more than 29000bp).
This oneliner downloads the genomes from the command line makes use of edirect utilities
Multiple sequence alignments (msa)
The alignment of complete genomes was carried out in prank. This tool claims a codon-aware alignment. The running time
to obtain the multifasta file was 620 seconds.aligned multifasta Visualization of aligned sequences is possible by many methods. One of them is
PRANK's companion: WASABI . Please, copy this url http://www.egarmo.com/GenBank.msa.fa.best.xml on the graphical interface.
WASABI shows a first draft of phylogenetic tree, but optimization was achieved by feeding the data into BEAST.
After a running time of 24 minutes and the use of ancillary programs (BEAUTi and Treeannotator) a phylogenetic tree in the
nexus format was obtained. The running time of BEAUTi and Treeannotator is negligible.
The file in nexus format is also available.
This file may bie visualizes in iTol from embl.de.
The toolbar on the left includes zoom in and out and a search tool under te 'Aa' icon. This button pops up a window to enter search terms in the names
of the tree leaves (taxa). For example entering 'Valencia' highlights the position of the genomes uploaded to that database from Spain yet. Genomes 1 and 7
cluster together, but far from all other genomes from Spain. This observation would support the already communicated idea of a possible double entry
of the SARS-Covid 19 in Spain or ongoing mutation across the country.