From: Sequence embedding for fast construction of guide trees for multiple sequence alignment

PCA visualisation of embedded H3 Influenza virus sequences. An embedding of 3994 GenBank haemaglutinin sequences from H3N2 influenza viruses, generated using mBed, and visualised using the first three axes of a PCA of the embedded vectors. Each sequence has been coloured by year of isolation to show the progression of sequence change between the years 1967 (blue) and 2008 (red).

