Words in pictures: exploring repetition in song lyrics

SongSim matrix

It may not be immediately obvious, but the image to the left represents part of the 2001 hit single, “Can’t get you out of my head”, by the Australian singer Kylie Minogue. If you know the song and its lyrics, you might be able to figure out what you are seeing. If you don’t, we’ll explain.

The full image below comes from the SongSim website, built by computer scientist Colin Morris, and it is an example of what is known as a self-similarity matrix, which in this case is being used to explore repetition in song lyrics.

The size of the matrix corresponds to the number of words in a song, and each word has its own column and row - picture the lyrics running left to right and top to bottom along the outside of the matrix. In the Minogue example, there are 374 individual (not unique) words, meaning the matrix consists of 374 × 374 cells, and a cell is filled with colour whenever there is a match between words along the horizontal and vertical axes. Or, as Morris puts it: “The cell at position (x, y) is filled in if the xth and yth words of the song are the same.” The diagonal line running top left to bottom right through the centre of the matrix is the entire song, from beginning to end; when words or phrases are repeated, patterns begin to form to the left and right of this line.



So, for example, Minogue’s song begins with the repetition of the phrase “La la la, la la la-la la”, and this refrain appears first as the filled black square at the top left. It crops up frequently throughout the song, as evidenced by the recurrence of black quadrilateral shapes within the matrix. You can also pick out the repetition of the song title, “Can’t get you out of my head”, as short diagonal lines of magenta cells. Meanwhile, the repetition of a two-word sequence – in this case, “and ever and ever and ever” – creates the checkerboard effect seen roughly one-third and-two thirds of the way through.

The SongSim website allows visitors to import their own lyrics (or text of any sort, like this article, for instance – see below). Data scientist Giora Simchoni has also built the songsim package for the statistical software R to enable further analysis of the matrices. The package includes a repetitiveness metric, for which Minogue’s song scores 23%.