This article is my examination and explanation on how musical form is utilised in much of modern media music (i.e. in video games and motion pictures).

What is musical form?

All songs, from popular to classical, are made up of parts. “Part” is a technical term, which means a clearly identifiable section of music. I’m sure you’re intimately familiar with what a chorus is, as well as a verse. You probably also know what a bridge is. These three are the different kinds of parts that make up the most common musical form of all (in the Western part of the world, anyway).

Pop form

Verse | Chorus | Verse | Chorus | Bridge | Chorus

Listen to Drops of Jupiter, by Train. (You know you love it.) If we analyse the form of this track from the perspective of pop form, we get the following.

  • 0:00 Verse 1 (first four measures of intro)
  • 0:48 Chorus 1
  • 1:12 Verse 2
  • 2:00 Chorus 2
  • 2:24 Bridge
  • 3:00 Chorus 3
  • 3:36 Outro

Easily recognisable parts. The verses are the baseline and the choruses are the more dramatic, stronger parts. The way these parts come together forms a predictable and enjoyable progression. The first verse and chorus are very often softer than the rest of the piece. In this track, the drums don’t come in until the second verse, which is very common.

Pairing the gentler verse with a chorus results in good dynamics, which is variation in strength and type of expression. Rather than having a piece be just four minutes of powerful chorus, each part gets more room and sticks out more by use of dynamic variation. It creates contrast. Variety being the spice of life and all that.

A bridge is a part which appears once in the latter half of a piece. Its purpose is to relieve the verse-chorus repetition. Playing the same two parts over and over would work against the very same variation we want by pairing them together. Hence, after they have been repeated, we introduce a different part, the bridge.

Bridges typically sit between the verse and chorus in terms of dynamics, and they often start out softer and rise to become strong right before the third chorus. In other words, bridges serve the function of replacing the verse for increased variation.

Pop form examined further

If we look more closely at pop form, we’ll find that verses, choruses and bridges are actually not technical parts. They can be, but that is not their definition. Many verses, for instance, are made up of multiple parts, and the same is true of bridges. What this means is that the verse, chorus and bridge are functional definitions. This means we can apply them outside of a popular music context. (Although defining modern media as not being such may not be tenable, but I digress.)

  • verse is a repeated part or group of parts that is not used for the climaxes of the piece.
  • chorus is a repeated part or group of parts that is the used for the climaxes of the piece.
  • bridge is a part or group of parts that is only played once, not repeated, which typically occurs either before the final chorus or a final verse-chorus repetition. Some songs also end on the bridge.

Classical form analysis

Don’t let the word “classical” fool you. We won’t be talking about classical music. Classical doesn’t refer to music style, just a different perspective of looking at musical architecture. To show you what I mean, let’s being by analysing a piece that is modern and relevant; Jeremy Soule’s Far Horizons from The Elder Scrolls V: Skyrim.

The way you analyse the musical form of a piece is to listen to it and write down where each new part begins and whether it’s new or a variation of one we’ve already heard before. We use capital letters to name each part, starting from A and then in alphabetical order. In the case of Far Horizons, we get the following.

  • 0:00, start of the track, a part begins right away. There’s no intro. We’ll call this part A.
  • 0:30, first part ends and another begins. We’ll call this one B.
  • 1:02, a repetition of part A, with variation.
  • 1:32, a new, unheard part begins. We’ll name it part C.
  • 2:02, another new, unheard part begins. Part D.
  • 2:31, the previous part, D, repeats with variation.
  • 3:02, the very first part, part A, repeats, again with a different variation to the arrangement.
  • 3:30, the second part, part B, is repeated again, with some variation.
  • 4:00, another repeat of the first part, A, with a larger and stronger arrangement.
  • 4:31, repeat of the third part, C.
  • 5:00, outro of sustained notes.

We use apostrophes (‘) to mark each variation. They’re spoken as “prime,” so A’ is “A prime” and A” is “A double prime.” The result is as follows.

A B A’ C D D’ A” B’ A”’ C’

If we were to apply the pop functions to this form, we would have to concede that part A serves the functions of both the verse and the chorus. We do have an obvious bridge, however, which is made up of two parts in the middle, D D’. The other repeated group of parts is A B A C.

By looking at this form, we can conclude that the piece has repetition at the start and end with a bridge for variation in between. We can also clearly see how the A and B parts rise in dynamics with each variation.

Why do we use musical form?

In its most basic intention, musical form is used to balance variation and repetition. At this point, keep in mind that while musical form is theory, it’s theory that’s based on convention. What sounds good comes from how our music has naturally developed over time. In other words, the pop form is good form because we like it and not the other way around.

Repetition and variation

These two concepts make up the entirety of our music. Observe the form below.

A B A B A A’

We’ve got two parts, A and B. By analysing the repetition and variation of this form, we can glean the following.

  • Variation by having B follow A.
  • Repetition by repeating a group of parts, A B.
  • Variation by following the third A with A’ instead of B.

It’s not just from part to part where we find variation and repetition. A repeated part also does not always imply more repetition than variation. If you’ve established that part A is followed by part B, changing it and following part A with A’ or C will result in variation by way of repetition, also known as relieved variation.

Dynamic curve

The secondary function of musical form is to manage the dynamic curve of a piece. It refers to how the strength of expression varies throughout a piece. In a typical context, you want a rising curve with a late climax followed by a drop-off before the end. If you’re writing music for media such as film or a cut-scene, your dynamic curve might be constrained to the inherent dramatics of that media, in which case you’ll need to adapt your form accordingly.

A A’ B A A’ D D’

Imagine that the form above belongs to a piece of music played in a film. The A part is sad and mourning. On-screen, a man is watching his wife die in hospital. Her death occurs in part B, which has a sudden and strong climax to accentuate the event. The coming A A’ is played during her funeral.

Then there’s a time lapse and all of sudden it’s years later. The man is climbing to the top of a mountain. When he gets there, he takes out out a picture of his wife and looks at it, smiling. The music, D D’, is a new set of parts that’s happy if a bit bittersweet.

Defining “part”

When it comes to music, the ear hears everything in multiples of two. Thus, the smallest group of measures required to form a recognisable section is four measures (to give us enough space to move away from and back to the tonic chord).

I don’t have an explanation for why parts are either eight or sixteen measures long as standard. It’s not a hard rule, just a guideline. My suspicion is that our ears like a mix of familiarity and novelty. New melodies, familiar structures.

Think of it in terms of other media, like books and TV series. We structure stories in acts and in episodes, even if the stories themselves are different. It’s the basic structure of a medium, the foundation we use to build upon. Just like our music comes from the overtone series, a natural phenomenon, there is perhaps something inherent to our brains about how they rationalise abstract concepts, applying structure where there is none.

The why doesn’t perhaps matter so much. Even someone without any musical training can identify and enjoy the different parts in a piece of music.

As a final note, the ideal length in time for a part is between 25 to 50 seconds.

Secondary parts

secondary part is one of the following three types.

  • Intro
  • Outro
  • Filler

Let’s use another Jeremy Soule example for this one, Wings of Kynareth. First, an analysis.

  • 0:00, presumably the first verse, part A
  • 0:22, chorus, part B
  • 0:49, repetition of part A
  • 1:10, chorus repetition, B
  • 1:37, repetition of part A
  • 1:58, another chorus, B
  • 2:20, unheard part, C
  • 2:41, another unheard part, D
  • 3:05, outro based on part A

A [2] B A’ [2] B’ A” B” C D [Outro]

While listening to this track, you may notice that before the first and second choruses, there is some extra space, two measures. These are filler parts. Soule uses them here to relax the ear from the lively verse and prepare it for the chorus. On the third chorus, it’s strong enough that doing so isn’t necessary, and it’s a good example of variation by removing existing variation. That chorus becomes all the more stronger because it starts right after the verse.

A filler part is separate from and exists only between its surrounding parts. They’re slot-in measures, used for certain effects, often as an extension of the previous part or preparation for the coming. It doesn’t have to be two measures; you could use just one, three or even four, but the longer you make it, the more it’ll interrupt the structure and you’ll move away from lyrical form.

As for intro and outro, the reason they’re written in brackets rather than as letters is because they’re not repeated and they don’t always follow part length. Sometimes the outro is just a repetition of a proper part, in which case you can write it as either.

Outro types

There are many different ways to end a piece. Arguably the most common is sustained harmony, as in Far Horizons. The last chord progression is resolved and then held, slowly fading out, possibly with some small embellishments.

Another way is to repeat an earlier part in a conclusory fashion. Note that the repeated part should not be the chorus.

The last most common way is to have a dedicated outro with new material (or a heavily modified version of a previous part). Very long and through-composed pieces benefit the most from this type.

Through-composed form

In short, form without any significant amount of repetition. Analysing it would result in something like A B C D E and so on, or possibly A A B B C C and on. This is common in film music. The lack of repetition makes cohesion difficult, as one minute could sound very different to the next.


Good form relies not only on a well-structured linear arrangement of parts but also on how the parts relate to one another; just because we can assign letters to each part doesn’t mean we can create good form without modifying the individual parts.

This is known as synergy, which by definition means two or more entities (musical parts, in our case) coming together to form a greater whole. After all, any verse or chorus taken out of a popular song doesn’t have nearly the same effect on its own as it does along with the rest of the track.

Achieving synergy

Like the overall form itself, synergy is based on variation and repetition. If we were to briefly jump back to our analysis of Far Horizons, we can find a good example of this.

A B A’ C D D’ A” B’ A”’ C’

If we examine the parts A and B, we can note the following observations.

  • A establishes a recognisable melodic pattern.
  • B uses a very similar pattern to A with variation in pitch and some extra notes added to the rhythm.

In other words, B forms synergy with A by using both repetition and variation.

In our second example, Wings of Kynareth, we find that the verses are very different to the chorus, in other words, synergy by variation.

In closing

This article explains my own take on form as seen in modern media music, but it is by no means extensive or covering all styles. It’s influenced mostly by composers such as Nobuo Uematsu, Yasunori Shiono and Jeremy Soule but also Inon Zur, Alexander Brandon and Yasunori Mitsuda.

As a final piece of advice, be prepared to kill or shelve your darlings. It may be that the part you love just doesn’t fit where you placed it.

All the best,

My music, elsewhere:

© Raniel Dan MMXXIII.