Context Layer Models

Most computer systems processing music data operate on sequential, time-based models that describe music as (one or multiple) sequences of notes and rests. This seems to be an adequate representation considering that the common form of music notation are scores, which primarily contain sequences of notes, rests and instructions how these are to be played. While sequential representations are suitable for many computer-aided music processing scenarios and also convenient for performing musicians, these are not sufficient for describing all aspects and relations of a musical composition from a composer’s point of view.

Introductory Example

The following example demonstrates a model of the first four measures of the well-known Beatles song Hey Jude. The vocal part looks like this:

The corresponding context layer model is depicted below:

Model Structure

Context layer models contain one or multiple streams which are comparable to voices or parts in scores. Each staff in a score is represented by at least one stream. A single part is divided into multiple streams if it in turn contains multiple voices (e.g. a fugue in four voices might be notated in two piano staves, however the context layer model representation will contain four streams).

Streams contain individual layers for various musical context dimensions, which are explained in the following sections. The layers in turn contain time-dependent context elements, each of which have a start time and a duration. Technical details of time representation are set out in section Time Model.

Instrumentation Context

The instrument layer specifies which instrument is used at which point of time in a stream. In the context layer model in the previous example, this context never changes and indicates a vocal part.

Metric Contexts

The meter layer provides the metric context of a musical stream. It contains time signatures, the start time and duration of which correspond to measures. Note that pieces may commence with an anacrusis (also known as pickup or upbeat) which implies a shortened initial measure. The measure numbers are shown in a timeline at the bottom of the context layer model visualization.

The current tempo is determined by an individual context layer. It usually contains elements specifying a constant tempo, as shown in the previous example. However, the tempo layer also supports gradual tempo changes to model accelerando and decelerando.

Harmonic Contexts

The current key context of streams is given by a correspondent context layer, which can change in the course of the composition.

Another harmonic context is given by context harmonies, which usually change more frequently than the key. In the previous example the key remains constant, while the context harmonies change in each measure.

Rhythmic Contexts

Rhythm is one of the most crucial dimensions of music. This is also reflected in context layer models: rhythm context layers are obligatory for each stream. The rhythmic dimension defines the durations and proportions of the notes or sound events produced by the stream.

Another rhythmic dimension is the harmonic rhythm, which specifies the durational proportions of context harmonies.

Pitch Contexts

The context layer model in the previous example also contains context layers regarding pitches, namely Scale, Degrees and Pitches.

Often pitches are derived from a contextually suitable scale, on which pitches can be addressed using scale degrees. In the example, pitches are derived from the F major scale (which in turn matches the key context) using zero-based scale degrees. For example, the degree 0 will resolve to F, 1 to G, 2 to A, 3 to Bb and so on.

The resulting absolute note names (including the octave) are visible in the Pitches context layer. If no octave is specified, the middle octave, which is encoded as octave with number 4 according to scientific pitch notation, is implied (see section Pitches for more details).

Loudness Contexts

Another musical context layer represents the progress of loudness throughout musical streams. It contains elements with static loudness instructions such as piano or forte. Gradual loudness progressions are also supported to model crescendo and decrescendo.

Another loudness-related context layer, which is not covered in the previous example, accommodates dynamic accents such as sforzando (notated as sfz, sf or fz), sforzando followed immediately by piano (sfp), rinforzando (rfz) or fortepiano, forte followed immediately by piano (fp).

Lyrics

Vocal streams can contain lyrics as an individual context. Using this layer, syllables can be assigned to individual notes as shown in the previous example.

Labels

Another context can be supplied in the form of labels for individuals parts of a composition. These could be, for example: Verse, Chorus, Bridge, Solo for popular music, or Exposition, Development and Recapitulation for a piece based on a sonata form.

Custom Contexts

MPS provides a number of default context layer types, most of which have been discussed in the previous sections. Yet, the number of layers is not fixed and the model was designed to be extensible in order to accommodate new context layers. For example, new context dimensions for fingering instructions, the spatial position of a musical stream or the emotional character of certain sections could be added. Refer to section Custom Contexts for detailed instructions on how to add custom context layers.

Time Model

Each context layer model has an internal timeline which by definition starts at t=0 at the beginning of the first full measure. The earliest point of time can become negative in the case of anacruses at the beginning of the piece, as shown in the previous example, which starts at t=-1/4 due to the upbeat.

Because the accuracy of floating point numbers is not sufficient for representing decimal fractions in all cases, MPS uses fractions (which internally store an integer numerator and denominator separately) for all points of time and durations.

Parallel Streams

To demonstrate the combination of multiple parts, a context layer model of the first four measures of Ludwig van Beethoven’s Piano Sonata No. 14 in C# minor is presented. Compare the original score with the context layer model depicted below:

The model contains two streams, namely one for the arpeggios in the right hand and an individual stream for the arpeggio-based accompaniment of the left hand. Note that both streams contain the same information on instrument, meter, tempo key, harmony, harmonic rhythm, scale and loudness layers. However, the streams contain individual rhythm, degree and pitch layers.

The fact that streams can either share common information or specify individual information can be utilized to represent arbitrary musical constellations between the streams. For example, for multi-tonal compositions concurrent streams could contain different harmonic and tonal contexts. For compositions which do not rely on tonality, all context layers relating to tonality can be removed. In this way, the proposed model is suitable for the representation of a number of musical concepts and constellations, which can not in all cases be made visible in musical scores.