Harmonic Visualiser

The Harmonic Visualiser is an audio editor which can capture individual pitched notes from real audio recordings. It can capture notes from both clean recordings and noisy recordings, from monophonic audio and polyphonic mixtures, from perfect harmonic notes and notes with inharmonicity, for stable-pitched notes and for notes with portamento or vibrato.

Once individual notes are captured, they can be manipulated in a great variety of ways. The Harmonic Visualiser is able to extract a note from a recording, enhance or suppress a note, duplicate a note for use in somewhere else, time-shift a note to meet a desired onset time, time-stretch a note to get a satisfactory duration, pitch-shift a note to correct out-of-tune instrument or singer, applying amplitude or pitch modulation to create wobbling effect, reduce, remove or reconfigure existing amplitude or pitch modulation where wobbling effect is overdone or badly done, change a male voice into a female one and vice versa, correct a wrong vowel, combining features of piano and violin, etc.

Harmonic Visualiser screenshot
Harmonic Visualiser screenshot

How does it work?

The Harmonic Visualiser uses sinusoid-based inference: it creates a detailed description of the captured note in the form of a harmonic sinusoid model.

Its creator Xue Wen explains:

"Audio and score are the two major representations of music. Score is the guidelines given by the composer to the performer for music making, covering as much information of the music piece as he feels necessary. Score is usually recorded as a collection of music notes plus specific instructions, in which the music information is displayed in a straight and compact manner. Audio, on the other hand, is the direct record of sound, on which the production, transmission and perception of music depends. Audio is usually recorded as sound pressure levels or spectrums, in which the music information is embedded in a loose and redundant manner.

By harmonic sinusoid modeling we try to make a direct connection between the two representations for a subset of music events: the pitched notes. The harmonic sinusoid model represents pitched music notes with sparsely-sampled parameters of harmonic sinusoids, therefore it has two sides: a parameter side and a signal side. The parameter side consists of the sinusoid parameters. On this side the harmonic sinusoid model is a symbolic representation, in the sense it describes a music symbol, i.e. the note, as an individual object. Part of this description is explicitly related to the score, while the other part captures more details of the physical process not instructed in the score, which can be attributed to the performer, specific instrument, environment, etc. The signal side provides an audio fragment made up of harmonic sinusoids, which approximates the original note perceptually or physically, depending on certain synthesizer specifications. With these two sides connected by an analyzer-synthesizer pair, the harmonic sinusoid model becomes a natural instrument for bridging the gap between audio and symbols.

As an example application of harmonic sinusoid modeling, we have created an audio editor which can capture individual pitched notes, both from clean recording and from noisy recording, both from monophonic audio and from polyphonic mixture, both for perfect harmonic notes and for notes with inharmonicity, both for stable-pitched notes and for notes with portamento or vibrato. A detailed description of the captured note in harmonic sinusoid model is immediately available. With this description we are able to extract a note from a recording, enhance or suppress a note, duplicate a note for use in somewhere else, time-shift a note to meet a desired onset time, time-stretch a note to get a satisfactory duration, pitch-shift a note to correct out-of-tune instrument or singer, applying amplitude or pitch modulation to create wobbling effect, reduce, remove or reconfigure existing amplitude or pitch modulation where wobbling effect is overdone or badly done, change a male voice into a female one and vice versa, correct a wrong vowel, combining features of piano and violin, and many more."

See: Wen X. and M. Sandler, “New audio editor functionality using harmonic sinusoids,” in Proc. AES 122nd Convention, Vienna, 2007.