# shiftPitch

Shift audio pitch

## Description

shifts the pitch of the audio input by the specified number of semitones,
`audioOut`

= shiftPitch(`audioIn`

,`nsemitones`

)`nsemitones`

.

specifies options using one or more `audioOut`

= shiftPitch(`audioIn`

,`nsemitones`

,`Name,Value`

)`Name,Value`

pair arguments.

## Examples

### Apply Pitch-Shifting to Time-Domain Audio

Read in an audio file and listen to it.

```
[audioIn,fs] = audioread('Counting-16-44p1-mono-15secs.wav');
sound(audioIn,fs)
```

Increase the pitch by 3 semitones and listen to the result.

nsemitones = 3; audioOut = shiftPitch(audioIn,nsemitones); sound(audioOut,fs)

Decrease the pitch of the original audio by 3 semitones and listen to the result.

nsemitones = -3; audioOut = shiftPitch(audioIn,nsemitones); sound(audioOut,fs)

### Apply Pitch-Shifting to Frequency-Domain Audio

Read in an audio file and listen to it.

```
[audioIn,fs] = audioread("SpeechDFT-16-8-mono-5secs.wav");
sound(audioIn,fs)
```

Convert the audio signal to a time-frequency representation using `stft`

. Use a 512-point `kbdwin`

with 75% overlap.

win = kbdwin(512); overlapLength = 0.75*numel(win); S = stft(audioIn, ... "Window",win, ... "OverlapLength",overlapLength, ... "Centered",false);

Increase the pitch by 8 semitones and listen to the result. Specify the window and overlap length you used to compute the STFT.

nsemitones = 8; lockPhase = false; audioOut = shiftPitch(S,nsemitones, ... "Window",win, ... "OverlapLength",overlapLength, ... "LockPhase",lockPhase); sound(audioOut,fs)

Decrease the pitch of the original audio by 8 semitones and listen to the result. Specify the window and overlap length you used to compute the STFT.

nsemitones = -8; lockPhase = false; audioOut = shiftPitch(S,nsemitones, ... "Window",win, ... "OverlapLength",overlapLength, ... "LockPhase",lockPhase); sound(audioOut,fs)

### Increase Fidelity Using Phase Locking

Read in an audio file and listen to it.

```
[audioIn,fs] = audioread('FemaleSpeech-16-8-mono-3secs.wav');
sound(audioIn,fs)
```

Increase the pitch by 6 semitones and listen to the result.

nsemitones = 6; lockPhase = false; audioOut = shiftPitch(audioIn,nsemitones, ... 'LockPhase',lockPhase); sound(audioOut,fs)

To increase fidelity, set `LockPhase`

to `true`

. Apply pitch shifting, and listen to the results.

lockPhase = true; audioOut = shiftPitch(audioIn,nsemitones, ... 'LockPhase',lockPhase); sound(audioOut,fs)

### Increase Fidelity Using Formant Preservation

Read in the first 11.5 seconds of an audio file and listen to it.

```
[audioIn,fs] = audioread('Rainbow-16-8-mono-114secs.wav',[1,8e3*11.5]);
sound(audioIn,fs)
```

Increase the pitch by 4 semitones and apply phase locking. Listen to the results. The resulting audio has a "chipmunk effect" that sounds unnatural.

nsemitones = 4; lockPhase = true; audioOut = shiftPitch(audioIn,nsemitones, ... "LockPhase",lockPhase); sound(audioOut,fs)

To increase fidelity, set `PreserveFormants`

to `true`

. Use the default cepstral order of `30`

. Listen to the result.

cepstralOrder = 30; audioOut = shiftPitch(audioIn,nsemitones, ... "LockPhase",lockPhase, ... "PreserveFormants",true, ... "CepstralOrder",cepstralOrder); sound(audioOut,fs)

## Input Arguments

`audioIn`

— Input signal

column vector | matrix | 3-D array

Input signal, specified as a column vector, matrix, or 3-D array. How the function
interprets `audioIn`

depends on the complexity of
`audioIn`

:

If

`audioIn`

is real,`audioIn`

is interpreted as a time-domain signal. In this case,`audioIn`

must be a column vector or matrix. Columns are interpreted as individual channels.If

`audioIn`

is complex,`audioIn`

is interpreted as a frequency-domain signal. In this case,`audioIn`

must be an*L*-by-*M*-by-*N*array, where*L*is the FFT length,*M*is the number of individual spectra, and*N*is the number of channels.

**Data Types: **`single`

| `double`

**Complex Number Support: **Yes

`nsemitones`

— Number of semitones to shift audio by

real scalar

Number of semitones to shift the audio by, specified as a real scalar.

The range of `nsemitones`

depends on the window length
(`numel(`

) and the overlap length
(`Window`

)`OverlapLength`

):

`-12*log2(numel(`

≤ `Window`

)-`OverlapLength`

)`nsemitones`

≤
`-12*log2((numel(`

`Window`

)-`OverlapLength`

)/numel(`Window`

))

**Data Types: **`single`

| `double`

### Name-Value Arguments

Specify optional pairs of arguments as
`Name1=Value1,...,NameN=ValueN`

, where `Name`

is
the argument name and `Value`

is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.

*
Before R2021a, use commas to separate each name and value, and enclose*
`Name`

*in quotes.*

**Example: **`'Window',kbdwin(512)`

`Window`

— Window applied in time domain

`sqrt(hann(1024,'periodic'))`

(default) | real vector

Window applied in the time domain, specified as the comma-separated pair
consisting of `'Window'`

and a real vector. The number of elements in
the vector must be in the range [1,
`size(`

]. The number of elements in
the vector must also be greater than `audioIn`

,1)`OverlapLength`

.

**Note**

If using `shiftPitch`

with frequency-domain input, you must
specify `Window`

as the same window used to transform
`audioIn`

to the frequency domain.

**Data Types: **`single`

| `double`

`OverlapLength`

— Number of samples overlapped between adjacent windows

`round(0.75*numel(``Window`

))

(default) | scalar in the range [0,
`numel(``Window`

)

)

`Window`

))`Window`

)Number of samples overlapped between adjacent windows, specified as the
comma-separated pair consisting of `'OverlapLength'`

and an integer
in the range [0, `numel(Window)`

).

**Note**

If using `shiftPitch`

with frequency-domain input, you must
specify `OverlapLength`

as the same overlap length used to
transform `audioIn`

to a time-frequency representation.

**Data Types: **`single`

| `double`

`LockPhase`

— Apply identity phase locking

`false`

(default) | `true`

Apply identity phase locking, specified as the comma-separated pair consisting of
`'LockPhase'`

and `false`

or
`true`

.

**Data Types: **`logical`

`PreserveFormants`

— Preserve formants

`false`

(default) | `true`

Preserves formants, specified as the comma-separated pair consisting of
`'PreserveFormants'`

and `true`

or
`false`

. Formant preservation is attempted using spectral envelope
estimation with cepstral analysis.

**Data Types: **`logical`

`CepstralOrder`

— Cepstral order used for formant preservation

30 (default) | nonnegative integer

Cepstral order used for formant preservation, specified as the comma-separated
pair consisting of `'CepstralOrder'`

and a nonnegative
integer.

#### Dependencies

To enable this name-value pair argument, set
`PreserveFormants`

to `true`

.

**Data Types: **`single`

| `double`

## Output Arguments

`audioOut`

— Pitch-shifted audio

column vector | matrix

Pitch-shifted audio, returned as a column vector or matrix of independent channels.

## Algorithms

To apply pitch shifting, `shiftPitch`

modifies the time-scale of audio
using a phase vocoder and then resamples the modified audio. The time scale modification
algorithm is based on [1] and [2] and is implemented as in
`stretchAudio`

.

After time-scale modification, `shiftPitch`

performs sample rate
conversion using an interpolation factor equal to the analysis hop length and a decimation
factor equal to the synthesis hop length. The interpolation and decimation factors of the
resampling stage are selected as follows: The analysis hop length is determined as
```
analysisHopLength =
numel(
```

. The
`Window`

)-`OverlapLength`

`shiftPitch`

function assumes that there are 12 semitones in an octave,
so the speedup factor used to stretch the audio is ```
speedupFactor =
2^(-
```

. The speedup factor and analysis hop
length determine the synthesis hop length for time-scale modification as
`nsemitones`

/12)`synthesisHopLength = round((1/SpeedupFactor)*analysisHopLength)`

.

The achievable pitch shift is determined by the window length
(`numel(`

) and
`Window`

)`OverlapLength`

. To see the relationship, note that the equation for
speedup factor can be rewritten as:

, and the equation for synthesis hop length can be
rewritten as `nsemitones`

=
-12*log2(speedupFactor)`speedupFactor = analysisHopLengh/synthesisHopLength`

. Using
simple substitution, ```
nsemitones =
-12*log2(analysisHopLength/synthesisHopLength)
```

. The practical range of a synthesis
hop length is [1, `numel(`

]. The range of
achievable pitch shifts is:`Window`

)

Max number of semitones lowered:

`-12*log2(numel(`

`Window`

)-`OverlapLength`

)Max number of semitones raised:

`-12*log2((numel(`

`Window`

)-`OverlapLength`

)/numel(`Window`

))

### Formant Preservation

Pitch shifting can alter the spectral envelope of the pitch-shifted signal. To diminish
this effect, you can set `PreserveFormants`

to `true`

.
If `PreserveFormants`

is set to `true`

, the algorithm
attempts to estimate the spectral envelope using an iterative procedure in the cepstral
domain, as described in [3] and [4]. For both the original
spectrum, *X*, and the pitch-shifted spectrum, *Y*, the
algorithm estimates the spectral envelope as follows.

For the first iteration, *EnvX*_{a} is set to
*X*. Then, the algorithm repeats these two steps in a loop:

Lowpass filters the cepstral representation of

*EnvX*_{a}to get a new estimate,*EnvX*_{b}. The`CepstralOrder`

parameter controls the quefrency bandwidth.To update the current best fit, the algorithm takes the element-by-element maximum of the current spectral envelope estimate and the previous spectral envelope estimate:

$$Env{X}_{\text{a}}=\mathrm{max}(Env{X}_{\text{a}},Env{X}_{\text{b}}).$$

The loop ends if either a maximum number of iterations
(`100`

) is reached, or if all bins of the estimated log envelope are
within a given tolerance of the original log spectrum. The tolerance is set to
`log(10^(1/20))`

.

Finally, the algorithm scales the spectrum of the pitch-shifted audio by the ratio of estimated envelopes, element-wise:

$$Y=Y\times \left(\frac{Env{X}_{\text{b}}}{Env{Y}_{\text{b}}}\right).$$

## References

[1] Driedger, Johnathan, and Meinard
Müller. "A Review of Time-Scale Modification of Music Signals." *Applied
Sciences*. Vol. 6, Issue 2, 2016.

[2] Driedger, Johnathan. "Time-Scale Modification Algorithms for Music Audio Signals." Master's Thesis. Saarland University, Saarbrücken, Germany, 2011.

[3] Axel Roebel, and Xavier Rodet. "Efficient Spectral Envelope Estimation and its application to pitch shifting and envelope preservation." International Conference on Digital Audio Effects, pp. 30–35. Madrid, Spain, September 2005. hal-01161334

[4] S. Imai, and Y. Abe. "Spectral
envelope extraction by improved cepstral method." *Electron. and Commun. in
Japan*. Vol. 62-A, Issue 4, 1997, pp. 10–17.

## Extended Capabilities

### C/C++ Code Generation

Generate C and C++ code using MATLAB® Coder™.

### GPU Arrays

Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Usage notes and limitations:

`LockPhase`

must be set to`false`

.Using

`gpuArray`

(Parallel Computing Toolbox) input with`shiftPitch`

is only recommended for a GPU with compute capability 7.0 ("Volta") or above. Other hardware might not offer any performance advantage. To check your GPU compute capability, see`ComputeCompability`

in the output from the`gpuDevice`

(Parallel Computing Toolbox) function. For more information, see GPU Computing Requirements (Parallel Computing Toolbox).

For an overview of GPU usage in MATLAB^{®}, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

## Version History

**Introduced in R2019b**

## See Also

`stretchAudio`

| `reverberator`

| `audioTimeScaler`

| `audioDataAugmenter`

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

# Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)