Algorithm to to resample pitch of sample
-
- KVRist
- Topic Starter
- 154 posts since 15 Feb, 2012
I have written a ringtone generator application in android using Java.
I have implemented a linear interpolation algorithm to change the pitch. It was a long time ago and I lost the article I used to base my algorithm on.
But either way, I am experiencing aliasing when I pitch the sample around +/- 10 (or more) semitones roughly.
The samples being resampled are sine, square, and saw samples at A4.
So I am hoping to replace the pitch shifting algorithm with a new one that is of a higher quality.
I am not bound by performance, as the samples do not need to be processed in real time. The user sets the parameters, and then hits play, and it generates the entire output sample before playing it. It is not an inconvenience if there is even a 400 ms delay before the resampling completes.
So ideally I would like an algorithm that prioritizes quality (to a lesser extent ease of implementation) over performance.
I am hoping to get pointed in the right direction. At the very least the suggestion of some algorithms to investigate. At the very most an implementation .
I have implemented a linear interpolation algorithm to change the pitch. It was a long time ago and I lost the article I used to base my algorithm on.
But either way, I am experiencing aliasing when I pitch the sample around +/- 10 (or more) semitones roughly.
The samples being resampled are sine, square, and saw samples at A4.
So I am hoping to replace the pitch shifting algorithm with a new one that is of a higher quality.
I am not bound by performance, as the samples do not need to be processed in real time. The user sets the parameters, and then hits play, and it generates the entire output sample before playing it. It is not an inconvenience if there is even a 400 ms delay before the resampling completes.
So ideally I would like an algorithm that prioritizes quality (to a lesser extent ease of implementation) over performance.
I am hoping to get pointed in the right direction. At the very least the suggestion of some algorithms to investigate. At the very most an implementation .
-
- Banned
- 12368 posts since 30 Apr, 2002 from i might peeramid
i'll volunteer that for this class of signals i'd be more inclined to render than resample, especially since you're ok with render time.
you come and go, you come and go. amitabha neither a follower nor a leader be tagore "where roads are made i lose my way" where there is certainty, consideration is absent.
-
- KVRist
- Topic Starter
- 154 posts since 15 Feb, 2012
You are correct. In the current application, it seems to make more sense to just render the saw, sine, and square waves.
But when the app was initially conceived, there would be more elaborate samples as well, Say with reverb and effects baked in. So instead of writing a synth engine for a simple ringtone app, it would just resample a bunch of samples, say like 4 to 8.
So I think for the time being I might want to have it resample, in case I want to introduce more samples in later.
But when the app was initially conceived, there would be more elaborate samples as well, Say with reverb and effects baked in. So instead of writing a synth engine for a simple ringtone app, it would just resample a bunch of samples, say like 4 to 8.
So I think for the time being I might want to have it resample, in case I want to introduce more samples in later.
-
- KVRist
- 106 posts since 14 Nov, 2009
Why not using FFT to pitchshift your samples ?
http://blogs.zynaptiq.com/bernsee/pitch ... ng-the-ft/
http://blogs.zynaptiq.com/bernsee/pitch ... ng-the-ft/
-
- KVRist
- Topic Starter
- 154 posts since 15 Feb, 2012
I was hoping to find a nice middle ground of quality, and ease of implementation. I don't want to have to read an entire paper and do trial and error, at least at this point...
-
- KVRAF
- 2256 posts since 29 May, 2012
The original version of this is already in android as a part of the OS https://github.com/pedrolcl/Linux-SonivoxEasI was hoping to find a nice middle ground of quality, and ease of implementation. I don't want to have to read an entire paper and do trial and error, at least at this point...
~stratum~
-
- KVRist
- Topic Starter
- 154 posts since 15 Feb, 2012
I have realised that the aliasing occurs only when pitching samples very high, which introduces low frequency aliasing.
This is because upper harmonics, once pitched, are well above 22500hz, which 44100 can no represent.
So I was thinking I would upsample my audio to 96000hz, and then pitch shift it, then downsample. But am I correct in assuming that the same aliasing will be introduced after the downsample.
So I am assuming (bear in mind I am quite new to audio programming), that I would first upsample the audio, then pitch shift, the do some sort of low pass filter on it, then resample down to 44100?
If this assumption is correct, what sort of low pass filter would suffice?
This is because upper harmonics, once pitched, are well above 22500hz, which 44100 can no represent.
So I was thinking I would upsample my audio to 96000hz, and then pitch shift it, then downsample. But am I correct in assuming that the same aliasing will be introduced after the downsample.
So I am assuming (bear in mind I am quite new to audio programming), that I would first upsample the audio, then pitch shift, the do some sort of low pass filter on it, then resample down to 44100?
If this assumption is correct, what sort of low pass filter would suffice?
-
- KVRAF
- 2256 posts since 29 May, 2012
We had a similar thread recently http://www.kvraudio.com/forum/viewtopic ... 3&t=475012
Yes that's about right. During pitch shifting there shouldn't be anything that crosses the new nyquist frequency, and before downsampling the same thing applies, but this time the nyquist frequency is the same as the original, of course. Have a look at the WDL-OL IPlug distortion example, it's more or less the same thing. (BTW, since a wavetable can be processed offline, you do not need to run the upsampling part of the algorithm realtime).So I am assuming (bear in mind I am quite new to audio programming), that I would first upsample the audio, then pitch shift, the do some sort of low pass filter on it, then resample down to 44100?
~stratum~
-
- KVRist
- Topic Starter
- 154 posts since 15 Feb, 2012
So to further flesh out my assumptions...
1. I will upsample audio. I assume I just apply a interpolation algorithm of my choice to upsample, since this is basically the same as increasing the frequency of the audio correct?
2. Apply a decently suited Low pass filter (care to name drop one for my to research?).
3. Down sample using an interpolation algorithm.
Sounds about right?
1. I will upsample audio. I assume I just apply a interpolation algorithm of my choice to upsample, since this is basically the same as increasing the frequency of the audio correct?
2. Apply a decently suited Low pass filter (care to name drop one for my to research?).
3. Down sample using an interpolation algorithm.
Sounds about right?
-
- KVRAF
- 2256 posts since 29 May, 2012
Down-sampling is not an interpolation algorithm. It's actually the same process as sampling an analog signal, i.e it is a process that throws away data, and for that to be lossless, the signal must not contain anything above the nyquist.
Similarly wavetable synthesis is just like sampling an analog signal, and you are changing the motor speed of the tape that supplies the signal, i.e. changing the wavetable scan rate. During this process no frequencies above the nyquist should be encountered at any scan rate, because unlike the process of sampling an analog tape, you cannot lowpass filter a wavetable during sampling, it must be prepared beforehand to be suitable for any scan rate to be used. That's why more than one wavetable is being used, usually one or more per octave.
For a suitable low pass filter, you can even look at what a A/D converter uses. For course there are better things that can be implemented in software, but the thing is, it is possible to view the whole process as sampling an analog tape and nothing else (observing the differences is also necessary). This kind of thinking is conceptually helpful and can clear a lot of confusion.
Similarly wavetable synthesis is just like sampling an analog signal, and you are changing the motor speed of the tape that supplies the signal, i.e. changing the wavetable scan rate. During this process no frequencies above the nyquist should be encountered at any scan rate, because unlike the process of sampling an analog tape, you cannot lowpass filter a wavetable during sampling, it must be prepared beforehand to be suitable for any scan rate to be used. That's why more than one wavetable is being used, usually one or more per octave.
For a suitable low pass filter, you can even look at what a A/D converter uses. For course there are better things that can be implemented in software, but the thing is, it is possible to view the whole process as sampling an analog tape and nothing else (observing the differences is also necessary). This kind of thinking is conceptually helpful and can clear a lot of confusion.
~stratum~
-
- KVRist
- Topic Starter
- 154 posts since 15 Feb, 2012
But if I am downsampling from 96000 to 44100, I am not simply dropping every sample, I am dropping every 1.088 samples, which would require interpolation....correct?Down-sampling is not an interpolation algorithm. It's actually the same process as sampling an analog signal, i.e it is a process that throws away data, and for that to be lossless, the signal must not contain anything above the nyquist.
-
- KVRAF
- 2256 posts since 29 May, 2012
Why not downsample from 88200 to 44100 instead and drop one of two subsequent samples, after low pass filtering the signal with cutoff=20khz or so? You do not have to make things harder than necessary.But if I am downsampling from 96000 to 44100, I am not simply dropping every sample, I am dropping every 1.088 samples, which would require interpolation....correct?
~stratum~
-
- KVRist
- Topic Starter
- 154 posts since 15 Feb, 2012
I was just thinking I would upsample to 96000 since that is a common sample rate. I guess that doesn't really matter...
EDIT: On the matter of the filter. I am totally lost. I have never written a low pass filter before, and I can not find a tutorial specifying one in code. Is there any resource you might know of that I could look into?
EDIT: On the matter of the filter. I am totally lost. I have never written a low pass filter before, and I can not find a tutorial specifying one in code. Is there any resource you might know of that I could look into?
-
- KVRAF
- 2256 posts since 29 May, 2012
- KVRist
- 251 posts since 7 Feb, 2017
Maybe https://en.wikipedia.org/wiki/PSOLA or TD-PSOLA methods. Essentially chopping up your waveform into Hann windowed overlapping blocks and then rearranging them so that there's no overlap before sending to stream.