@Maelstorm said:
Why did I not know about Google Patents? Thanks, katjav!
Elden came up with the first google patents link in this thread.
Found another interesting one. Peter Neubäcker's patent application for Melodyne DNA:
http://www.google.com/patents/US8022286?printsec=description#v=onepage&q&f=false
The description shows that there is indeed no special trick, like Peter Neubäcker already mentioned in the video portrait. It is just a very detailed analysis, mainly of frequency domain data, where frequencies are found by the phase-delta method using 4 times or more FFT overlap and Hann windowing. For anyone who has ever done basic frequency domain pitch shifting plus pitch detection and attack detection from scratch, all the elements of DNA technique will be very familiar. But I can imagine how it takes months or years of patient coding to get all the details right. Understanding a concept is one thing, building a robust implementation is something completely different.
Anyhow, a DNA approach is really suitable for studio work but it could not operate on live input. In Pd, we need a low-latency and preferably efficient pitch shifter which doesn't produce artifacts like amplitude modulation, phase randomization, or false pitch detection. For vocals, one would like to control pitch and formants independently. How to do this in Pd was the initial question of this forum thread. So let's go back to that question.
From the info's so far it seems that a time domain pitch synchronous method is most appropriate for this purpose, the Brian Gibson method essentially. This method assumes a monophonic periodic signal. What elements are needed to build this in Pd?
To start with, an accurate and robust periodicity tracker. This could be loosely modeled on Hildebrand's concept, where the signal stream is downsampled to limit the analysis frequency range and improve efficiency. Alternatively, autocorrelation could be done in frequency domain as a multiplication, where it would also be easy to restrict the frequency range. Prototyping and test for this can be done as Pd patches.
Next, a method for fractional resampling. Using [tabread4~], it is easy to interpolate between samples, and it's even better than the minimum requirement as proposed by Gibson (i.e. linear interpolation). The interpolation quality of [tabread4~] is good enough for moderate resampling ratio's like +/- 1 octave. But the problem is, both the input and output of [tabread4~] operate at the current samplerate. Say if you have read 48 values at increased speed from a 64 pt block and you're done reading, there is no easy way to store these values at index 0 - 47 in a resampled array, and continue writing at index 48 in the next signal block. There may be some trick employing [phasor~] objects to address the correct write indexes. Alternatively, a specialized Pd class could be written for this.
These things are only the very first steps towards a pitch shifter implementation. But they're interesting experiments anyway, and such subroutines can be employed for other purposes as well, if they succeed.
Katja