Lent's pitch shift, no Mickey Mouse

katjav

It took me a full day to track a nasty bug in STK class LentPitShift, but in the end it could be made to work as a Pd object [lent~].

Lent's method reportedly does not produce the Mickey Mouse effect. This topic was discussed in thread:

http://puredata.hurleur.com/sujet-6614-upsampling-audio-signals-using-automatable-sampling-frequency

LentPitShift (and [lent~]) is a proof of concept, it is programmed without optimization and does some 15% CPU load for a rather crappy output. But the concept is quite clear: indeed there is no Mickey Mouse effect, the formants are preserved while the pitch is altered.

I've never heard this effect before in an open source real time application. Is it so hard to program? Probably, if you want to do it right.

I've compiled it on OSX 10.5, and can do a Debian Squeeze (32bit) build too if anyone interested. See attached lent~.zip with lent~.pd_darwin, lent~-test.pd and all the sourcecodes and makefile.

Here is the LentPitShift project description by author Francois Germain:

http://www.music.mcgill.ca/~francois/MUMT_618/Report/Report.pdf

Katja

Edit: find a zip with OSX, Linux and Windows binaries in post #7.

http://www.pdpatchrepo.info/hurleur/lent~.zip

domien

What a pitty there's no win build, hint hint

D.

sumidero

Hello Katjav:
I am interested in testing a Debian Squeeze version of lent~. I'm following the pitch shifting thread with a lot of pleasure. So, if you please... Thanks in advance.

Sumidero

Ps: Do you sleep anytime?

lead

This is completely amazing!

I'm just trying it out on some vocals and it's performing better than a commercial plugin I had to fork out for last month.

Thanks very much Katjav!

sumidero

Hello everybody:
I've got it! Was just installing 'make' and running it. Good work, katjav!
Here's the external for Debian Squeeze.

Sumidero

http://www.pdpatchrepo.info/hurleur/lent~.pd_linux

nau

Thank you ! Amazing.

katjav

Thanks Sumidero, for doing the Linux build. That gave me some time to sleep... and to do a Windows build. I've also fixed a deallocation bug. Attached is a zip with new binaries for OSX, Linux and Windows (all 32 bit), sources, and testpatch.

@lead said:

This is completely amazing!

I'm just trying it out on some vocals and it's performing better than a commercial plugin I had to fork out for last month.

What about those spikes every few seconds? What about the spurious ringtones in the background? The 2000 samples latency, the 20% CPU load?

For me, [lent~] is amazing because it demonstrates the validity of Lent's method. Originally it had a pernicious bug which made me go through the C++ code line by line, and this provided a good understanding of it's working. It is a basic implementation with no refinement of the analysis and no optimization. There's plenty opportunity for improvement. A good quality implementation will be several times more complex though. For one thing, it needs a fast and accurate transient detector and a routine to handle transients with special care. Pitch shifting is after all a real time splicing technique, a job with lots of details.

Katja

http://www.pdpatchrepo.info/hurleur/lent~.zip

Jwif

Hey Katja,

I'll say it again, seriously impressed by the work you're doing. There's so much invaluable information being posted on these forums. I really need to get some time to properly look through it all.

I agree strongly about having a fast/accurate transient detector. Have you already linked any articles on this subject. Sorry if I've missed them. I'm really interested in developing something like this and would like to help where I can.

Thanks,
Joe

katjav

@Jwif said:

I agree strongly about having a fast/accurate transient detector.

Of course, [bonk~] does a very good job in detecting transients of all sorts. It does multiband analysis, and is not only sensitive to signal level rise but also to sound type alteration. But it does not point to the exact location of a transient start. In class [slicerec~], I use a simpler detector which sets a cuepoint at the zero crossing just before the transient start. The procedure is described on:

http://www.katjaas.nl/beatdetection/beatdetection.html

[slicerec~] finds transients within one block of audio, 64 samples by default. Most transients do not even last longer than 1 or 2 milliseconds. But in the case where they do, I miss them. It seems that a multirate solution would be best, operating on different analysis sizes simultaneously. For pitch detection, multirate would be the best option as well. You'd also want polyphase decimation to make it efficient. All together, that means a fairly complex programming job.

Katja

Jwif

Cool thanks, I forgot I already read that a while ago.

Would it make sense to have modular objects so that different methods in transient/pitch could be used to create specialised analysis algorithms? For example one object could detect say the spectral content of a clap sound, another would detect a transient. If at a certain point in time both these objects detected a hit we could say it was a clap.

I guess an issue (amongst others) is syncing the two objects. Plus is this solution viable for real-time purposes? A transient of 1-2ms would happen roughly over a whole 64 block @44.1kHz

By multirate do you mean upsampling? From some google-fu it seems like polyphase decimation is moving the downsampling process before the filters so that you can process them at the original samplerate? Is this right?

Cheers,
Joe

katjav

@Jwif said:

Would it make sense to have modular objects so that different methods in transient/pitch could be used to create specialised analysis algorithms?
...
I guess an issue (amongst others) is syncing the two objects.

Indeed the problem with analysis objects in Pd is always the transitions from signal rate to messages and back again. It's not always helpful to have a message saying like 'bang, a transient happened somewhere in the last 20 milliseconds'.

Anyway, my approach towards analysis would be like this: develop a set of subprocesses as C++ classes, which can be individually wrapped in Pd classes, or combined into more comprehensive processes if so desired.

For example, you'd have some pitch estimator classes, each according to a different method. These classes can be thoroughly tested in Pd patches, compared to each other, and eventually optimized. The same with transient detectors. All such Pd implementations could be used on their own as independent objects.

In addition to that, you'd develop helper methods like polyphase decimator filtering. When all subprocesses are tested to work OK, they can be combined to make more advanced, integrated and optimized classes, for example a fine vocal processor with pitch and formant manipulation, as proposed by Elden in a thread which ignited the recent pitch discussions.

Following this concept, I am at the moment working on a pitch estimator according to Philip McLeod's Specially Normalized Autocorrelation (SNAC) method, and will report about it soon because it has some excellent characteristics, not only theoretically but also in practice.

@Jwif said:

By multirate do you mean upsampling? From some google-fu it seems like polyphase decimation is moving the downsampling process before the filters so that you can process them at the original samplerate? Is this right?

With 'multirate' I mean doing analysis at different sample rates. The point with pitch and transient detection is, that the large analysis frame required to accomodate the lowest frequencies, is too long for the fast fluctuations. It would be best to have short and long analysis frames running in parallel. The long frames can be run in downsampled time, so effectively they would hold as many samples as the short frames, which would be efficient. Each rate requires it's own bandlimited operation. It's wavelet filter bank technique, essentially.

Katja

Jwif

@katjav said:

Indeed the problem with analysis objects in Pd is always the transitions from signal rate to messages and back again. It's not always helpful to have a message saying like 'bang, a transient happened somewhere in the last 20 milliseconds'.

Anyway, my approach towards analysis would be like this: develop a set of subprocesses as C++ classes, which can be individually wrapped in Pd classes, or combined into more comprehensive processes if so desired.

Yup this is exactly what I was thinking. Regarding conversion to messages and back, would a solution be to output the hit detection as a signal that is readable by the other classes? Like a zeroed-out buffer with maximum value samples to denote hits?

@katjav said:

Following this concept, I am at the moment working on a pitch estimator according to Philip McLeod's Specially Normalized Autocorrelation (SNAC) method, and will report about it soon because it has some excellent characteristics, not only theoretically but also in practice.

This sound great! Although it seems like to the link to the thesis is broken? Or down?

You've probably already seen this but this website has loads of info and links to recent papers. https://sites.google.com/site/musicaudiohp/notes/asmarterwaytofindpitch

@katjav said:

With 'multirate' I mean doing analysis at different sample rates. The point with pitch and transient detection is, that the large analysis frame required to accomodate the lowest frequencies, is too long for the fast fluctuations. It would be best to have short and long analysis frames running in parallel. The long frames can be run in downsampled time, so effectively they would hold as many samples as the short frames, which would be efficient. Each rate requires it's own bandlimited operation. It's wavelet filter bank technique, essentially.

I tried to draw out what the building blocks of a potential algorithm could be, it's in the attached image. I did it probably more to really understand what's involved, let me know if anything is wrong or I missed something.

Thanks again,
Joe

http://www.pdpatchrepo.info/hurleur/detector.png

katjav

Jwif, your drawing reflects my idea to a large extent. But the decimator block would filter the phases for all samplerates so each rate gets the bandlimited signal it needs, with virtually no overlap.

Of course the first step is to design transient and pitch detection methods with optimal accuracy and robustness in the spectrum range of concern. Second step would be a method to synchronize data from different analysis frame sizes, and make sure that double triggering can not happen. The different frame sizes would enable fast detection of high frequency events, while low frequency events are noticed as well but just a bit later. Polyhase decimator routines are needed for speed optimization, that would be the last step in such a project. I have not programmed such a thing yet. There's a lot of info on this topic, as it is used in ubiquitous data reduction (compression) techniques.

I'd imagine that different versions of processes could be developed according to requirements of quality and CPU load. If these things are all done as C++ classes following a certain template for input/output of samples and messages, they could be used and combined for all sorts of purposes in real time dsp. So, I would like to propose a name for this software library project yet to be started:

RAP, the Realtime Analysis Project

We should really start a new thread if this is getting serious.

Katja

P.s. [snac~] is here:

http://puredata.hurleur.com/sujet-6761-snac-precision-pitch-tracker

Jwif

Yep this sounds great! Sorry for thread hi-jacking I'll make a new one

sumidero

Hello everybody:
I changed from 32bit to 64bit Debian GNU+Linux, so I had to get or recompile some useful objects that @katjav wrote and I use very often. Here they are, just in case someone hits the same wall as I did.
Best regards and thanks to our kind dsp guru,

Sumidero

PS: Zipped and attached are 64 bit pd_linux objects: helmholtz~, lent~, slicerec~ and sliceplay~.

http://www.pdpatchrepo.info/hurleur/KatjasStuff.tar.gz

sumidero

Just in case, I meant that our dsp guru is @katjav.

Sumidero

katjav

Thanks Sumidero. I have Linux 64 bit too since a while and the slice stuff is updated (also the patch is made compatible with Pd 0.43), see SliceJockey2test2 on:

http://www.katjaas.nl/slicejockey/slicejockey.html

But the other classes I must confess they aren't done.

Katja