Sigmund, Fiddle, or Helmholtz faster than realtime?

ablue

Is it possible to do pitch detection faster than real-time? I'd love to be able to analyze the pitch of a file (currently stored in an array) faster than real-time. Are there any options for being able to do analysis like this?

sunji

While you can upsample to make frames faster, you'll only be getting the same speed increase equivalent with dropping the window size. And what you gain in speed you lose in accuracy, starting in the lower frequencies. Doubling the sr with a 512 point window is effectively the identical to a 256 point window, just with twice the computations.

One issue with spectral domain is that you need to poll a history of so many samples to infer frequency information. ie If you want 10ms response time (just the spectral analysis, not total overhead), the windowing will not even register 60hz tones.

You can get away with smaller windows for higher voices. If you are tracking a piccolo or a violin, a 128 point window might be sufficient. 128 samples at SR 48k is only 2.6 ms of added latency, which is not too bad!

ablue

@sunji said:

While you can upsample to make frames faster, you'll only be getting the same speed increase equivalent with dropping the window size. And what you gain in speed you lose in accuracy, starting in the lower frequencies. Doubling the sr with a 512 point window is effectively the identical to a 256 point window, just with twice the computations.

Thanks. Right now, the pitch detection range is 25hz - ~3000hz. Basically the frequency of the most common instruments.

I have an app that does pitch detection in real time as people tune their instruments. It records the results and gives a printout of how well they played in tune. I'd also like to give them this same print out for things they previously recorded (so not in real time). So, that's the challenge right now - to try and do the same pitch detection that's happening in realtime, but as fast as possible when using a pre-recorded source.

jancsika

I believe there is a flag to put Pd into "batch mode" where it processes everything as fast as possible.

ablue

I'm finally getting a chance to look back at this.

@jancsika I couldn't find a reference to batch mode. I'm also using libpd, so I'm not sure if that's something that's implemented there.

I tried upsampling again using the example patch the helmholtz~ comes with illustrating re-sampling (http://www.katjaas.nl/helmholtz/helmholtz.html). I was able to get faster processing from what I can tell and the pitches were correct, but the lowest detectable pitch doubles for every double of sample rate. For example at 44100, the lowest is 25hz and upsampling 8 - 16x made the lowest around 400hz which is too high.

Are there any other potential methods for this?

jancsika

@ablue It changes Pd's scheduler to this:

int m_batchmain(void)
{
    while (sys_quit != SYS_QUIT_QUIT)
        sched_tick();
    return (0);
}

In other words, it computes your audio blocks as fast as possible until the program halts.

You can select this scheduler by starting Pd with the flag "-batch". (E.g., pd -batch foo.pd)

lacuna

wow, can you even set this flag with [pd~]?

ablue

@jancsika Thanks for this information.

I left a comment on the libpd repo to see if there's a way to set this on libpd since it appears it doesn't use the args passed in.

jancsika

Here's my understanding of the situation:

The [pd~] process plugs in its own special scheduler to essentially do its own, deterministic version of "batch" processing. So technically speaking [pd~] will process multiple blocks of audio as fast as it receives them (i.e., faster than realtime).

The constraint is that the Pd process it communicates with sends it audio on a regular schedule. So you can only set the number of blocks of audio you want pd~ to process "faster than realtime" with the "-fifo" flag. And the number you set is bound by the maximum size allowed by your OS for the pipe between the Pd and pd~ process.

Plus the fact that you can only get the output from [pd~] on the regular schedule determined by block size, sample rate, and the latency from the value you gave to the "-fifo" flag. So while pd~ technically processes blocks "faster than realtime" (it would be pointless if it couldn't), you can't actually receive data from it faster than your current audio rate allows.

Now, one could design an [batchpd] object that spawns a separate Pd process which runs in batch mode and pipes Pd messages back to the original Pd process before it dies. But you'd either have to make the output of [batchpd] block (i.e., wait) until the subprocess dies, or deliver output at an arbitrary future time. The former would cause dropouts if you're trying to compute audio, and the latter would be non-deterministic.

chmod

@ablue

Did you make any progress with this issue? I'm in a similar situation myself where I've been asked to build an automated offline testing system for a pitch-tracking rhythm game. Since timing is a really important aspect of my patch (since it tracks not only pitches but rhythms) I'm not sure that this kind of system would be feasible in my case. I'm thinking my best bet would just be a system that automatically goes through each level one by one using a test MIDI file as input (with random variances to test accuracy). The game currently works with instruments as low as the Tuba so I'm not sure if I'd be able to use oversampling in my case; but please let me know if I'm incorrect!