PURE DATA forum~

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Advanced Search

Posted by

In Categories Search child categories

Reply Count

Post time

Sort by

Show results as

Posts Topics

Save preferences Clear preferences

35 result(s) matching "human-voice-spectral-analysis", (0.01 seconds)

Serious problem with poly object!!

@skatias said:

But when I press a note twice and I want it to be sustained by finger and release the pedal then the pedal gives all notes off and the finger sustain doesn't work.

I don't think that should happen unless the previous note has been stolen. But if it is the case, you might want to consider handling this inside the voice itself. I don't have time to make an example right now, but I'll try to explain. Right now I imagine it's set up to go into its release stage when it receives a velocity 0. You could put a [ctlin] inside the voice and have it do a check for whether or not the sustain pedal is down when it receives velocity 0. If it is, have it wait for the pedal to be released before going into the release stage, otherwise let it pass. This should definitely prevent new voices from being turned off.

As for making sure a specific note isn't duplicated, you'll have to keep a running track of what notes have been sent to what voice. Here's what I'm thinking: for every note-on/note-off coming out of [poly], store the note value in a table that's the same length as the number of voices. Use [poly]'s voice number as the table index, and for every note-on store the note number, and for note-offs store something out of range like -1. When a new note-on comes through, run a quick check through the table to see if the same pitch is being used before [route]ing it. If it finds a match, change the voice number to the table index so that it goes to the same voice and retrigger the note, making sure to reset it if it's being held by the sustain pedal.

Or something like that.

posted in technical issues

Filtering speech from adc

@flo said:

separating voice from any other sound is quite a task even in professional studiotechniques, as far as i can see.

maybe you could approach this in a different way, such as having an array of point directional mics pointing into the room at an average height of a human mouth and then get set up to record everthing as soon as a certain level of noise (hence, a voice presumably) is being detected.

then you could use some eqing to filter evwrything not in human voice spectral domain.
anyway, keep in mind that it will be difficult to have a really clean recording

posted in technical issues

Filtering speech from adc

you mean, you have a stereo music signal with e.g. a complete band with guitars, bass, drums and vox on the inlet and you only want to have the voices on the outlet? no, i'm pretty sure that's not possible or in any case not a trivial task.
if you aim at the opposit of those effects, that filter out a voice from a music file, so you get the instruments only as result - those efffects do not really filter voice, they filter everything in the middle of the stereo field such as voice, but unfortunately often the bass guitar and other instruments are set to middle too, so they will be filtered too.

edit: if you have a constant noise in a voice recording, then this is possible to reduce it more or less without big changes on the voice. this is a task for fft analysation and fir filters. theres a nice freeware plugin called reaFIR, look here:
http://www.reaper.fm/reaplugs/

posted in technical issues

"winds of doom" soundscape-generator

@toxonic said:

i have to enhance some little things in my structure to save some cpu power (for example i use a noise~ object in every(!) voice, but i only need one for all voices) - at the moment it's not possible to run 100 voices on my machine. but i find, this many voices are not needed to get a good and fat sound.

yes, a single instance of noise~ would help in reducing the cpu load. Another (maybe useful) suggestion is to use
a switch~ for the voices that are not in use at a certain time. Anyway you're right, a nice sound does not need
too many voices (96 are a lot, 128 too many .

@toxonic said:

what i found out in your patch is, that reducing the voices leads to a reduction of mainly high voices - was this your intention?

I approached that problem by using a "frequency spacing" parameter, so even with few voices I can span
the 20-20000 Hz region (each "voice" has a center frequency f = f0 x n x dF where f0 is the "base" frequency,
n is the voice number and dF is the frequency spacing), but in that case, if you reduce the number of voices and you
want to preserve the full spectrum span you need to increase the frequency spacing..

Anyway, I really enjoyed winds of doom, thanks for sharing.
I particularly like the "drops" effects (preset 2 if i'm not wrong), it's really good.

@toxonic said:

p.s.: a skrewell adaption would be a nice idea too, but i didn't already get me into it.... i hate reverse engeneering...

Skrewell is an instrument for which I'd pay to have it ported in Pd (ok, kudos, not real euros ..
In case you change idea let me know, maybe sometime we can join the efforts for this ...
Anyone else?

Alberto

posted in patch~

"winds of doom" soundscape-generator

cool, nice sounding patch too! it's sounds a little different to mine, but great too. i have to enhance some little things in my structure to save some cpu power (for example i use a noise~ object in every(!) voice, but i only need one for all voices) - at the moment it's not possible to run 100 voices on my machine. but i find, this many voices are not needed to get a good and fat sound.
what i found out in your patch is, that reducing the voices leads to a reduction of mainly high voices - was this your intention? i solved that by creating the voices as abstractions dynamically with their basic-pitch as argument - but i create them in 10th steps and use modulo to begin again the lowest voice.... sorry my english is too poor.
what i mean is, when i have a count of ten voices i have already coverd the whole frequency spectrum.
thank you for the link, i'll have a look to your other creations later!

p.s.: a skrewell adaption would be a nice idea too, but i didn't already get me into it.... i hate reverse engeneering...

posted in patch~

Dynamic voice management

@mod said:

how about making your own version of [poly] as an abstraction? that way you can do what you want with it,

?? uh, i guess, that's not really a trivial thing to do!? ok, you could use a counter to generate a voice number and a [pack] object to pack the pitch/velocity pairs to that voice number. but when i hit for example one key and keep holding it down while i hit another key (now the second voice will be triggered...) - when i let the first key go after a while, how can i set the velocity of the corresponding "triple" (voice#, pitch, velo) to zero again?
has somebody built something like this yet? would be cool, if i could have a look to an existing patch...i don't have a clue!

edit: ok, after hours of trial and error, i'm quite sure, that the way i planned it - by using a midi triggered counter to generate the voice-number - it won't be possible to build a poly-like abstraction:
when a key is held down while pressing other keys, the corresponding voice-number has to be "saved" somehow, for only the left voices will be used... the other issue that i have is, how to get the velocity values to change back to zero within the "midi triple messages", when the key will let go again...
i really have no idea how to figure that out - can someone help?
sorry for my poor english.....

posted in technical issues

Audio recognition with FFT

Check this post
http://puredata.hurleur.com/sujet-2508-frequency-analyzer

I posted a patch in there "spect.pd" which analyzes and write the spectrum data in a table which you can then save, recall, and so on.

There is a problem with what you have in your mind though: You want to compare the spectrum of two sounds, whereas the spectrum of a sound is NOT something stable that you can get the "exact" same result each time you analyze even exactly the same sound. You should focus on specific partials a particular sound has dominating over the others in its spectrum, then you can compare the two sound that has same characteristics in their timbre.

There is another patch in there called "spect2.pd" which has a threshold you can set to have only the partials with an amplitude above the threshold pass through. By using such a function, you can detect which partials are dominant in the sound you have in your mind as basis for comparison, and make a patch that looks for those in the spectral data you provide for it.

posted in technical issues

How to populate 1 array with 4 incomming number streams

ahhh, I was wrong about how I thought [poly] assigned voices - i misthought it would always try the lowest voice first, so you could do voice 1->pad, voice 2->perc, voice 3->bass etc to get what you want (which is: each blob gets one sound until it dies, with some sounds preferred when there aren't many blobs - as far as i understand - i might have misinterpreted though)

BUT: it seem [poly] just cycles through all the voices, which isn't good for this problem

sorry for leading you up the wrong path

will think some more about this problem, maybe i'll have something useful to show to make up for the confusion

posted in technical issues

Programmable sweep management?

This may sound confusing, if so no worries.

OK... right now I'm working with 4 voices, each with 5 values that I want to be able to occasionally sweep (fund freq, vibrato range & speed, tremolo level & speed). I don't have a piano keyboard, and may never get one because I prefer to send ordered, pre-loaded lists of notes and envelopes to the instruments to control their sounds, rather than MIDI device inputs. I'm wondering if there's an easier way to program sweep information than the method I'm building now (which is to store each upcoming sweep as it's own pre-loaded table of offest values, and have each list of tables accessed by sending the address referencing the voice-modifier I wish to sweep). That way I can encode the ranges, direction changes, and speeds of a particular sweep motion with a single, easily-resized data structure.

Basically when I want a certain input to a particular voice swept, I just negate the value I'm sending (since they are all otherwise positive), then use a cell logic to return the abs values to the voice player, and if negative then also send a message to my sweep sequencer consisting of "addresses", which are just numbers representing which voice's modifers I want swept on the current beat. The sweep sequencer then decodes this address using a large [select] to choose which of 20 lists of sweep-offset-tables to use: (to tabread through @ 1 index/ms, then send & add the sweep-offset values to the proper modifier input of my voice player, and finally, prime the next offsets table to use next time the same address is sent to the sweep sequencer.

This means that my little 4-voice synth, with 5 sweepable modifiers per voice, will require 20 separate lists of tables (I figure anywhere from 0-30 tables per list in a given track). To me this seems like kind of an ugly/expensive approach, but it's alot faster than the only other programmable-sweeps method I could come up with.

I bet this post looks like a train wreck, but hopefully someone else has "been there" and can visualize what I'm trying to do. Is this going to be the easiest way for sending programmed sweeps? Maybe there's something I'm overlooking that would do this with less work and code volume?

And if this is my best bet, then is 1 index/ms a good speed to use for extracting sweep offests from the tables? This is kindof a guess... I shouldn't need my sweep response to be faster, but maybe it needs to be slower? The most voice modifiers I'll ever sweep simultaneously on a given beat is 12, with the average being more like 1-2.

posted in technical issues

Error: tabsend~: $O-hann: no such array

To cross synthesise two voices you must ensure that two speakers make exactly the same utterances which are phonetically aligned. This is hard as I can tell you from experience of recording many voice artists. Even the same person will not speak a phrase the same way twice.

<< This is not possible in my experiment, as I am supposed to morph the actual conversation, so it is upto subjects what ever they want to speak. There is some work done by (Oytun Türk , Levent M.Arslan) who conducted experiment in passive enviornment (not at real time).

The result is not a "timbral morph" between the two speakers. The human voice is very complex. Most likely the experiment will be invalidated by distracting artifacts.

Here's some suggestions.

Don't "morph" the voices, simply crossfade/mix them.

<< yes I also want to do this, crossfading and mix, as I just want to create illusion, so that listner start thinking whether it is B's voice or A's voice>>

For repeatable results (essential to an experiment) a real-time solution is probably no good. Real time processing is very sensitive to initial conditions. I would prepare all the material beforehand and carefully screen it to make sure each set of subjects hears exactly the same signals.

<<Yes, I agree, but it is demand of experiment, I can not control the environment, but to create some good illusion (or to distract listner, I may add noise in the siganl at real time, it would challange the listner brain in identification, so I may use such kind of tricks for the sucess factor)

If you want a hybrid voice (somewhere between A and then vocoding is not the best way. There are many tools that would be better than Puredata which are open source and possible to integrate into a wider system.

<<Actually now I have a little familarity with pure data, so it is more good for me to stick with it for a while (due to short time), yes if i continue my phD in this domain, I would explore other tools as well. for the time being it is a kind of pilot study>>

Question :
Is VoCoder alone is sufficient for morph/mix/crossfade among two voices?
or I should also add the pitch shifting module with VoCoder to get some more qualitative results.

Question 2:
I already tried this VoCoder example, but could not change it according to my requirements. In my requirements, I have a target voice (the target voice is phonetically rich) and now source speaker is speaking (what ever he want to speak) and source voice is changing into target voice. (illusion/crossfade/mix)

The (changetimbre1.pd ) file that I attached first, give you an idea of what kind of operational interface I am looking for.

Question 3:
what should be the ideal length of target wave file?

Before the start of experiment, I would collect the voice sample of all participants.

I am highly oblige for your earlier help and looking for more (greedy). Meanwhile I would once again study this vocoder example to change it according to my requiement. ( though i doubt I may change it).

Thanks.

posted in technical issues