How can I measure the voiced and unvoiced segment duration of an input?