• Jona

    @whale-av thanks, everything works fine. I was just wondering if every PD instance needs its own port. obviously not if it receives udp, thanks for your explanations.

    posted in technical issues read more
  • Jona

    @whale-av thanks a lot for your explanations, i think i understand the netsend object better now. and it works with multiple pd instances. it seems that each pd instance needs its own port?
    pdtexttospeechwindows.zip
    this ist the gui instance:
    textospeechgui.PNG
    and this are two of the netreceive instances:
    textospeech1.PNG
    textospeech2.PNG

    posted in technical issues read more
  • Jona

    I recognized one problem: while the text to speech command prompt is running via [sytem], the pure data patch is "frozen". is there a way to solve it (then I can play for example a female and a male voice together)?
    I also tried the [sys_gui] method: https://forum.pdpatchrepo.info/topic/10168/is-it-possible-to-execute-an-exe-from-within-puredata/10
    But somehow I can not execute an .exe file with it (message to [sys_gui] : exec "path to .exe")...

    posted in technical issues read more
  • Jona

    it does work now on windows with this tool: https://www.elifulkerson.com/projects/commandline-text-to-speech.php
    and the [sytem] object from the motex library.
    with this python code:

    import speech_recognition as sr
    import socket
    s = socket.socket()
    host = socket.gethostname()
    port = 3000
    s.connect((host, port))
    
    while (True == True):
      r = sr.Recognizer()
      with sr.Microphone() as source:
        r.adjust_for_ambient_noise(source, duration = 1)
        print("Say something!")
        audio = r.listen(source,phrase_time_limit = 5)
     
      try:
        pdMessage = r.recognize_google(audio, language = "en-US") + " ;"
        message = r.recognize_google(audio, language = "en-US")
        s.send(pdMessage.encode('utf-8'))
        print("PD Message: " + message)
      except sr.UnknownValueError:
        print("Google could not understand audio")
      except sr.RequestError as e:
        print("Google error; {0}".format(e))
    

    and this patch: speechtoquote.pd
    here is the csv file that I used: https://github.com/akhiltak/inspirational-quotes/blob/master/Quotes.csv


    and here is a texttospeech minimal example (which does not need python):
    texttospeech.PNG

    posted in technical issues read more
  • Jona

    @Johnny-Mauser exactly that (even if it is not a real world project (yet) but discovering the possibilities of speech recognition and pure data). I can probably transfer it to Windows if you (or somebody else) can give me some more hints how that is done with Pure Data and OSX. That is what i found about speech recognition and OSX in the mailing list: https://www.mail-archive.com/pd-list@iem.at/msg45054.html

    posted in technical issues read more
  • Jona

    this is a combination of speech to text and text to speech it is mainly copied from here: https://pythonspot.com/speech-recognition-using-google-speech-api/
    it works offline with sphinx too, but then it is less reliable.

    import speech_recognition as sr
    from gtts import gTTS
    import vlc
    import socket
    s = socket.socket()
    host = socket.gethostname()
    port = 3000
    s.connect((host, port))
    
    while (True == True):
      r = sr.Recognizer()
      with sr.Microphone() as source:
        r.adjust_for_ambient_noise(source, duration=1)
        print("Say something!")
        audio = r.listen(source,phrase_time_limit=5)
     
      try:
        pdMessage = r.recognize_google(audio, language="en-US") + " ;"
        message = r.recognize_google(audio, language="en-US")
        s.send(pdMessage.encode('utf-8'))
        print("PD Message: " + message)
        tts = gTTS(text = str(message), lang="en-us")
        tts.save("message.mp3")
        p = vlc.MediaPlayer("message.mp3")
        p.play()
      except sr.UnknownValueError:
        print("Google could not understand audio")
      except sr.RequestError as e:
        print("Google error; {0}".format(e))
    

    What I would like to achieve: I have an CSV file with ~60000 quotes. The recognized word is send with [netreceive] to Pure Data and the patch chooses randomly one of the quotes where the recognized word appears. That does work.
    My question: Is it possible to send the choosen quote with [netsend] back to python and transform it there into speech?
    Somebody says a word and the computer answers with a quote where the word appears...
    Does it make sense to use Pure Data for that task or better just use Python (but I do not know how to do that only with Python yet...)?
    The Universal Sentence Encoder sounds really promising for "understanding" the meaning of a sentence and finding the most similar quote. But that is far too complicated for me to implement...
    https://www.learnopencv.com/universal-sentence-encoder/

    posted in technical issues read more
  • Jona

    @EEight that is nice. Your 10 year old approach seems almost more reliable than mine with google speech recognition now. What language or library did you use back then?

    posted in technical issues read more
  • Jona

    the initial idea of my "research" was to find the most similar subtitle of a movie compared to the spoken input and jump to the according position.
    it seems to be more complicated than I naivly thought, but still theoretically possible with machine learning technologies like word2vec or doc2vec ;) i do not think that it is possible with pure data alone (perhaps i am wrong)...
    still nice that it is possible to implement basic speech recognition in pure data...

    posted in technical issues read more
  • Jona

    i try to use speech recognition with this python script in pure data:

    import speech_recognition as sr
    import socket
    s = socket.socket()
    host = socket.gethostname()
    port = 3000
    s.connect((host, port))
    
    while (True == True):
      r = sr.Recognizer()
      with sr.Microphone() as source:
          r.adjust_for_ambient_noise(source, duration = 1)
          print("Say something!")
          audio = r.listen(source, phrase_time_limit = 5)
    
      try:
          message = r.recognize_google(audio) + " ;"
          s.send(message.encode('utf-8'))
          print("Google PD Message: " + message)
      except sr.UnknownValueError:
          print("Google Speech Recognition could not understand audio")
      except sr.RequestError as e:
          print("Could not request results from Google Speech Recognition service; {0}".format(e))
    

    this is the test patch: pythonspeech.pd
    it should change the canvas color if the patch recognizes the word red/ blue/ green/ yellow /white /black from the microphone input while the python script is running.

    I have two questions regarding the patch:

    It seems that the python script sometimes stops to work, does anybody know what is the error? I think it has something to do with the while loop, because it worked fine without, but I am not sure (and i do want the loop).

    Is there something like pyext for python 3.x and Pure Data 64bit (Windows)?

    At the moment the script uses Google Speech Recognition and needs an internet connection because of that, but it seems to be possible to use CMUSphinx offline instead (I do not know a lot about all that).

    pyspeech.PNG

    posted in technical issues read more
  • Jona

    @Balwyn your solution is much easier :)

    posted in technical issues read more

Internal error.

Oops! Looks like something went wrong!