Hello all,
I hereby present to you a package to implement dialogs and speech recognition at the same time!
I've been working on this as part of a graduation project (University of Applied Sciences Rotterdam) and I'd love for you guys to give me some feedback. I hope it'll make speech and dialog implemantations easier for everyone.
Feature list:
- Dialog editor, for normal dialogs or dialogs with user options.
- Dialogs to implement cutscenes or cutscenes with user options
- Option to use different grammar files for each dialog (Sphinx 4)
- Export to Json & Load Json
- Typewriter animations
- Delay before a next dialog shows!
- Out of the box support for 3 speech recognition systems. Google Speech, Wit.ai and Sphinx 4.
- Automatic selection of dialog answers on speech result. Let's calculate that accuracy!
- Audio input analyzer. When did the user start talking and when did he stop? Let's cut that audio out, and recognize it!
- Voice Activation Volume adjuster
- Callbacks for timers, automatic answer selection and current state of audio input (no audio, listening & analyzing speech)
- Includes a working Sphinx 4 Server, and text to language model tool. (out of the box support for English, Dutch and German)
- Automatic grammar files to dictionary to decrease server load
- Docs available in source and here: https://hespen.net/Portfolio/UnityDialogEditor/annotated.html
The project can be found here:
It contains a Unity Package for easy implementation, the Unity project, a Sphinx 4 Server and a Sphinx 4 Text to Language Model Tool.
To make things easier for you to test, I've added keys for the Google Speech API and Wit.ai API. (Wit.ai has an English key, Change the language of Google in the code) Select the speech system you'd like to use on the main camera object
I advise you to take a look at Sphinx though. Just import it into your IDE, Gradle Make it, And run the Base object.
Remember this is not a finished product, as there still are some bugs in the dialog editor. And I haven't had the time to make it beautifull yet. I did implement this in a VR game, but as that is part of a company, I can't share that one.
I'd love to know what you guys think of it!
Screenshots:
Demo video (crappy quality, no audio):
How to use:
Enable Microphone Setup Object and run it and toggle the button for like 5 seconds while being silent. Toggle it off, and speak. When you speak the square should become green. (saved in prefs automatically)
Stop the game, disable the microphone setup object. Select the speech system you'd like to use on the Main Camera object. Press enter to start the dialog.
Remember: Google en Wit.ai are really slow, use Sphinx 4 for the fastest result! I did research on the implemented speech recognition systems and their accuracy and speed. Sphinx is the fastest with an average of 200ms recognition (external server)! Where Google and Wit.ai will need atleast 2-5 seconds. (tested with 3200 audiofiles, 2 languages)
The editor:
Windows -> Nodes Editor. Right click to create new nodes or load the json file. Demo Json in Resources folder. Middle click to drag, scroll to zoom. Right click to export to json. You can attach the json to the main camera!
- Keywords are used for speech recognition! They determine the accuracy.
- Delay in Seconds before dialog is shown
- Time until next node is a delay before the next node is shown. This one starts counting after the first delay has passed.
Last thing: The used google key in this project is attached to a trial account. If I've spend my cash, it won't work anymore. You can however in that case, set up your own trial account for free on the Google website.