I intend to use separate language models to allow for speech to text, and also to capture some user details from speech/voice patterns. This project is like a crash course on speech for me…. projects like sphinx appear to be language independent speech engines, and one needs to layer the desired language models for the application purpose.
Trying to figure out the necessary modules to build an open source based prototype. Please let me know if you have any recommendations. The closest I worked in this domain was when I intern’ed for a telecommunications company in early 90’s. The application was more text-to-speech synthesize for a personalized message hub, using tcl/tk, dragon dictate and some proprietary algorithms.