0 - The Context comes with ILA Beta v3.8

veröffentlicht um 16.06.2015, 13:43 von Florian Quirin   [ aktualisiert: 16.06.2015, 14:50 ]

Hello everybody! It's been a while since the last update but finally here it is: ILA Beta v3.8! :-)

The most obvious first, ILA got some fresh new look ^^. Hope you like it! (if not don't worry you can always go back to classic). The more subtle changes are various improvements in the Add-ons and finally ILA got some context ... I mean commands that depend on context :-) Here is a more detailed patch list:
  • New looks for ILA: 4 new skins to choose from plus the classic look.
  • Context dependent commands that can be reused. An example: imagine you'd say "open the third" ... could be useful in many situations right, so now you can define it multiple times. See the tutorial for a detailed description about contexts.
  • Besides the normal context, there is also a super-context. Commands defined with this parameter will only work in a certain environment that you can activate before (imagine you'd switch from your smart home to your smart car, same commands will probably trigger different actions).
  • Add-ons have been improved and support 'open parameter' commands like every other command of ILA (see teaching tutorial if you don't know what I mean :-) ) and it is possible now to load answers from an external file including support for languages and parameters.
  • 'Open parameters' work with custom answer/conversation commands now as well, you just have to add 2 brackets at the end, e.g. use 'do you like ()' and define a couple of answers and ILA will answer everything like "do you like movies", "do you like ice cream" ... with these random answers.
  • Word and letter error thresholds used to do the auto-correction/adaption of unknown commands can be set in the now. Increasing the values would make unknown commands more likely to get accepted when compared to known ones, but be careful sometimes a little word like 'not' can change the meaning of a sentence completely ;-)
  • As usual there are many bugs fixed and some tweaks done to the UI (e.g.: skipping and aborting stuff is more reliable now).
  • oh and before I forget, I've added the first (very rough and basic) version of an XBMC/Kodi voice remote add-on, feel free to improve it ^^

Hope you like it! (context: the update) ;-)

You might know by now, here you can post questions and comments and there is also a  facebook page ^^.

have fun!


1 - More freedom in ILA beta v3.7

veröffentlicht um 25.04.2015, 10:54 von Florian Quirin   [ aktualisiert: 16.06.2015, 12:50 ]

Good day and welcome to ILA Beta v3.7! :-)

Since some time now ILA works pretty reliable using the Sphinx-4 (offline) engine to recognize speech. This works mainly by restricting the vocabulary to obtain good recognition results. Restriction means a working system in this case ... but let's face it: we want FREEDOM! because fun comes with a large vocabulary :-) Up till now Google was able to give us this freedom but it required an API key (complicated to get) and came with restrictions of usage :-( ... so no real freedom. But thanks to the wonderful web API of Google and a nice technique called websockets we are finally free of restrictions now! Of cause that also means Google is free to gather more information on us while we use their services! =)

Besides freedom ILA beta 3.7 comes with a hand full of other neat features, here is the (approx.) complete list:

  • Google is back! Full implementation of a new recognizer using Google's web speech API. To get this thing running please see the tutorial (basically you just need to install Chrome or Chromium browser)
  • The add-on (plug-in) system of ILA has been reworked introducing more freedom (again ^^) and accessibility in designing add-ons. Add-ons are now completely independent classes and the commands they supply are auto-loaded into ILA's teachGUI. You only have to make one basic class implementing the ILA_addon_interface and the rest is up to you. For more details watch out for a new tutorial and have a look at the new developer version 3.7.
  • Freedom is the word of the day ^^ so why not write your own speech recognizer or synthesizer for ILA using one of the two new classes 'STT_CustomRecognizer' and 'TTS_CustomSynthesizer'.
  • Handling processes has been greatly improved especially for Linux systems giving you a) the freedom(!) to choose a program to open apps/files and b) keeping track of multiple processes in preparation of advanced opening and closing commands (updates will follow e.g.: "ILA, open my music player", "ILA, close my music player"). See the 'Apps' folder for more details!
  • Approximate matching of input (speech and commands) has been improved to check not only for wrong words ('approxSearchErrorRateThresh') but also for letters. Watch out for the new 'approxSearchLetterErrorsThresh=3' in the file (3 means if we have less than 4 wrong letters compared to a saved command it's still recognized).
  • Individual browser calls can be deactivated now by removing (or commenting out) links from 'ILA/Data/linkList_xy.txt' file. Why do we want to do this? Because we want to have the freedom(^^) to decide if the browser opens when asked for the weather for example.
  • In case you notice a feedback of the confirmation blib-sound in your recordings (rec.wav) you can add a delay to the recording. Check '/Data/' for 'recordingDelay=0' (given in milliseconds).
  • Removed some bugs like broken UTF-8 encoding in ILA's memory files, failed overwriting of double commands while teaching and minor stuff regarding weather, processes, dates/reminders and more.

Hope you like the new freedom ;-)

As usual you can post your questions and comments in the discussion forum or on facebook.

have fun!


2 - Major updates in Beta v3.6

veröffentlicht um 01.04.2015, 17:38 von Florian Quirin   [ aktualisiert: 16.06.2015, 12:49 ]

Welcome to Beta v3.6 the next major update of ILA!

This time the emphasis lies on more content and improved Pocketsphinx support. Pocketsphinx is the lightweight speech recognition engine of CMU that is optimized for mobile devices and single-board computers like the Raspberry Pi(2). It also comes with a "keyword search" mode giving the user an alternative to Sphinx-4 when using "Hey ILA" to activate the assistant.
Here is a more detailed list of what's new:

- Introduction of the new 'reminders' ability, enabling ILA to understand commands like "remind me tomorrow at 2 pm to attend the meeting", "set a reminder for the game at 20:30 o'clock", "remind me in 12 minutes to check the oven" or "remind me on the next start to teach ILA new stuff" :-)
- Together with the new reminders ILA got kind of an personality upgrade called ILA-Intentions that controls the activation of reminders by starting a conversation. Missed reminders will be indicated as a clickable blinking spot next to ILA.
- New commands to manage reminders: "show all reminders", "remove a reminder", "show missed reminders", "remove all missed reminders"
- The 'Contacts' list of ILA has been restructured to support UTF-8 and new contact-based commands. The first implementation is 'search contacts for a birthday' (available in the teach GUI).
- Extended support for Pocketsphinx including a tutorial on how to install it (unfortunately it is not platform-independant and needs to be compiled on your system). Windows libraries for 32bit and 64bit are already included and should work right away. Some Linux libraries are also there if you are lucky you might be able to skip installation :-)
- Included the optimized Pocketsphinx language model (english). Make sure to set 'pocket-en-us' in the settings when using Pocketsphinx in english. As this model is incompatible with Sphinx-4 there is also a new option called 'defaultAcousticModel' in the file (ILA/Data/).
- The responsiveness/speed of Sphinx-4 has been further improved by a smarter resource allocation.
- most of the hard-coded language specific methods and lines of code have been moved to ILA_answers and ILA_interpreter to enable programmers to better translate ILA to other languages than english and german
- Animations in power-saver mode have been improved. More fancy + same performance :-)
- Removed some dependencies from other libraries.
- Fixed a bug in the Google API that prevented the recognizer to automatically stop on empty input (user said nothing).
- Any sentence containing the keyword "computer" can now be set in the 'heyILA' grammar or 'pocketsphinx_key' file in addition to "hey ILA" so you can activate ILA for example with an "oh mighty computer".
- Fixed the INSTALL scripts to support installation paths that contain spaces (yep that was crashing the old versions).

As usual you can post your questions and comments in the discussion forum or on facebook.

Hope you like all the new stuff! :-)
- Florian

Bug fixing and Pocketsphinx in Beta v3.5

veröffentlicht um 07.03.2015, 04:59 von Florian Quirin   [ aktualisiert: 16.06.2015, 12:49 ]

ILA BETA v3.5 - fixes for Linux and Pocketsphinx support

I noticed some nasty bugs in the Linux (Ubuntu 14.04) version of ILA mainly coming from conflicts between Minim, Java an PulseAudio. In the worst case ILA could not speak anymore or was crashing a lot. I fixed what I could but if you still have problems I strongly recommend to install the latest Oracle Java 8. Together with the bug fixing I've also included some new features :-)

- Integration of Pocketsphinx command line tool to better support low performance systems. You need to have Pocketsphinx installed (Linux) or put a pre compiled version (Win) in the subfolder SpeechData/Pocketsphinx (a Win8 version is included) to use it. Unfortunately I couldn't get the keyphrase spotter running as Java process yet. There is a config file for Pocketsphinx too in case you want to add parameters (SpeechData/default.pocketsphinx.config) (see pocketsphinx tutorial)
- Added grammar/non-grammar switching for pocketsphinx too and improved it to work better in combination with 'hey ILA' (if you've deactivated grammar restrictions completely)
- I've completely rewritten the 3rd layer of input analysis (1st is check of teachIt memory-file, 2nd is keyword isolation). If the 1st and 2nd layer fail ILA will try to do an approximate match to the language-knowledge-base. The approximate match is done with (kind of) an edit-distance. The threshold for an approximate match can be set in the config file (Data/ -> approxSearchErrorRateThresh (basically WER))
- For people experimenting with 'addons' there is a new method "ILA_speechControl.askDirectQuestion("whaaaat?");" that can be used to let ILA ask direct questions from anywhere inside the code. The answer is obtained by checking ILA.lastInput. Check the 'batchaccuracytest' addon for an example.

If you find bugs please report them to me! For example in the discussion forum.

Thats it for now. Hope you enjoy :-)
- Florian

Shiny and new ILA Beta v3.4 :-)

veröffentlicht um 28.02.2015, 10:35 von Florian Quirin   [ aktualisiert: 01.04.2015, 15:47 ]

Hello everybody,

just a few days after the release of Beta v3.3 I'm happy to present you v3.4 already :-) I needed to fix some bugs and took the chance to include a bunch of improvements too!
Here is the (almost) complete list of changes:

- fixed the timeout bug (you might have seen the v3.3 quick fix) in the system 'test' command and made it a bit more fancy :-)
- added the contacts list to the automatic creation of the dynamic language model (dlm) (yes! there is a contacts.txt list ^^)
- removed any numbers from 'App'-names during auto loading into the dlm, ILA is not very flexible with number-to-string conversion yet :-(
- added the possibility to correct (delete) what you have said by saying "I repeat" (de: ich wiederhole) or "I said" (de: ich sagte) followed by a short pause. So when you know you messed up an input just say "-pause- 'I repeat' -pause-" :-) This works especially nice in the Live-mode! (only Sphinx-4)
- added some more ILA comments when the program needs to reload stuff so you know now that you have to wait a bit ;-)
- added some tooltips to settings (especially for the selection of the default recognizer)
- finally fixed saving and loading of the speaker adaptation data for good (it works reliable now with all tested models)
- added the PTM 8kHz acoustic model to the default set of models. I recommend to try this one if your accuracy is rather low. To use it please adjust the 'acoustic model' in settings.
- auto-loading the sample rate of acoustic models by placing a ''-file inside the folder of the AM (see included models)
- added a 'test accuracy' command to test again the accuracy of the recorded speech in Data\test.wav (created during speaker adaptation)
- added a 'batch test' command as an addon to test a bunch of .wav-files recorded and saved with transcription. You can use the 'amt' (acoustic model training) command to record these files
- ILA saves the speech recorded during the system test ('test') now and uses that to initialize the recognizer (usually the first sentences where always crap somehow oO ^^)
- included updates in Sphinx-4 (LiveCMN and BatchCMN improves the recognizer? small case dictionary)
- completely rewrote the Google speech recognition part to get rid of old bugs and dependencies and removed the old API
- more bug fixing

I've also uploaded a 'developer version' of ILA that comes with less 'extras' like different acoustic models but with some source code to let you develop add-ons or adapt the language files right away so check it out :-) I'll add more code over time.

For more info and tutorials check out ILA's homepage
If you have any questions or want to share your project please post here

Have fun!

Let me introduce to you: ILA Beta v3.3

veröffentlicht um 20.02.2015, 12:17 von Florian Quirin   [ aktualisiert: 07.03.2015, 05:13 ]

This is ILA Beta v3.3 the next big step forward :-)

ILA has just become even more customizable, more reliable, faster and smarter!
Here is a list of what's new:

- ILA has been updated to support the open source Text-to-Speech System MaryTTS this means basically 2 things: ILA is completely free from any cloud service now (if you want) aaaand you can add new voices yeah! :-)

- Beta v3.3 introduces a new dynamic language model, something I really enjoy! It means even the grammar-free mode can learn now everything you teach ILA and the program is getting better in recognizing these commands every time. One major step away from grammar restrictions to natural language recognition.

- in addition all the links/program names inside the 'Apps' folder are automatically part of the dynamic model (if the dictionary knows the words - you can check that in the START_debug mode)

- thanks to tweaks done by the sphinx-4 team and some pre-loading of stuff ILA is more responsive now and works a bit faster

- ILA will warn you now when you want to teach her a new command that includes unknown words and asks you to add these words to the dictionary yourself

- there is a welcome screen now that'll give you some basic info and tell you how to adapt ILA better to your hardware

- speaker adaptation can greatly improve speech recognition accuracy. Unfortunately there was a bug in ILA where the previously adapted model was not loaded correctly and in the end there might have been no improvement at all after restart. Speaker adaptation has been fixed and greatly improved so you will get more feedback about the result. In addition adaptation works now for the en-us acoustic PTM model as well

- many people asked me for the source code of ILA. As this is still a beta version and many things are changing quickly I don't think it helps the project right now to release it completely buuut there is an 'Addons' folder now with smaller parts of ILA available as code. The focus right now lies on "expansion" and "localization". I'll write more about that soon for now please check the files "Addons/ILA_answers.jar","Addons/ILA_addons.jar" and "Addons/ILA_welcome.jar". You can simply extract these 3 files with a ZIP program to get the source code (along with the Java Class) :-)

- all text coding has been changed to UTF-8 standard. I hope that fixes all the problems with special characters and opens up new possibilities for other languages

- as usually there is also a lot of tweaking, dictionary updates, command updates and bug fixes

I hope you like the new version as much as I do :-D and I'd be happy to hear about your experiences with ILA!
Don't forget to vistit ILA's discussion forum or Facebook page ;-)


3 - ILA feature @CMU

veröffentlicht um 03.02.2015, 08:18 von Florian Quirin   [ aktualisiert: 07.03.2015, 05:13 ]

Check out the article about ILA on the CMU Sphinx homepage :-D

Hello and welcome to ILA Beta v3.2!

veröffentlicht um 28.01.2015, 17:43 von Florian Quirin   [ aktualisiert: 28.02.2015, 10:19 ]

I'm happy to announce the release of ILA Beta v3.2! :-)

The main focus lies on improving grammar-free speech recognition accuracy with Sphinx-4 bringing ILA one step closer to becoming independant of Googles's Speech API. Here are the recent patch notes:

- updated to the most recent version of Sphinx-4 with support for the new PTM acoustic models
- included the new CMU Sphinx en-us acoustic model (non-PTM) with greatly improved accuracy
- added MLLR unsupervised speaker adaptation (type 'ussa' in ILA's input field) and auto-loading of MLLR_matrix files when added to the acoustic model folder
- added a GUI to help you train your acoustic models (type 'amt' in ILA's input field) (tutorial soon)
- the pre-rec recognizer (settings->ILA speech engine->Sphinx-4 offline (rec)) works much better now with in the grammar-free mode (settings->use grammar->red(off)).
the LiveSpeechRecognizer (Sphinx-4 offline (live)) only works reliable with grammar turned on, I'm trying to find the problem!
- to be able to use grammar-free mode I've added two simple language models for 'en' and 'de'
- new setting that allows grammar + non-grammar mixing (settings->use grammar on ILA question) that means ILA will switch back to grammar mode when you specifically told her to in e.g. an 'open parameter' command
- 'open parameter' commands have been improved to filter user-specified words to prevent things like "play some musik of of Jimi Hendrix" (tutorial available soon)
- added a button for the audio samplerate to the settings (if you want to use 8kHz acoustic models) and fixed a bug that actually prevented switching to anything else than 16kHz
- when adding new commands to the grammar and ILA's memory (teachit_xy.txt) ILA checks now if this command already exists and replaces the old one instead of adding a 'dead' command to the end of the files
- added a bunch of Icons to the windows shown in the taskbar (windows) and an updated manifest file (for windows start screen)
- many more or less visible tweaks to the UI and as usual bugfixing (yes it's still a beta ;-) ) e.g. fixing problems with unsupported translucency and buttons in the Mac version

Have fun and vistit ILA's discussion forum or Facebook page if you like :-)


veröffentlicht um 06.01.2015, 08:14 von Florian Quirin   [ aktualisiert: 06.01.2015, 18:35 ]

Welcome to 2015! and welcome to ILA Beta 3.0 :-D

A lot of new features have been added with the main focus on extending possibilities for user-defined commands giving ILA the ability to ask for parameters with self-defined questions and on-the-fly grammar switching (if you are using sphinx4). Here are the patch notes:

- on-the-fly grammar change for sphinx-4 with the ability to load your own grammar-file (tutorial)
- extended open parameters (custom question and auto-asking)
- RSS Feed reading (e.g.: ILA will read the first 3 headlines and you can ask her to open the corresponding link), be sure to check-out the new 'load personal news feed'-command :-)
- test-Command Button in teach-interface, you can test a custom command now before saving it ^^
- you don't know what to ask ILA? Then have a look at the suggestions popping up in the input text field :-)
- INSTALL-scripts for Windows, Linux and MAC (the MAC one could use some love ^^)
- a lot of tweaking and bug-fixing

Have fun and enjoy! :-)
- Florian

comment on ILA in the CMU Sphinx-4 forum or like ILA on facebook :-)

ILA Beta V2.90 out now!

veröffentlicht um 16.12.2014, 15:49 von Florian Quirin   [ aktualisiert: 16.12.2014, 16:18 ]

A new version of ILA is out and we jump directly to v2.90 because there is a lot of new stuff :-) mainly its about some performance enhancements on weaker systems, localization and tweaking:

- global hotkey support: let ILA run in the background and call it by holding a mouse-button for 1s (or whatever hotkey you want), especially nice when operating with wireless mouse or keyboard ;-)
- performance enhancements: ILA is automatically in power saver mode while minimized, in addition there is now a second sphinx-4 mode (settings: sphinx (live), sphinx (rec)) that works by first recording a wav and then transcribing it. That gives the GUI a bit faster response time and seems to work more reliable on older systems.
- new "installation"-script that automatically creates a desktop shortcut of ILA and assigns the Java VM more memory cause I've seen a lot of systems that where behaving weird because of memory issues. ILA with sphinx needs around 400MB I feel
- convenience upgrades: less restarts necessary! No need for a restart anymore after teaching new commands (instant grammar update) and possibility of changing language on-the-fly
- localization updates: teach-keywords in english and german and rearranged order, linkList updates for germany and USA, improved map search
- better support for new languages and switching made easier with new "set language to ..." command (in case you want to make a bunch of custom commands in spanisch, french, turkish, dutch ... google has them all and sphinx just needs a good acoustic model ^^)
- many improvements to the command-interpreter (especially for timers), new words in the dictionaries, more reliable ip-location service, improved test-command
- bugfixes, tweaks in the interface

as usual you can discuss ILA in the CMU Sphinx-4 forum or like ILA on facebook :-)

