LipTalk User Manual - Audio Converter

Overview

Requirements

Installation for Windows

Installation for MacOs

Method

2. Method

Installation Win/macOs (Source Code Files)

Getting Started - Audio Converter Operation Manual

Overview

3Deoskill's LipTalkAudioConverter is a free software written in Python, which can convert audio speech files into special *.json files.
These *.json files can be loaded into the plugin in Cinema 4D to create the lip syncs.
It supports up to 2500 languages.

The LipTalkAudioConverter was written in Python and is licensed under the GNU General Public License 3.0.
You can find the link to the license and the source code here.

It uses Python which is licensed unter the PSF Lizenz. You can find the official source code for python here and the license for Python here
It uses the allosaurus module which is licensed unter the GNU-GPL 3.0 License
Here you can find the link to allosaurus and here and the license here

Requirements

Wave Files (Mono and Stereo)

Installation for Windows

For Windows users it is pretty simple.

Download the zip-file for Windows
Unpack it
Start the LipTalkAudioConverter.exe

Video : Installation for Windows

Installation for MacOs

Method

Download the zip-file for MacOs
Unpack it
It will contain a Unix-Executable
Start the LipTalkAudioConverter.exec

2. Method

Download the zip-file package for Windows with the exe file and unpack it
Download the free Software Wine (Wine can run Windows exe files directly in MacOs)
Try to run the LipTalkAudioConverter .exe via Wine
Convert
If you end up with the transcript - everything is fine

Download Wine Software: www.winehq.org/

Video: 1. Method

Video: 2. Method

Installation Win/macOs (Source Code Files)

If none of the above mentioned methods works, you can follow the instructions below.

Python is an high-level interpreted language, which means it requires Python to be installed on your MacOs or Windows in order to run python scripts or develop python software. Python comes with its standard library, but some additional modules are needed for the LipTalkAudioConverter to work properly. You can install these modules by typing a simple command into the Windows CMD or the Terminal in MacOs. It will install all necessary packages into the site-packages folder of your python installation. After the installation is complete, you can run the LipAudioConverter.py which is included in the LipTalk zip-file for macOS by double-clicking it.

Download the LipTalk zip-file for macOS
Unpack it (3 Files - LipTalkAudioConverter.pyc, requirements.txt, bash.sc)
Download Python from python.org (Minimum Version Python 3.9).
The converter was developed with version 3.11.5
Install Python (Install in Sys Path)
Win users open CMD, MacOs users open Terminal
- Type in following commands
- Win: pip install allosaurus
- macOs: python3 pip install allosaurus
It installs now all packages which are necessary for liptalk to work fully
Run the LipAudioConverter.py by double-click
Convert Audio
If you end up with the transcript - everything is fine

Video: Installation with sourc code files

Getting Started - Audio Converter Operation Manual

Language
Phonem Emitter Strength
Advanced Mode (for speech files with longer than average vowels)
Load an audio file (wave-file, mono or stereo)
The path where you want to save the generated transcript file
Convert button to convert the audio file
Status / progress bar

From this menu you can choose the language of the audio file.
International is the standard setting and can convert up to 2500 languages.
You can also find a small selection of languages in the menu that offer special phonetic features to improve the accuracy of speech intelligibility. The best thing to do here is to test which setting gives you the best results.
The Phoneme Emitter value is an important parameter for generating phonemes from speech signals. This parameter sets the sensitivity for detecting phonemes..
The default value is 1.0. If you are not satisfied with the phoneme recognition, you can simply increase the value. However, if too high a value has negative effects, for example if too many phonemes result in unnatural or irregular or shaky animations of the lips, you should reduce the value. Many tests have shown that values below 0.7 are not recommended. The standard value is actually a good base value.
The Advanced Mode checkbox is an option you can use when working with speech files that contain long vowels. It is an internal algorithm in the LipTalk Plugin called the "Frame-Range-Method" which was developed by 3Deoskill. In Normal Mode, the lips are immediately closed after a phoneme is detected using the Release parameter. In Advanced Mode, the lips remain in this position as long as the internal algorithm still associates the amplitude with the phoneme. If the signal falls below a certain level, the Release parameter kicks in and closes the lips. The Threshold parameter can be used to influence this threshold value, similar to a noise gate. The parameter works dynamically and automatically adjusts to the detected original level. Internally, the audio signal is of course smoothed to avoid irregularities. If you enable this option, the amplitude values of the audio file will be included in the transcript file, resulting in a larger file size. For example, a 10-second audio file can be 20 MB in size.
But this enables you to switch between Normal and Advanced Mode in the LipTalk Plugin. If you do not convert in Advanced Mode, only Normal Mode is available.
To convert an audio Wave-File, you can select a file here. The file can be stereo or mono, but the quality should be high. If the level is too low, the noise will increase. In addition, there should be no disturbing background noise that affects the pronunciation of the phonemes. The audio signal should be normalized, but this is not mandatory as the AudioConverter adjusts the amplitudes internally.
If you already have an audio file selected, the program will automatically create a transcript file with the same name and location as the audio file. But you can also browse your computer and define a custom name for the transcript file.
To begin the conversion process, you need to click on the Convert button. Make sure you have an active internet connection, as the program will download a pretrained model from a remote server.
The program displays a progress bar while it performs the conversion. After completing the calculation, it indicates the status Finished.

Tip: To speed up the conversion process, keep the converter running. The pretrained model is already downloaded, so it does not need to do it again. This is very comfortable if you want to create more audio files.

==> Next LipTalk Plugin

Page updated

Google Sites

Report abuse