IC Speech‎ > ‎

IC Wordmaker

Wordmaker


Tripod.com

Published Juni 1981 Electronics and Music Maker
by Raj Gunawardana, Texas Instruments Ltd.
Constructional details by Glenn Rogers and Peter Kershaw

  • E and MM brings you the first majar Solid State Speech project for under 100 UK pounds
  • Promises to have a dramatic impact on state-of-the-art electronics - now, and for generations to come
  • Complex talking library of over 200 words with further expansion space
  • Easy interfacing to a microcomputer through a few lines of BASIC
  • Pitch control has exciting electronic music applications

wordmaker

  • Wordmaker

  • For some ten years Texas Instruments have been developing solid state speech technology with the result that speech can now be produced which faithfully preserves the character of the spoken voice including intonation, accent, dialect, and pitch. Linked to a microcomputer, words can be strung together to make complete phrases and sentences so that voice communication between 'computer' and human becomes possible.

Wordmaker
wordmaker

  • The uses of this project are far reaching and will be of benefit to almost anyone who uses it. The carefully selected word library has many applications in the home and industry, for telephone, burglar alarms, conversations, messages, games, electronic terms, studio control speaking clock, temperature indication, calendar, business coding, factory announcements, and accountancy.
  • This month we shall present the complete building project which can be purchased as a kit and explain how to interface the WORDMAKER board to a microcomputer. Possible interface circuits are included and BASIC programs are also given. It has already been fully tested on the Sharp MZ-80K and Tangerine systems. Further details are provided later for other popular micros and we shall be following this article with additional information on the processes of speech synthesis employed, and readers' ideas for interfacing and use will be welcomed.
  • The E and MM WORDMAKER Speech Synthesiser is based on the Texas Instruments Voice Synthesis Processor (VSP). This card can be interfaced to any computer system or used as an independent unit. The card comprises the Texas TMS 5100 Voice Synthesis Processor; a memory bank containing the vocabulary and an onboard amplifier.

VSP card
VSP card

  • The synthesis method used is called Linear Predictive Coding (LPC). This is a technique developed by Texas which minimises the amount of storage needed for each word. Human speech, like most communication signals, contains a large proportion of redundant information. LPC involves looking at the complete word as a binary data string and removing any redundant data. The coding is then tested to check that the word is spoken satisfactorily. The TMS 5100 contains a 10pole digital filter which synthesises the voice; the filter is controlled by the LPC data. For each word sample, the length of the data string written to the TMS 5100 may vary from 4 to 49 bits. The device, therefore, requires quite a high level of 'intelligence'.

  • The Heart of the System-TMS 5100
  • The TMS 5100 has five control lines. The command is set up on the CTL lines and executed by toggling the command clock line, PDC. Table 1 shows the complete list of commands and Figure 1 gives the pin configuration of the IC.

Figure 1: TMS 5100 pin details
Table 1: The TMS 5100 VSP Command Summary
Pin Details and Command Summary

  • Load Address Command: This command causes the VSP to accept a subsequent nibble (4-bits) of data set up on CTL lines as a speech address segment which is transferred to voice synthesis (V/S) ROM address registers. Read and Branch: This instructs the VSP to set up appropriate control signals to the V/S ROM, causing it to update its address registers with the contents of the currently addressed pair of bytes.

  • Speak: On receiving this command, the VSP takes over the control of the V/S ROM and generates pulses on its l/O line to fetch bit serial data from ROM and commences speech. Pulses on the I/O line occur in bursts of a frame interval of twentyfive milliseconds. The number of pulses in any one frame varies from 4 to 49, depending on the data. The timing of I/O pulses for a maximum length of 49 bits, is shown in Figure 2. Details of the data structure will be discussed in a future article.

Figure 2: IO Pulse and Data input to TMS 5100 in Talk mode
Pulse/Data input in Talk mode

  • Test Busy: This command permits the controller to access the TALK STATUS LATCH of the VSR In operation the command is first set up on CTL lines and the PDC line toggled once. A subsequent toggle of the PDC line enables the Talk Status to be output to CTL1 line. The Talk Status will be high during the execution of speech generation and will be set low on an END or PHRASE code being' encountered. A third toggle of the PDC line is required to return the VSP to a state of accepting new commands.

  • Read Bit: This command causes the VSP to generate a single pulse on the I/O line and thus read a single bit of data. Each data read is input via the ADD8 line to 4-bit shift register in the VSP. Hence four consecutive read operations are required to completely update the shift register contents.

  • Output: On receiving this command the VSP is initialised into outputting its buffer contents to the CTL lines. A second PDC toggle enables the CTL output buffers and a third is required to return the VSP to the command mode. The output commamd coupled with Read Bit thus allows the controller to access auxillary data stored in the V/S ROM.

  • Reset: This command is used to establish known initial conditions in the internal circuitry of the VSP in readiness for a following sequence of commands. Since, the CTL lines convey data as well as commands to the VSP, when previous conditions are not known, it is possible that a command can be conveyed as data. Hence, it is necessary toggle PDC at least three times whilst maintaining the Reset command on CTL lines, to ensure correct synchronisation of subsequent commands. Reset can be used in the middle of speech to stop VSP execution. In the circuit design discussed in this article only the Reset, Test Busy and Speak commands are used.

  • Interfacing the VSP Design Considerations

  • The following requirements have been considered in interfacing the TMS 5100 to a microcomputer to operate as a speech peripheral:
    • (1) SPEECH DATA MEMORY should have means of serial data output and an autoincrementing address register for sequential data access.
    • (2) SPEECH DATA ADDRESS should be presetable from the host processor to define current enunciation required.
    • (3) THE CONTROL INTERFACE should be consistent with device specifications (of TMS 5100)
    • (4) SIGNAL LEVELS to and from the controller should be TTL compatible.
  • As far as speech data memory is concerned, two approaches can be made in implementation:
    • (1) Speech data can be stored externally to the processor in nonvolatile memory for stand alone operation.
    • (2) Speech data can be supplied from within the processor with synchronisation to suit the TMS 5100 timing (see Figure 2).

Figure 3: VSP interface
VSP Interface

  • The circuit discussed in this article takes the first approach, to achieve stand alone operation. Figure 3 shows how the VSP could be interfaced toa microcomputer by implementing a direct data path between the address counter (and the controller, instead of via the TMS 5100 CTL lines. This feature avoids the neeed to decode various commands (e.g. Load Address), to maintain a record of command sequences and to build up the contents of the address registers, one nibble at a time.
  • Speech data memory can, in theory, either be non-volatile or Random Access Memory (RAM). If memory comprises RAM, it would be possible to 'overlay' speech code read out from a slow bulk storage peripheral such as floppy disc or cassette tape. The circuit discussed in this article, however, uses a choice of EPROM types for speech data storage.

  • Practical Implementation
  • Figure 4 shows a practical circuit designed in accordance with the architecture discussed. The circuit is designed to be driven from a byte oriented bus and requires a number of control bits to clock data and to monitor VSP busy conditions. Once commanded to TALK, the circuit will operate independently of the processor to generate a single utterance. Concatenation of such utterances has to be carried out by the host processor.

Figure 4: Wordmaker complete circuit diagram
Circuit Diagram

  • The Control Interface. The control interface comprises four lines named C0, C1, CCLK and BUSY. C0 and C1 are used to set up three commands on CTL2, CTL4 and CTL8 lines, as shown in Table 2.
    Command C0 C1
    Reset 1 1
    Talk 0 1
    Testbusy 0 0
    Invalid 1 0
    Transistors TR1,TR2 and TR3 are used to convert TTL levels to drive voltages suited to the TMS 5100. CCLK is used to clock commands set up on C0 and C1 lines into the VSP. The VSP clock line, PDC, has to change synchronously with the VSP ROM clock line. This is achieved by the use of lC2b as a synchroniser. The CCLK line should therefore be held high for a minimum duration, of 6.25 microseconds to guarantee that a command would be accepted by the VSP. The busy line can be used in one of two ways to monitor the end of an utterance.. During speech and when the CTL1 line is in a disabled state, the BUSY line will be low, producing a high level only when CTL1 is enabled and subsequent to encountering an END OF PHRASE code. Hence, the host processor can be made to monitor the BUSY line until a high level is detected. Alternatively, more efficient use of the, host processor can be achieved by using the positive going edge of the BUSY signal to generate an interrupt.

  • Speech Address Buffer/Counter The address counter comprises four 74LS193 ICs which are 4bit binary counters with parallel loading capability (IC4-IC7). The starting address is loaded from the data input lines D0-D7, in two stages. Applying, a low logic level to LDA1 causes the less significant byte of the counter (IC6 & IC7) to be loaded with data setup on input lines D0-D7. Applying a low logic level to LDA0 loads the more significant byte of the counter. Byte address incrementing pulses are derived from IC8 which is programmed as a module 8 counter. lC3b is clocked with pulses generated by VSP on I/O line. IC3b is used to invert the I/O line and as a buffer to provide greater fanout capability. This results in IC8 incrementing its contents on the negative edge of the I/O pulse and consequently keeping track of bitcount, at a byte level, for accessing bit serial speech data. At the commencement of speech, speech data is, output starting with least significant bit of the first speech data byte. Hence, IC8 is cleared every time a new address byte is loaded into the less significant byte of the address counter. The 16bit address counter permits a maximum speech memory capacity of 64K bytes. The total capacity of the memory can be expanded by using extra counter stages, if required. A 64K byte memory will store approximately 600 spoken words.

  • Speech Data Memory In the circuit shown speech data can be stored in TMS 2516 (16K-bit), TMS 2532 (32K-bit) or TMS 2564 (64Kbit) EPROMs, by wiring an appropriate set of links. Tables 3, 4 and 5 show the links required for each EPROM type and the resulting memory maps. Serial data is derived by the use of a 74LS151, an eight to one line multiplexer. IC10 data input is fed from the data output of EPROMs. The select input of IC10 is obtained from IC8 which maintains a module 8 count which is incremented once, when a single data bit is accessed by the VSP The output of the multiplexer is conveyed through IC2a which is used as a single-bit shiftregister clocked by I/O pulse. The purpose of IC2a is to synchronise serial speech data such that data equested by a particular I/O pulse (see Figure 2) is stored unchanged despite the bit count and the memory address changing as a result of address incrementation.

Table 3: Speech memory address mapping for TMS 2516
Table 4: Speech memory address mapping for TMS 2532
Table 5: Speech memory mapping for TMS 2564
Memory Adress Mapping

  • Audio signal Conditioning IC11, a quadoperational amplifier is used to condition the differential audio output of the VSP (SP1 and SP2) into a form suitable for driving a general purpose 8-ohm speaker. IC11a converts the differential pushpull output current into a singleended voltage output. This signal is then low pass filtered by the active filter comprising IC11b to get rid of any harmonic distortion, generated by the 8Khz sampled output from the D to A converter. The thirdstage of the op-amp is used along with transistors TR6-9, to provide power amplification. The amplifier is capable of producing up to 4.5 Watts of audio power into an 8 Ohm speaker. At this power rating, it will be necessary to mount TR8 and TR9 on heatsinks to maintain devices within operating temperature. At reduced power levels, the heat sinking area etched on the PCB should be adequate for normal operation.

  • Power Supply Requirements Figure 6 shows the distribution of power supplies in the circuit. The negative 5 volt supply is generated on the PCB, by using REG1 (voltage regulator) and tapping on to the negative 12 volts supply. Typical power requirements (for a board fully populated with TMS 2532 EPROMs) are +5V @ 300mA, +12V @ 50mA, -12V @ 50mA without any audio output.

Figure 6: Power supply distribution
Power Supply Diagram

  • Speech Data EPROMs Table 7 gives the speech starting addresses for data in the PROMs provided as parts of the kit. EMM1, EMM2, EMM3 and EMM4 should be plugged into IC sockets IC12, IC13, IC14 and IC15 respectively. The links should be connected according to Table 4 (i.e. same as for TMS 2532 EPROMs). In the kit SPST DIL switches are provided for this purpose.

Table 7: E"MM speech data EPROM listing
Eprom Data Listing


  • Construction and Setting Up

  • Figure 5 shows the component overlay for the circuit. The first step is to fit all the necessary links between the two sides of the PCB (using Track pins or small lengths of wire). Care should be taken when soldering on this board as the tracks are fine and often very close together. The resistors and capacitors can then be fitted, followed by the diodes, soldering both sides where necessary. Next, make all the IC sockets using Soldercon connectors and again solder to both sides of the PCB where necessary. Having completed these stages you can fit the transistors, the voltage regulator (REG1) and IC11.The powertransistors (TR8, TR9) and the negative 5 volt regulator should be positioned flat as shown in Figure 5 and bolted on to the PCB to achieve good thermal dissipation.

Figure 5: Component Overlay
Component Overlay

  • Before you plug in any more ICs, connect a low impedance speaker and power up the (connection details are shown in Table 6). Check the supply currents and voltages (the current should be approximately 50mA on +12V lines and negligible on +5V line). Next, check the amplifier is operating. If all is well can proceed and fit the rest of the components. The pin numbers given in circuit diagram are correct for the TMS 2564 only. TMS 2532 and TMS 2516 ICs have 24 pins compared with 28 pins for the 2564. The signal lines match when the lower 24 pins of the 28 pin configuration are used (i.e. pin numbers 1, 2, 27 & 28 are not used).

Table 6: Edge connector details
Edge Connector

  • Note: When using 24-pin p packages you must link pin 28 to pin 26 on ICs 12-19 (see photo) For correct speed of operation the TMS 5100 internal clock frequency should be adjusted with RV1 to obtain a square wave of period 6.25us (a frequency of 160KHz) at ROM CLK (pin 3) of IC1. The correct adjustment is nominally midway on RV1. If instruments for adjustment are not available, good results can be obtained by listening to the speech output and making the adjustment such that the output sounds 'normal'.

Photo: Wire links in place for EPROM's supplied
Wire Links

  • Care should be taken in handling the TMS 5100 which can be damaged by static discharges. The kit of parts does not contain an edge connector. The RS467-425 20-way, double edge connector is suitable and instead of Veropins for soldering the speaker connections, the screw connector socket (RS423-762) can be used (both available from Radio Spares). A suitable Power Supply circuit diagram for the WORDMAKER is shown in Figure 11.

Figure 11: Suggested power supply
PS Circuit Diagram

  • Now you know all about the E&MM WORDMAKER but is it any use to you? The all-important question is 'Will it interface to my microcomputer?' Well, here is a simple guide to give you some idea. List 1 contains all the popular systems which can be used with available modules. List 2 contains all the popular microcomputers which will drive the WORDMAKER if a simple dedicated interface is used such as the one shown.

LIST 1
Sharp MZ-80K with parallel I/O card and expansion unit
Nascom 1 & 2 as standard
Apple/ITT 2020 with parallel I/O card
Commodore Pet with parallel I/O expansion
Atari 400 & 800 with parallel I/O expansion
Tangerine Micron as standard
Acorn as standard
Video Genie with parallel I/O expansion

LIST 2
Microcomputer Addressing mode
Sharp MZ-80K I/O mapped
Tandy TRS 80 I/O mapped
Sinclair Zx80/81 I/O mapped
Apple/ITT 2020 memory mapped
Commondore Pet memory mapped
Atari 400 & 800 memory mapped
UK 101 memory mapped
OHIO Superboard memory mapped

  • Using the Wordmaker

  • Communication with the VSP card is carried out through two ports; one to supply the address of the word defining data in the V/S ROM, and the other to set the various control functions. There are two preset potentiometers on the card; RV1 controls the spead and pitch of the voice; RV2 controls the volume of the onboard amplifier. All the connections on the board are TTL compatible for easy interfacing (see Figures 9 and 10).

Figure 9: Connection to a standard PIO/PIA port
Standard PI0/PIA

Figure 10: Purpose-built interface (memory mapped or I/O addressed
Interface MM or I/O Addressed

  • The VSP card is very simple to use and the flowcharts in Figure 7 show the sequence of operations. Figure 8 shows the sequence of commands and the relevant timing. On 'power up' the card must be initialised bysettirig C0 and C1 to 'RESET' (see Table 2), toggling CCLK 3 times then setting C0 and C1 to 'TEST BUSY' and toggling CCLK a further 3 times. The card is then ready to talk to you. The flowchart in Figure 7(a) also shows a 'dummy test talk' command can be executed in order to avoid an audible click that may be generated prior to commencement of speech. To make it speak the address of the word is written to the card.

Figure 7: Flowchart
Flowcharts


Figure 8: VSP interface control signal timing
Control Signal Timing

  • The two address bytes are latched into the VSP card by taking LDA1 (for LS byte) and LDA0 (for MS byte) low for at least 6.25us (at an oscillator frequency of 640kHz). It is important to note that the LS byte must be loaded first. Having set up the address all we need to do now is send the 'TALK' command on C0 and C1 and toggle CCLK once. Then: Hey Presto, it speaks! If any problems are encountered at this point, a logic probe will be useful for checking that the control and data input lines are providing the correct 'high'/'Iow' signals via the board connector to the EPROMs and associated logic ICs. Resistor values for R40, R41 R42 may need changing in order to get the right 'pullup'.

Wordmaker circuit board
Circuit Board

  • If you are using the VSP card with a computer system it wi probably become necessary some stage to be able to test whe one word has finished so you ca start another. If you try and start word while the VSP is speaking, will miss the end of the first word and say the next - or it might just stop altogether. Using the 'TEST BUSY' command it is possible to monitor the BUSY line. This done by setting the 'TEST BUSY' command on C0 and C1 and toggling CCLK twice, then reading the BUSY line. When BUSY goes high you toggle CCLK once more and then initiate the next talk cycle. The BUSY line output (connector pin 38) need not be connected when first testing the board for correct speech opera tion (e.g. using Test Program 1)

  • Some Simple Programs

  • These programs are written BASIC to run on the Shar MZ-80K. The port is assumed to be addressed I/O. If you wish to use a memory-mapped syster replace all output statement with 'pokes' and input statement with 'peeks'. The programs ar written as subroutines to aIIow them to be easily incorporate into existing BASIC program (see Subroutines). During the 'Initialise' subroutine, you will need to specify the port address. On the Sharp this is simply two numbers, say 2 and 3.

  • Test Program 1: By entering the word start address in decimal when prompted, the VSP card will say the word. WL=LS byte, WH=MS byte. This program is a continual loop and to stop use escape', 'break' or 'control C' command. Test Program 2: By entering a string of word start addresses in the DATA line as follows: WL1, WH1, WL2, WH2 ...., the VSP card can be made to speak the entered sentence or phrase. If you use the data list (lines 35-62) the WORDMAKER speaks the whole word library available in correct EPROM order (see Table 7). Note that the 'decimal' Address has the correct numbers for operation instead of a straightforward Hex conversion. Test Program 3: This program, based on Test Programs 1 and 2, gives some sample sentences and tones which are recorded on our demonstration cassette No.2. Pauses of varying lengths are easily made by inserting a FOR / NEXT loop at line 28 as shown. Some idea of the musical potential, using varying pitch/clock rates by adjusting RV1 (this can be increased to 100k for greater range), is also given. Exciting possibilities are evident here.

Program listings 1
Program Listing


Program listings 2
Program Listings


Parts list
Parts List

  • We hope you will find the simple programs helpful in your investigation into the world of talking computers and that you wont spend too many hours talking to your computer as opposed to your family or friends!E"MM
  • RDjan2001
Comments