forked from pumper42nickel/eloquence_threshold
-
Notifications
You must be signed in to change notification settings - Fork 0
/
tts.txt
5178 lines (5177 loc) · 201 KB
/
tts.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
http://web.archive.org/web/20191125091344/http://www.wizzardsoftware.com/docs/tts.pdf
IBM Text-to-Speech
API Reference
Version 6.4.0 Printed in the USA
Note:
Before using this information and the product it supports, be sure to read the general information under
Appendix A, "Notices."
Twelfth Edition (March 2002)
The following paragraph does not apply to the United Kingdom or any country where such provisions are
inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS
PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied
warranties in certain transactions, therefore, this statement may not apply to you. This publication could include
technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these
changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in
the product(s) and/or the program(s) described in this publication at any time.
It is possible that this publication may contain reference to, or information about, IBM products (machines and
programs), programming, or services that are not announced in your country. Such references or information
must not be construed to mean that IBM intends to announce such IBM products, programming, or services in
your country. Requests f or technical information about IBM products should be made to your IBM reseller or
IBM marketing representative.
©Copyright International Business Machines Corporation 1994-2002. All Rights Reserved.
Note to U.S. Government Users—Documentation related to restricted rights— Use, duplication or disclosure is
subject to restrictions set forth in GS ADP Schedule Contract with IBM Corp.Copyright License
This information contains sample application programs in source language, which illustrates
programming techniques. You may copy, modify, and distribute these sample programs
in any form without payment to IBM, for the purposes of developing, using, marketing
or distributing application programs conforming to the application programming interface
for the operating platform for which the sample programs are written. These examples
have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee
or imply reliability, serviceability, or functionality of these programs. You may
also copy, modify, and distribute these sample programs in any form without payment
to IBM for the purposes of developing, using, marketing, or distributing application
programs conforming to IBM's application programming interfaces.
Each copy or any portion of these sample programs or any derivative work, must include
a copyright notice as follows:
© (your company name) (year). Portions of this code are derived from IBM Corp. Sample
Programs. © Copyright IBM Corp.
_enter the year or years_. All rights reservedPage 4IBM Text-to-Speech v
Contents
About This Book 1
Who Should Read This Book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Organization of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Typographical Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
The IBM Text-to-Speech Software Developer’s Kit 3
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Eloquence Command Interface (ECI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
The ECI Application Programming Interface 5
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Structuring an ECI Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Threading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Callbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
User Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
ECI Reference 21
Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Synthesis State Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Voice Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Preset Voice Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Table of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Alphabetical Index of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Annotations 151About This Document
vi IBM Text-to-Speech
ECI Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Selecting a Language and Dialect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Selecting a Voice or Voice Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . 157
Selecting a Speaking Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Modifying Word Emphasis and Tone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Modifying Phrase-Final Intonation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Adding Pauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Specifying Alternative Pronunciations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Custom Filters 169
Implementing a Custom Filter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Dynamic Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Static Filters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Symbolic Phonetic Representations 181
SPR Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
SPR Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
American English SPRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
British English SPRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
German SPRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Canadian French SPRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
French SPRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Standard Italian SPRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Mexican Spanish SPRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Castilian Spanish SPRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Brazilian Portuguese SPRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Finnish SPRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Chinese SPRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Japanese SPRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Code Samples 217IBM Text-to-Speech vii
About This Document
Hello world! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Specifying a language. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Specifying a voice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Specifying a sample rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Specifying voice parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Using annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Concatenative TTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Inserting indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Catching indices – the callback function . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
User dictionaries – main volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
User dictionaries – roots volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
User dictionaries – abbreviations volume . . . . . . . . . . . . . . . . . . . . . . . . . . 226
User dictionaries – extended volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Appendix A Notices 233
Trademarks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Index 235About This Document
viii IBM Text-to-SpeechIBM Text-to-Speech 1
About This Book
This book provides information on incorporating IBM Text-to-Speech technology into other
applications. It describes the programming interfaces available for developers to take advantage of
these features within their applications. This book is prepared in Portable Document Format (PDF) to
provide the advantages of text search and cross-reference hyperlinking and is viewable with the Adobe
Acrobat Reader v.3.x or higher. We recommend that you print all or part of this guide for quick
reference.
Who Should Read This Book?
Read this book if you are a software developer interested in writing applications that use IBM Text-toSpeech technology. This document describes the use of IBM Text-to-Speech technology for
beginning to advanced software engineers.
Organization of This Book
This document is organized in the following manner:
• “The IBM Text-to-Speech Software Developer’s Kit” contains general information about
the structure and organization of the IBM Text-to-Speech SDK, including an overview of the
API interfaces and a description of the SDK-provided tools.
• “The ECI Application Programming Interface” contains information about using IBM
Text-to-Speech with its proprietary “Eloquence Command Interface” API.
• “ECI Reference” contains detailed information about the data types and functions available
for use with the Eloquence Command Interface.
• “Annotations” includes a description of the use of special codes that can be inserted into the
input text to customize the behavior of IBM Text-to-Speech .
• “Symbolic Phonetic Representations” describes the use of special phonetic symbols to
customize pronunciations in IBM Text-to-Speech.About This Book
2 IBM Text-to-Speech
• “Glossary of Linguistic Terms” contains definitions of linguistic terms used in this manual.
Typographical Conventions
The following typographical conventions are used throughout this document to facilitate reading and
comprehension. They are outlined in the following table.
Text Format Applies to
Monospace font Code samples, file and directory names.
Bold Function and callback names; data types (including
structures and enumerations).
Italics Parameter and structure member names; sample text; the
introduction of a new term.
UPPERCASE Property, enumerator, mode, and state names.IBM Text-to-Speech 3
The IBM Text-to-Speech Software
Developer’s Kit
Overview
The IBM TTS allows you to incorporate high-quality text-to-speech functionality into your
applications. This SDK offers developers the application programming interfaces (APIs) for the
proprietary, platform-independent Eloquence Command Interface (ECI). The typical installation of
the IBM Text-to-Speech SDK, which includes this document, along with the IBM TTS RunTime,
provides all the necessary software and support files for these APIs.
The following sections include a brief description of each of the available APIs and directory structure
of this SDK.
Eloquence Command Interface (ECI)
The Eloquence Command Interface (ECI) is a proprietary, platform independent API that allows direct
access to all the functionality and power of the IBM Text-to-Speech. This API:
• Is supported on a variety of operating systems.
• Allows customization of speech output both through function calls and textual annotations.
• Does not use the Windows Registry to find components, allowing developers to include a private
copy of the text-to-speech engine with their application that is less likely to be accidentally
modified by later installations or by other applications.
See the sections The ECI Application Programming Interface and ECI Reference for details on how to
use this API. See the section Annotations for details on the use of ECI annotations to customize
speech output. Eloquence Command Interface (ECI)
4 IBM Text-to-SpeechIBM Text-to-Speech 5
The ECI Application Programming
Interface
Overview
The Eloquence Command Interface (ECI) is a library that provides an interface between applications
and the IBM Text-to-Speech system. Version 6.2 of ECI has been re-architected to provide support for
multiple concurrent speech synthesis threads, and a consistent interface on all supported platforms.
As in prior versions of ECI, text is appended to the input buffer. Each word takes its voice definition
from the active voice. Speech is synthesized from the input buffer according to the associated voice
parameters, placed in the output audio buffer, and sent to the appropriate destination. The active voice
can be set from a number of built-in voices or from a user-defined voice. Language, dialect, and voice
parameters can be modified individually using either ECI function calls or annotations inserted into the
input buffer with the input text. As text is added to the input buffer, the active voice definition is stored
with it, so that changes to the active voice do not affect text already in the input buffer.
Indices can be used to determine when the delimited text fragment has been synthesized. A message
will be received when all text inserted before the index has been synthesized.
Output can be sent to one of three types of destinations: a callback function, a file, or an audio device.
These destination types are mutually exclusive, so sending output to one of them turns off output to the
previous destination. The default destination is an available audio device.Structuring an ECI Program
6 IBM Text-to-Speech
Structuring an ECI Program
Using eciSpeakText for Simple Programs
The simplest way to incorporate text-to-speech into your application is by using the high-level ECI
function eciSpeakText, which speaks the given text to the default audio device. This first sample C
program speaks a short phrase and then exits:
Managing an ECI Instance
In order to use the more powerful features of the ECI API, you will have to manage ECI instances
directly. An ECI instance, in accordance with standard object-oriented procedure, originates with a
call to eciNew and ends with a call to eciDelete.
One basic strategy for managing an ECI instance is outlined below:
• Create a new ECI instance by invoking eciNew.
• If you want ECI to notify you of certain events, register a callback function with a call to
eciRegisterCallback.
• Interact with the ECI instance. You may, for example:
• Add text to the ECI instance’s input buffer with one or more calls to eciAddText.
• To synthesize annotated text, call eciSetParam(eciInstance, eciInputType, 1) before
calling eciAddText. This lets ECI know that the text may contain annotations.
• To use one of the preset voices, call eciCopyVoice before calling eciAddText. The active
voice (voice 0) specifies values for a set of voice characteristics, such as pitch baseline and
#include <eci.h>
int main(int argc, char *argv[])
{
eciSpeakText ("Hello World!", false);
return 0;
}IBM Text-to-Speech 7
Structuring an ECI Program
pitch fluctuation, which are applied to all new text added to the input buffer. See Voice
Parameters for more detailed discussion.
• Change the state of the active voice with calls to eciSetVoiceParam.
• Call eciSynthesize when all text has been added to the input buffer. To synthesize text in lineoriented format, such as a table or list, call eciAddText and eciSynthesize for each line, to
ensure that each line is spoken as a separate sentence.
• If the thread that is managing this ECI instance does not contain a Windows message loop, you
must ask your instance of ECI to report that synthesis is complete. This step will also allow
your registered callback to be called by ECI. You can do this in more than one way:
• Call eciSynchronize, which waits in an efficient state, allowing callbacks to be called,
until synthesis is finished. When synthesis is complete, the function will return control to
the calling thread. Do not call eciSynchronize from a thread that has a Windows message
loop.
• Call eciSpeaking until it returns false. Each call to eciSpeaking will allow your
callback to be called.
If the thread that is managing this ECI instance contains a Windows message loop, this step is
not necessary.
• Use eciDelete to free the resources dedicated to your instance.Structuring an ECI Program
8 IBM Text-to-Speech
The following example speaks a phrase in English, then a phrase in French, then exits:
#include <eci.h>
int main(int argc, char *argv[])
{
ECIHand eciHandle;
eciHandle = eciNew();//Create a new ECI Instance
if (eciHandle!= NULL_ECI_HAND) //Success?
{//Give some text to the instance
if (!eciAddText(eciHandle, "Hello World!"))
{
//We failed to add text
//Print an error message
printf( "eciAddText failed\n" );
}
//Change the language to Standard French,
//if available
if (eciSetParam(eciHandle, eciLanguageDialect,
eciStandardFrench) == -1)
{
//Error Changing to French
printf( "Could not change to French\n" );
}
else
{
//Give some text in French
if (!eciAddText(eciHandle, "Un. Deux. Trois."))
{
//We failed to add text
//Print an error message
printf( "eciAddText failed\n" );
}
Continued on the next pageIBM Text-to-Speech 9
Structuring an ECI Program
Continued from previous page
//Start ECI speaking
if (!eciSynthesize(eciHandle))
{
//We failed to synthesize
//Print an error message
printf( "eciSynthesize failed\n" );
}
}
//Wait until ECI finishes speaking
if (!eciSynchronize(eciHandle))
{
//We failed to synchronize
//Print an error message
printf( "eciSynchronize failed\n" );
}
//Delete our ECI Instance; deallocates memory
eciDelete(eciHandle);
}
else
{
//We failed to create a new ECI Instance
//Print an error message
printf( "eciNew failed\n" );
}
return 0;
}Threading
10 IBM Text-to-Speech
Threading
The ECI API is structured on a principle called the "Single-Threaded Apartment Model", which means
that each individual instance can be called only upon the thread that created it; that is, it should not be
affected by the existence of other instances or threads. All callbacks are called by the thread that
created the instance.
The eciSpeakText function is a blocking function that creates, manages, and destroys its own private
ECI instance. The application thread of execution is blocked until the function returns. eciSpeakText
requires no special thread handling, since it does not return control to the main thread until it has
completed all synthesis.
Other ECI functions are non-blocking: the application thread of execution remains available during
their execution. Applications using animated mouths, multiple voices, multiple conversations or
requiring the highest possible performance depend on these non-blocking functions, which are only
accessible through the handle created by eciNew. See also eciNewEx.
Callbacks
A callback is a mechanism for temporarily passing control of execution out of an instance of ECI to a
function provided by the developer when certain events take place. The ECI API provides for four
callback events:
• eciIndexReply: Sends notification when a particular point in the input text is reached. To set
these points in the text, call eciInsertIndex after calls to eciAddText.
• eciPhonemeBuffer: Sends notification when the Symbolic Phonetic Representations buffer is
full. Call eciGeneratePhonemes after a call to eciAddText to enable this event.
• eciPhonemeIndexReply: Sends notification when a particular phoneme is spoken, including
mouth animation data for that phoneme. Set eciWantPhonemeIndices to 1 with eciSetParam
to enable this event.
• eciWaveformBuffer: Sends notification when a sample-capture buffer is full (so, e.g., the
developer can send the samples to a custom audio destination). Call eciSetOutputBuffer to
enable this event.IBM Text-to-Speech 11
Callbacks
Only one callback function may be registered for each instance of ECI. This function will receive all
four types of callback events. No events are set by default.
Callback functions must return promptly, returning a flag indicating completion of processing.
Callbacks may not call ECI functions.
Register your callback with eciRegisterCallback immediately after calling eciNew. For any given
ECI instance, your callback will be called from the same thread on which your application calls ECI.
See eciRegisterCallback for more details on use of callbacks.User Dictionaries
12 IBM Text-to-Speech
User Dictionaries
IBM TTS allows you to explicitly specify pronunciations for words, abbreviations, acronyms, and
other sequences, preventing the normal pronunciation rules from applying. One way you can do this is
to enter a Symbolic Phonetic Representation (SPR) annotation directly into the input text (see
Symbolic Phonetic Representations). A more permanent way is to enter the word (the input string or
key) and the pronunciation you want (the output or translation value) in one of the user dictionaries.
A dictionary set consists of 4 volumes. Each volume differs from the kinds of keys and translation
values it accepts.
Main Dictionary (eciMainDict)
Main Extension Dictionary (eciMainDictExt)
Roots Dictionary (eciRootDict)
Abbreviations Dictionary (eciAbbvDict)
A dictionary file consists of ASCII text with one dictionary entry per line. Each input line contains a
key and a translation value, separated by a tab character. An invalid key or translation will cause the
dictionary look-up to fail, and the pronunciation of the word will be generated by the normal
pronunciation rules. Valid entries for each dictionary are discussed in the subsections below.
To add, modify, or delete an entry in any of the dictionaries, use the eciUpdateDict function of the
API.
For Asian languages, such as Chinese and Japanese, the client application should use the dictionary
maintenance functions that are named with an A at the end in place of the same-name function. For
example, use eciDictFindFirstA, in stead of eciDictFindFirst.
ForChinese, Roots Dictionary (eciRootDict) functionality is not supported.IBM Text-to-Speech 13
User Dictionaries
Main Dictionary (eciMainDict)
The Main Dictionary is distinguished from the other user dictionaries in two ways: a valid translation
consists of any valid input string, and the key of a Main Dictionary entry may contain any characters
other than white space, except that the final character of the key may not be a punctuation symbol.
You can thus use the Main Dictionary for:
• Strings that translate into more than one word
• Keys that require translations which include annotations or SPRs
• URLS and email addresses
• Keys containing digits or other non-letter symbols
• Acronyms with special pronunciations
The Main Dictionary is case-sensitive. For example, if you enter the key "WHO" with the translation
"World Health Organization", lower case who will still be pronounced as expected (`[hu]).
Note: The Main Dictionary translations may include ECI annotations.
Valid Main Dictionary Entries
The following table summarizes the valid Main Dictionary keys and translations:
Key Translation
· letters, both upper and lower case
· digits
· non-alphanumeric characters like @,
#, $, %, &, *, +
· apostrophes, quotation marks,
parentheses, brackets, etc.
· punctuation, except as the final
character
Anything that is legal input to the textto-speech engine, including white
space, punctuation, SPRs, and
annotations.
NO: white spaceUser Dictionaries
14 IBM Text-to-Speech
Main Dictionary Examples
The following table shows examples of Main Dictionary entries:
See Also
Abbreviations Dictionary (eciAbbvDict), Roots Dictionary (eciRootDict).
Key Translation
AWSA American Woman Suffrage `0 Association
[email protected] j e b at not real dot o r g
ECSU `[1i] `[1si] `[1Es] `[1yu]
UConn `[2yu1kan]
WYSIWYG `[1wI0zi0wIg]
Win32 win thirty two
486DX 4 86 dee ecksIBM Text-to-Speech 15
User Dictionaries
Main Extension Dictionary (eciMainDictExt)
The Main Extension Dictionary is the used for Asian languages and provides support for Chinese,
Japanese, and Korean.
You can use the Main Extension Dictionary for:
• Strings for DBCS languages (other than white space)
• Strings that translate into more than one word
• Keys that require translations which include annotations or SPRs
• Keys containing digits or other non-letter symbols
• Acronyms with special pronunciation
Translation is language dependent. For example in Japanese, Katakana Yomi strings are valid
translations. Any other SBCS/DBCS characters except the accent mark (^) will cause an error.
Each Main Extension Dictionary entry requires a part of speech which specifies the grammatical
category. The possible values are:
Note: The Main Extension Dictionary can be accessed with eciUpdateDictA, eciDictFindFirstA,
eciDictFindNextA, eciDictLookupA.
Language Part of Speech (POS)
Chinese eciUndefinedPOS
eciMingCi
Japanese eciUndefinedPOS
eciFutsuuMeishi
eciKoyuuMeishi
eciSahenMeishi
Korean eciUndefinedPOSUser Dictionaries
16 IBM Text-to-Speech
Roots Dictionary (eciRootDict)
The Roots Dictionary is used for ordinary words, like nouns (including proper names), verbs, or
adjectives, and for proper names. The distinctive feature of the Roots Rictionary is that you only have
to enter the root form of a word; all other forms of the word will automatically get pronounced in the
same way. For example, the letter-to-sound rules normally pronounce roof as [ruf] (which has the
vowel of boot). You can use the Roots Dictionary to specify the alternate pronunciation [rUf] (which
has the vowel of book). Then, all words with this root, such as roofer and roofing will also be
pronounced this way; there is no need to list the other words separately in the dictionary.
• The Roots Dictionary is not case-sensitive. So, for example, when you enter a root in lowercase, it
will still be found and pronounced as specified even when it begins with an uppercase (capital)
letter (for example, as the first word in a sentence).
• The Roots Dictionary is designed to provide alternate pronunciations of existing roots, and may not
work properly in the case of unknown roots. For example, the entry prego occurring in the
hypothetical word pregoness will not be accessed from the user roots dictionary because the
linguistic analysis rules assume that the word contains the root go rather than the root prego.
• The roots dictionary cannot be used to specify an alternate pronunciation of a function word, such
as the or to.
Valid Roots Dictionary Entries
The following table summarizes valid Roots Dictionary keys and translations:
Roots Dictionary Examples
The following table shows examples of Roots Dictionary entries:
Keys Translations
A single word in ordinary spelling, all
lowercase letters
· A single word in ordinary spelling
· A valid SPR
NO: digits, punctuation, white space, or
other non-letter characters
NO: digits, punctuation, or other non-letter
characters, white space, tags, or annotations
Key Translation Would apply to:
figure `[.1fI.0gR] figures, figuring, figured, refigure
tomato `[.0tx.1ma.0to] tomatoes, tomato’sIBM Text-to-Speech 17
User Dictionaries
See Also
Main Dictionary (eciMainDict), Abbreviations Dictionary (eciAbbvDict)
wash `[.1warS] wash, washing, washed, washes
wilhelmina wilma Wilhelmina, Wilhelmina’s
Key Translation Would apply to:User Dictionaries
18 IBM Text-to-Speech
Abbreviations Dictionary (eciAbbvDict)
The Abbreviations Dictionary is used for abbreviations (both with and without periods) which do not
require the use of annotations in their translation.
The Abbreviations Dictionary is case-sensitive. So for example, if you entered the key Mar with
translation "march," lower-case "mar" would still be pronounced as expected (`[mar]).
When you enter a key in the Abbreviations Dictionary, it is not necessary to include the "trailing"
period (as in the final period of "etc."). However, if you want an abbreviation to be pronounced as
specified in the translation only when it is followed by a period in the text, then you must enter the
trailing period in the key. The following table summarizes the use of trailing periods:
An Abbreviations Dictionary entry invokes different assumptions about how to interpret the trailing
period in the text than does a Main Dictionary entry. Since the period cannot be part of a Main
Dictionary entry key, it is automatically interpreted as end-of-sentence punctuation. A period
following an Abbreviations Dictionary entry, on the other hand, is ambiguous. It will only be
interpreted as end-of-sentence punctuation if other appropriate conditions obtain (e.g., if it is followed
by two spaces and an upper-case letter). For example, input (a) will be interpreted as one sentence,
while (b) will be interpreted as two sentences.
(a) It rained 2 cm. on Monday.
(b) On Sunday it rained 2 cm. On Monday, it was sunny.
Key entry: Will match:
inv inv.
inv
sid. sid.
(not sid)IBM Text-to-Speech 19
User Dictionaries
Valid Abbreviations Dictionary Entries
The following table summarizes valid Abbreviations Dictionary keys and translations:
Abbreviations Dictionary Examples
The following table shows examples of Abbreviations Dictionary entries:
See Also
Main Dictionary (eciMainDict), Roots Dictionary (eciRootDict).
Keys Translation
• Sequences of one or more letters
separated by periods (x.x.x. or
xx.xx.xx)
• Sequences of letters, with or without
the trailing period that may be
considered part of the abbreviation
(xxx. or xxx)
• Upper or lower case letters
• Internal apostrophes (not the first or
last character in the sequence)
• One or more valid words in ordinary spelling,
including both upper and lower case letters,
separated by white space or hyphen
NO: digits, non-letter symbols, white
space, or punctuation other than periods
NO: digits, punctuation, SPRs, tags, or annotations
Key Translation
Is.D. eye ess dee
punct punctuation
para paragraph
Ltjg lieutenant junior-grade
Fr Friar
int'l internationalUser Dictionaries
20 IBM Text-to-Speech
You can temporarily override the use of both internal and user-defined abbreviations with an
annotation; see Dictionary Processing of Abbreviations.IBM Text-to-Speech 21
ECI Reference
This section contains the following reference information:
• Data Types
• Synthesis State Parameters
• Voice Parameters
• Table of Functions
• Alphabetical Index of Functions
Data Types
ECI defines the following data types in the header file eci.h which should be included in any source
file that uses ECI functions.
Boolean
typedef int Boolean;
Many ECI functions return Boolean values.
ECICallbackReturn
typedef enum{
eciDataNotProcessed,
eciDataProcessed
eciDataAbort
}ECICallbackReturn
If you register a callback function, it must return one of these enumerated values.Data Types
22 IBM Text-to-Speech
ECIDictError
typedef enum{
DictNoError,
The call executed properly.
DictNoEntry,
The dictionary is empty, or there are no more entries.
DictFileNotFound,
The specified file could not be found.
DictOutOfMemory,
Ran out of heap space when creating internal data structures.
DictInternalError,
An error occurred in the internal synthesis engine.
DictAccessError
An error occurred when claiming operating-system specific resources for dictionary access.
DictErrLookUpKey
An error occurred when looking up the key.
DictInvalidVolume
The dictionary volume is not supported by the current language.
}ECIDictError
Most dictionary volume access functions return a value of this type to report errors.
ECIDictHand
typedef void* ECIDictHand
A handle to an ECI dictionary set.IBM Text-to-Speech 23
Data Types
ECIDictVolume
typedef enum {
eciMainDict,
eciRootDict,
eciAbbvDict,
eciMaindDictExt
}ECIDictVolume;
Identifies dictionary set volumes. See User Dictionaries.
ECIFilterError
typedef enum {
FilterNoError,
The call executed properly.
FilterFileNotFound,
The specified filter could not be found.
FilterOutOfMemory,
Ran out of heap space when creating internal data structures
FilterInternalError,
An error occurred in the internal synthesis engine.
FilterAccessError,
An error occurred when claiming operating-system specific resources for filter access.
} ECIFilterError
ECIHand
typedef void* ECIHand
A handle to an instance of ECI.Data Types
24 IBM Text-to-Speech
ECIInputText
typedef const void* ECIInputText
Contains an NULL terminated string using a system-dependent character set (currently ANSI for all
platforms).
ECILanguageDialect
typedef enum {
eciGeneralAmericanEnglish,
eciBritishEnglish,
eciCastilianSpanish,
eciMexicanSpanish,
eciStandardFrench,
eciCanadianFrench
eciStandardGerman,
eciStandardItalian,
eciMandarinChinese,
eciTaiwaneseMandarin,
eciBrazilianPortuguese
eciStandardJapanese,
eciStandardFinnish,
eciStandardNorwegian
eciStandardSwedish,
eciStandardDanish
} ECILanguageDialect
Identifies a language and dialect.IBM Text-to-Speech 25
Data Types
ECIMessage
typedef enum{
eciWaveformBuffer,
eciPhonemeBuffer,
eciIndexReply,
eciPhonemeIndexReply
}ECIMessage
Indicates why a callback has been called.
ECIParam
typedef enum{
eciSynthMode,
eciInputType,
eciTextMode,
eciDictionary,
eciSampleRate,
eciWantPhonemeIndices,
eciRealWorldUnits,
eciLanguageDialect,
eciNumberMode,
eciPhrasePrediction,
eciNumParams
}ECIParam
Specifies a synthesis state parameter for function calls that get and set synthesis state attributes.
ECIVoiceParam
typedef enum{
eciGender,
eciHeadSize,
eciPitchBaseline,
eciPitchFluctuation,
eciRoughness,Data Types
26 IBM Text-to-Speech
eciBreathiness,
eciSpeed,
eciVolume,
eciNumVoiceParams
}ECIVoiceParam
Specifies a voice parameter for function calls that get and set voice attributes.
ECIMouthData
Consists of a phoneme, language and dialect of the phoneme, and mouth position data for the
phoneme. Returned by callbacks with the eciPhonemeIndexReply message. See
eciRegisterCallback for more details.
In addition to the phoneme symbols defined for SPR input, the symbol ¤ (0xA4) is also used to
indicate end of utterance, and is sent with a set of neutral mouth position parameters.
typedef struct {
char szPhoneme[eciPhonemeLength+1];
ECILanguageDialect eciLanguageDialect;
unsigned char mouthHeight;
unsigned char mouthWidth;
unsigned char mouthUpturn;
unsigned char jawOpen;
unsigned char teethUpperVisible;
unsigned char teethLowerVisible;
unsigned char tonguePosn;
unsigned char lipTension;
} ECIMouthData;
Members
szPhoneme
Null-terminated, ASCIIZ string containing the name of a phoneme, or ¤ (0xA4) for end-ofutterance.
eciLanguageDialect
Language and dialect of this phoneme.
mouthHeightIBM Text-to-Speech 27
Data Types
Height of the mouth and lips. This is a linear range from 0-255, where 0 = minimum height (that is,
mouth and lips are closed) and 255 = maximum possible height for the mouth.
mouthWidth
Width of the mouth and lips. This is a linear range from 0-255, where 0 = minimum width (that is,
the mouth and lips are puckered) and 255 = maximum possible width for the mouth.
mouthUpturn
Extent to which the mouth turns up at the corners, that is, how much it smiles. This is a linear
range from 0-255, where 0 = mouth corners turning down, 128 = neutral, and 255 = mouth is fully
upturned.
jawOpen
Angle to which the jaw is open. This is a linear range from 0-255, where 0 = fully closed, and 255
= completely open.
teethUpperVisible
Extent to which the upper teeth are visible. This is a linear range from 0-255, where 0 = upper
teeth are completely hidden, 128 = only the teeth are visible, and 255 = upper teeth and gums are
completely exposed.
teethLowerVisible
Extent to which the lower teeth are visible. This is a linear range from 0-255, where 0 = lower
teeth are completely hidden, 128 = only the teeth are visible, and 255 = lower teeth & gums are
completely exposed.
tonguePosn
Tongue position. This is a linear range from 0-255, where 0 = tongue is completely relaxed, and
255 = tongue is against the upper teeth.
lipTension
Lip tension. This is a linear range from 0-255, where 0 = lips are completely relaxed, and 255 =
lips are very tense.
Remarks
The inventory of phoneme symbols used as the values of szPhoneme is similar but not necessarily
identical to the inventory of Symbolic Phonetic Representations (SPR) phoneme symbols. The values
of szPhoneme are taken directly from the phonemic representation generated by the IBM Text-toSpeech TTS engine, whereas the symbols used in SPRs are normalized versions of these phonemes. Data Types
28 IBM Text-to-Speech
In addition to the phoneme symbols used in each language, the symbol ¤ (0xA4) is used to indicate the
end of a sentence and is sent with a set of neutral mouth position parameters.IBM Text-to-Speech 29
Synthesis State Parameters
Synthesis State Parameters
When you create a new ECI instance, it is given a default synthesis state. As you interact with the
instance, its state changes. You can:
• Get the current synthesis state using eciGetParam.
• Set the synthesis state directly, through eciSetParam, or indirectly, by sending annotated text in
calls to eciAddText.
This section describes the synthesis state parameters that can be passed to eciGetParam and
eciSetParam.
eciDictionary
0: Abbreviations dictionaries (both internal and user) are used (default).
1: Abbreviations dictionaries (both internal and user) are not used.
Enables or disables the internal and user abbreviations dictionaries. You can also turn abbreviations
dictionary lookups on and off the using the ‘daN annotation (see Dictionary Processing of
Abbreviations).
eciInputType
0: Plain: input consists of unannotated text. Any annotations will be spelled out (e.g., `v2 will be
pronounced "backquote vee two") (default).
1: Annotated: input text includes annotations. See Annotations for more details.
eciLanguageDialect
enum
{
eciGeneralAmericanEnglish,Synthesis State Parameters
30 IBM Text-to-Speech
eciBritishEnglish,
eciCastilianSpanish,
eciMexicanSpanish,
eciStandardFrench,
eciStandardGerman,
eciStandardItalian,
eciMandarinChinese,
eciTaiwaneseMandarin,
eciBrazilianPortuguese
eciStandardJapanese,
eciStandardFinnish,
eciStandardKorean
} ECILanguageDialect
A value specifying the language and dialect. These should be of type ECILanguageDialect. Not all
languages are available with all installations. The language defaults to the “lowest-numbered”
language installed on the system. Languages are numbered in the order specified by the
ECILanguageDialect enum.
This parameter can be set by the ‘lN annotation; see Selecting a Language and Dialect for more detail.
eciNumberMode
0: Pronounce 4-digit numbers as “nonyears” (e.g., “1984” would be pronounced “one thousand nine
hundred eighty four”).
1: Pronounce 4-digit numbers as “years” (e.g., “1984” would be pronounced “nineteen eighty four”)
(default)
This parameter can be set by the ‘tyN annotation; see Specifying Alternative Pronunciations for more
detail.
eciNumParams
Total number of ECIParams. Passing eciNumParams to eciGetParam will cause a -1 (error) return,
which is an expected behavior. IBM Text-to-Speech 31
Synthesis State Parameters
eciRealWorldUnits
0: Use ECI values (default).
1: Use Real World units.
Selects the units for the values of the voice parameters eciPitchBaseline, eciSpeed, and eciVolume as
either ECI units or Real World units.
eciSampleRate
0: 8000 samples per second.
1: 11,025 samples per second (default).
2: 22,050 samples per second.
eciSynthMode
0: Sentence: The input buffer is synthesized and cleared at the end of each sentence (default).
1: Manual: Synthesis and input clearing is controlled by commands only.
eciTextMode
0: Default: no special interpretation (default).
1: AlphaSpell: letters and digits are spelled out, punctuation is treated normally to identify ends of
phrases and sentences, and other symbols are ignored.
2: AllSpell: all symbols are spelled out. Note that sentence ends are not recognized in this mode.Synthesis State Parameters
32 IBM Text-to-Speech
3: IRCSpell: like AlphaSpell, except that letters are spelled out using the International Radio Code
(“alpha, bravo, charlie”) rather than their conventional names.
This corresponds to the annotation ‘tsN, described in Specifying Alternative Pronunciations.
eciWantPhonemeIndices
0: Phoneme indices are not generated. (default)
1: If a callback has been registered (see eciRegisterCallback below), phoneme indices will be sent to
the callback as each phoneme is being spoken. See also the eciPhonemeIndexReply message and the
ECIMouthData type.IBM Text-to-Speech 33
Synthesis State Parameters
Synthesis State Parameter Defaults
The following table provides a summary of the synthesis state parameters and their default behavior.
Parameter Default value Default behavior
eciDictionary 0 User dictionaries are used.
eciInputType 0 Annotations in input will be spelled out.
eciLanguageDialect lowest number
installed
The lowest-numbered language/dialect on
the system is used.
eciNumberMode 1 Four-digit numbers are pronounced as
“years”.
eciNumParams 0 Total number of ECIParams.
eciRealWorldUnits 0 ECI units are used for all voice definition
parameters.
eciSampleRate 1 The sample rate is 11,025 samples per
second.
eciSynthMode 0 The input buffer is synthesized and cleared
at the end of each sentence.
eciTextMode 0 No special spelling interpretation is
performed on the text.
eciWantPhonemeIndices 0 Phoneme indices are not generated.Voice Parameters
34 IBM Text-to-Speech
Voice Parameters
Voice parameters are commands used to define and adjust individual voice characteristics. A set of
voice parameters makes a voice definition. You can create custom voices by selecting unique
combinations of voice parameters. In addition, there are five predefined voice definitions, as discussed
in the next section.
When you create a new ECI instance, it is given the default voice parameters. You can:
• Get the current voice parameters using eciGetVoiceParam.
• Set the voice parameters directly through eciSetVoiceParam, or indirectly by sending annotated
text in calls to eciAddText.
This section describes the voice parameters that can be passed to eciGetVoiceParam and
eciSetVoiceParam.
eciBreathiness
Range: 0-100
This parameter controls the amount of breathiness in the voice. The higher the value, the more
breathiness the voice has. A value of 100 produces a whisper.
This voice parameter can be changed using the annotation ‘vyN (see Selecting a Voice or Voice
Characteristics).
eciGender
0: male
1: female
Male and female vocal tracts have physical differences that affect the voice, some of which are
reflected in the vocal tract setting. Other differences between male and female voices, namely pitch
and head size, are controlled independently.IBM Text-to-Speech 35
Voice Parameters
This voice parameter can be changed using the annotation ‘vgN (see Selecting a Voice or Voice
Characteristics).
eciHeadSize
Range: 0-100
This parameter controls the size of the head for the speaker, changing the perceived pitch and other
acoustic characteristics of the voice. A large number indicates a large head and a deeper voice.
This voice parameter can be changed using the annotation ‘vhN (see Selecting a Voice or Voice
Characteristics).
eciNumVoiceParams
Total number of ECIVoiceParams. Passing eciNumVoiceParams to eciGetVoiceParam will cause a -1
(error) return, which is an expected behavior.
eciPitchBaseline
Range: 0-100 (ECI units); 40-422 (Real World Units = cycles per second)
Changing the pitch baseline will affect the overall pitch of the voice. The larger the pitch value, the
higher the pitch of the voice.
This voice parameter can be changed using the annotation ‘vbN (see Selecting a Voice or Voice
Characteristics).
eciPitchFluctuation
Range: 0-100Voice Parameters
36 IBM Text-to-Speech
This parameter controls the degree of pitch fluctuation in the voice. A value of zero produces a voice
with no pitch fluctuation, resulting in monotone speech. A high value produces a voice with large pitch
fluctuations, typical of excited speech.
This voice parameter can be changed using the annotation ‘vfN (see Selecting a Voice or Voice
Characteristics).
eciRoughness
Range: 0-100
This parameter adds roughness or "creakiness" to the voice. A low value produces a smooth voice,
while a high value is rough or scratchy.
This voice parameter can be changed using the annotation ‘vrN (see Selecting a Voice or Voice
Characteristics).
eciSpeed
Range: 0-250 (ECI Units); 70-1297 (Real World Units = words per minute)
Speed controls the number of words spoken per minute.
This voice parameter can be changed using the annotation ‘vsN (see Selecting a Voice or Voice
Characteristics).
eciVolume:
Range: 0-100 (ECI Units); 1-65535 (Real World Units)
The smaller the value, the lower the volume. Louder settings may cause distortion when combined
with other attribute changes.IBM Text-to-Speech 37
Preset Voice Definitions
This voice parameter can be changed using the annotation ‘vvN (see Selecting a Voice or Voice
Characteristics).
Preset Voice Definitions
Voice definitions are sets of parameter values that make an individual voice. There are five preset
voice definitions for each dialect of each language (three more are reserved for future use).
Each voice definition contains a set of parameter values that control the attributes of the voice.
The preset voices in each language are:
1. Adult Male 1
2. Adult Female 1
3. Child 1
4. Adult Male 2
5. Adult Male 3
6. Adult Female 2
7. Elderly Female 1
8. Elderly Male 1
Voice Parameter Defaults
The following chart shows the voice definition parameters for all languages, except as noted:
1 2 3 4 5 6 7 8
Voice
Parameters
Adult
Male 1
Adult
Female 1 Child 1
Adult
Male 2
Adult
Male 3
Adult
Female 2
Elderly
Female 1
Elderly
Male 1
Breathiness 0** 50 0 0 0 40 40 20
Gender 01100110
Head size 50 50 22 86 50 56 45 30Preset Voice Definitions
38 IBM Text-to-Speech
*In French, the Pitch Baseline parameter is 69. ** In Taiwanese Mandarin, the Breathiness paramter is
34.
Pitch
Baseline
(ECI units)
65* 81 93 56 69 89 68 61
Pitch
Fluctuation
30 30 35 47 34 35 30 44
Roughness 0 0 0 0 0 0 3 18
Speed
(ECI units)
50 50 50 50 70 70 50 50
Volume
(ECI units)
92 100 90 93 92 95 90 90
1 2 3 4 5 6 7 8
Voice
Parameters
Adult
Male 1
Adult
Female 1 Child 1
Adult
Male 2
Adult
Male 3
Adult
Female 2
Elderly
Female 1
Elderly
Male 1IBM Text-to-Speech 39
Table of Functions
Table of Functions
This table outlines the available ECI functions. Detailed information about each function can be found
in the Alphabetical Index of Functions.
System Control
Use the following functions for system control:
Synthesis Control
Use the following functions for Synthesis Control:
Function Description
eciDeactivateFilter Disables the specified filter for the ECI instance.
eciNew Creates a new ECI instance and returns a handle to it.
eciNewEx Creates a new instance of ECI and returns a handle to it.
The client indicates the language, dialect and character set
for the new engine instance
eciReset Resets the ECI instance to the default state.
eciSpeakText Synthesizes text to the default audio device.
eciSpeakTextEx Synthesizes text to the default audio device with ability for
selection of language dialect and character set of its text
Function Description
eciAddText Appends new text to the input buffer.
eciClearInput Clears the input buffer.
eciGeneratePhonemes Converts text to phonemes.
eciGetIndex Returns the last index reached in an output buffer.
eciInsertIndex Inserts an index into an input buffer.
eciPause Pauses or unpauses speech synthesis and playback.Table of Functions
40 IBM Text-to-Speech
Output Control
Use the following functions for output control:.
Speech Environment Parameter Selection
Use the following functions for speech environment parameter selection:
eciSpeaking Determines whether synthesis is in progress.
eciStop Stops synthesis.
eciSynchronize Waits for an ECI instance to finish processing its
output and then synchronizes it with a device.
eciSynthesize Starts synthesis of text in an input buffer.
eciSynthesizeFile Synthesizes the contents of a file.
Function Description
eciSetOutputBuffer Sets an output buffer as the synthesis destination.
eciSetOutputDevice Sets an audio output hardware device as the
synthesis destination.
eciSetOutputFilename Sets an output file as the synthesis destination.
Function Description
eciGetDefaultParam Returns the default values for an environment
speech parameter.
eciGetParam Returns the value of an environment parameter.
eciSetDefaultParam Sets the default values for an environment
speech parameter.
eciSetParam Sets an environment parameter.
Function DescriptionIBM Text-to-Speech 41
Table of Functions
Voice Parameter Control
Use the following functions for voice parameter control:
Dynamic Dictionary Maintenance
Use the following functions for dictionary maintenance (for Asian languages such as Chinese and
Japanese, use the functions that end with the letter A):
Function Description
eciCopyVoice Makes a copy of a set of voice parameters.
eciGetVoiceName Returns the voice name and then copies it to a name