One of the objectives of Crest is to bring together people from diverse disciplines and to help them to work together. One of the challenges in such a situation is that of simply being able to understand each other's language. This page can be used to develop a glossary whereby members can document what they mean by different terms. There may be differences in understanding and they can be captured here. So, if any any time you find yourself using a word or phrase that someone else in the network does not understand, then please come to some mutual agreement as to what you mean and record that definition here. Or, perhaps, you realize that you do not really understand a term in common use and find it helpful to work out and record what you really mean.

Of course, this is effectively a wiki for members - so if you disagree with a definition or think it can be refined, then please go ahead and update the entry, or start up a Wall discussion.


Alternative and augmentative communication. Generally this refers to communication using some device in addition to the body. Examples are the devices made by Toby Churchill Ltd [1].


The way we speak a phoneme is affected by the phonemes that precede and follow it - simply because of the physical movements we have to make with our articulation organs (tongue, lips etc). 'For instance, the words "construe" and "constraint" begin with the same phonemes, but they may be pronounced differently because in the former case the speaker is likely to round his or her lips in anticipation of the ending u sound, while in saying "constraint" the lips will end quite open.' (Edwards, 1991, p.16).


Rather than using phonemes to construct synthetic speech, it is possible to use transitions between phonemes. This helps to improve the voice because it (partly) captures coarticulation effects. Essentially a diphone consists of the sound from the middle of one phoneme to the middle of the next one. The number of phonemes is much larger than the number of phonemes - but not impractically so, because many phonemes simply do not co-occur in the language.


Interactive voice response. This refers to those telephone-based systems which use voice recognition - to book cinema seats or flights or whatever. Although often the butt of jokes and frustration, they are a very serious commercial proposition. Their design can make serious difference to revenues.


The phoneme is the smallest segment of sound, such that if one phoneme in a word is substituted for another, the meaning of the word is changed. For example, substituting the first phoneme in 'coffee' could change the word to 'coffee'. Because the definition is somewhat subjective, it is not possible to say precisely how many phonemes there are in the English language, but there are around forty'. (Edwards, 1991, p.13). Some synthetic speech is generated by stringing together phonemes.


'"Voice" synthesis, as distinct from "speech"synthesis, may be incomputable, at least with current technology. "Voice", in this thesis, will always be presented as a higher-level feature than "speech", requiring more complex rendering techniques. (Newell, 2009, p.88)


Edwards, A. D. N. (1991), Speech Synthesis: Technology for Disabled People, London: Paul Chapman Publishing Ltd.

Newell, C. H. (2009), Place, authenticity, and time: a framework for liveness in synthetic speech', unpublished PhD Thesis, University of York.