Advances in several fundamental technologies are making possible mobile computing platforms of unprecedented power. In the speech and voice technology business fields SALT has been introduced as a new tool. SALT supplies a critical missing component, facilitating intuitive speech-based interfaces that anyone can master. Verizon Wireless has joined the SALT Forum to make speech applications more accessible to wireless customers. The SALT specification defines a set of lightweight tags as extensions to commonly used Web-based programming languages, strengthened by incorporating existing standards from the World Wide Web Consortium (W3C) and the Internet Engineering Task Force. In multimodal applications, the tags can be added to support speech input and output either as standalone events or jointly with other interface options such as speaking while pointing to the screen with a stylus. In telephony applications, the tags provide a programming interface to manage the speech recognition and text-to-speech resources needed to conduct interactive dialogs with the caller through a speech-only interface.
SALT is a speech interface markup language. SALT (Speech Application Language Tags) is an extension of HTML and other markup languages (HTML, XHTML, WML) that adds a powerful speech interface to Web pages, while maintaining and leveraging all the advantages of the Web application model. These tags are designed to be used for both voice-only browsers (for example, a browser accessed over the telephone) and multimodal browsers. SALT (Speech Application Language Tags) is a small set of XML elements, with associated attributes and DOM object properties, events, and methods, which may be used in conjunction with a source markup document to apply a speech interface to the source page. The SALT formalism and semantics are independent of the nature of the source document, so SALT can be used equally effectively within HTML and all its flavors, or with WML, or with any other SGML-derived markup. SALT targets speech applications across a wide range of devices including telephones, PDAs, tablet computers and desktop PCs. As all these devices have different methods of inputting data SALT has taken this also into consideration.
SALT provides a multimodel access in which users will be able to interact with an application in a variety of ways: input with speech, a keyboard, keypad, mouse and/or stylus; and output as synthesized speech, audio, plain text, motion video and/ or graphics. Each of these modes could be used independently or concurrently. For example, a user might click on a flight info icon on a device and say "Show me the flights from San Francisco to Boston after 7 p.m. on Saturday" and have the browser display a Web page with the corresponding flights.
There are mainly three major challenges that SALT will help address.
1. Input on wireless devices:
Wireless devices are becoming pervasive, but lack of a natural input mechanism hinders adoption as well as application development on these devices.
2. Speech-enabled application development:
Speech-enabled integration between existing Web browser software, server and network infrastructure and speech technology, SALT will allow many more Web sites to be reachable through telephones.
3. Telephony applications:
There are 1.6 billion telephones in the world, but only a relatively small fraction of Web applications and services are reachable by phone.