Information Retrieval from Mixed-Language Spoken Documents (混合语言音频档案中的信息检索)

 

Principal Investigator:  Tan Lee, Dept. of EE, CUHK  

Co-Investigator:         P.C. Ching, Dept. of EE, CUHK

Research Students:   Houwei Cao, David Yeung, Dept. of EE. CUHK

 

 

As the use of Internet becomes increasingly popular and the cost of electronic storage keeps dropping, we now enjoy the easy access of abundant digitized information, including text documents, audio and video recordings. To make such resources useful, we need effective and efficient methods of retrieving information according to prescribed queries. For example, a financial analyst may wish to find all audio recordings from radio broadcast news that are about the activities of a specific company; a police investigator may need to locate certain key words from many hours of recordings of interviews with the suspects or witnesses.

 

Hong Kong is an international city where many people, especially the young generation, are Cantonese and English bilinguals. There has also been a trend that people tend to frequently embed English words into spoken Cantonese sentences, e.g. “能夠同佢哋work together 我覺得好exciting.” The use of mixed-language is common not only in casual speech but also in formal business communication, e.g. “我哋concern呢個investment會唔會太risky.”

 

For information retrieval from audio archives that contain mixed-language content, conventional methods based on monolingual speech recognition systems are not applicable. First, there is no prior knowledge about when there is a switch of language so that we can not determine which of the two recognizers should be used for a particular speech segment. Second, it is very often that the English words embedded into a Cantonese utterance are spoken with strong Cantonese accents, which a monolingual ASR system for standard English is unable to handle. Third, mixed-language speech adopts special grammars that can not be inferred from monolingual speech, so the language models need to re-built from mixed-language data.

 

In the propose project, we aim to develop an information retrieval system that can handle audio documents with mixed-language content. Our research results are expected to be extendable to general information retrieval applications that can support mixed-language spoken or text queries.

 


go to DSP-STL Home, EE , CUHK