Mandarin Topic-oriented Conversation Corpus (Academia Sinica)Type Collection Collection Identifier c0518 Description The Mandarin Topic-oriented Conversation Corpus (MTCC) was recorded in 2002, from January to March. The conversations are natural discussion between two familiar persons. The topic of the conversations is on one chosen event happened in 2001. There are in total 60 speakers (age: from 14 to 63), 30 conversations. The total length of data is 11 hours, and the average length of each conversation is 22 minutes. The annotation system is designed to mark discourse functions in natural conversations. Opening, main discussion and closing are the three main parts of a natural, topic-oriented conversation. The main discussion contains discourse functions intended to start a discussion, to negotiate a topic, to introduce a topic, to talk about a topic, and to end the discussion. In order to build a multimodal database together with the metadata of the transcription texts, all sound files are segmented and stored in stereo files. The total size is 6.78GB. With the help of Translist, 29 conversations (185,000 characters) are completely transcribed and annotated. Language English;Chinese Rights Subject Language Temporal Coverage 2001 Dates Collection Accumulated 2002 Owner Institute of Linguistics, Academia Sinica Is Located At Institute of Linguistics, Academia Sinica Is Accessed Via link: http://mmc.sinica.edu.tw/mtcc_e.htm Super-Collection Language Associated collection Academia Sinica Tagged Corpus of Early Mandarin Chinese(Institute of Linguistics, Academia Sinica);Formosan Language Archive(Institute of Linguistics, Academia Sinica);Southern-Min Archive: A Database of Historical Change in Language Distribution(Institute of Linguistics, Academia Sinica);
|