Analiza diskurza kot podpora sistemom strojnega simultanega prevajanja govora : [doktorska disertacija]

University of Maribor Library (UKM)

Opening hours: Monday to Friday from 8.00 to 14.00, Wednesdays to 17.00 and Saturdays from 9.00 to 13.00. ČUK reading room opening hours: Monday to Saturday from 12.00 to 24.00, closed on Saturday. Information: 02 25 07 431, ukm@um.si

Analiza diskurza kot podpora sistemom strojnega simultanega prevajanja govora : [doktorska disertacija]

Verdonik, Darinka

The aim of this work is to research the telephone conversations in tourist domain with the concepts of discourse analysis that could be used in speech-to-speech translation in order to better handle ... spontaneous speech phenomena. Speech-to-speech translation systems have to manage three differenttasks: first a speech recognition of speech in input language is needed. The text gained through speech recognition usually includes errors and is not structured to clauses and sentences. The recognized text is then translated to output language in the process called speech centred translation. Translation of spontaneous spoken text is different than translation of written text because the spoken text includes disfluencies, repairs, false starts, hesitations, filled pauses, silences etc.; repetitions are much more often, implicitness of information is higher, prosody is lost when transforming speech to text... These and other similar phenomena of the spontaneous speech have be en noticed in the speech-to-speech translation as problematic: the C-STAR consortium (http://www.c-star.org/main/english/cstar2/) therefore suggests that simple combining of machine translation technics, developed for the translation of the written text, with speech recognition and speech synthesis into speech-to-speech translation systems cannot achieve satisfying quality, but special approaches to the speech centred translation are needed. Similar is concluded in the Verbmobil (http://verbmobil.dtki.de/verbmobilNM.English.Mai1.30 .1 0.96.html) and other projects where speech-to-speech translation systems were built. The last act of the speech-to-speech translation system is speech synthesis of the translated text into output language. The system has to be reciprocal. An overview of machine translation and speech-to-speech translation shows that different approaches to the problem have been developed, the most promising recently are statistical corpus technies. When using certain parts of traditional linguistics knowledge the machine translation as well as other language technologies can perform better - part-of-speech categories as well as other morpho-syntactic' attributes, for example, are widely used. But when dealing with the spontaneous speech we find many phenomena exceeding the traditional linguistics knowledge since it was gained mostly through researching written language forms. The spontaneous speech was better researched in fields such as pragma-linguistics, conversation analysis and others which can be classified as discourse analysis. Therefore I suggest to use some parts of linguistic knowledge of the discourse analysis to overcome the phenomena of the spontaneous speech in speech-to-speech translation. Researching was done theoretically and empirically. It was limited to tourist domain, to telephone conversations in tourist agency, tourist office and hotel. The corpus Turdis-1, including 30 conversations, was used as research material. The discourse analysis were studied in search for concepts that could be as easily as possible implemented to speech corpora as attributes fortagging. In this work I suggest that the spoken text is structured to smaller units: opening and closing sections, turns and utterances. The utterance is precisely defined. Hearer's signal s (words such as mhm, aha, ja) are treated as special discourse events, not as turn-taking. Further I suggest that the concept of discourse markers could be used. The empirical study shows that at least 15 expressions in the corpus Turdis-l (ja, mhm, aha,aja, ne?, no, eee, dobro/v redu/okej/prav, glejte/poglejte, veste, mislim,zdaj) could be specified as discourse markers. In the function of discourse marker these expressions represent almost 14% of all words in the 15.000 words corpus. Their particularity is that they do not contribute much to a representational meaning of utterance but are used mainly as pragmatic expressions: they help connecting discourse, expressing speaker's attitude towards discourse content, maintaining hearer's attention, organizing discourse etc. A structure of spontaneous spoken utterance can be fuzzy and disfluent. I suggest to use the concept of repair to eliminate a special, retrograde part of the utterance which can be disturbing for further processing since it is cut off. In 8% of all utterances in the corp us the repair was used. Further researching of the analyzed phenomena as well as researching of some not analyzed, but mentioned phenomena such as repetitions,topic structure of conversation, adjacency pairs, could be continuation of the present work. From the linguistic perspective this work brings researches of language use in a domain (spontaneous telephone conversations) and from perspectives (conversation structure, discourse markers, repair) which are all more or less new in the linguistics of Slovenian language.

Type of material - dissertation ; adult, serious

Publication and manufacture - Ljubljana : [D. Verdonik], 2006

Language - slovenian

COBISS.SI-ID - 227346944

Keep searching

Author
Verdonik, Darinka

Other authors
Stabej, Marko | Kačič, Zdravko

Holdings
Availability in other libraries

Call number – location, accession no. ...	Copy status	Reservation
Skladišče II 0000065124/k Skladišče II 65124/k	available - reading room
Skladišče CD 0000005980/cd Skladišče CD 5980/cd	available - reading room

Access to the JCR database is permitted only to users from Slovenia. Your current IP address is not on the list of IP addresses with access permission, and authentication with the relevant AAI accout is required.

Year	Impact factor		Edition		Category		Classification
Year	JCR	SNIP	JCR	SNIP	JCR	SNIP	JCR	SNIP

Links to authors' personal bibliographies	Links to information on researchers in the SICRIS system
Verdonik, Darinka	23838
Stabej, Marko	11651
Kačič, Zdravko	06821

Source: Personal bibliographies and: SICRIS

The material from the parent unit is free. If the material is delivered to the pickup location from another unit, the library may charge you for this service.

Pickup location	Material status	Reservation

Upload image

Shelf entry

Adding material to shelf was successful.

Adding material to shelf failed.

It was not necessary to add the material to the shelf.

Permalink

E-mail

Impact factor

Select the library membership card:

DRS, in which the journal is indexed

Select pickup location:

Material pickup by post

Notification

Citations

Subject headings in COBISS General List of Subject Headings

Select pickup location

Reservation was successful.

Reservation failed.

Reservation...

Bibliographic data

Number of loans

Loan was successful

Loan failed

Loan was successful

Loan failed

Loan was successful

Loan failed

Loan was successful

Loan failed

Theme