Bilingual lexicon extraction from comparable corpora for closely related languages; Elektronski vir

(UL)

Bilingual lexicon extraction from comparable corpora for closely related languages [Elektronski vir]

Fišer, Darja, 1978- ; Ljubešić, Nikola, 1979-

In this paper we present a knowledge-light approach to extract a bilingual lexicon for closely related languages from comparable corpora. While in most related work an existing dictionary is used to ... translate context vectors, we take advantage of the similarities between languages instead and build a seed lexicon from words that are identical in both languages and then further extend it with context-based cognates and translations of the most frequent words. We also use cognates for reranking translation candidates obtained via context similarity and extract translation equivalents for all content words, not just nouns as in most related work. The results are very encouraging, suggesting that other similar languages could benefit from the same approach. By enlarging the seed lexicon with cognates and translations of the most frequent words and by cognate-based reranking of translation candidates we were able to improve the average baseline precision from 0.592 to 0.797 on themean reciprocal rank for the ten top-ranking translation candidates for nouns, verbs and adjectives with a 46% recall on the gold standard of 1000 random entries from a traditional dictionary.

Source: Proceedings [Elektronski vir] ([7 str.])

Type of material - conference contribution ; adult, serious

Publish date - 2011

Language - english

COBISS.SI-ID - 46844258

Link(s):
http://lml.bas.bg/~iva/ranlp2011/RANLR2011_Proceedings.PDF

Keep searching

Author
Fišer, Darja, 1978- | Ljubešić, Nikola, 1979-

Access to the JCR database is permitted only to users from Slovenia. Your current IP address is not on the list of IP addresses with access permission, and authentication with the relevant AAI accout is required.

Year	Impact factor		Edition		Category		Classification
Year	JCR	SNIP	JCR	SNIP	JCR	SNIP	JCR	SNIP

Links to authors' personal bibliographies	Links to information on researchers in the SICRIS system
Fišer, Darja, 1978-	26294
Ljubešić, Nikola, 1979-	36871

Source: Personal bibliographies and: SICRIS

The material from the parent unit is free. If the material is delivered to the pickup location from another unit, the library may charge you for this service.

Pickup location	Material status	Reservation

Upload image

Shelf entry

Adding material to shelf was successful.

Adding material to shelf failed.

It was not necessary to add the material to the shelf.

Permalink

E-mail

Impact factor

Select the library membership card:

DRS, in which the journal is indexed

Select pickup location:

Material pickup by post

Notification

Citations

Subject headings in COBISS General List of Subject Headings

Select pickup location

Reservation was successful.

Reservation failed.

Reservation...

Bibliographic data

Number of loans

Loan was successful

Loan failed

Loan was successful

Loan failed

Loan was successful

Loan failed

Loan was successful

Loan failed

Theme