COMPUTATIONAL LINGUISTICS
ERCIM News No.26 - July 1996 - GMD

Twelve Languages - one Translation Method


by Maria Theresia Rolland

In ERCIM News No. 22 of July 1995 (page 21) a new method is described called logotechnique (word processing). This method enables a fully automated language processing both for a query system in a single language and for machine translation. In the following we will illustrate machine translation according to this method by explaining how one sentence could be treated in all twelve languages of the thirteen ERCIM institutions. The fundamentals of the method will be described in a simplified manner and the translation specifications will be given. Finally, the translations will be presented.

Initially, the structure of each language itself must be identified [details: Rolland: Sprachverarbeitung durch Logotechnik. Bonn: Dümmler 1994]. In this way we obtain the relational structure of the language, ie the connection between the initial word and the dependent words in their specific relationships (eg: to buy -> what? -> computers, not: *to buy -> what? -> earthquakes). All relationships of the initial word result in the construction plan. Each word has such a construction plan. This plan thus comprises the initial word, and the relationships involved including the specific dependent words using their correct inflections. The word contents consists of the special contents and the general contents, differentiated into inflection and construction. The dependent words are grouped into classes of meaning according to their semantic similarity (eg furniture: table, chair, ...; device: computer, printer, ...) etc. Within the meaning classes each word is marked by the possible relationships; each relationship has a reference to the respective inflection. Therefore, construction plans, meaning classes and inflection groups have to be determined only once. Each sentence is an extract from the possibilities which are given by the construction plan including the references. All construction plans taken together are the explicit relational structure of the language. The construction plans labelled for natural language processing represent the relation base.

Additionally, it is necessary to consider specific features of the language. Eg a relationship may not only be concretized by a single word, but also by a sub-ordinate clause, with or without an introductory conjunction or the like, but always with a predicate which has its own construction plan (eg to walk on -> in spite of what concession? -> although -> to be exhausted: They walked on although they were exhausted). Predicates include verbs, auxiliary verbs, modal verbs etc. Articles and possessive pronouns are dependent on the corresponding noun [R.e. pp. 83; 203]. In addition, each language has particular characteristics to be handled by rules. Rules also govern the word order. When establishing a translation system it is necessary to create the link between the relation bases. Equivalents may be, eg: a) a single word - a subordinate clause; b) active form - passive form; c) reflexive verb - phrase; d) article +noun - noun only etc.

If we assume that the relation bases of the twelve languages of the ERCIM institutions were explicitly available and had been linked by translation experts, we could have the following twelve language-units: German - English; English - Finnish; Finnish - French; French - Italian; Italian - Greek; Greek - Dutch; Dutch - Norwegian; Norwegian - Portuguese; Portuguese - Swedish; Swedish - Spanish; Spanish - Hungarian; Hungarian - German. The sentence, which is to be translated, reads as follows: "Mag sich auch die ganze Welt gegen die Wahrheit rüsten, so wird man doch ihren Sieg nicht verhindern" (Let the whole world rise up in arms against the truth, still its victory cannot be prevented). The construction plans of the verbs: verhindern (to prevent) and sich rüsten (to rise up in arms) would be included in the linked relation bases in all of the above languages. Now it is possible to identify the structure of the sentence according to the relation base in one language and look up the linked units in the other language.

If one needs the translation: German - Italian, but there are only the equivalents in the sequence: German - English - Finnish - French - Italian, a computer system can identify the corresponding sentence structure, beginning with German and continuing through English, Finnish, French and Italian. The system uses only the Italian word ordering rules for presenting the correct Italian translation in the final step. So an arbitrary language can be the source or target language. The twelve sentences are presented in tables available in rtf and ps format.


Please contact:
Maria Theresia Rolland - GMD
Tel: +49 2241 14 2087
E-mail:rolland@gmd.de


return to the contents page