Hungarian frequency dictionaries
Könyvtári Figyelő (Library Review) vol. 16. (52.) 2006. no. 1. pp. 45-58.
The automation of abstracting is based on the definition of the most important sentences of the text, and is produced with statistical methods, using the achievements of linguistics. First of all, a word frequency list needs to be established in order to define the most significant words of the text. The first Hungarian frequency dictionaries were published between the two world wars. They were extensively used in shorthand writing (e.g. the language of the Hungarian Parliament debates, or the language of newspaper articles). Frequency dictionaries also started to play an important role in setting up the vocabularies of language learning textbooks. Several vocabularies of the poetic language of classic authors (Sándor Petofi, Gyula Juhász, Mihály Csokonai Vitéz, Miklós Zrínyi) and vocabularies related to Hungarian classic literary works were published. Some other works dealt with children's vocabulary and the language of newspapers, while glossaries of scientific disciplines were only published a later time. One of the most important ongoing Hungarian dictionary projects is called the Szószablya /WordSword project carried out at the Budapest University of Technology and Economics. This project aims to create a web dictionary on the basis of the Hungarian webcorpus, a collection of more than 18 million Hungarian webpages. Another large-scale project is the Hungarian National Corpus, carried out at the Research Institute for Linguistics of the Hungarian Academy of Sciences. This latter project is based on a corpus of 150 million words to represent the written use of the Hungarian language. |