PARADIGM OF CORPUS TYPOLOGICAL CHARACTERISTICS BY THE TYPE OF TEXT DATA
Keywords:
general corpus, specialized corpus, text data, corpus type, typological characteristicsAbstract
The article attempts to analyze the typological characteristics of text corpora. The author proposes to classify corpora with consideration of different aspects of this modern linguistic notion, namely the nature of the text data included in the corpus, the design and structural features of the corpus, the method of fixing and indexing text data in the corpus, as well as the way of how the corpus can be used. Particular attention in the article is paid to the classification of corpora considering what text data was taken into account while it was compiled, in particular by the degree of text data specialization, its formal nature, and the language parameter. It is revealed that the paradigm of the «degree of specialization of text data» is composed of general and specialized corpora. In its turn, inside the group of specialized corpora, the type of text data that defines the name of the corpus to which they belong and serve as a selection parameter can be determined by genre, stylistic, temporal, anthropocentric, professional, communicative, geographical or social nature of linguistic diversity. Examples of these types of text corpora are also presented. The article presents terminological equivalents of corpus names by the type of text data in Ukrainian and English.