Nouns in Russian language list file. Frequency dictionary of the Russian language

The meaning of the word NOUN in the Ozhegov Russian Dictionary

NOUN

noun == noun Declension of nouns. noun In grammar: a part of speech that denotes an object and expresses the meaning of the object in the forms of gender, number and case. Concrete and abstract nouns. Proper and common nouns. Nouns, animate and inanimate.

Ozhegov. Ozhegov's Dictionary of the Russian Language. 2012

See also interpretations, synonyms, meanings of words and what a NOUN is in Russian in dictionaries, encyclopedias and reference books:

  • NOUN in big encyclopedic dictionary:
  • NOUN
    part of speech, a class of full-valued words (lexemes), which includes the names of objects and animate beings and can appear in a sentence...
  • NOUN in the Modern Encyclopedic Dictionary:
  • NOUN in the Encyclopedic Dictionary:
    part of speech denoting objects (things, substances, people, animals), properties abstracted from their bearer (“kindness”), actions and states abstracted from...
  • NOUN in the Encyclopedic Dictionary:
    ,-oto, cf. or a noun - in grammar: a part of speech that denotes an object and expresses the meaning of the object in forms of gender, ...
  • NOUN in the Big Russian Encyclopedic Dictionary:
    NOUN, a class of full-meaning words (part of speech), which includes the name. objects and animate beings and can appear in a sentence...
  • NOUN
    - a class of full-meaning words (part of speech), which includes the names of objects and animate beings and can appear in a sentence according to ...
  • NOUN
    see noun...
  • NOUN in the dictionary of Synonyms of the Russian language.
  • NOUN in the New Explanatory Dictionary of the Russian Language by Efremova:
  • NOUN in the Complete Spelling Dictionary of the Russian Language:
    noun, …
  • NOUN in the Spelling Dictionary:
    noun...
  • NOUN in Modern explanatory dictionary, TSB:
    a class of full-meaning words (part of speech), which includes the names of objects and animate beings and can appear in a sentence in ...
  • NOUN in Ephraim's Explanatory Dictionary:
    noun cf. A part of speech that denotes a subject and usually varies by case and number; noun (in...
  • NOUN in the New Dictionary of the Russian Language by Efremova:
    Wed A part of speech that denotes a subject and usually varies by case and number; noun (in...
  • NOUN in the Large Modern Explanatory Dictionary of the Russian Language:
    Wed A part of speech that denotes a subject and usually varies by case and number; noun (in linguistics) ...
  • ENGLISH LANGUAGE in the Literary Encyclopedia:
    language mixed. By its origin, it is associated with the western branch of the Germanic group of languages. (cm.). It is customary to share the history of A. Yaz. on the …
  • PARTS OF SPEECH in big Soviet encyclopedia, TSB:
    speech, the main classes of words of a language, distinguished on the basis of the similarity of their syntactic (see Syntax), morphological (see Morphology) and logical-semantic (see ...
  • PARTS OF SPEECH in the Linguistic Encyclopedic Dictionary:
    - classes of words in a language, distinguished on the basis of the commonality of their syntactic (see Syntax), morphological (see Morphology) and semantic (see Semantics) properties. ...
  • WORD ORDER in the Linguistic Encyclopedic Dictionary:
    - a certain arrangement of words in a sentence or syntax. group. Structural types of P. s. differ in the following. oppositions: progressive, or consistent...
  • PERIPHRASE in the Linguistic Encyclopedic Dictionary:
    (periphrasis) (from Greek, periphrasis - descriptive expression, allegory) - a stylistic device consisting in an indirect, descriptive designation of objects and phenomena of reality...
  • NAME CLASSES in the Linguistic Encyclopedic Dictionary:
    - a lexical-grammatical category of a noun, consisting in the distribution of names into groups (classes) in accordance with ancient semantic features with the obligatory formal...
  • BALTIC LANGUAGES in the Linguistic Encyclopedic Dictionary:
    - a group of Indo-European languages. B. i. preserve the ancient Indo-European culture more completely. language system than other modern ones. Indo-European groups. families of languages. There is a point...
  • ADAMUA-EAST LANGUAGES in the Linguistic Encyclopedic Dictionary.
  • PARTS OF SPEECH in the Dictionary of Linguistic Terms:
    The main lexical and grammatical categories into which words of a language are distributed based on the following characteristics: a) semantic (generalized meaning of an object, action or state, quality...
  • AGREEMENT ON MEANING in the Dictionary of Linguistic Terms:
    The choice of the number form or gender of the predicate is based not on grammatical similarity to the form of expression of the subject, but on the semantic relationship between both...
  • CASE in the Dictionary of Linguistic Terms:
    1 (case category). A grammatical category of a noun that expresses the relationship of the object it denotes to other objects, actions, and characteristics. Withering away in Romanesque...
  • CHARACTER in the Popular Explanatory Encyclopedic Dictionary of the Russian Language:
    -a, m. Actor in artistic, dramatic work, and genre painting. Comic character. Negative character. Chekhov's characters. Characters of Russian folk...

Second version of the frequency list

On this page you can get lists of the most common words in the Russian language. Until now Frequency dictionary Russian language ed. L.N. Zasorina (1977) was most often used as a source of information about the frequency of Russian words. However, the corpus from which the frequency of words in this dictionary was calculated is very small by modern standards (about a million words). In addition, the list is significantly outdated: it corresponds to the frequency of use of words in the period from the 20s to the 60s. As a result, the corpus includes a large number of ideological sources, for example, the works of Lenin and Kalinin, Materials of the 22nd and 23rd Congresses of the CPSU, Soviet newspapers. Words Soviet And comrade are included in the first hundred Russian words, along with function words (they occur more often than words where, here, your), words party, revolution, communist occur more often than back, around, better etc. Finally, the list of words from Zasorina’s dictionary does not exist in electronic form.

The list of words available from this page contains approximately 35,000 words with a frequency greater than 1 ipm (instances per million words, instances per million words). There is also a shorter list of the 5000 most common Russian words. The lists use the Cyrillic utf8 encoding and are compressed with the WinZip utility (Linux or Mac users can use StuffIt to unzip).

The structure of the lists follows the format of lemmatized lists from the British National Corpus (BNC), created by Adam Kilgariff, as follows:
ordinal number, frequency (ipm), lemma, part of speech (BNC classification).

Words with frequency greater than 1 ipm

  • - word forms sorted by frequency

List of 5000 most common words

  • - lemmas sorted in alphabetical order
  • - lemmas sorted by frequency

Some statistics on the use of Russian words

  • The average word length is 5.28 characters.
  • The average sentence length is 10.38 words.
  • The 1000 most frequent lemmas cover 64.0708% of the text.
  • The 2000 most frequent lemmas cover 71.9521% of the text.
  • The 3000 most frequent lemmas cover 76.5104% of the text.
  • The 5000 most frequent lemmas cover 82.0604% of the text.

More full information The correspondence between word frequency and corpus coverage is found.

The list is based on a representative corpus of the modern Russian language. It includes a selection of contemporary prose, political memoirs, contemporary newspapers and non-fiction (about 40 million words, with prose making up about a little more than half the volume). All texts in the corpus were written in Russian between 1970 and 2002; the majority between 1980 and 1995, newspaper corpus 1997-1999 (the corpus is based on texts from the Moshkov Library and the corpus of modern journalism by A.V. Baranov).

It is well known that large texts present a problem for the compilation of frequency lists, since a relatively long text may contain a large number of occurrences of some rare word, which will significantly increase its frequency in the final list. For example, the corpus used to compile this list, contains a variation on the theme of Tolkien's "Lord of the Rings" (author Nick Perumov). Despite the fact that the length of this novel is 250 thousand words, less than one percent of the entire corpus, the frequency of use of the word hobbit in this novel puts it in the first thousand Russian words, if the frequency is counted across all texts without restrictions on their length. For this reason, frequency lists were compiled on the condition that the sample from large texts is limited to 10 thousand words, and the sample from texts of one author is less than 100 thousand words. As a result, the subset of the full corpus used in frequency calculations is approximately 16 million words.

The distribution of words in the texts is far from uniform. Some words (for example, prepositions) appear in many texts with quite predictable frequency. The frequency of others (for example, pronouns or mental verbs) depends significantly on the author or genre of the text, while many words are “contagious”: if this word (for example, a proper name, a designation of a person by rank or position, or a technical term) occurs once in the text, it is very likely that it will be repeated there many more times, thus significantly increasing its frequency in the document. Exist different ways measurements of such variation (Church, K. and Gale, W. (1995) Poisson Mixtures, Journal of Natural Language Engineering, 1:2). The simplest way to assess the behavior of a word: calculate the coefficient of variation, which is calculated as the standard deviation divided by the average value. The standard deviation gives the absolute value of the variation in a data set (it increases for words with higher mean frequency), while the coefficient of variation compares the distribution of words with unequal mean frequency. The deviation values ​​for the 5000 most frequent words can be viewed. File structure:
lemma, average frequency (ipm), number of texts in which this word occurs, standard deviation of frequency for all texts, coefficient of variation, dispersion.

The corpus, tools for working with it, as well as the parallel English-Russian corpus (sentence-based alignment) are described, in particular, in the following publication by the author:

Sharoff, Serge, (2002). Meaning as use: exploitation of aligned corpora for the contrastive study of lexical semantics. Proc. of Language Resources and Evaluation Conference (LREC02). May, 2002, Las Palmas, Spain.

There are also separate frequency lists for the following classes of words:

The creation of the corpus and the development of associated software and frequency lists were supported by a grant provided to the author by the Humboldt Foundation, Germany. Lemmatization for the analysis of word forms in the corpus was carried out using the Dialing morphological analyzer. Since many word forms are ambiguous (for example, dear, were, steel, for, three, already), the frequency of some words is not entirely reliable, for example, For treated as a verb only if it is not followed by a noun, adjective or pronoun, become has always been considered as a noun, for spouses was always chosen spouse if possible spouse And spouses(plural). The criteria for choosing a word form were:

  1. frequency of the corresponding lemma ( took it, I'll give it to you as a noun is extremely unlikely, so in these cases a verb is chosen);
  2. comparative frequency of a particular form (both lemmas for become are quite frequent, but the noun, unlike the verb, is very often used in this form; form it's time has to be counted in predicative use, while the noun appears in all its other forms).
Like the Zasorina surname dictionary, first names and patronymics were filtered from lemmatized frequency lists, but geographical names left because it’s difficult to justify why they were left in Zasorina’s dictionary Moscow or American, but not Moscow And America. The frequency list of word forms was not filtered.

noun e is an independent significant part of speech, combining words that

1) have a generalized meaning of objectivity and answer the questions who? or what?;

2) are proper or common nouns, animate or inanimate, have a constant gender sign and inconsistent (for most nouns) number and case signs;

3) in a sentence they most often act as subjects or objects, but can be any other members of the sentence.

Noun- this is a part of speech, when highlighted, the grammatical features of words come to the fore. As for the meaning of nouns, this is the only part of speech that can mean anything: an object (table), a person (boy), an animal (cow), a sign (depth), an abstract concept (conscience), an action (singing) , relation (equality). From the point of view of meaning, these words are united by the fact that they can be asked the question who? or what?; This, in fact, is their objectivity.

Common nouns designate objects without distinguishing them from the class of the same type (city, river, girl, newspaper).

Proper nouns designate objects, distinguishing them from the class of homogeneous objects, individualizing them (Moscow, Volga, Masha, Izvestia). It is necessary to distinguish proper names from proper names - ambiguous names of individualized objects (“Evening Moscow”). Proper names do not necessarily include given name(Moscow State University).

Animate and inanimate nouns

Nouns have a permanent morphological feature animation.

The sign of animacy of nouns is closely related to the concept of living / inanimate. Nevertheless, animacy is not a category of meaning, but a morphological feature itself.

Animacy as a morphological feature also has formal means of expression. Firstly, animateness/inanimateness is expressed by the endings of the noun itself:

1) animate nouns have the same plural endings. numbers V. p. and R. p., and for nouns husband. This also applies to units. number;

2) inanimate nouns have the same plural endings. numbers V. p. and I. p., and for nouns husband. This also applies to units. number.

The animacy of most nouns reflects a certain state of affairs in extra-linguistic reality: animate nouns are mainly called living beings, and inanimate are inanimate objects, but there are cases of violation of this pattern:


fluctuation in animation

an object cannot be both living and non-living:
alive but inanimate

1) a collection of living beings:

(I see)armies, crowds, peoples ;

2) plants, mushrooms:

(gather)chanterelles ;

inanimate but animate

1) toys in the form of a person:

(I see)dolls, nesting dolls, tumblers ;

2) figures of some games:

(play)kings, queens ;

3) deceased:

(I see)dead, drowned , Butdead body (inanimate);

4) fictional creatures:

(I see)mermaids, goblins, brownies.

Nouns have a constant morphological gender marker and relate to male, female or neuter.

Masculine, feminine and neuter gender include words with the following compatibility:

Some nouns with the ending -a, denoting characteristics, properties of persons, in I. p. have a double gender characterization depending on the gender of the designated person:

your ignoramus has come,

your ignoramus came.

Such nouns belong to the general gender.

Nouns only plural (cream, scissors) do not belong to any of the genders, since in the plural the formal differences between nouns of different genders are not expressed (cf.: desks - tables).

Nouns change according to number and case. Most nouns have singular and plural forms ( city ​​- cities, village - villages). However, some nouns have either only a singular form (for example, peasantry, asphalt, combustion), or only the plural form (for example, scissors, railings, everyday life, Luzhniki).

Case as a morphological feature of nouns

Nouns change by case, that is, they have an inconsistent morphological sign of number.

There are 6 cases in the Russian language: nominative (I. p.), genitive (R. p.), dative (D. p.), accusative (V. p.), instrumental (T. p.), prepositional (P. P.). These case forms are diagnosed in the following contexts:

I. p.who is this? What?

R. p. no one? what?

D. p.happy for whom? what?

V. p. see who? What?

T.p.proud of whom? how?

P.P. I'm thinking about whom? how?

The endings of different cases are different depending on which declension the noun belongs to.

Declension of nouns

Changing nouns by case is called declension.

TO I declension include nouns husband. and wives kind with ending I. p. unit. numbers -a(-i), including words ending in -i: mom-a, dad-a, earth-ya, lecture-ya (lecture-a). Words with a stem ending in a hard consonant (hard version), a soft consonant (soft version) and with a stem ending in -иj have some differences in endings, for example:

CaseSingular
Solid option
Soft option
On - and I
Name Countries - A Earth -I Army -I
R.p. Countries - s
Earth -And Army -And
D.p. Countries - e Earth -e
Army -And
V.p. Countries - at Earth -Yu Army -Yu
etc. Countries -Ouch (-oh )
Earth -to her (-yoyu ) Army -to her (-her )
P.p. Countries -e Earth -e Army -And

Co. II declension include nouns husband. genders with zero ending I. p., including words starting with -iy, and nouns m. and cf. genders ending in -о(-е), including words ending in -е: table-, genius-, town-o, window-o, half-e, peni-e (penij-e).

TO III declension include nouns female. kind with zero ending in I. p.: dust- , night-.

In addition to nouns that have endings in only one of these declensions, there are words that have part of the endings from one declension, and part from the other. They are called heterogeneous. These are 10 words starting with -mya (burden, time, stirrup, tribe, seed, name, flame, banner, udder, crown) and path.

In the Russian language there are so-called indeclinable nouns. These include many common nouns and personal borrowings (coat, Tokyo), Russian surnames with -yh, -ikh, -vo (Petrovykh, Dolgikh, Durnovo). They are usually described as words without endings.

Morphological analysis of a noun

The noun is parsed according to the following plan:

I. Part of speech. General value. Initial form(nominative singular).

II. Morphological characteristics:

1. Constant signs: a) proper or common noun, b) animate or inanimate, c) gender (masculine, feminine, neuter, common), d) declension.
2. Non-constant signs: a) case, b) number.

III. Syntactic role.

Sample morphological analysis noun

Two ladies ran up to Luzhin and helped him get up; he began to knock the dust off his coat with his palm (according to V. Nabokov).

I. Ladies- noun;

initial form - lady.

II. Constant signs: nat., soul., female. genus, I class;

inconsistent signs: plural. number, I. p.

III. They ran up(Who?) ladies (part of the subject).

I.(To) Luzhin- noun;

initial form - Luzhin;

II. Constant signs: own, soulful, male. genus, I class;

inconsistent signs: units. number, D. p.;

III.
They ran up(to whom?) .underline ( border-bottom: 1px dashed blue; ) to Luzhin(addition).

I. Palm- noun;

initial form - palm;

II.
Constant signs: nav., inanimate., female. genus, I class;

inconsistent signs: units. number, T. p.;

III.
Began to shoot down(how?) palm(addition).

I. Dust- noun;

initial form - dust;

II.
Constant signs: nav., inanimate., female. genus, III class;

inconsistent signs: units. number, V. p.;

III. Began to shoot down(What?) dust(addition).

I. Coat- noun;

initial form - coat;

II.
Constant signs: vernacular, inanimate, cf. gen., undeclined;

inconsistent signs: the number is not determined by the context, R. p.;

III. Began to shoot down(why?) with coat(addition).

Loading...Loading...