ShiftRy

About ShiftRy

ShiftRy is a web service for analyzing diachronic changes in the usage of words occurring in news texts from Russian mass media. It is named after Shiftry pokemon, inspired by Japanese Tengu. For that, we employ diachronic word embedding models trained on large Russian news corpora from 2010 up to 2020.

You can explore the semantic shifts history of any given query word, or browse the lists of words ranked by the degree of their semantic drift in any couple of years. Visualizations of the words' trajectories through time are provided. Users can obtain corpus examples with the query word before and after the semantic shift.

We plan to update ShiftRy yearly.

ShiftRy talk at "Dialogue-2020"

Sources of news texts

  1. Fontanka.ru (until 2020)

  2. Gazeta.ru (until 2020)

  3. Interfax (until 2020)

  4. Izvestia (until 2020)

  5. Komsomolskaya Pravda

  6. Lenta.ru

  7. Novaya Gazeta

  8. N + 1

  9. RBC

  10. The Village

Diachronic word embeddings

The word embedding models used on this site were trained on news texts published by Russian media between 2010 and 2020. The full corpus for 11 years contains about 185 million word tokens, with yearly sub-corpora sizes varying from 9 million (2014) to 29 million (2020) word tokens. All the models were aligned using Procrustes transformation and are fully compatible.

Download diachronic embeddings:


ShiftRy creators

Creating ShiftRy was a part of a Master's program in Computational Linguistics at the Higher School of Economics. Project members:

Our papers on tracing semantic changes: