ShiftRy is a web service for analyzing diachronic changes in the usage of words occurring in news texts from Russian mass media. It is named after Shiftry pokemon, inspired by Japanese Tengu. For that, we employ diachronic word embedding models trained on large Russian news corpora from 2010 up to 2020.
You can explore the semantic shifts history of any given query word, or browse the lists of words ranked by the degree of their semantic drift in any couple of years. Visualizations of the words' trajectories through time are provided. Users can obtain corpus examples with the query word before and after the semantic shift.
We plan to update ShiftRy yearly.
Fontanka.ru (until 2020)
Gazeta.ru (until 2020)
Interfax (until 2020)
Izvestia (until 2020)
The word embedding models used on this site were trained on news texts published by Russian media between 2010 and 2020. The full corpus for 11 years contains about 185 million word tokens, with yearly sub-corpora sizes varying from 9 million (2014) to 29 million (2020) word tokens. All the models were aligned using Procrustes transformation and are fully compatible.
Download diachronic embeddings:
Creating ShiftRy was a part of a Master's program in Computational Linguistics at the Higher School of Economics. Project members:
Andrey Kutuzov (University of Oslo)
Vadim Fomin (Higher School of Economics)
Vladislav Mikhailov (Higher School of Economics, Sberbank)
Julia Rodina (Higher School of Economics)
Our papers on tracing semantic changes: