ru-RU: Антон Белый / zh-CN: 安东·贝里 / IPA: [ɐn̪’t̪o̞n̪ ‘bʲeɫɨç] / Scholar / Github / LinkedIn

natural language processing / machine learning / algorithms

Hi, I’m Anton!

I am a 2nd year PhD student at the Center of Language and Speech Processing (CLSP) at Johns Hopkins University. I am fortunate to be advised by Benjamin Van Durme and Vladimir Braverman.

I develop efficient algorithms for large-scale information retrieval and structured, semantically-aware text generation. I also design interfaces that allow humans and machines collaborate on generating structured objects such as scripts or schemas.

My background is in Algorithms, Machine Learning and Software Development: prior to PhD, I interned at JetBrains, worked for a year as a Backend Developer at VK and for two years as a Senior Data Scientist at Tochka Bank. I received B.Sc. in Applied Maths and Computer Science from ITMO University, where I’ve collaborated with Andrey Filchenkov and Konstantin Vorontsov on building exploratory search engines and plagiarism detection tools.

My papers

SI as ARM screenshot Script Induction As Association Rule Mining.
Anton Belyy and Benjamin Van Durme. ACL Workshop on Narrative Understanding, Storylines, and Events 2020.
[paper] [slides] [code]
Normplagdet screenshot Improved Evaluation Framework for Complex Plagiarism Detection.
Anton Belyy, Marina Dubova, and Dmitry Nekrasov. ACL 2018.
[paper] [poster] [code]
Russian plagiarism detection screenshot Framework for Russian Plagiarism Detection Using Sentence Embedding Similarity and Negative Sampling.
Anton Belyy and Marina Dubova. Dialogue 2018.
[paper] [slides] [code]
Hiearchical ARTM improvement screenshot Quality Evaluation and Improvement for Hierarchical Topic Modeling.
Anton Belyy, Mariia Seleznova, Alexey Sholokhov, and Konstantin Vorontsov. Dialogue 2018.
[paper] [slides] [demo]

Fun projects

My web-dev projects, written (mostly) on the rare occasions of having spare time, and mostly to play with a fancy new Web framework :)

schema-blocks screenshot SchemaBlocks (2019 – present)
A web-based UI for “programming” KAIROS schemas, specifying events, relations, and entity types. Based on Google Blockly.
[code] [demo]
rysearch screenshot Rysearch (2017 – 2019)
An exploratory search engine, that uses topic models to organize popular science literature into a hierarchical map and perform inexact document queries over this map. Based on FoamTree and hARTM.
[code] [demo]
esenin screenshot Esenin (2013)
A karaoke music player, featuring songs based on lyrics of a great 20th-century Russian poet Sergey Esenin. Written overnight from scratch in CoffeeScript.
[code] [demo]
lonelord screenshot Lonelord (2013)
An online strategy game, where you can build castles, mine resources, wage wars and trade with neighbors… by sending MongoDB database queries! Inspired by mysqlgame.
[code] [blog]

Page design by Ankit Sultana