ru-RU: Антон Белый / zh-CN: 安东·别里 / IPA: [ɐn̪’t̪o̞n̪ ‘bʲeɫɨç]

LinkedIn / Github / Scholar

natural language processing / machine learning / algorithms

Hi, I’m Anton!

For the past 7+ years, I take pride in designing and building scalable, reliable, and accountable AI systems.

Currently, I am a Senior Machine Learning Engineer at Stripe, where I build ML/LLM models for credit risk detection. Prior to Stripe, I have worked at:

I completed my Masters Degree at the Center for Language and Speech Processing at Johns Hopkins University, and my Bachelors Degree at the Computer Technologies Department at ITMO University.

Research papers

MM1 paper MM1: methods, analysis and insights from multimodal LLM pre-training. Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman, Mark Lee, Zirui Wang, Ruoming Pang, Peter Grasch, Alexander Toshev, Yinfei Yang. ECCV 2024. [paper]
Fleek paper FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge. Farima Fatahi Bayat, Kun Qian, Benjamin Han, Yisi Sang, Anton Belyi, Samira Khorshidi, Fei Wu, Ihab F. Ilyas, Yunyao Li. EMNLP Demo Track 2023. [paper]
[demo]
FTC paper Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI. Suzanna Sia, Anton Belyy, Amjad Almahairi, Madian Khabsa, Luke Zettlemoyer, and Lambert Mathias. AAAI 2023. [paper]
Schema paper Human Schema Curation via Causal Association Rule Mining. Noah Weber, Anton Belyy, Nils Holzenberger, Rachel Rudinger, and Benjamin Van Durme. LREC Linguistic Annotation Workshop 2022. [paper]
[code]
[demo]
[data]
Guided K-best Guided K-best Selection for Semantic Parsing Annotation. Anton Belyy, Chieh-Yang Huang, Jacob Andreas, Emmanouil Antonios Platanios, Sam Thomson, Richard Shin, Subhro Roy, Aleksandr Nisnevich, Charles Chen, and Benjamin Van Durme. ACL Demo Track 2022. [paper]
[poster]
[slides]
[talk]
InFillmore InFillmore: Frame-Guided Language Generation with Bidirectional Context. Jiefu Ou, Nathaniel Weir, Anton Belyy, Felix Yu, and Benjamin Van Durme. STARSEM 2021. [paper]
[poster]
[slides]
[talk]
[demo]
SI as ARM Script Induction As Association Rule Mining. Anton Belyy and Benjamin Van Durme. ACL Workshop on Narrative Understanding, Storylines, and Events 2020. [paper]
[slides]
[talk]
[code]
Normplagdet screenshot Improved Evaluation Framework for Complex Plagiarism Detection. Anton Belyy, Marina Dubova, and Dmitry Nekrasov. ACL 2018. [paper]
[poster]
[slides]
[code]
[blog]
Russian plagiarism detection Framework for Russian Plagiarism Detection Using Sentence Embedding Similarity and Negative Sampling. Anton Belyy and Marina Dubova. Dialogue 2018. [paper]
[slides]
[code]
Hiearchical ARTM improvement Quality Evaluation and Improvement for Hierarchical Topic Modeling. Anton Belyy, Mariia Seleznova, Alexey Sholokhov, and Konstantin Vorontsov. Dialogue 2018. [paper]
[slides]
[demo]

Fun projects

My toy projects from high school and undergrad, written (mostly) to have fun and play with a shiny new (at the time) framework or language:

rysearch screenshot Rysearch (2017 – 2019)
An exploratory search engine, that uses interpretable topic models to organize popular science literature into a hierarchical map and perform document search. Built using Node.js, MongoDB, ZeroMQ, FoamTree, and BigARTM.
[code] [demo]
esenin screenshot Esenin (2013)
A karaoke music player, featuring songs based on verses of a great 20th-century Russian poet Sergey Esenin. Written overnight using Node.js, Express and CoffeeScript.
[code] [demo]
lonelord screenshot Lonelord (2013)
A real-time strategy game, where you can build castles, mine resources, wage wars and trade with neighbors… by sending MongoDB database queries! Inspired by mysqlgame.
[code] [blog]

Page design by Ankit Sultana