Alexey Skrynnik
Senior Research Scientist & Team Lead, AIRI
Moscow, Russia
Senior Research Scientist and Team Lead at AIRI focused on reinforcement learning, multi-agent systems, and LLM/multimodal agents. My recent work spans RL-style tree search for LLM test-time reasoning, foundation-model and learnable approaches to multi-agent pathfinding, and embodied instruction-following agents.
Research Experience
Senior Research Scientist & Team Lead
AIRI, Cognitive AI Systems Laboratory
- Leading an 8-person research team focused on RL, LLM/multimodal agents, and multi-agent systems
- Supervised ReSCALE, an RL-style tree-search method for LLM test-time reasoning that restores monotonic scaling with larger search budgets without retraining (ICAPS 2026)
- Led MAPF-GPT, a foundation model for multi-agent pathfinding with zero-shot generalization on unseen maps, outperforming state-of-the-art learnable solvers (AAAI 2025, Oral)
Research Scientist
AIRI, Cognitive AI Systems Laboratory
- Learn to Follow (AAAI 2024, Oral): combined RL and decentralized planning for lifelong multi-agent pathfinding, improving generalization with a 10x speedup over a state-of-the-art search-based solver
- Decentralized MCTS for partially observable MAPF: first MCTS approach for this setting (AAAI 2024)
- Built and open-sourced POGEMA, a benchmark platform for multi-agent pathfinding, later published at ICLR 2025
- RL Track Lead for the IGLU Competition at NeurIPS 2021 and 2022; co-developed benchmarks for collaborative embodied agents in grounded instruction-following Minecraft tasks for human-AI collaboration
Junior Research Scientist
Federal Research Center for Computer Science and Control, Russian Academy of Sciences
- 1st place at the NeurIPS 2019 MineRL Diamond Competition with a hierarchical RL approach leveraging demonstrations as human priors for long-horizon decision-making in Minecraft; first author and presenter at NeurIPS
- Researched multi-agent pathfinding, model-based RL, and visual navigation, including hybrid policy learning with classical search; published in IEEE TNNLS, Knowledge-Based Systems, and Cognitive Systems Research
Education
PhD in AI & Machine Learning FRC CSC RAS (defended at MIPT), Moscow, Russia, 2023
MS in Computer Science Rybinsk State Aviation Technical University, 2015 – 2017
BS in Computer Science Rybinsk State Aviation Technical University, 2011 – 2015
Selected Publications
ICAPS 2026
Revisiting Tree Search for LLMs: Gumbel and Sequential Halving for Budget-Scalable Reasoning
AAAI 2026 (Best Poster Award)
Camar: Continuous Actions Multi-Agent Routing
AAAI 2025 (Oral)
MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale
ACM SIGIR 2025
IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents
ECAI 2024
Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments
AAAI 2024 (Oral)
Learn to Follow: Decentralized Lifelong Multi-Agent Pathfinding via Planning and Learning
NeurIPS 2022 Competition Track
Interactive Grounded Language Understanding in a Collaborative Environment: Retrospective on IGLU 2022 Competition
Technical Skills
Core Stack Python, PyTorch, JAX, C++
Research Areas Reinforcement Learning, Multi-Agent Systems, Planning/Search, Imitation Learning
LLM/Agent Systems RL for LLMs, LLM/Multimodal Agents, Test-Time Reasoning, Inference/Evaluation
Training/Infrastructure Distributed Training, FSDP/DDP, Slurm, Ray, vLLM, VERL, HF Transformers, LoRA/PEFT
Languages English (advanced), Russian (native)
Achievements & Service
- Yandex ML-Prize 2024 — awarded for research in Reinforcement Learning, Multi-Agent RL, and Multi-Agent Systems
- NeurIPS 2019 MineRL Diamond Competition — 1st place in long-horizon RL from demonstrations (2019)
- Co-organizer & RL Track Lead: IGLU Competition at NeurIPS 2021-2022 on grounded instruction following
- Reviewing: NeurIPS, ICML, AAAI, ICLR, ACL Rolling Review, JAIR, Nature Communications
Teaching
MSU AI Masters – Advanced Reinforcement Learning Lecturer (2025 – present)
MIPT – Reinforcement Learning, Software Tools for AI Assistant Lecturer (2020 – 2022)
HSE – Applied Problems of Data Analysis Lecturer (2018 – 2020)