Research

I'm exciting about creating new methods and establishing connections between different areas of mathematics and AI. My professional experience consists of research in the fields of generative artificial intelligence, deep learning, numerical optimization, and approximation theory, so naturally most of my active interests lay in their intersections.

On a high level my main interests can be summarized into the following categories:

Large Language Models and Generative AI
Reinforcement and Supervised Learning
Nonconvex Blackbox Optimization
Nonlinear Constructive Approximation

Below are brief descriptions of every direction that I'm actively pursuing or have worked on in the past. If any of them interest you, please reach out; I'd be happy to have a deeper discussion, provide detailed information, or collaborate with you if my schedule allows!

Check out my research statement for a more technical description of my interests.

research_statement.pdf

Large Language Models and Generative AI: Can ChatGPT solve my life?

Generative AI has recently become one of my primary areas of interest, and certainly the one I'm most excited about. This groundbreaking technology has opened doors to interdisciplinary collaborations that were previously unimaginable, and I see Generative AI as a bridge for individuals from non-technical backgrounds to dive into the world of AI and play new roles in developing innovative applications. While ChatGPT introduced the public at large to the concept of large language models, it's just the tip of the iceberg, and generative AI holds a wealth of untapped potential just waiting to be explored. In addition to the projects below, I am pursuing several smaller ideas related to transformers, tokenization, and reinforcement learning from human feedback. Reach out if any of these are of interest to you, or if you have a totally new project you'd like an extra pair of eyes on; I'm always on the lookout for new opportunities to explore the area of generative AI!

Generative AI for Behavioral Sentiment Analysis

Content tailoring for individual user preferences is a common objective in many human--system applications, especially where the aim is to encourage positive behavioral change in the user. Traditionally, this task falls upon content writers and behavioral scientists to infuse messages with behavioral economics principles like Behavior Change Techniques (BCTs); however, this implementation can vary drastically depending on the style, tone, and context chosen by each author. An exciting challenge arises in developing NLP systems that can autonomously detect and quantify the presence and salience of these BCTs within a given message. Think of it as sentiment analysis taken to the next level, where we move beyond simple classification of positive, neutral, or negative sentiment and instead determine the presence of the 93 BCTs defined within the established taxonomy.

This project is a continuous collaboration with behavioral scientists and content designers, combining their expertise with the power of AI. Together, we're crafting a tool that holds the potential to revolutionize the way messages are tailored and coded for behavioral change. Check out our progress in the following repositories: BCT Dataset and BCT Comprehension.

Pre-Trained LLMs for Personalization Applications

Another captivating application of Large Language Models (LLMs) lies in harnessing the vast knowledge they've accumulated for content matching tasks like recommendations, personalization, nudging, and more. In this context, we view LLMs as pre-trained networks providing relevant embeddings for the content items, which becomes especially useful in tackling cross-modal tasks that are well-known to be challenging due to the vastly different natures and representations of the inputs. While traditional approaches typically necessitate individual systems for each modality, an innovative alternative emerges: obtaining a text description for each input and employing an LLM to embed these standardized representations into a shared latent space, where matching and comparison takes place.

For a demonstration of this idea in more detail, check out the paper Zero-Shot Recommendations with Pre-Trained Large Language Models for Multimodal Nudging, where we explore how this concept can be used for content matching across diverse domains.

Reinforcement and Supervised Learning: Can't we just use a bigger network?

The main focus of my professional research is deep learning with particular emphases on mathematical approach and real-world tasks. This includes finding solutions to the most challenging aspects of practical problems, analyzing methods that are employed in applications, and developing novel techniques that are better suited for a particular use case. In reinforcement learning, I strive to develop innovative approaches that enhance the efficiency and effectiveness of recommendation systems, ultimately leading to more personalized and engaging user experiences. In supervised learning, my research is primarily centered on network training and architecture design. This involves the development of robust and scalable neural network models, as well as the creation of efficient training algorithms. My overarching goal is to bridge the gap between theory and practical application. I am committed to advancing the state-of-the-art in deep learning by developing mathematically sound solutions that can be effectively deployed in real-world scenarios.

Policy Entropy and Action Selection

In practical applications, the focus often extends beyond the agent's capacity to learn, and encompasses the behavior it exhibits during the learning process. This becomes critical in contexts featuring a multitude of available actions, as the challenge of exploring the state--action space becomes unfeasible. Moreover, when combined with practical challenges, such as non-stationarity of the action space or non-determinicity of the feedback, this issue poses a significant obstacle, particularly in the domains of recommendation in the fields of healthcare, nudging, and behavioral change, where the amount of agent--user interactions are fundamentally limited.

Remarkably, one of the primary factors shaping the agent's behavior is the choice of the learning algorithm. Specifically, Policy Optimization and Q-Learning yield fundamentally distinct behaviors, even though both ultimately converge to the same optimal policy. This intriguing and somewhat unexpected finding appears to remain somewhat obscured within the broader research community. This ongoing investigation delves deeper into this phenomenon; further insights can be found in the paper Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks.

Reinforcement Learning for Hyperpersonalization

Reinforcement learning for hyperpersonalization is an emerging area becoming prominent in such critical applications as individualized healthcare and behavioral intervention where the conventional methods for personalization (e.g. A/B testing, collaborative filtering, recommender system) are not sufficient due their inability to learn the latent features and appropriately differentiate the input data. While there exist approaches that are better suited to handle hyperpersonalization tasks, they are typically benchmarked on simple simulated environments and do not suggest generalization guarantees to real-life applications. To complicate the matter further, applications dealing with human feedback are notoriously messy since human interactions are restricted and the behavioral feedback is non-deterministic by nature, which drastically complicates the feasibility of assessing the agent's performance.

To address such settings, I designed synthetic environments that showcase the underlying complexity of hyperpersonalization tasks and develop novel approaches that can recognize and overcome these challenges. This is an actively developing direction and you can read about it in the paper On the Unreasonable Efficiency of State Space Clustering in Personalization Tasks or check out my Reinforcement Learning Personalization Challenge.

Policy Evaluation and Ranking

In real-life reinforcement learning one often has to estimate the agent's performance to assess the state of the training process. This is usually achieved through one of the policy evaluation methods—techniques that utilize the logged agent-environment interactions to estimate the value of the policy. There has been a lot of development in the policy evaluation field recently with a variety of methods that take into account the distribution of the historical data, the structure of the reward signal, and the available information about the environment. Despite these advances, the existing methods require large amounts of data to provide a reasonable estimate, which renders them unreliable in critical real-life applications where the agent-environment interactions are limited due to either ethical, practical, or security considerations. In such cases, one may instead utilize a policy ranking approach---a technique that orders a set of agents with respect to the values of their policies. While typically being simpler than the evaluation, policy ranking is still a nontrivial task that requires careful consideration.

I've designed a policy ranking method that is intended to work in situations where the historical data is restricted in terms of both the volume and distribution. Read more about it in the paper Offline Policy Comparison under Limited Historical Agent-Environment Interactions.

Neural Architecture Design and Approximation Theory

The choice of architecture for a neural network is nontrivial and typically relies on a set of heuristic rules empirically established by the early practitioners. However, in novel real-world applications, the existing conventional rules, established for the determined benchmark datasets, don't always apply. This often leaves the user in the undesirable position of relying on a trial-and-error approach to find an appropriate architecture. Such an approach is not only unfeasible in critical applications but is also mathematically unsatisfactory to any sensible scientist. After all, the architecture indicates the parameterization of the learning target and thus the architecture selection can be approached from the perspective of approximation theory.

This is how I was introduced to deep learning, and I have since established several promising connections between these fields, including a technique for constructing efficient and compact shallow networks based on the provided training data. For more detail, refer to my papers Neural Network Integral Representations with the ReLU Activation Function and Greedy Shallow Networks: An Approach for Constructing and Training Neural Networks and check out my Shallow Network Approximation Challenge.

Nonconvex Blackbox Optimization: Did you try changing the learning rate?

Ever since I started working in deep learning, I was amazed at how little we know about nonconvex optimization and how unprepared we are to reliably train neural networks. While I am still amazed and we are still unprepared, I am now more aware of the overwhelming complexity that nonconvexity presents as well as the tremendous gap between the theoretical guarantees and the practical performance of neural network approximation. It is my persistent desire to reduce this gap and I'm constantly thinking of novel approaches that might help in this direction.

Smoothing-Based Optimization

Optimization is an essential and intricate component of the training of a neural network. Even though there are lots of heuristics for the choice of an optimization algorithm and its hyperparameters, the training process is far from straightforward. Most notably, the loss minimization problem is typically plagued by the nonconvexity of the loss landscape, which often causes the optimizer to get stuck in a local minimum. If the attained error value is not satisfactory, the conventional approach is to fine-tune the hyperparameters and repeat the whole minimization process, which is a tedious process with no guaranteed outcome. An alternative approach is to employ an optimization technique that can, to some extent, alleviate the challenge of nonconvexity. I'm really excited about the idea of smoothing-based optimization where the minimization is performed on the smoothed surrogate rather than on the original target function. As a result, the optimizer can navigate the smoothed loss landscape more reliably and converge faster, both of which are vital advantages in practical applications.

I have designed a blackbox smoothing-based optimizer that demonstrates outstanding performance on a wide range of nonconvex functions. This direction is in active development in terms of both theoretical analysis and practical implementation. Find more information in the papers An Adaptive Stochastic Gradient-Free Approach for High-Dimensional Blackbox Optimization and Gaussian Smoothing Gradient Descent for Minimizing High-Dimensional Non-Convex Functions.

Quasi-Random Low-Discrepancy Sampling

Quasi-random sampling is an essential component of many stochastic computational methods---from the simplicity of numerical integration to the complexity of behavior of a reinforcement learning agent---with Monte Carlo sampling being the most conventional and widely-employed technique. However, in critical real-life applications where the sampling space is high-dimensional and the user is limited in the amount of samples that can be drawn, the distribution of the Monte Carlo points might not be uniform enough, which, in turn, would distort the predicted outcome. In such situations, one would want to utilize a technique that provides samples with low discrepancy, thus guaranteeing appropriate distribution over the sampling space. Despite a seemingly simple and flexible formulation, generating quasi-random low-discrepancy samples is an extremely challenging problem in a general case. For instance, a particular version of this problem---picking uniformly distributed points on a 2-sphere---is closely related to Smale's seventh problem and is still unresolved. While this direction is currently on hold due to shifted priorities, I'm still quite interested in the problem. See the project repository for our progress so far.

Nonlinear Constructive Approximation: Does it have any real applications?

My mathematical background is in approximation theory with specialization in greedy algorithms. Greedy-type decomposition is often employed in the areas of signal processing, compressed sensing, sparse nonlinear approximation, convex optimization, and more recently in such fields as reduced order modeling and neural network approximation. Even though I'm not very active in this area anymore, I'm still excited about and follow the development of greedy techniques and their applications in other areas.

Greedy Algorithms for Reduced Order Modeling

Reduced order modeling is an umbrella term for reducing the computational complexity of mathematical models in numerical simulations. It is crucial in such applications as aero- and hydrodynamics, structural mechanics, and design optimization. Examples of reduced order modeling methods include proper orthogonal decomposition, nonlinear manifold algorithms, and reduced basis methods, with the latter being of most interest to me. Reduced basis methods iteratively construct a sequence of nested subspaces that approximate a given manifold, with the most prominent methods being the Empirical Interpolation Method and the Orthogonal Greedy Algorithm.

During my brief stay at the Oden Institute, I understood the engineering need for more flexible reduced basis methods. To address this challenge I have designed and analysed the Natural Greedy Algorithm---a reduced basis method that iteratively constructs an approximating subspace to the given manifold with respect to any chosen norm by utilizing the norming functionals of the space. In addition to being computationally simple, it turns out that the NGA coincides with the OGA in the case of 2-norm and with the EIM in the case of infinity-norm, and in general acts as a generalization between these two methods, depending on the selected norm. Read more in my paper The Natural Greedy Algorithm for Reduced Bases in Banach Spaces.

Greedy Algorithms with Adjustable Computational Complexity

While greedy algorithms are widely used in a variety of applications, the choice of which algorithm to employ often boils down to the available computational budget. Specifically, one of two algorithms are most often used: the Pure Greedy Algorithm (also known as the Matching Pursuit) or the Orthogonal Greedy Algorithm (also known as the Orthogonal Matching Pursuit). The PGA is a computationally simple iterative approach that on each iteration calculates a projection onto one-dimensional space, while the OGA projects onto a space of ever-increasing dimensionality, thus becoming more computationally expensive with each iteration. Due to the practical demand, it is of interest to develop a class of greedy algorithms that adapts to the available computational budget of a particular setting and thus provides a freedom of choice to the practitioner. This is an ongoing project with my collaborators from the High-Dimensional Approximation and Applications Laboratory and Lomonosov Moscow State University.

Greedy Algorithms with Imprecise Calculations

In practice, realization of many numerical algorithms is plagued with numerical instability due to their dependence on precise calculations. A conventional workaround is to employ algorithms that are robust to various forms of computational errors. I've researched this direction in the context of greedy algorithms with a particular focus on the Chebyshev Greedy Algorithm---a generalization of the Orthogonal Greedy Algorithm to the Banach space setting---in the areas of sparse approximation and convex optimization. This topic was the main direction of my PhD research and I published 3 papers on it: Biorthogonal Greedy Algorithms in Convex Optimization, On the Generalized Approximate Weak Chebyshev Greedy Algorithm, and On the Approximate Weak Chebyshev Greedy Algorithm in Uniformly Smooth Banach Spaces.

More Things I Want to Look Into (If I Ever Have Free Time)

A constant source of personal frustration is knowing that there are so many interesting concepts that I haven't had the time to explore yet. One thing that always bugs me is knowing there are lots of cool ideas I haven't had time to check out yet. In particular, I've got this little list of interesting projects that I would really like to investigate further:

Generative Adversarial Networks is a concept that is fascinating to me with an inspiring number of possible applications: from visual and audio artwork content generation to adversarial attacks, and from robust classifier training to the self-play training of the reinforcement learning agents. I can't believe that I still haven't made the time to play around with GANs and hope to change that soon.
Diffusion Models are a relatively new concept that has been applied in image generation by using a diffusion process to progressively refine an image. While not treating it seriously at the moment, I am having a lot of fun playing with Midjourney and Stable Diffusion, and am constantly on the lookout for new platforms, such as Stableboost, that simplify the usability of various diffusion models.
Neural Teleportation is the really neat idea of changing the weights of a network in such a way that the performance of the network stays the same; essentially the whole set of network weights is teleported to a new point on the loss landscape while the output stays unchanged. Such manipulation might be beneficial from an optimization perspective and might provide deeper insights into the loss landscape topology.
Competing Critics Model is an idea to model the human decision-making process as a compromise of two agents—an optimist and a pessimist—that predict the best and the worst possible outcomes respectively; while there's nothing special about the structure of each agent, I find the concept to be quite impressive and definitely one of the most elegant approaches to behavior modeling that I've seen so far.
Legend of Evelys is a game designed by the National Research Council of Canada that uses behavioral science to support the player's mental health. It was demonstrated to me during a discussions with the NRC and it's particularly intriguing to me since it's incredibly difficult to assess one's mental state, let alone intervene and correct it via a non-human medium. LoE is still in active development but I can't wait for the public release.

Page updated

Google Sites

Report abuse