Kumar Abhishek received his MS Dual Degree in Computer Science and Engineering (CSE). His research work was supervised by Dr. Sujit Prakash Gujar. Here’s a summary of Kumar Abhishek’s thesis Design and Analysis of Algorithms for Combinatorial Multi-arm Bandit Problems under Complex Environments.
Online learning problems are among the most abundant machine learning problems, such as advertisement selection, movie recommendation, and node or link prediction in evolving networks. In online learning, one must make online, real-time decisions and continuously improve performance with the sequential arrival of data. The multi-armed bandit (MAB) problem is one of the fundamental online learning problems that captures the classic exploration vs. exploitation dilemma. As per the applications’ requirements, new variants of MAB have been developed to tackle the problem. For example, strategic bidding in the online learning environment (MAB mechanism); availability of extra information to make the decision (Contextual MAB); selecting multiple arms instead of the single-arm (Combinatorial MAB); a varying set of available arms (Sleeping MAB), etc. We identify some of the problems and appropriately address them by proposing efficient algorithms. We provide theoretical guarantees for the performance of the proposed algorithms and showcase their efficacy through simulations. Following are the problems we address in this thesis:
- Designing truthful contextual multi-armed bandits with the backdrop of sponsored search auction. We tackle the drawbacks of the current state-of-the-art.
- Propose an efficient algorithm for sleeping combinatorial multi-arm bandit problem when we consider general reward function (satisfying mild smoothness conditions).
- We take a multi-arm bandit approach to subset selection under constraints in an unknown environment