Skip to main content


eCommons@Cornell

eCommons@Cornell >
Cornell University Graduate School >
Theses and Dissertations (CLOSED) >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1813/17704
Title: Expressive Models In Online Learning
Authors: Sharma, Yogeshwer
Issue Date: 20-Oct-2010
Abstract: We study the online learning model: a widely applicable model for making repeated choices in an interactive environment. In standard online learning model (or an online learning problem), the decision-maker is provided with a set of alternatives, and selects one alternative in each of the T sequential trials, deriving a reward for each selection. After T trials, the total reward of the decision-maker is compared with the best "single-arm" strategy which has the maximum reward in hindsight. The difference between the reward of the best single-arm strategy and that of the algorithm is called the regret , and one seeks decision-making algorithms whose regret is sublinear in T and running time is polynomial in the problem size. In this thesis, we extend the basic online learning model in two important ways. In the first extension, we model sponsored search auctions as a multi-armed bandit problem (a type of online learning problem), and allow the alternatives (or advertisers in this case) to be strategic which can report possibly wrong rewards (in order to make personal gains). We seek to provide incentives to advertisers so as to get good solutions (socially efficient solutions). We prove that any socially efficient solution that provides right incentives to advertisers (being dominant strategy truthful) must suffer much higher regret than the regret suffered by algorithms for multi-armed problem without incentive issues. In the second extension, also motivated by sponsored search and resource selection in distributed systems, we allow the set of available alternatives to vary over time, provide a natural way to define the regret, and give policies for the decision-maker that suffer low regret. We also prove that the regret suffered by our policies is information-theoretically the lowest possible.
No Access Until: 2015-10-20
URI: http://hdl.handle.net/1813/17704
Appears in Collections:Theses and Dissertations (CLOSED)

Files in This Item:

File Description SizeFormat
Sharma, Yogeshwer.pdf1.18 MBAdobe PDFView/Open

Refworks Export

Items in eCommons are protected by copyright, with all rights reserved, unless otherwise indicated.

 

© 2014 Cornell University Library Contact Us