# Paper Digest: COLT 2013 Highlights

Readers can also choose to read this highlight article on our console, which allows users to filter out papers using keywords and find related papers.

The Annual Conference on Learning Theory (COLT) focuses on addressing theoretical aspects of machine learing and related topics.

To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

If you do not want to miss any interesting academic paper, you are welcome to **sign up our free daily paper digest service ** to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.

Paper Digest Team

team@paperdigest.org

#### TABLE 1: COLT 2013 Papers

Title | Authors | Highlight | |
---|---|---|---|

1 | Preface | Shai Shalev-Shwartz, Ingo Steinwart | Preface |

2 | Open Problem: Adversarial Multiarmed Bandits with Limited Advice | Yevgeny Seldin, Koby Crammer, Peter Bartlett | It is known that if we observe the advice of all experts on every round we can achieve O\left(\sqrtKT \ln N\right) regret, where K is the number of arms, T is the number of game rounds, and N is the number of experts. |

3 | Open Problem: Fast Stochastic Exp-Concave Optimization | Tomer Koren | The question we pose is whether it is possible to obtain fast rates for exp-concave functions using more computationally-efficient algorithms. |

4 | Open Problem: Lower bounds for Boosting with Hadamard Matrices | Jiazhong Nie, Manfred K. Warmuth, S.V.N. Vishwanathan, Xinhua Zhang | We conjecture that with Hadamard matrices we can build a certain game matrix for which the game value grows at the slowest possible rate for t up to a fraction of n. |

5 | On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization | Ohad Shamir | In this paper, we investigate the attainable error/regret in the bandit and derivative-free settings, as a function of the dimension d and the available number of queries T. |

6 | A Theoretical Analysis of NDCG Type Ranking Measures | Yining Wang, Liwei Wang, Yuanzhi Li, Di He, Tie-Yan Liu | In this paper we study, from a theoretical perspective, the widely used NDCG type ranking measures. |

7 | Excess risk bounds for multitask learning with trace norm regularization | Massimiliano Pontil, Andreas Maurer | We give excess risk bounds with explicit dependence on the number of tasks, the number of examples per task and properties of the data distribution. |

8 | Honest Compressions and Their Application to Compression Schemes | Roi Livni, Pierre Simon | This means that we describe the reconstruction function explicitly. |

9 | The price of bandit information in multiclass online classification | Amit Daniely, Tom Helbertal | We consider two scenarios of multiclass online learning of a hypothesis class H⊆Y^X. |

10 | Estimation of Extreme Values and Associated Level Sets of a Regression Function via Selective Sampling | Stanislav Minsker | We propose a new method for estimating the locations and the value of an absolute maximum (minimum) of a function from the observations contaminated by random noise. |

11 | Bounded regret in stochastic multi-armed bandits | S�bastien Bubeck, Vianney Perchet, Philippe Rigollet | We propose a new randomized policy that attains a regret uniformly bounded over time in this setting. |

12 | Recovering the Optimal Solution by Dual Random Projection | Lijun Zhang, Mehrdad Mahdavi, Rong Jin, Tianbao Yang, Shenghuo Zhu | We present a simple algorithm, termed Dual Random Projection, that uses the dual solution of the low-dimensional optimization problem to recover the optimal solution to the original problem. |

13 | Opportunistic Strategies for Generalized No-Regret Problems | Andrey Bernstein, Shie Mannor, Nahum Shimkin | In this paper, we focus on a generalized no-regret problem with vector-valued rewards, defined in terms of a desired reward set of the agent. |

14 | Online Learning for Time Series Prediction | Oren Anava, Elad Hazan, Shie Mannor, Ohad Shamir | In this paper, we address the problem of predicting a time series using the ARMA (autoregressive moving average) model, under minimal assumptions on the noise terms. |

15 | Sharp analysis of low-rank kernel matrix approximations | Francis Bach | In this paper, we show that for approximations based on a random subset of columns of the original kernel matrix, the rank p may be chosen to be linear in the \emphdegrees of freedom associated with the problem, a quantity which is classically used in the statistical analysis of such methods, and is often seen as the implicit number of parameters of non-parametric estimators. |

16 | Beating Bandits in Gradually Evolving Worlds | Chao-Kai Chiang, Chia-Jung Lee, Chi-Jen Lu | To aim for smaller regrets, we adopt a relaxed two-point bandit setting in which the player can play two actions in each round and observe the loss values of those two actions. |

17 | Information Complexity in Bandit Subset Selection | Emilie Kaufmann, Shivaram Kalyanakrishnan | We consider the problem of efficiently exploring the arms of a stochastic bandit to identify the best subset. |

18 | Passive Learning with Target Risk | Mehrdad Mahdavi, Rong Jin | In this paper we consider learning in passive setting but with a slight modification. |

19 | Blind Signal Separation in the Presence of Gaussian Noise | Mikhail Belkin, Luis Rademacher, James Voss | In this paper we propose a new algorithm for solving the blind signal separation problem in the presence of additive Gaussian noise, when we are given samples from X=AS+η, where ηis drawn from an unknown, not necessarily spherical n-dimensional Gaussian distribution. |

20 | Active and passive learning of linear separators under log-concave distributions | Maria-Florina Balcan, Phil Long | Building on this, we provide a computationally efficient PAC algorithm with optimal (up to a constant factor) sample complexity for such problems. |

21 | Randomized partition trees for exact nearest neighbor search | Sanjoy Dasgupta, Kaushik Sinha | We analyze three such schemes. |

22 | Surrogate Regret Bounds for the Area Under the ROC Curve via Strongly Proper Losses | Shivani Agarwal | In this paper, we obtain such (non-pairwise) surrogate regret bounds for the AUC in terms of a broad class of proper (composite) losses that we term \emphstrongly proper. |

23 | Algorithms and Hardness for Robust Subspace Recovery | Moritz Hardt, Ankur Moitra | We consider a fundamental problem in unsupervised learning called subspace recovery: given a collection of m points in R^n, if many but not necessarily all of these points are contained in a d-dimensional subspace T can we find it? |

24 | PLAL: Cluster-based active learning | Ruth Urner, Sharon Wulff, Shai Ben-David | We investigate the label complexity of active learning under some smoothness assumptions on the data-generating process.We propose a procedure, PLAL, for “activising” passive, sample-based learners. |

25 | Learning Using Local Membership Queries | Pranjal Awasthi, Vitaly Feldman, Varun Kanade | We introduce a new model of membership query (MQ) learning, where the learning algorithm is restricted to query points that are close to random examples drawn from the underlying distribution. |

26 | Sparse Adaptive Dirichlet-Multinomial-like Processes | Marcus Hutter | I derive an optimal adaptive choice for the main parameter via tight, data-dependent redundancy bounds for a related model. |

27 | Prediction by random-walk perturbation | Luc Devroye, G�bor Lugosi, Gergely Neu | We propose a version of the follow-the-perturbed-leader online prediction algorithm in which the cumulative losses are perturbed by independent symmetric random walks. |

28 | Approachability, fast and slow | Vianney Perchet, Shie Mannor | In this paper we provide a characterization for the convergence rates of approachability and show that in some cases a set can be approached with a 1/n rate. |

29 | Classification with Asymmetric Label Noise: Consistency and Maximal Denoising | Clayton Scott, Gilles Blanchard, Gregory Handy | We introduce a general framework for classification with label noise that eliminates these assumptions. |

30 | General Oracle Inequalities for Gibbs Posterior with Application to Ranking | Cheng Li, Wenxin Jiang, Martin Tanner | In this paper, we summarize some recent results in Li et al. (2012), which can be used to extend an important PAC-Bayesian approach, namely the Gibbs posterior, to study the nonadditive ranking risk. |

31 | Learning Halfspaces Under Log-Concave Densities: Polynomial Approximations and Moment Matching | Daniel Kane, Adam Klivans, Raghu Meka | We give the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). |

32 | Subspace Embeddings and \ell_p-Regression Using Exponential Random Variables | David Woodruff, Qin Zhang | If one is just interested in a \textpoly(d) rather than a (1+ε)-approximation to \ell_p-regression, a corollary of our results is that for all p ∈[1, ∞) we can solve the \ell_p-regression problem without using general convex programming, that is, since our subspace embeds into \ell_∞ it suffices to solve a linear programming problem. |

33 | Consistency of Robust Kernel Density Estimators | Robert Vandermeulen, Clayton Scott | In this paper we establish asymptotic L^1 consistency of the RKDE for a class of losses and show that the RKDE converges with the same rate on bandwidth required for the traditional KDE. |

34 | Divide and Conquer Kernel Ridge Regression | Yuchen Zhang, John Duchi, Martin Wainwright | We study a decomposition-based scalable approach to performing kernel ridge regression. |

35 | Regret Minimization for Branching Experts | Eyal Gofer, Nicol� Cesa-Bianchi, Claudio Gentile, Yishay Mansour | For this setting of branching experts, we give algorithms and analysis that cover both the full information and the bandit scenarios. |

36 | Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families | Peter Bartlett, Peter Gr�nwald, Peter Harremo�s, Fares Hedayati, Wojciech Kotlowski | We study online learning under logarithmic loss with regular parametric models. |

37 | Online Similarity Prediction of Networked Data from Known and Unknown Graphs | Claudio Gentile, Mark Herbster, Stephen Pasteris | We consider online similarity prediction problems over networked data. |

38 | A near-optimal algorithm for finite partial-monitoring games against adversarial opponents | G�bor Bart�k | In this paper we present and analyze a new algorithm for locally observable partial monitoring games. |

39 | Representation, Approximation and Learning of Submodular Functions Using Low-rank Decision Trees | Vitaly Feldman, Pravesh Kothari, Jan Vondr�k | We study the complexity of approximate representation and learning of submodular functions over the uniform distribution on the Boolean hypercube {0,1}^n. |

40 | A Tale of Two Metrics: Simultaneous Bounds on Competitiveness and Regret | Lachlan Andrew, Siddharth Barman, Katrina Ligett, Minghong Lin, Adam Meyerson, Alan Roytman, Adam Wierman | We consider algorithms for “smoothed online convex optimization” problems, a variant of the class of online convex optimization problems that is strongly related to metrical task systems. |

41 | Optimal Probability Estimation with Applications to Prediction and Classification | Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, Ananda Theertha Suresh | Via a unified viewpoint of probability estimation, classification,and prediction, we derive a uniformly-optimal combined-probability estimator, construct a classifier that uniformly approaches the error of the best possible label-invariant classifier, and improve existing results on pattern prediction and compression. |

42 | Polynomial Time Optimal Query Algorithms for Finding Graphs with Arbitrary Real Weights | Sung-Soon Choi | In this paper, we achieve an ultimate goal of recent years for graph finding with the two types of queries, by constructing the first polynomial time algorithms with optimal query complexity for the general class of graphs with n vertices and at most m edges in which the weights of edges are arbitrary real numbers. |

43 | Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso | Abhradeep Guha Thakurta, Adam Smith | The algorithms we describe are efficient and in some cases match the optimal \emphnon-private asymptotic sample complexity. |

44 | Learning a set of directions | Wouter M. Koolen, Jiazhong Nie, Manfred Warmuth | We develop online algorithms for this type of problem. |

45 | A Tensor Spectral Approach to Learning Mixed Membership Community Models | Animashree Anandkumar, Rong Ge, Daniel Hsu, Sham Kakade | We propose a unified approach to learning these models via a tensor spectral decomposition method. |

46 | Adaptive Crowdsourcing Algorithms for the Bandit Survey Problem | Ittai Abraham, Omar Alonso, Vasilis Kandylas, Aleksandrs Slivkins | We present several algorithms for this problem, and support them with analysis and simulations.Our approach is based in our experience conducting relevance evaluation for a large commercial search engine. |

47 | Boosting with the Logistic Loss is Consistent | Matus Telgarsky | This manuscript provides optimization guarantees, generalization bounds, and statistical consistency results for AdaBoost variants which replace the exponential loss with the logistic and similar losses (specifically, twice differentiable convex losses which are Lipschitz and tend to zero on one side). |

48 | Competing With Strategies | Wei Han, Alexander Rakhlin, Karthik Sridharan | We study the problem of online learning with a notion of regret defined with respect to a set of strategies. |

49 | Online Learning with Predictable Sequences | Alexander Rakhlin, Karthik Sridharan | We present methods for online linear optimization that take advantage of benign (as opposed to worst-case) sequences. |

50 | Efficient Learning of Simplices | Joseph Anderson, Navin Goyal, Luis Rademacher | We show an efficient algorithm for the following problem: Given uniformly random points from an arbitrary n-dimensional simplex, estimate the simplex. |

51 | Complexity Theoretic Lower Bounds for Sparse Principal Component Detection | Quentin Berthet, Philippe Rigollet | We measure the performance of a test by the smallest signal strength that it can detect and we propose a computationally efficient method based on semidefinite programming. |