Paper Digest: COLT 2013 Highlights

June 24, 2013June 18, 2020 admin

Readers can also choose to read this highlight article on our console, which allows users to filter out papers using keywords and find related papers.

The Annual Conference on Learning Theory (COLT) focuses on addressing theoretical aspects of machine learing and related topics.

To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.

Paper Digest Team
team@paperdigest.org

TABLE 1: COLT 2013 Papers

	Title	Authors	Highlight
1	Preface	Shai Shalev-Shwartz, Ingo Steinwart	Preface
2	Open Problem: Adversarial Multiarmed Bandits with Limited Advice	Yevgeny Seldin, Koby Crammer, Peter Bartlett	It is known that if we observe the advice of all experts on every round we can achieve O\left(\sqrtKT \ln N\right) regret, where K is the number of arms, T is the number of game rounds, and N is the number of experts.
3	Open Problem: Fast Stochastic Exp-Concave Optimization	Tomer Koren	The question we pose is whether it is possible to obtain fast rates for exp-concave functions using more computationally-efficient algorithms.
4	Open Problem: Lower bounds for Boosting with Hadamard Matrices	Jiazhong Nie, Manfred K. Warmuth, S.V.N. Vishwanathan, Xinhua Zhang	We conjecture that with Hadamard matrices we can build a certain game matrix for which the game value grows at the slowest possible rate for t up to a fraction of n.
5	On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization	Ohad Shamir	In this paper, we investigate the attainable error/regret in the bandit and derivative-free settings, as a function of the dimension d and the available number of queries T.
6	A Theoretical Analysis of NDCG Type Ranking Measures	Yining Wang, Liwei Wang, Yuanzhi Li, Di He, Tie-Yan Liu	In this paper we study, from a theoretical perspective, the widely used NDCG type ranking measures.
7	Excess risk bounds for multitask learning with trace norm regularization	Massimiliano Pontil, Andreas Maurer	We give excess risk bounds with explicit dependence on the number of tasks, the number of examples per task and properties of the data distribution.
8	Honest Compressions and Their Application to Compression Schemes	Roi Livni, Pierre Simon	This means that we describe the reconstruction function explicitly.
9	The price of bandit information in multiclass online classification	Amit Daniely, Tom Helbertal	We consider two scenarios of multiclass online learning of a hypothesis class H⊆Y^X.
10	Estimation of Extreme Values and Associated Level Sets of a Regression Function via Selective Sampling	Stanislav Minsker	We propose a new method for estimating the locations and the value of an absolute maximum (minimum) of a function from the observations contaminated by random noise.
11	Bounded regret in stochastic multi-armed bandits	S�bastien Bubeck, Vianney Perchet, Philippe Rigollet	We propose a new randomized policy that attains a regret uniformly bounded over time in this setting.
12	Recovering the Optimal Solution by Dual Random Projection	Lijun Zhang, Mehrdad Mahdavi, Rong Jin, Tianbao Yang, Shenghuo Zhu	We present a simple algorithm, termed Dual Random Projection, that uses the dual solution of the low-dimensional optimization problem to recover the optimal solution to the original problem.
13	Opportunistic Strategies for Generalized No-Regret Problems	Andrey Bernstein, Shie Mannor, Nahum Shimkin	In this paper, we focus on a generalized no-regret problem with vector-valued rewards, defined in terms of a desired reward set of the agent.
14	Online Learning for Time Series Prediction	Oren Anava, Elad Hazan, Shie Mannor, Ohad Shamir	In this paper, we address the problem of predicting a time series using the ARMA (autoregressive moving average) model, under minimal assumptions on the noise terms.
15	Sharp analysis of low-rank kernel matrix approximations	Francis Bach	In this paper, we show that for approximations based on a random subset of columns of the original kernel matrix, the rank p may be chosen to be linear in the \emphdegrees of freedom associated with the problem, a quantity which is classically used in the statistical analysis of such methods, and is often seen as the implicit number of parameters of non-parametric estimators.
16	Beating Bandits in Gradually Evolving Worlds	Chao-Kai Chiang, Chia-Jung Lee, Chi-Jen Lu	To aim for smaller regrets, we adopt a relaxed two-point bandit setting in which the player can play two actions in each round and observe the loss values of those two actions.
17	Information Complexity in Bandit Subset Selection	Emilie Kaufmann, Shivaram Kalyanakrishnan	We consider the problem of efficiently exploring the arms of a stochastic bandit to identify the best subset.
18	Passive Learning with Target Risk	Mehrdad Mahdavi, Rong Jin	In this paper we consider learning in passive setting but with a slight modification.
19	Blind Signal Separation in the Presence of Gaussian Noise	Mikhail Belkin, Luis Rademacher, James Voss	In this paper we propose a new algorithm for solving the blind signal separation problem in the presence of additive Gaussian noise, when we are given samples from X=AS+η, where ηis drawn from an unknown, not necessarily spherical n-dimensional Gaussian distribution.
20	Active and passive learning of linear separators under log-concave distributions	Maria-Florina Balcan, Phil Long	Building on this, we provide a computationally efficient PAC algorithm with optimal (up to a constant factor) sample complexity for such problems.
21	Randomized partition trees for exact nearest neighbor search	Sanjoy Dasgupta, Kaushik Sinha	We analyze three such schemes.
22	Surrogate Regret Bounds for the Area Under the ROC Curve via Strongly Proper Losses	Shivani Agarwal	In this paper, we obtain such (non-pairwise) surrogate regret bounds for the AUC in terms of a broad class of proper (composite) losses that we term \emphstrongly proper.
23	Algorithms and Hardness for Robust Subspace Recovery	Moritz Hardt, Ankur Moitra	We consider a fundamental problem in unsupervised learning called subspace recovery: given a collection of m points in R^n, if many but not necessarily all of these points are contained in a d-dimensional subspace T can we find it?
24	PLAL: Cluster-based active learning	Ruth Urner, Sharon Wulff, Shai Ben-David	We investigate the label complexity of active learning under some smoothness assumptions on the data-generating process.We propose a procedure, PLAL, for “activising” passive, sample-based learners.
25	Learning Using Local Membership Queries	Pranjal Awasthi, Vitaly Feldman, Varun Kanade	We introduce a new model of membership query (MQ) learning, where the learning algorithm is restricted to query points that are close to random examples drawn from the underlying distribution.
26	Sparse Adaptive Dirichlet-Multinomial-like Processes	Marcus Hutter	I derive an optimal adaptive choice for the main parameter via tight, data-dependent redundancy bounds for a related model.
27	Prediction by random-walk perturbation	Luc Devroye, G�bor Lugosi, Gergely Neu	We propose a version of the follow-the-perturbed-leader online prediction algorithm in which the cumulative losses are perturbed by independent symmetric random walks.
28	Approachability, fast and slow	Vianney Perchet, Shie Mannor	In this paper we provide a characterization for the convergence rates of approachability and show that in some cases a set can be approached with a 1/n rate.
29	Classification with Asymmetric Label Noise: Consistency and Maximal Denoising	Clayton Scott, Gilles Blanchard, Gregory Handy	We introduce a general framework for classification with label noise that eliminates these assumptions.
30	General Oracle Inequalities for Gibbs Posterior with Application to Ranking	Cheng Li, Wenxin Jiang, Martin Tanner	In this paper, we summarize some recent results in Li et al. (2012), which can be used to extend an important PAC-Bayesian approach, namely the Gibbs posterior, to study the nonadditive ranking risk.
31	Learning Halfspaces Under Log-Concave Densities: Polynomial Approximations and Moment Matching	Daniel Kane, Adam Klivans, Raghu Meka	We give the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter).
32	Subspace Embeddings and \ell_p-Regression Using Exponential Random Variables	David Woodruff, Qin Zhang	If one is just interested in a \textpoly(d) rather than a (1+ε)-approximation to \ell_p-regression, a corollary of our results is that for all p ∈[1, ∞) we can solve the \ell_p-regression problem without using general convex programming, that is, since our subspace embeds into \ell_∞ it suffices to solve a linear programming problem.
33	Consistency of Robust Kernel Density Estimators	Robert Vandermeulen, Clayton Scott	In this paper we establish asymptotic L^1 consistency of the RKDE for a class of losses and show that the RKDE converges with the same rate on bandwidth required for the traditional KDE.
34	Divide and Conquer Kernel Ridge Regression	Yuchen Zhang, John Duchi, Martin Wainwright	We study a decomposition-based scalable approach to performing kernel ridge regression.
35	Regret Minimization for Branching Experts	Eyal Gofer, Nicol� Cesa-Bianchi, Claudio Gentile, Yishay Mansour	For this setting of branching experts, we give algorithms and analysis that cover both the full information and the bandit scenarios.
36	Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families	Peter Bartlett, Peter Gr�nwald, Peter Harremo�s, Fares Hedayati, Wojciech Kotlowski	We study online learning under logarithmic loss with regular parametric models.
37	Online Similarity Prediction of Networked Data from Known and Unknown Graphs	Claudio Gentile, Mark Herbster, Stephen Pasteris	We consider online similarity prediction problems over networked data.
38	A near-optimal algorithm for finite partial-monitoring games against adversarial opponents	G�bor Bart�k	In this paper we present and analyze a new algorithm for locally observable partial monitoring games.
39	Representation, Approximation and Learning of Submodular Functions Using Low-rank Decision Trees	Vitaly Feldman, Pravesh Kothari, Jan Vondr�k	We study the complexity of approximate representation and learning of submodular functions over the uniform distribution on the Boolean hypercube {0,1}^n.
40	A Tale of Two Metrics: Simultaneous Bounds on Competitiveness and Regret	Lachlan Andrew, Siddharth Barman, Katrina Ligett, Minghong Lin, Adam Meyerson, Alan Roytman, Adam Wierman	We consider algorithms for “smoothed online convex optimization” problems, a variant of the class of online convex optimization problems that is strongly related to metrical task systems.
41	Optimal Probability Estimation with Applications to Prediction and Classification	Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, Ananda Theertha Suresh	Via a unified viewpoint of probability estimation, classification,and prediction, we derive a uniformly-optimal combined-probability estimator, construct a classifier that uniformly approaches the error of the best possible label-invariant classifier, and improve existing results on pattern prediction and compression.
42	Polynomial Time Optimal Query Algorithms for Finding Graphs with Arbitrary Real Weights	Sung-Soon Choi	In this paper, we achieve an ultimate goal of recent years for graph finding with the two types of queries, by constructing the first polynomial time algorithms with optimal query complexity for the general class of graphs with n vertices and at most m edges in which the weights of edges are arbitrary real numbers.
43	Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso	Abhradeep Guha Thakurta, Adam Smith	The algorithms we describe are efficient and in some cases match the optimal \emphnon-private asymptotic sample complexity.
44	Learning a set of directions	Wouter M. Koolen, Jiazhong Nie, Manfred Warmuth	We develop online algorithms for this type of problem.
45	A Tensor Spectral Approach to Learning Mixed Membership Community Models	Animashree Anandkumar, Rong Ge, Daniel Hsu, Sham Kakade	We propose a unified approach to learning these models via a tensor spectral decomposition method.
46	Adaptive Crowdsourcing Algorithms for the Bandit Survey Problem	Ittai Abraham, Omar Alonso, Vasilis Kandylas, Aleksandrs Slivkins	We present several algorithms for this problem, and support them with analysis and simulations.Our approach is based in our experience conducting relevance evaluation for a large commercial search engine.
47	Boosting with the Logistic Loss is Consistent	Matus Telgarsky	This manuscript provides optimization guarantees, generalization bounds, and statistical consistency results for AdaBoost variants which replace the exponential loss with the logistic and similar losses (specifically, twice differentiable convex losses which are Lipschitz and tend to zero on one side).
48	Competing With Strategies	Wei Han, Alexander Rakhlin, Karthik Sridharan	We study the problem of online learning with a notion of regret defined with respect to a set of strategies.
49	Online Learning with Predictable Sequences	Alexander Rakhlin, Karthik Sridharan	We present methods for online linear optimization that take advantage of benign (as opposed to worst-case) sequences.
50	Efficient Learning of Simplices	Joseph Anderson, Navin Goyal, Luis Rademacher	We show an efficient algorithm for the following problem: Given uniformly random points from an arbitrary n-dimensional simplex, estimate the simplex.
51	Complexity Theoretic Lower Bounds for Sparse Principal Component Detection	Quentin Berthet, Philippe Rigollet	We measure the performance of a test by the smallest signal strength that it can detect and we propose a computationally efficient method based on semidefinite programming.