Paper Digest: AISTATS 2014 Highlights

June 17, 2014June 18, 2020 admin

Readers can also choose to read this highlight article on our console, which allows users to filter out papers using keywords and find related papers.

The International Conference on Artificial Intelligence and Statistics (AISTATS) is an interdisciplinary gathering of researchers at the intersection of computer science, artificial intelligence, machine learning, statistics, and related areas.

To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.

Paper Digest Team
team@paperdigest.org

TABLE 1: AISTATS 2014 Papers

	Title	Authors	Highlight
1	Preface	Samuel Kaski, Jukka Corander	Preface
2	Decontamination of Mutually Contaminated Models	Gilles Blanchard, Clayton Scott	This work focuses on the problem of classification with multiclass label noise, in a general setting where the noise proportions are unknown and the true class distributions are nonseparable and potentially quite complex.
3	Distributed optimization of deeply nested systems	Miguel Carreira-Perpinan, Weiran Wang	We describe a general strategy to learn the parameters and, to some extent, the architecture of nested systems, which we call the method of auxiliary coordinates (MAC).
4	Analysis of Empirical MAP and Empirical Partially Bayes: Can They be Alternatives to Variational Bayes?	Shinichi Nakajima, Masashi Sugiyama	In this paper, we theoretically investigate the behavior of the MAP and the PB solutions of matrix factorization.
5	Improved Bounds for Online Learning Over the Permutahedron and Other Ranking Polytopes	Nir Ailon	We present an algorithm of expected regret O(n\sqrtOPT + n^2), where OPT is the loss of the best (single) ranking in hindsight.
6	Information-Theoretic Characterization of Sparse Recovery	Cem Aksoylar, Venkatesh Saligrama	We formulate sparse support recovery as a salient set identification problem and use information-theoretic analyses to characterize the recovery performance and sample complexity.
7	Hybrid Discriminative-Generative Approach with Gaussian Processes	Ricardo Andrade Pacheco, James Hensman, Max Zwiessele, Neil D. Lawrence	Here, we present a model based on a hybrid approach that breaks down some of the barriers between the discriminative and generative points of view, allowing continuous dimensionality reduction of hybrid discrete-continuous data, discriminative classification with missing inputs and manifold learning informed by class labels.
8	Average Case Analysis of High-Dimensional Block-Sparse Recovery and Regression for Arbitrary Designs	Waheed Bajwa, Marco Duarte, Robert Calderbank	Average Case Analysis of High-Dimensional Block-Sparse Recovery and Regression for Arbitrary Designs
9	A New Perspective on Learning Linear Separators with Large L_qL_p Margins	Maria-Florina Balcan, Christopher Berlind	We give theoretical and empirical results that provide new insights into large margin learning.
10	A Non-parametric Conditional Factor Regression Model for Multi-Dimensional Input and Response	Ava Bargi, Richard Yi Xu, Zoubin Ghahramani, Massimo Piccardi	In this paper, we propose a non-parametric conditional factor regression (NCFR) model for domains with multi-dimensional input and response.
11	Learning Optimal Bounded Treewidth Bayesian Networks via Maximum Satisfiability	Jeremias Berg, Matti J�rvisalo, Brandon Malone	In this work, we develop a novel score-based approach to BTW-BNSL, based on casting BTW-BNSL as weighted partial Maximum satisfiability.
12	Online Passive-Aggressive Algorithms for Non-Negative Matrix Factorization and Completion	Mathieu Blondel, Yotaro Kubo, Ueda Naonori	In this paper, we present non-negative passive-aggressive (NN-PA), a family of online algorithms for non-negative matrix factorization (NMF).
13	PAC-Bayesian Theory for Transductive Learning	Luc B�gin, Pascal Germain, Fran�ois Laviolette, Jean-Francis Roy	We propose a PAC-Bayesian analysis of the transductive learning setting, introduced by Vapnik [2008], by proposing a family of new bounds on the generalization error.
14	Random Bayesian networks with bounded indegree	Eunice Yuh-Jie Chen, Judea Pearl	In this paper, we propose a simple model for large random BNs with bounded indegree, that is, large directed acyclic graphs (DAG) where the edges appear at random and each node has at most a given number of parents.
15	Efficient Low-Rank Stochastic Gradient Descent Methods for Solving Semidefinite Programs	Jianhui Chen, Tianbao Yang, Shenghuo Zhu	We propose a low-rank stochastic gradient descent (LR-SGD) method for solving a class of semidefinite programming (SDP) problems.
16	Characterizing EVOI-Sufficient k-Response Query Sets in Decision Problems	Robert Cohn, Satinder Singh, Edmund Durfee	When the only constraint on what queries can be asked is that they have exactly k possible responses (with k \ge 2), we show that the set of k-response decision queries (which ask the user to select his/her preferred decision given a choice of k decisions) is EVOI-Sufficient, meaning that no single k-response query can have higher EVOI than the best single k-response decision query for any decision problem.
17	Doubly Aggressive Selective Sampling Algorithms for Classification	Koby Crammer	We introduce two stochastic linear algorithms and analyze them in the worst-case mistake-bound framework.
18	Sparse Bayesian Variable Selection for the Identification of Antigenic Variability in the Foot-and-Mouth Disease Virus	Vinny Davies, Richard Reeve, William Harvey, Francois Maree, Dirk Husmeier	Here we describe a novel sparse Bayesian variable selection model using spike and slab priors which is able to predict antigenic variability and identify sites which are important for the neutralisation of the virus.
19	Sparsity and the Truncated $l^2$-norm	Lee Dicker	In this paper, we study an alternative measure of sparsity, the truncated $l^2$-norm, which is related to other $l^p$-norms, but appears to have some unique and useful properties.
20	Efficient Distributed Topic Modeling with Provable Guarantees	Weicong Ding, Mohammad Rohban, Prakash Ishwar, Venkatesh Saligrama	We consider topic modeling under the separability assumption and develop novel computationally efficient methods that provably achieve the statistical performance of the state-of-the-art centralized approaches while requiring insignificant communication between the distributed document collections.
21	Pan-sharpening with a Bayesian nonparametric dictionary learning model	Xinghao Ding, Yiyong Jiang, Yue Huang, John Paisley	We present a new pan-sharpening algorithm that uses a Bayesian nonparametric dictionary learning model to give an underlying sparse representation for image reconstruction.
22	Approximate Slice Sampling for Bayesian Posterior Inference	Christopher DuBois, Anoop Korattikara, Max Welling, Padhraic Smyth	In this paper, we advance the theory of large scale Bayesian posterior inference by introducing a new approximate slice sampler that uses only small mini-batches of data in every iteration.
23	Bayesian Logistic Gaussian Process Models for Dynamic Networks	Daniele Durante, David Dunson	Motivated by an application to studying dynamic networks among sports teams, we propose a Bayesian nonparametric model.
24	Avoiding pathologies in very deep networks	David Duvenaud, Oren Rippel, Ryan Adams, Zoubin Ghahramani	We propose an alternate network architecture which does not suffer from this pathology.
25	Efficient Inference for Complex Queries on Complex Distributions	Lili Dworkin, Michael Kearns, Lirong Xia	We consider problems of approximate inference in which the query of interest is given by a complex formula (such as a formula in disjunctive formal form (DNF)) over a joint distribution given by a graphical model.
26	Bayesian Switching Interaction Analysis Under Uncertainty	Zoran Dzunic, John Fisher III	We introduce a Bayesian discrete-time framework for switching-interaction analysis under uncertainty, in which latent interactions, switching pattern and signal states and dynamics are inferred from noisy (and possibly missing) observations of these signals.
27	Robust learning of inhomogeneous PMMs	Ralf Eggeling, Teemu Roos, Petri Myllym�ki, Ivo Grosse	In this work, we empirically investigate the performance of robust alternatives for structure and parameter learning that extend the practical applicability of inhomogeneous parsimonious Markov models to more complex settings than before.
28	Fully-Automatic Bayesian Piecewise Sparse Linear Models	Riki Eto, Ryohei Fujimaki, Satoshi Morinaga, Hiroshi Tamano	Our contributions are mainly three-fold.
29	Learning with Maximum A-Posteriori Perturbation Models	Andreea Gane, Tamir Hazan, Tommi Jaakkola	In this paper, we analyze, extend and seek to estimate such dependencies from data.
30	Sketching the Support of a Probability Measure	Joachim Giesen, Soeren Laue, Lars Kuehne	Here we propose to sketch the support of the probability measure (that does not need to be a manifold) by some gradient flow complex, or more precisely by its Hasse diagram.
31	Robust Stochastic Principal Component Analysis	John Goes, Teng Zhang, Raman Arora, Gilad Lerman	We introduce three novel stochastic approximation algorithms for robust PCA that are extensions of standard algorithms for PCA – the stochastic power method, incremental PCA and online PCA using matrix-exponentiated-gradient (MEG) updates.
32	Bayesian Nonparametric Poisson Factorization for Recommendation Systems	Prem Gopalan, Francisco J. Ruiz, Rajesh Ranganath, David Blei	We develop a Bayesian nonparametric Poisson factorization model for recommendation systems.
33	Efficiently Enforcing Diversity in Multi-Output Structured Prediction	Abner Guzman-Rivera, Pushmeet Kohli, Dhruv Batra, Rob Rutenbar	This paper proposes a novel method for efficiently generating multiple diverse predictions for structured prediction problems.
34	Learning and Evaluation in Presence of Non-i.i.d. Label Noise	Nico G�rnitz, Anne Porbadnigk, Alexander Binder, Claudia Sannelli, Mikio Braun, Klaus-Robert Mueller, Marius Kloft	In this paper, we present a novel methodology for learning and evaluation in presence of systematic label noise.
35	Analytic Long-Term Forecasting with Periodic Gaussian Processes	Nooshin HajiGhassemi, Marc Deisenroth	Gaussian processes are a state-of-the-art method for learning models from data.
36	On Estimating Causal Effects based on Supplemental Variables	Takahiro Hayashi, Manabu Kuroki	In this paper, we consider the situation where a treatment is associated with a response through a set of supplementary variables in both linear and discrete models.
37	Non-Asymptotic Analysis of Relational Learning with One Network	Peng He, Changshui Zhang	We propose a novel combinational approach to analyze complex dependencies of relational data, which is crucial to our non-asymptotic analysis.
38	Exploiting the Limits of Structure Learning via Inherent Symmetry	Peng He, Changshui Zhang	This theoretical paper is concerned with the structure learning limit for Gaussian Markov random fields from i.i.d. samples.
39	A Statistical Model for Event Sequence Data	Kevin Heins, Hal Stern	In this paper, we consider a general probabilistic framework for identifying such patterns, by distinguishing between events that belong to a pattern and events that occur as part of background processes.
40	Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics	Philipp Hennig, S�ren Hauberg	We study a probabilistic numerical method for the solution of both boundary and initial value problems that returns a joint Gaussian process posterior over the solution.
41	Tilted Variational Bayes	James Hensman, Max Zwiessele, Neil Lawrence	We present a novel method for approximate inference.
42	On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning	Matthew Hoffman, Bobak Shahriari, Nando Freitas	We introduce a Bayesian approach for this problem and show that it empirically outperforms both the existing frequentist counterpart and other Bayesian optimization methods.
43	Optimality of Thompson Sampling for Gaussian Bandits Depends on Priors	Junya Honda, Akimichi Takemura	In this paper we discuss the optimality of TS for the model of normal distributions with unknown means and variances as one of the most fundamental examples of multiparameter models.
44	Tight Bounds for the Expected Risk of Linear Classifiers and PAC-Bayes Finite-Sample Guarantees	Jean Honorio, Tommi Jaakkola	We analyze the expected risk of linear classifiers for a fixed weight vector in the “minimax” setting.
45	Latent Gaussian Models for Topic Modeling	Changwei Hu, Eunsu Ryu, David Carlson, Yingjian Wang, Lawrence Carin	A new approach is proposed for topic modeling, in which the latent matrix factorization employs Gaussian priors, rather than the Dirichlet-class priors widely used in such models.
46	A Finite-Sample Generalization Bound for Semiparametric Regression: Partially Linear Models	Ruitong Huang, Csaba Szepesvari	In this paper we provide generalization bounds for semiparametric regression with the so-called partially linear models where the regression function is written as the sum of a linear parametric and a nonlinear, nonparametric function, the latter taken from a some set \mathcalH with finite entropy-integral.
47	Global Optimization Methods for Extended Fisher Discriminant Analysis	Satoru Iwata, Yuji Nakatsukasa, Akiko Takeda	A parametrized extension, which we call the extended FDA, has been introduced from the viewpoint of robust optimization.
48	High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation	Rafael Izbicki, Ann Lee, Chad Schafer	Here we propose a simple-to-implement, fully nonparametric density ratio estimator that expands the ratio in terms of the eigenfunctions of a kernel-based operator; these functions reflect the underlying geometry of the data (e.g., submanifold structure), often leading to better estimates without an explicit dimension reduction step.
49	Near Optimal Bayesian Active Learning for Decision Making	Shervin Javdani, Yuxin Chen, Amin Karbasi, Andreas Krause, Drew Bagnell, Siddhartha Srinivasa	Our goal is to drive uncertainty into a single decision region as quickly as possible.
50	A Level-set Hit-and-run Sampler for Quasi-Concave Distributions	Shane Jensen, Dean Foster	We develop a new sampling strategy that uses the hit-and-run algorithm within level sets of a target density.
51	New Bounds on Compressive Linear Least Squares Regression	Ata Kaban	In this paper we provide a new analysis of compressive least squares regression that removes a spurious log N factor from previous bounds, where N is the number of training points.
52	Recovering Distributions from Gaussian RKHS Embeddings	Motonobu Kanagawa, Kenji Fukumizu	In this paper, we consider the recovery of the information of a distribution from an estimate of the kernel mean, when a Gaussian kernel is used.
53	Collaborative Ranking for Local Preferences	Berk Kapicioglu, David Rosenberg, Robert Schapire, Tony Jebara	To address this, we introduce a matrix factorization framework called Collaborative Local Ranking (CLR).
54	Scalable Collaborative Bayesian Preference Learning	Mohammad Emtiyaz Khan, Young Jun Ko, Matthias Seeger	To simplify the difficulty, we present a novel expectation maximization algorithm, driven by expectation propagation approximate inference, which scales to very large datasets without requiring strong factorization assumptions.
55	A Gaussian Latent Variable Model for Large Margin Classification of Labeled and Unlabeled Data	Do-kyum Kim, Matthew Der, Lawrence Saul	We investigate a Gaussian latent variable model for semi-supervised learning of linear large margin classifiers.
56	Scalable Variational Bayesian Matrix Factorization with Side Information	Yong-Deok Kim, Seungjin Choi	In this paper, we present a scalable inference for VBMF with side information, the complexity of which is linear in the rank K of factor matrices.
57	Algebraic Reconstruction Bounds and Explicit Inversion for Phase Retrieval at the Identifiability Threshold	Franz Kir�ly, Martin Ehler	We study phase retrieval from magnitude measurements of an unknown signal as an algebraic estimation problem.
58	Visual Boundary Prediction: A Deep Neural Prediction Network and Quality Dissection	Jyri Kivinen, Chris Williams, Nicolas Heess	This paper investigates visual boundary detection, i.e. prediction of the presence of a boundary at a given image location.
59	Low-Rank Spectral Learning	Alex Kulesza, N. Raj Rao, Satinder Singh	Spectral learning methods have recently been proposed as alternatives to slow, non-convex optimization algorithms like EM for a variety of probabilistic models in which hidden information must be inferred by the learner.
60	Fugue: Slow-Worker-Agnostic Distributed Learning for Big Models on Big Data	Abhimanu Kumar, Alex Beutel, Qirong Ho, Eric Xing	We present a scheme for fast, distributed learning on big (i.e. high-dimensional) models applied to big datasets.
61	Computational Education using Latent Structured Prediction	Tanja K�ser, Alexander Schwing, Tamir Hazan, Markus Gross	For interpretability we propose to constrain the parameter space a-priori by leveraging domain knowledge.
62	Towards building a Crowd-Sourced Sky Map	Dustin Lang, David Hogg, Bernhard Sch�lkopf	We describe a system that builds a high dynamic-range and wide-angle image of the night sky by combining a large set of input images.
63	Incremental Tree-Based Inference with Dependent Normalized Random Measures	Juho Lee, Seungjin Choi	In this paper, we present a tree-based inference method for MNRM mixture models, extending Bayesian hierarchical clustering (BHC) which was originally developed as a deterministic approximate inference for Dirichlet process mixture (DPM) models.
64	Jointly Informative Feature Selection	Leonidas Lefakis, Francois Fleuret	We propose several novel criteria for the selection of groups of jointly informative continuous features in the context of classification.
65	Learning Heterogeneous Hidden Markov Random Fields	Jie Liu, Chunming Zhang, Elizabeth Burnside, David Page	We formally define heterogeneous HMRFs and propose an EM algorithm whose M-step combines a contrastive divergence learner with a kernel smoothing step to incorporate the background knowledge.
66	PAC-Bayesian Collective Stability	Ben London, Bert Huang, Ben Taskar, Lise Getoor	We investigate whether weaker definitions of collective stability suffice.
67	Active Area Search via Bayesian Quadrature	Yifei Ma, Roman Garnett, Jeff Schneider	In this paper we combine these two concepts and consider the problem of choosing data collection locations when the goal is to identify regions whose set of collected data would be labeled positively by a set classifier.
68	Active Boundary Annotation using Random MAP Perturbations	Subhransu Maji, Tamir Hazan, Tommi Jaakkola	As an example of our framework we propose a boundary refinement task which can used to obtain pixel-accurate image boundaries much faster than traditional tools by focussing on parts of the image for refinement in a multi-scale manner.
69	Interpretable Sparse High-Order Boltzmann Machines	Martin Renqiang Min, Xia Ning, Chao Cheng, Mark Gerstein	In this paper, we propose an efficient approach for learning a fully observable high-order Boltzmann Machine based on sparse learning and contrastive divergence, resulting in an interpretable Sparse High-order Boltzmann Machine, denoted as SHBM.
70	Efficient Lifting of MAP LP Relaxations Using k-Locality	Martin Mladenov, Kristian Kersting, Amir Globerson	Such models often exhibit considerable symmetry, and it is a challenge to devise algorithms that exploit this symmetry to speed up inference.
71	A Geometric Algorithm for Scalable Multiple Kernel Learning	John Moeller, Parasaran Raman, Suresh Venkatasubramanian, Avishek Saha	We present a geometric formulation of the Multiple Kernel Learning (MKL) problem.
72	On the Testability of Models with Missing Data	Karthika Mohan, Judea Pearl	We present sufficient conditions for testability in missing data applications and note the impediments for testability when data are contaminated by missing entries.
73	Selective Sampling with Drift	Edward Moroshko, Koby Crammer	We develop a novel selective sampling algorithm for the drifting setting, analyze it under no assumptions on the mechanism generating the sequence of instances, and derive new mistake bounds that depend on the amount of drift in the problem.
74	The Dependent Dirichlet Process Mixture of Objects for Detection-free Tracking and Object Modeling	Willie Neiswanger, Frank Wood, Eric Xing	We present a model that localizes objects via unsupervised tracking while learning a representation of each object, avoiding the need for pre-built detectors.
75	Bias Reduction and Metric Learning for Nearest-Neighbor Estimation of Kullback-Leibler Divergence	Yung-Kyun Noh, Masashi Sugiyama, Song Liu, Marthinus C. Plessis, Frank Chongwoo Park, Daniel D. Lee	In this paper, we show that this non-local bias can be mitigated by changing the distance metric, and we propose a method for learning an optimal Mahalanobis-type metric based on global information provided by approximate parametric models of the underlying densities.
76	Robust Forward Algorithms via PAC-Bayes and Laplace Distributions	Asaf Noy, Koby Crammer	We introduce new learning algorithms that minimize objectives derived directly from PAC-Bayes bounds, incorporating Laplace distributions.
77	Joint Structure Learning of Multiple Non-Exchangeable Networks	Chris Oates, Sach Mukherjee	Here we present a novel Bayesian formulation that generalises joint structure learning beyond the exchangeable case.
78	Scaling Nonparametric Bayesian Inference via Subsample-Annealing	Fritz Obermeyer, Jonathan Glidden, Eric Jonas	We describe an adaptation of the simulated annealing algorithm to nonparametric clustering and related probabilistic models.
79	Fast Distribution To Real Regression	Junier Oliva, Willie Neiswanger, Barnabas Poczos, Jeff Schneider, Eric Xing	We study the problem of distribution to real regression, where one aims to regress a mapping f that takes in a distribution input covariate P∈\mathcalI (for a non-parametric family of distributions \mathcalI) and outputs a real-valued response Y=f(P) + ε.
80	FuSSO: Functional Shrinkage and Selection Operator	Junier Oliva, Barnabas Poczos, Timothy Verstynen, Aarti Singh, Jeff Schneider, Fang-Cheng Yeh, Wen-Yih Tseng	We present the FuSSO, a functional analogue to the LASSO, that efficiently finds a sparse set of functional input covariates to regress a real-valued response against.
81	To go deep or wide in learning?	Gaurav Pandey, Ambedkar Dukkipati	In this paper, we propose an approach called wide learning based on arc-cosine kernels, that learns a single layer of infinite width.
82	LAMORE: A Stable, Scalable Approach to Latent Vector Autoregressive Modeling of Categorical Time Series	Yubin Park, Carlos Carvalho, Joydeep Ghosh	This paper proposes two auxiliary techniques that help stabilize and calibrate the estimated parameters.
83	Spoofing Large Probability Mass Functions to Improve Sampling Times and Reduce Memory Costs	Jon Parker, Hans Engler	This paper presents a novel lossy compression method intended for large (O(10^5)) dense PMFs that speeds up the sampling process and guarantees high fidelity sampling.
84	Learning Bounded Tree-width Bayesian Networks using Integer Linear Programming	Pekka Parviainen, Hossein Shahrabi Farahani, Jens Lagergren	Since the inference problem is common in many application areas, we provide a practical algorithm for learning bounded tree-width Bayesian networks.
85	An Efficient Algorithm for Large Scale Compressive Feature Learning	Hristo Paskov, John Mitchell, Trevor Hastie	This paper focuses on large-scale unsupervised feature selection from text.
86	Expectation Propagation for Likelihoods Depending on an Inner Product of Two Multivariate Random Variables	Tomi Peltola, Pasi Jyl�nki, Aki Vehtari	We describe how a deterministic Gaussian posterior approximation can be constructed using expectation propagation (EP) for models, where the likelihood function depends on an inner product of two multivariate random variables.
87	An inclusion optimal algorithm for chain graph structure learning	Jose Pe�a, Dag Sonntag, Jens Nielsen	This paper presents and proves an extension of Meek’s conjecture to chain graphs under the Lauritzen-Wermuth-Frydenberg interpretation.
88	A Stepwise uncertainty reduction approach to constrained global optimization	Victor Picheny	We propose here a new optimization strategy based on the stepwise uncertainty reduction paradigm, which offers an efficient trade-off between exploration and local search near the boundaries.
89	Connected Sub-graph Detection	Jing Qian, Venkatesh Saligrama, Yuting Chen	For concreteness we consider the connected sub-graph detection problem that arises in a number of applications including network intrusion, disease outbreaks, and video surveillance.
90	An Analysis of Active Learning with Uniform Feature Noise	Aaditya Ramdas, Barnabas Poczos, Aarti Singh, Larry Wasserman	In this paper, we consider the effect of feature noise in active learning, which could arise either because X itself is being measured, or it is corrupted in transmission to the oracle, or the oracle returns the label of a noisy version of the query point.
91	Black Box Variational Inference	Rajesh Ranganath, Sean Gerrish, David Blei	In this paper, we present a “black box” variational inference algorithm, one that can be quickly applied to many models with little additional derivation.
92	Cluster Canonical Correlation Analysis	Nikhil Rasiwasia, Dhruv Mahajan, Vijay Mahadevan, Gaurav Aggarwal	In this paper we present cluster canonical correlation analysis (cluster-CCA) for joint dimensionality reduction of two sets of data points.
93	Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process	Vikas Raykar, Priyanka Agrawal	With the goal of reducing the labeling cost, we introduce the notion of sequential crowdsourced labeling, where instead of asking for all the labels in one shot we acquire labels from annotators sequentially one at a time.
94	Learning Structured Models with the AUC Loss and Its Generalizations	Nir Rosenfeld, Ofer Meshi, Danny Tarlow, Amir Globerson	In this work, we propose a representation and learning formulation for optimizing structured models over the AUC loss, show how our approach generalizes the unstructured case, and provide algorithms for solving the resulting inference and learning problems.
95	Class Proportion Estimation with Application to Multiclass Anomaly Rejection	Tyler Sanderson, Clayton Scott	This work addresses two classification problems that fall under the heading of domain adaptation, wherein the distributions of training and testing examples differ.
96	Lifted MAP Inference for Markov Logic Networks	Somdeb Sarkhel, Deepak Venugopal, Parag Singla, Vibhav Gogate	In this paper, we present a new approach for lifted MAP inference in Markov Logic Networks (MLNs).
97	Estimating Dependency Structures for non-Gaussian Components with Linear and Energy Correlations	Hiroaki Sasaki, Michael Gutmann, Hayaru Shouno, Aapo Hyvarinen	In this paper, we propose a probabilistic model of non-Gaussian components which are allowed to have both linear and energy correlations.
98	Student-t Processes as Alternatives to Gaussian Processes	Amar Shah, Andrew Wilson, Zoubin Ghahramani	We investigate the Student-t process as an alternative to the Gaussian process as a nonparametric prior over functions.
99	In Defense of Minhash over Simhash	Anshumali Shrivastava, Ping Li	In this study, we provide a theoretical answer (validated by experiments) that MinHash virtually always outperforms SimHash when the data are binary, as common in practice such as search.
100	Loopy Belief Propagation in the Presence of Determinism	David Smith, Vibhav Gogate	In this paper, we propose a new method for remedying this problem.
101	Explicit Link Between Periodic Covariance Functions and State Space Models	Arno Solin, Simo S�rkk�	This paper shows how periodic covariance functions in Gaussian process regression can be reformulated as state space models, which can be solved with classical Kalman filtering theory.
102	Bat Call Identification with Gaussian Process Multinomial Probit Regression and a Dynamic Time Warping Kernel	Vassilios Stathopoulos, Veronica Zamora-Gutierrez, Kate Jones, Mark Girolami	We study the problem of identifying bat species from echolocation calls in order to build automated bioacoustic monitoring algorithms.
103	SMERED: A Bayesian Approach to Graphical Record Linkage and De-duplication	Rebecca Steorts, Rob Hall, Stephen Fienberg	We propose a novel unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files.
104	Adaptive Variable Clustering in Gaussian Graphical Models	Siqi Sun, Yuancheng Zhu, Jinbo Xu	We present a novel nonparametric Bayesian generative model for such a block-structured GGM and an efficient inference algorithm to find the clustering of variables in this GGM by combining a Gibbs sampler and a split-merge Metropolis-Hastings algorithm.
105	Scaling Graph-based Semi Supervised Learning to Large Number of Labels Using Count-Min Sketch	Partha Talukdar, William Cohen	In this paper, we propose MAD-Sketch, a novel graph-based SSL algorithm which compactly stores label distribution on each node using Count-min Sketch, a randomized data structure.
106	Path Thresholding: Asymptotically Tuning-Free High-Dimensional Sparse Regression	Divyanshu Vats, Richard Baraniuk	In this paper, we address the challenging problem of selecting tuning parameters for high-dimensional sparse regression.
107	Active Learning for Undirected Graphical Model Selection	Divyanshu Vats, Robert Nowak, Richard Baraniuk	We propose an active learning algorithm that uses junction tree representations to adapt future measurements based on the information gathered from prior measurements.
108	Linear-time training of nonlinear low-dimensional embeddings	Max Vladymyrov, Miguel Carreira-Perpinan	We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time.
109	Gaussian Copula Precision Estimation with Missing Values	Huahua Wang, Farideh Fazayeli, Soumyadeep Chatterjee, Arindam Banerjee	In this paper, we propose double plugin Gaussian (DoPinG) copula estimators to estimate the sparse precision matrix corresponding to \emphnon-paranormal distributions.
110	An LP for Sequential Learning Under Budgets	Joseph Wang, Kirill Trapeznikov, Venkatesh Saligrama	We present a convex framework to learn sequential decisions and apply this to the problem of learning under a budget.
111	Efficient Algorithms and Error Analysis for the Modified Nystrom Method	Shusen Wang, Zhihua Zhang	In this paper, we propose two algorithms that make the modified Nyström method practical.
112	Bayesian Multi-Scale Optimistic Optimization	Ziyu Wang, Babak Shakibi, Lin Jin, Nando Freitas	In this paper, we introduce a new technique for efficient global optimization that combines Gaussian process confidence bounds and treed simultaneous optimistic optimization to eliminate the need for auxiliary optimization of acquisition functions.
113	Accelerating ABC methods using Gaussian processes	Richard Wilkinson	We introduce Gaussian process (GP) accelerated ABC, which we show can significantly reduce the number of simulations required.
114	A New Approach to Probabilistic Programming Inference	Frank Wood, Jan Willem Meent, Vikash Mansinghka	We introduce and demonstrate a new approach to inference in expressive probabilistic programming languages based on particle Markov chain Monte Carlo.
115	Dynamic Resource Allocation for Optimizing Population Diffusion	Shan Xue, Alan Fern, Daniel Sheldon	The main contribution of this paper is to design and evaluate an online planner for this problem based on Hindsight Optimization (HOP), a technique that has shown promise in other stochastic planning problems.
116	Mixed Graphical Models via Exponential Families	Eunho Yang, Yulia Baker, Pradeep Ravikumar, Genevera Allen, Zhandong Liu	We study several instances of our model, and propose scalable M-estimators for recovering the underlying network structure.
117	Context Aware Group Nearest Shrunken Centroids in Large-Scale Genomic Studies	Juemin Yang, Fang Han, Rafael Irizarry, Han Liu	We have devised an approach to phenotype classification from gene expression profiling.
118	Nonparametric estimation and testing of exchangeable graph models	Justin Yang, Christina Han, Edoardo Airoldi	We propose a 3-step procedure to estimate the canonical graphon of any ExGM that satisfies these conditions.
119	Generating Efficient MCMC Kernels from Probabilistic Programs	Lingfeng Yang, Patrick Hanrahan, Noah Goodman	We present a technique that recovers hand-coded levels of performance from a universal probabilistic language, for the Metropolis-Hastings (MH) MCMC inference algorithm.
120	Efficient Transfer Learning Method for Automatic Hyperparameter Tuning	Dani Yogatama, Gideon Mann	We propose a fast and effective algorithm for automatic hyperparameter tuning that can generalize across datasets.
121	Accelerated Stochastic Gradient Method for Composite Regularization	Wenliang Zhong, James Kwok	In this paper, we propose a novel extension with accelerated gradient method for stochastic optimization.
122	Heterogeneous Domain Adaptation for Multiple Classes	Joey Tianyi Zhou, Ivor W.Tsang, Sinno Jialin Pan, Mingkui Tan	In this paper, we present an efficient Multi-class Heterogeneous Domain Adaptation (HDA) method, where data from the source and target domains are represented by heterogeneous features with different dimensions.