Paper Digest: ICML 2022 Highlights
The Internationl Conference on Machine Learning (ICML) is one of the top machine learning conferences in the world. In 2022, it is to be held in Baltimore, US.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: ICML 2022 Highlights
Paper | Author(s) | |
---|---|---|
1 | PAC-Bayesian Bounds on Rate-Efficient Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive analytic bounds on the noise invariance of majority vote classifiers operating on compressed inputs. |
Alhabib Abbas; Yiannis Andreopoulos; |
2 | Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, the loss landscape of MAML is much complex with possibly many more saddle points and local minima than its empirical risk minimization counterpart. To address this challenge, we leverage the recently invented sharpness-aware minimization and develop a sharpness-aware MAML approach that we term Sharp-MAML. |
Momin Abbas; Quan Xiao; Lisha Chen; Pin-Yu Chen; Tianyi Chen; |
3 | An Initial Alignment Between Neural Network and Target Is Needed for Gradient Descent to Learn Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces the notion of “Initial Alignment” (INAL) between a neural network at initialization and a target function. |
Emmanuel Abbe; Elisabetta Cornacchia; Jan Hazla; Christopher Marquis; |
4 | Active Sampling for Min-Max Fairness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose simple active sampling and reweighting strategies for optimizing min-max fairness that can be applied to any classification or regression model learned via loss minimization. |
Jacob D Abernethy; Pranjal Awasthi; Matth?us Kleindessner; Jamie Morgenstern; Chris Russell; Jie Zhang; |
5 | Meaningfully Debugging Model Mistakes Using Conceptual Counterfactual Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a systematic approach, conceptual counterfactual explanations (CCE), that explains why a classifier makes a mistake on a particular test sample(s) in terms of human-understandable concepts (e.g. this zebra is misclassified as a dog because of faint stripes). |
Abubakar Abid; Mert Yuksekgonul; James Zou; |
6 | Batched Dueling Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the batched K-armed dueling bandit problem under two standard settings: (i) existence of a Condorcet winner, and (ii) strong stochastic transitivity and stochastic triangle inequality. |
Arpit Agarwal; Rohan Ghuge; Viswanath Nagarajan; |
7 | Hierarchical Shrinkage: Improving The Accuracy and Interpretability of Tree-based Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Hierarchical Shrinkage (HS), a post-hoc algorithm which regularizes the tree not by altering its structure, but by shrinking the prediction over each leaf toward the sample means over each of its ancestors, with weights depending on a single regularization parameter and the number of samples in each ancestor. |
Abhineet Agarwal; Yan Shuo Tan; Omer Ronen; Chandan Singh; Bin Yu; |
8 | Deep Equilibrium Networks Are Sensitive to Initialization Statistics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that DEQs are sensitive to the higher order statistics of the matrix families from which they are initialized. |
Atish Agarwala; Samuel S Schoenholz; |
9 | Learning of Cluster-based Feature Importance for Electronic Health Record Time-series Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a supervised deep learning model to cluster EHR data based on the identification of clinically understandable phenotypes with regard to both outcome prediction and patient trajectory. |
Henrique Aguiar; Mauro Santos; Peter Watkinson; Tingting Zhu; |
10 | On The Convergence of The Shapley Value in Parametric Bayesian Learning Games Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we establish the convergence property of the Shapley value in parametric Bayesian learning games where players perform a Bayesian inference using their combined data, and the posterior-prior KL divergence is used as the characteristic function. |
Lucas Agussurja; Xinyi Xu; Bryan Kian Hsiang Low; |
11 | Individual Preference Stability for Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a natural notion of individual preference (IP) stability for clustering, which asks that every data point, on average, is closer to the points in its own cluster than to the points in any other cluster. |
Saba Ahmadi; Pranjal Awasthi; Samir Khuller; Matth?us Kleindessner; Jamie Morgenstern; Pattara Sukprasert; Ali Vakilian; |
12 | Understanding The Unstable Convergence of Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, many works have observed that in machine learning applications step sizes often do not fulfill this condition, yet (stochastic) gradient descent still converges, albeit in an unstable manner. We investigate this unstable convergence phenomenon from first principles, and discuss key causes behind it. |
Kwangjun Ahn; Jingzhao Zhang; Suvrit Sra; |
13 | Minimum Cost Intervention Design for Causal Effect Identification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we consider the problem of designing the collection of interventions with the minimum cost to identify the desired effect. |
Sina Akbari; Jalal Etesami; Negar Kiyavash; |
14 | How Faithful Is Your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a 3-dimensional evaluation metric, ($\alpha$-Precision, $\beta$-Recall, Authenticity), that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion. |
Ahmed Alaa; Boris Van Breugel; Evgeny S. Saveliev; Mihaela van der Schaar; |
15 | A Natural Actor-Critic Framework for Zero-Sum Markov Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce algorithms based on natural actor-critic and analyze their sample complexity for solving two player zero-sum Markov games in the tabular case. |
Ahmet Alacaoglu; Luca Viano; Niao He; Volkan Cevher; |
16 | Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By leveraging Holographic Reduced Representations (HRRs), we create a neural network with a pseudo-encryption style defense that empirically shows robustness to attack, even under threat models that unrealistically favor the adversary. |
Mohammad Mahmudul Alam; Edward Raff; Tim Oates; James Holt; |
17 | Optimistic Linear Support and Successor Features As A Basis for Optimal Policy Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the identified solutions are not guaranteed to be optimal. We introduce a novel algorithm that addresses this limitation. |
Lucas Nunes Alegre; Ana Bazzan; Bruno C. Da Silva; |
18 | Structured Stochastic Gradient MCMC Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Unfortunately, VI makes strong assumptions on both the factorization and functional form of the posterior. To relax these assumptions, this work proposes a new non-parametric variational inference scheme that combines ideas from both SGMCMC and coordinate-ascent VI. |
Antonios Alexos; Alex J Boyd; Stephan Mandt; |
19 | XAI for Transformers: Better Explanations Through Conservative Propagation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that the gradient in a Transformer reflects the function only locally, and thus fails to reliably identify the contribution of input features to the prediction. We identify Attention Heads and LayerNorm as main reasons for such unreliable explanations and propose a more stable way for propagation through these layers. |
Ameen Ali; Thomas Schnake; Oliver Eberle; Gr?goire Montavon; Klaus-Robert M?ller; Lior Wolf; |
20 | RUMs from Head-to-Head Contests Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we focus on slates of size two representing head-to-head contests. |
Matteo Almanza; Flavio Chierichetti; Ravi Kumar; Alessandro Panconesi; Andrew Tomkins; |
21 | Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present RetoMaton – retrieval automaton – which approximates the datastore search, based on (1) saving pointers between consecutive datastore entries, and (2) clustering of entries into "states". |
Uri Alon; Frank Xu; Junxian He; Sudipta Sengupta; Dan Roth; Graham Neubig; |
22 | Minimax Classification Under Concept Drift with Multidimensional Adaptation and Performance Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents adaptive minimax risk classifiers (AMRCs) that account for multidimensional time changes by means of a multivariate and high-order tracking of the time-varying underlying distribution. |
Ver?nica ?lvarez; Santiago Mazuelas; Jose A Lozano; |
23 | Scalable First-Order Bayesian Optimization Via Structured Automatic Differentiation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we observe that a wide range of kernels gives rise to structured matrices, enabling an exact $O(n^2d)$ matrix-vector multiply for gradient observations and $O(n^2d^2)$ for Hessian observations. Beyond canonical kernel classes, we derive a programmatic approach to leveraging this type of structure for transformations and combinations of the discussed kernel classes, which constitutes a structure-aware automatic differentiation algorithm. |
Sebastian E Ament; Carla P Gomes; |
24 | Public Data-Assisted Mirror Descent for Private Model Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we revisit the problem of using in-distribution public data to improve the privacy/utility trade-offs for differentially private (DP) model training. |
Ehsan Amid; Arun Ganesh; Rajiv Mathews; Swaroop Ramaswamy; Shuang Song; Thomas Steinke; Thomas Steinke; Vinith M Suriyakumar; Om Thakkar; Abhradeep Thakurta; |
25 | On Last-Iterate Convergence Beyond Zero-Sum Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we provide new results and techniques that apply to broader families of games and learning dynamics. |
Ioannis Anagnostides; Ioannis Panageas; Gabriele Farina; Tuomas Sandholm; |
26 | Online Algorithms with Multiple Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give a generic algorithmic framework for online covering problems with multiple predictions that obtains an online solution that is competitive against the performance of the best solution obtained from the predictions. |
Keerti Anand; Rong Ge; Amit Kumar; Debmalya Panigrahi; |
27 | Learning to Hash Robustly, Guaranteed Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we design an NNS algorithm for the Hamming space that has worst-case guarantees essentially matching that of theoretical algorithms, while optimizing the hashing to the structure of the dataset (think instance-optimal algorithms) for performance on the minimum-performing query. |
Alexandr Andoni; Daniel Beaglehole; |
28 | Set Based Stochastic Subsampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Deep models are designed to operate on huge volumes of high dimensional data such as images. In order to reduce the volume of data these models must process, we propose a set-based two-stage end-to-end neural subsampling model that is jointly optimized with an arbitrary downstream task network (e.g. classifier). |
Bruno Andreis; Seanie Lee; A. Tuan Nguyen; Juho Lee; Eunho Yang; Sung Ju Hwang; |
29 | Towards Understanding Sharpness-Aware Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We argue that the existing justifications for the success of SAM which are based on a PAC-Bayes generalization bound and the idea of convergence to flat minima are incomplete. |
Maksym Andriushchenko; Nicolas Flammarion; |
30 | Fair and Fast K-Center Clustering for Data Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider two key issues faced by many clustering methods when used for data summarization, namely (a) an unfair representation of "demographic groups” and (b) distorted summarizations, where data points in the summary represent subsets of the original data of vastly different sizes. |
Haris Angelidakis; Adam Kurpisz; Leon Sering; Rico Zenklusen; |
31 | Interactive Correlation Clustering with Existential Cluster Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce existential cluster constraints: a new form of feedback where users indicate the features of desired clusters. |
Rico Angell; Nicholas Monath; Nishant Yadav; Andrew Mccallum; |
32 | Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current algorithms, however, do not generally offer statistical guarantees that protect against a model’s mistakes and hallucinations. To address this, we develop uncertainty quantification techniques with rigorous statistical guarantees for image-to-image regression problems. |
Anastasios N Angelopoulos; Amit Pal Kohli; Stephen Bates; Michael Jordan; Jitendra Malik; Thayer Alshaabi; Srigokul Upadhyayula; Yaniv Romano; |
33 | AdaGrad Avoids Saddle Points Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we focus on the AdaGrad family of algorithms – from scalar to full-matrix preconditioning – and we examine the question of whether the method’s trajectories avoid saddle points. |
Kimon Antonakopoulos; Panayotis Mertikopoulos; Georgios Piliouras; Xiao Wang; |
34 | UnderGrad: A Universal Black-Box Optimization Method with Almost Dimension-Free Convergence Rate Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our paper aims to bridge this gap by providing a scalable universal method – dubbed UnDERGrad – which enjoys an almost dimension-free oracle complexity in problems with a favorable geometry (like the simplex, $\ell_1$-ball or trace-constraints), while retaining the order-optimal dependence on T described above. |
Kimon Antonakopoulos; Dong Quan Vu; Volkan Cevher; Kfir Levy; Panayotis Mertikopoulos; |
35 | Adapting The Linearised Laplace Model Evidence for Modern Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we examine the assumptions behind this method, particularly in conjunction with model selection. |
Javier Antoran; David Janz; James U Allingham; Erik Daxberger; Riccardo Rb Barbano; Eric Nalisnick; Jose Miguel Hernandez-Lobato; |
36 | EAT-C: Environment-Adversarial Sub-Task Curriculum for Efficient Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Reinforcement learning (RL) is inefficient on long-horizon tasks due to sparse rewards and its policy can be fragile to slightly perturbed environments. We address these challenges via a curriculum of tasks with coupled environments, generated by two policies trained jointly with RL: (1) a co-operative planning policy recursively decomposing a hard task into a coarse-to-fine sub-task tree; and (2) an adversarial policy modifying the environment in each sub-task. |
Shuang Ao; Tianyi Zhou; Jing Jiang; Guodong Long; Xuan Song; Chengqi Zhang; |
37 | Online Balanced Experimental Design Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present algorithms that build on recent advances in online discrepancy minimization which accommodate both arbitrary treatment probabilities and multiple treatments. |
David Arbour; Drew Dimmery; Tung Mai; Anup Rao; |
38 | VariGrow: Variational Architecture Growing for Task-Agnostic Continual Learning Based on Bayesian Novelty Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a variational architecture growing framework dubbed VariGrow. |
Randy Ardywibowo; Zepeng Huo; Zhangyang Wang; Bobak J Mortazavi; Shuai Huang; Xiaoning Qian; |
39 | Thresholded Lasso Bandit Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we revisit the regret minimization problem in sparse stochastic contextual linear bandits, where feature vectors may be of large dimension $d$, but where the reward function depends on a few, say $s_0\ll d$, of these features only. |
Kaito Ariu; Kenshi Abe; Alexandre Proutiere; |
40 | Gradient Based Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality with respect to cluster assignments and cluster center positions. |
Aleksandar Armacki; Dragana Bajovic; Dusan Jakovetic; Soummya Kar; |
41 | Understanding Gradient Descent on The Edge of Stability in Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The current paper mathematically analyzes a new mechanism of implicit regularization in the EoS phase, whereby GD updates due to non-smooth loss landscape turn out to evolve along some deterministic flow on the manifold of minimum loss. |
Sanjeev Arora; Zhiyuan Li; Abhishek Panigrahi; |
42 | Private Optimization in The Interpolation Regime: Faster Rates and Hardness Results Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate differentially private stochastic optimization in the interpolation regime. |
Hilal Asi; Karan Chadha; Gary Cheng; John Duchi; |
43 | Optimal Algorithms for Mean Estimation Under Local Differential Privacy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate the question of designing the randomizer with the smallest variance. |
Hilal Asi; Vitaly Feldman; Kunal Talwar; |
44 | Asymptotically-Optimal Gaussian Bandits with Side Observations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The LP optimizes the cost (regret) required to reliably estimate the suboptimality gap of each arm. This LP lower bound motivates our main contribution: the first known asymptotically optimal algorithm for this general setting. |
Alexia Atsidakou; Orestis Papadigenopoulos; Constantine Caramanis; Sujay Sanghavi; Sanjay Shakkottai; |
45 | Congested Bandits: Optimal Routing Via Short-term Resets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For the multi-armed setup, we propose a UCB style algorithm and show that its policy regret scales as $\tilde{O}(\sqrt{K \Delta T})$.Motivated by this, we introduce the problem of Congested Bandits where each arm’s reward is allowed to depend on the number of times it was played in the past $\Delta$ timesteps. |
Pranjal Awasthi; Kush Bhatia; Sreenivas Gollapudi; Kostas Kollias; |
46 | Do More Negative Samples Necessarily Hurt In Contrastive Learning? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show in a simple theoretical setting, where positive pairs are generated by sampling from the underlying latent class (introduced by Saunshi et al. (ICML 2019)), that the downstream performance of the representation optimizing the (population) contrastive loss in fact does not degrade with the number of negative samples. |
Pranjal Awasthi; Nishanth Dikkala; Pritish Kamath; |
47 | H-Consistency Bounds for Surrogate Loss Minimizers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a detailed study of estimation errors in terms of surrogate loss estimation errors. |
Pranjal Awasthi; Anqi Mao; Mehryar Mohri; Yutao Zhong; |
48 | Iterative Hard Thresholding with Adaptive Regularization: Sparser Solutions Without Sacrificing Runtime Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a simple modification to the iterative hard thresholding (IHT) algorithm, which recovers asymptotically sparser solutions as a function of the condition number. |
Kyriakos Axiotis; Maxim Sviridenko; |
49 | Proving Theorems Using Incremental Learning and Hindsight Experience Replay Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we adapt the idea of hindsight experience replay from reinforcement learning to the automated theorem proving domain, so as to use the intermediate data generated during unsuccessful proof attempts. |
Eser Ayg?n; Ankit Anand; Laurent Orseau; Xavier Glorot; Stephen M Mcaleer; Vlad Firoiu; Lei M Zhang; Doina Precup; Shibl Mourad; |
50 | Near-optimal Rate of Consistency for Linear Models with Missing Values Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which turns out to be quite a challenging task. |
Alexis Ayme; Claire Boyer; Aymeric Dieuleveut; Erwan Scornet; |
51 | How Tempering Fixes Data Augmentation in Bayesian Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we identify two interlaced factors concurrently influencing the strength of the cold posterior effect, namely the correlated nature of augmentations and the degree of invariance of the employed model to such transformations. |
Gregor Bachmann; Lorenzo Noci; Thomas Hofmann; |
52 | ASAP.SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce (i) $\mathtt{ASAP.SGD}$, an analytical framework capturing necessary and desired properties of staleness-adaptive step size functions and (ii) \textsc{tail}-$\tau$, a method for utilizing key properties of the execution instance, generating a tailored strategy that not only dampens the impact of stale updates, but also leverages fresh ones. |
Karl B?ckstr?m; Marina Papatriantafilou; Philippas Tsigas; |
53 | From Noisy Prediction to True Label: Noisy Prediction Calibration Via Generative Model Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We suggest a new branch of method, Noisy Prediction Calibration (NPC) in learning with noisy labels. |
Heesun Bae; Seungjae Shin; Byeonghu Na; Joonho Jang; Kyungwoo Song; Il-Chul Moon; |
54 | Data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To get us closer to general self-supervised learning, we present data2vec, a framework that uses the same learning method for either speech, NLP or computer vision. |
Alexei Baevski; Wei-Ning Hsu; Qiantong Xu; Arun Babu; Jiatao Gu; Michael Auli; |
55 | End-to-End Balancing for Causal Continuous Treatment-Effect Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new theory for consistency of entropy balancing for continuous treatments. |
Taha Bahadori; Eric Tchetgen Tchetgen; David Heckerman; |
56 | A Hierarchical Transitive-Aligned Graph Kernel for Un-attributed Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop a new graph kernel, namely the Hierarchical Transitive-Aligned Kernel, by transitively aligning the vertices between graphs through a family of hierarchical prototype graphs. |
Lu Bai; Lixin Cui; Hancock Edwin; |
57 | Near-Optimal Learning of Extensive-Form Games with Imperfect Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present the first line of algorithms that require only $\widetilde{\mathcal{O}}((XA+YB)/\varepsilon^2)$ episodes of play to find an $\varepsilon$-approximate Nash equilibrium in two-player zero-sum games, where $X,Y$ are the number of information sets and $A,B$ are the number of actions for the two players. |
Yu Bai; Chi Jin; Song Mei; Tiancheng Yu; |
58 | Gaussian Mixture Variational Autoencoder with Contrastive Learning for Multi-Label Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel contrastive learning boosted multi-label prediction model based on a Gaussian mixture variational autoencoder (C-GMVAE), which learns a multimodal prior space and employs a contrastive loss. |
Junwen Bai; Shufeng Kong; Carla P Gomes; |
59 | A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, all the above tasks are in the direction of speech understanding, but for the inverse direction, speech synthesis, the potential of representation learning is yet to be realized, due to the challenging nature of generating high-quality speech. To address this problem, we propose our framework, Alignment-Aware Acoustic-Text Pretraining (A$^3$T), which reconstructs masked acoustic signals with text input and acoustic-text alignment during training. |
He Bai; Renjie Zheng; Junkun Chen; Mingbo Ma; Xintong Li; Liang Huang; |
60 | Stability Based Generalization Bounds for Exponential Family Langevin Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we unify and substantially generalize stability based generalization bounds and make three technical contributions. |
Arindam Banerjee; Tiancong Chen; Xinyan Li; Yingxue Zhou; |
61 | Certified Neural Network Watermarks with Randomized Smoothing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the first certifiable watermarking method. |
Arpit Bansal; Ping-Yeh Chiang; Michael J Curry; Rajiv Jain; Curtis Wigington; Varun Manjunatha; John P Dickerson; Tom Goldstein; |
62 | Data Scaling Laws in NMT: The Effect of Noise and Architecture Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study the effect of varying the architecture and training data quality on the data scaling properties of Neural Machine Translation (NMT). |
Yamini Bansal; Behrooz Ghorbani; Ankush Garg; Biao Zhang; Colin Cherry; Behnam Neyshabur; Orhan Firat; |
63 | Learning Stable Classifiers By Transferring Unstable Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explicitly inform the target classifier about unstable features in the source tasks. |
Yujia Bao; Shiyu Chang; Dr.Regina Barzilay; |
64 | Fast Composite Optimization and Statistical Recovery in Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: From optimization upfront, we propose a new algorithm named Fast Federated Dual Averaging for strongly convex and smooth loss and establish state-of-the-art iteration and communication complexity in the composite setting. |
Yajie Bao; Michael Crawshaw; Shan Luo; Mingrui Liu; |
65 | Generative Modeling for Multi-task Visual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, motivated by multi-task learning of shareable feature representations, we consider a novel problem of learning a shared generative model that is useful across various visual perception tasks. |
Zhipeng Bao; Martial Hebert; Yu-Xiong Wang; |
66 | Estimating The Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we consider diagonal and full covariances to improve the expressive power of DPMs. |
Fan Bao; Chongxuan Li; Jiacheng Sun; Jun Zhu; Bo Zhang; |
67 | On The Surrogate Gap Between Contrastive and Supervised Losses Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Following the simplified setting where positive pairs are drawn from the true distribution (not generated by data augmentation; as supposed in previous studies), this study establishes surrogate upper and lower bounds for the downstream classification loss for all negative sample sizes that best explain the empirical observations on the negative sample size in the earlier studies. |
Han Bao; Yoshihiro Nagano; Kento Nozawa; |
68 | Representation Topology Divergence: A Method for Comparing Neural Network Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a method for comparing two data representations. |
Serguei Barannikov; Ilya Trofimov; Nikita Balabin; Evgeny Burnaev; |
69 | Sparse Mixed Linear Regression with Guarantees: Taming An Intractable Problem with Invex Relaxation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the problem of sparse mixed linear regression on an unlabeled dataset that is generated from linear measurements from two different regression parameter vectors. |
Adarsh Barik; Jean Honorio; |
70 | Neural Fisher Discriminant Analysis: Optimal Neural Network Embeddings in Polynomial Time Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a natural extension of FLDA that employs neural networks, called Neural Fisher Discriminant Analysis (NFDA). |
Burak Bartan; Mert Pilanci; |
71 | Fictitious Play and Best-Response Dynamics in Identical Interest and Zero-Sum Stochastic Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes an extension of a popular decentralized discrete-time learning procedure when repeating a static game called fictitious play (FP) (Brown, 1951; Robinson, 1951) to a dynamic model called discounted stochastic game (Shapley, 1953). |
Lucas Baudin; Rida Laraki; |
72 | Information Discrepancy in Strategic Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We initiate the study of the effects of non-transparency in decision rules on individuals’ ability to improve in strategic learning settings. |
Yahav Bechavod; Chara Podimata; Steven Wu; Juba Ziani; |
73 | On The Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To mitigate this hidden bias, heavy-tailed policy parameterizations may be used, which exhibit a bounded score function, but doing so can cause instability in algorithmic updates. To address these issues, in this work, we study the convergence of policy gradient algorithms under heavy-tailed parameterizations, which we propose to stabilize with a combination of mirror ascent-type updates and gradient tracking. |
Amrit Singh Bedi; Souradip Chakraborty; Anjaly Parayil; Brian M Sadler; Pratap Tokekar; Alec Koppel; |
74 | Imitation Learning By Estimating Expertise of Demonstrators Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we show that unsupervised learning over demonstrator expertise can lead to a consistent boost in the performance of imitation learning algorithms. |
Mark Beliaev; Andy Shih; Stefano Ermon; Dorsa Sadigh; Ramtin Pedarsani; |
75 | Matching Normalizing Flows and Probability Paths on Manifolds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose to train CNFs on manifolds by minimizing probability path divergence (PPD), a novel family of divergences between the probability density path generated by the CNF and a target probability density path. |
Heli Ben-Hamu; Samuel Cohen; Joey Bose; Brandon Amos; Maximillian Nickel; Aditya Grover; Ricky T. Q. Chen; Yaron Lipman; |
76 | Stochastic Contextual Dueling Bandits Under Linear Stochastic Transitivity Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a computationally efficient algorithm, \Algo{CoLSTIM}, which makes its choice based on imitating the feedback process using perturbed context-dependent utility estimates of the underlying CoLST model. |
Viktor Bengs; Aadirupa Saha; Eyke H?llermeier; |
77 | Neural Inverse Kinematic Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a neural IK method that employs the hierarchical structure of the problem to sequentially sample valid joint angles conditioned on the desired position and on the preceding joints along the chain. |
Raphael Bensadoun; Shir Gur; Nitsan Blau; Lior Wolf; |
78 | Volatility Based Kernels and Moving Average Means for Accurate Forecasting with Gaussian Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address this fundamental limitation, we show how to re-cast a class of stochastic volatility models as a hierarchical Gaussian process (GP) model with specialized covariance functions. |
Gregory Benton; Wesley Maddox; Andrew Gordon Wilson; |
79 | Gradient Descent on Neurons and Its Link to Approximate Second-order Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This challenges widely held believes and immediately raises the question why KFAC performs so well. Towards answering this question we present evidence strongly suggesting that KFAC approximates a first-order algorithm, which performs gradient descent on neurons rather than weights. |
Frederik Benzing; |
80 | Safe Learning in Tree-Form Sequential Decision Making: Handling Hard and Soft Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the hard-threshold problem of achieving sublinear regret while guaranteeing that the threshold constraint is satisfied at every iteration with high probability. |
Martino Bernasconi; Federico Cacciamani; Matteo Castiglioni; Alberto Marchesi; Nicola Gatti; Francesco Trov?; |
81 | Skin Deep Unlearning: Artefact and Instrument Debiasing in The Context of Melanoma Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we robustly remove bias and spurious variation from an automated melanoma classification pipeline using two leading bias unlearning techniques. |
Peter Bevan; Amir Atapour-Abarghouei; |
82 | Approximate Bayesian Computation with Domain Expert in The Loop Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce an active learning method for ABC statistics selection which reduces the domain expert’s work considerably. |
Ayush Bharti; Louis Filstroff; Samuel Kaski; |
83 | Minimax M-estimation Under Adversarial Contamination Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To illustrate the usefulness of the derived robust M-estimator in an online setting, we present a bandit algorithm for the partially identifiable best arm identification problem that improves upon the sample complexity of the state of the art algorithms. |
Sujay Bhatt; Guanhua Fang; Ping Li; Gennady Samorodnitsky; |
84 | Nearly Optimal Catoni’s M-estimator for Infinite Variance Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we extend the remarkable M-estimator of Catoni \citep{Cat12} to situations where the variance is infinite. |
Sujay Bhatt; Guanhua Fang; Ping Li; Gennady Samorodnitsky; |
85 | Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study stochastic optimization algorithms for a personalized federated learning setting involving local and global models subject to user-level (joint) differential privacy. |
Alberto Bietti; Chen-Yu Wei; Miroslav Dudik; John Langford; Steven Wu; |
86 | Non-Vacuous Generalisation Bounds for Shallow Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We focus on a specific class of shallow neural networks with a single hidden layer, namely those with $L_2$-normalised data and either a sigmoid-shaped Gaussian error function (“erf”) activation or a Gaussian Error Linear Unit (GELU) activation. For these networks, we derive new generalisation bounds through the PAC-Bayesian theory; unlike most existing such bounds they apply to neural networks with deterministic rather than randomised parameters. |
Felix Biggs; Benjamin Guedj; |
87 | Structure-preserving GANs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce structure-preserving GANs as a data-efficient framework for learning distributions with additional structure such as group symmetry, by developing new variational representations for divergences. |
Jeremiah Birrell; Markos Katsoulakis; Luc Rey-Bellet; Wei Zhu; |
88 | Scalable Spike-and-Slab Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this article, we propose Scalable Spike-and-Slab (S^3), a scalable Gibbs sampling implementation for high-dimensional Bayesian regression with the continuous spike-and-slab prior of George & McCulloch (1993). |
Niloy Biswas; Lester Mackey; Xiao-Li Meng; |
89 | Breaking Down Out-of-Distribution Detection: Many Methods Based on OOD Training Data Estimate A Combination of The Same Core Quantities Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The goal of this paper is to recognize common objectives as well as to identify the implicit scoring functions of different OOD detection methods. |
Julian Bitterwolf; Alexander Meinke; Maximilian Augustin; Matthias Hein; |
90 | A Query-optimal Algorithm for Finding Counterfactuals Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We design an algorithm for finding counterfactuals with strong theoretical guarantees on its performance. |
Guy Blanc; Caleb Koch; Jane Lange; Li-Yang Tan; |
91 | Popular Decision Tree Algorithms Are Provably Noise Tolerant Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Using the framework of boosting, we prove that all impurity-based decision tree learning algorithms, including the classic ID3, C4.5, and CART, are highly noise tolerant. |
Guy Blanc; Jane Lange; Ali Malik; Li-Yang Tan; |
92 | Optimizing Sequential Experimental Design with Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these methods may not sufficiently explore the design space, require access to a differentiable probabilistic model and can only optimize over continuous design spaces. Here, we address these limitations by showing that the problem of optimizing policies can be reduced to solving a Markov decision process (MDP). |
Tom Blau; Edwin V. Bonilla; Iadine Chades; Amir Dezfouli; |
93 | Lagrangian Method for Q-Function Learning (with Applications to Machine Translation) Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper discusses a new approach to the fundamental problem of learning optimal Q-functions. |
Huang Bojun; |
94 | Generalized Results for The Existence and Consistency of The MLE in The Bradley-Terry-Luce Model Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study the performance of the Bradley-Terry-Luce model for ranking from pairwise comparison data under more realistic settings than those considered in the literature so far. |
Heejong Bong; Alessandro Rinaldo; |
95 | How to Train Your Wide Neural Network Without Backprop: An Input-Weight Alignment Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Leveraging NTK theory, we show theoretically that gradient descent drives layerwise weight updates that are aligned with their input activity correlations weighted by error, and demonstrate empirically that the result also holds in finite-width wide networks. |
Akhilan Boopathy; Ila Fiete; |
96 | Improving Language Models By Retrieving from Trillions of Tokens Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We enhance auto-regressive language models by conditioning on document chunks retrieved from a large corpus, based on local similarity with preceding tokens. |
Sebastian Borgeaud; Arthur Mensch; Jordan Hoffmann; Trevor Cai; Eliza Rutherford; Katie Millican; George Bm Van Den Driessche; Jean-Baptiste Lespiau; Bogdan Damoc; Aidan Clark; Diego De Las Casas; Aurelia Guy; Jacob Menick; Roman Ring; Tom Hennigan; Saffron Huang; Loren Maggiore; Chris Jones; Albin Cassirer; Andy Brock; Michela Paganini; Geoffrey Irving; Oriol Vinyals; Simon Osindero; Karen Simonyan; Jack Rae; Erich Elsen; Laurent Sifre; |
97 | Lie Point Symmetry Data Augmentation for Neural PDE Solvers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thus, we are presented with a proverbial chicken-and-egg problem. In this paper, we present a method, which can partially alleviate this problem, by improving neural PDE solver sample complexity—Lie point symmetry data augmentation (LPSDA). |
Johannes Brandstetter; Max Welling; Daniel E Worrall; |
98 | An Iterative Clustering Algorithm for The Contextual Stochastic Block Model with Optimality Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new iterative algorithm to cluster networks with side information for nodes (in the form of covariates) and show that our algorithm is optimal under the Contextual Symmetric Stochastic Block Model. |
Guillaume Braun; Hemant Tyagi; Christophe Biernacki; |
99 | Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Motivated by the emerging principles of dendritic computation, we augment a dynamically interpretable and mathematically tractable piecewise-linear (PL) recurrent neural network (RNN) by a linear spline basis expansion. |
Manuel Brenner; Florian Hess; Jonas M Mikhaeil; Leonard F Bereska; Zahra Monfared; Po-Chen Kuo; Daniel Durstewitz; |
100 | Learning to Predict Graphs with Fused Gromov-Wasserstein Barycenters Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces a novel and generic framework to solve the flagship task of supervised labeled graph prediction by leveraging Optimal Transport tools. |
Luc Brogat-Motte; R?mi Flamary; Celine Brouard; Juho Rousu; Florence D?Alch?-Buc; |
101 | Efficient Learning of CNNs Using Patch Based Features Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Recent work has demonstrated the effectiveness of using patch based representations when learning from image data. Here we provide theoretical support for this observation, by showing that a simple semi-supervised algorithm that uses patch statistics can efficiently learn labels produced by a one-hidden-layer Convolutional Neural Network (CNN). |
Alon Brutzkus; Amir Globerson; Eran Malach; Alon Regev Netser; Shai Shalev-Schwartz; |
102 | Causal Structure-based Root Cause Analysis of Outliers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a formal method to identify "root causes" of outliers, amongst variables. |
Kailash Budhathoki; Lenon Minorics; Patrick Bloebaum; Dominik Janzing; |
103 | IGLUE: A Benchmark for Transfer Learning Across Modalities, Tasks, and Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Due to the lack of a multilingual benchmark, however, vision-and-language research has mostly focused on English language tasks. To fill this gap, we introduce the Image-Grounded Language Understanding Evaluation benchmark. |
Emanuele Bugliarello; Fangyu Liu; Jonas Pfeiffer; Siva Reddy; Desmond Elliott; Edoardo Maria Ponti; Ivan Vulic; |
104 | Interactive Inverse Reinforcement Learning for Cooperative Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of designing autonomous agents that can learn to cooperate effectively with a potentially suboptimal partner while having no access to the joint reward function. |
Thomas Kleine B?ning; Anne-Marie George; Christos Dimitrakakis; |
105 | Convolutional and Residual Networks Provably Contain Lottery Tickets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove that also modern architectures consisting of convolutional and residual layers that can be equipped with almost arbitrary activation functions can contain lottery tickets with high probability. |
Rebekka Burkholz; |
106 | Near-Optimal Algorithms for Autonomous Exploration and Multi-Goal Stochastic Shortest Path Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new algorithm with stronger sample complexity bounds than existing ones. |
Haoyuan Cai; Tengyu Ma; Simon Du; |
107 | Convergence of Invariant Graph Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the convergence of one powerful GNN, Invariant Graph Network (IGN) over graphs sampled from graphons. |
Chen Cai; Yusu Wang; |
108 | Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In detail, we propose a reinforcement learning algorithm (Optimistic Exploration via Adversarial Integral Equation or OP-TENET) that attains an $\epsilon$-optimal policy within $O(1/\epsilon^2)$ episodes. |
Qi Cai; Zhuoran Yang; Zhaoran Wang; |
109 | Scaling Gaussian Process Optimization By Evaluating A Few Unique Candidates Multiple Times Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that sequential black-box optimization based on GPs (GP-Opt) can be made efficient by sticking to a candidate solution for multiple evaluation steps and switch only when necessary. |
Daniele Calandriello; Luigi Carratino; Alessandro Lazaric; Michal Valko; Lorenzo Rosasco; |
110 | Adaptive Gaussian Process Change Point Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Detecting change points in time series, i.e., points in time at which some observed process suddenly changes, is a fundamental task that arises in many real-world applications, with consequences for safety and reliability. In this work, we propose ADAGA, a novel Gaussian process-based solution to this problem, that leverages a powerful heuristics we developed based on statistical hypothesis testing. |
Edoardo Caldarelli; Philippe Wenk; Stefan Bauer; Andreas Krause; |
111 | Measuring Dissimilarity with Diffeomorphism Invariance Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce DID, a pairwise dissimilarity measure applicable to a wide range of data spaces, which leverages the data’s internal structure to be invariant to diffeomorphisms. |
Th?ophile Cantelobre; Carlo Ciliberto; Benjamin Guedj; Alessandro Rudi; |
112 | A Model-Agnostic Randomized Learning Framework Based on Random Hypothesis Subspace Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a model-agnostic randomized learning framework based on Random Hypothesis Subspace Sampling (RHSS). |
Yiting Cao; Chao Lan; |
113 | Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for Safety-Critical Applications Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, state-of-the-art techniques for safety-critical settings hinge on the assumption that the kernel hyperparameters are known, which does not apply in general. To mitigate this, we introduce robust Gaussian process uniform error bounds in settings with unknown hyperparameters. |
Alexandre Capone; Armin Lederer; Sandra Hirche; |
114 | Burst-Dependent Plasticity and Dendritic Amplification Support Target-Based Learning and Hierarchical Imitation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a multi-compartment model of pyramidal neuron, in which bursts and dendritic input segregation give the possibility to plausibly support a biological target-based learning. |
Cristiano Capone; Cosimo Lupo; Paolo Muratore; Pier Stanislao Paolucci; |
115 | A Marriage Between Adversarial Team Games and 2-player Games: Enabling Abstractions, No-regret Learning, and Subgame Solving Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, we propose a new, suitable game representation that we call team-public-information, in which a team is represented as a single coordinator who only knows information common to the whole team and prescribes to each member an action for any possible private state. |
Luca Carminati; Federico Cacciamani; Marco Ciccone; Nicola Gatti; |
116 | RECAPP: Crafting A More Efficient Catalyst for Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel Relaxed Error Criterion for Accelerated Proximal Point (RECAPP) that eliminates the need for high accuracy subproblem solutions. |
Yair Carmon; Arun Jambulapati; Yujia Jin; Aaron Sidford; |
117 | Estimating and Penalizing Induced Preference Shifts in Recommender Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We focus on induced preference shifts in users. |
Micah D Carroll; Anca Dragan; Stuart Russell; Dylan Hadfield-Menell; |
118 | YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our method builds upon the VITS model and adds several novel modifications for zero-shot multi-speaker and multilingual training. |
Edresson Casanova; Julian Weber; Christopher D Shulby; Arnaldo Candido Junior; Eren G?lge; Moacir A Ponti; |
119 | The Infinite Contextual Graph Markov Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As with most Deep Graph Networks, an inherent limitation is the need to perform an extensive model selection to choose the proper size of each layer’s latent representation. In this paper, we address this problem by introducing the Infinite Contextual Graph Markov Model (iCGMM), the first deep Bayesian nonparametric model for graph learning. |
Daniele Castellana; Federico Errica; Davide Bacciu; Alessio Micheli; |
120 | Compressed-VFL: Communication-Efficient Learning with Vertically Partitioned Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Compressed Vertical Federated Learning (C-VFL) for communication-efficient training on vertically partitioned data. |
Timothy J Castiglia; Anirban Das; Shiqiang Wang; Stacy Patterson; |
121 | Online Learning with Knapsacks: The Best of Both Worlds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of $m$ resource constraints. |
Matteo Castiglioni; Andrea Celli; Christian Kroer; |
122 | Stabilizing Off-Policy Deep Reinforcement Learning from Pixels Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a result, many successful algorithms must combine different domain-specific practices and auxiliary losses to learn meaningful behaviors in complex environments. In this work, we provide novel analysis demonstrating that these instabilities arise from performing temporal-difference learning with a convolutional encoder and low-magnitude rewards. |
Edoardo Cetin; Philip J Ball; Stephen Roberts; Oya Celiktutan; |
123 | Accelerated, Optimal and Parallel: Some Results on Model-based Stochastic Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an acceleration scheme for the APROX family and provide non-asymptotic convergence guarantees, which are order-optimal in all problem-dependent constants and provide even larger minibatching speedups. |
Karan Chadha; Gary Cheng; John Duchi; |
124 | Robust Imitation Learning Against Variations in Environment Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a robust imitation learning (IL) framework that improves the robustness of IL when environment dynamics are perturbed. |
Jongseong Chae; Seungyul Han; Whiyoung Jung; Myungsik Cho; Sungho Choi; Youngchul Sung; |
125 | Fairness with Adaptive Weights Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel adaptive reweighing method to address representation bias. |
Junyi Chai; Xiaoqian Wang; |
126 | UNIREX: A Unified Learning Framework for Language Model Rationale Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Although attribution algorithms and select-predict pipelines are commonly used in rationale extraction, they both rely on certain heuristics that hinder them from satisfying all three desiderata. In light of this, we propose UNIREX, a flexible learning framework which generalizes rationale extractor optimization as follows: (1) specify architecture for a learned rationale extractor; (2) select explainability objectives (\ie faithfulness and plausibility criteria); and (3) jointly train the task model and rationale extractor on the task using selected objectives. |
Aaron Chan; Maziar Sanjabi; Lambert Mathias; Liang Tan; Shaoliang Nie; Xiaochang Peng; Xiang Ren; Hamed Firooz; |
127 | Revisiting Label Smoothing and Knowledge Distillation Compatibility: What Was Missing? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The main contributions of our work are the discovery, analysis and validation of systematic diffusion as the missing concept which is instrumental in understanding and resolving these contradictory findings. |
Keshigeyan Chandrasegaran; Ngoc-Trung Tran; Yunqing Zhao; Ngai-Man Cheung; |
128 | Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we tackle the training-inference mismatch encountered during unsupervised learning of controllable generative sequence models. |
Jen-Hao Rick Chang; Ashish Shrivastava; Hema Koppula; Xiaoshuai Zhang; Oncel Tuzel; |
129 | Learning Bellman Complete Representations for Offline Policy Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose BCRL, which directly learns from data an approximately linear Bellman complete representation with good coverage. |
Jonathan Chang; Kaiwen Wang; Nathan Kallus; Wen Sun; |
130 | Sample Efficient Learning of Predictors That Complement Humans Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we provide the first theoretical analysis of the benefit of learning complementary predictors in expert deferral. |
Mohammad-Amin Charusaie; Hussein Mozannar; David Sontag; Samira Samadi; |
131 | Nystrom Kernel Mean Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an efficient approximation procedure based on the Nystr{ö}m method, which exploits a small random subset of the dataset. |
Antoine Chatalic; Nicolas Schreuder; Lorenzo Rosasco; Alessandro Rudi; |
132 | Coarsening The Granularity: Towards Structurally Sparse Lottery Tickets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we demonstrate the first positive result that a structurally sparse winning ticket can be effectively found in general. |
Tianlong Chen; Xuxi Chen; Xiaolong Ma; Yanzhi Wang; Zhangyang Wang; |
133 | Learning Domain Adaptive Object Detection with Probabilistic Teacher Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a simple yet effective framework, termed as Probabilistic Teacher (PT), which aims to capture the uncertainty of unlabeled target data from a gradually evolving teacher and guides the learning of a student in a mutually beneficial manner. |
Meilin Chen; Weijie Chen; Shicai Yang; Jie Song; Xinchao Wang; Lei Zhang; Yunfeng Yan; Donglian Qi; Yueting Zhuang; Di Xie; Shiliang Pu; |
134 | The Fundamental Price of Secure Aggregation in Differentially Private Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of training a $d$ dimensional model with distributed differential privacy (DP) where secure aggregation (SecAgg) is used to ensure that the server only sees the noisy sum of $n$ model updates in every training round. |
Wei-Ning Chen; Christopher A Choquette Choo; Peter Kairouz; Ananda Theertha Suresh; |
135 | Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We argue that creating spread alone is insufficient for better representations, since spread is invariant to permutations within classes. |
Mayee Chen; Daniel Y Fu; Avanika Narayan; Michael Zhang; Zhao Song; Kayvon Fatahalian; Christopher Re; |
136 | Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate a natural but surprisingly unstudied approach to the multi-armed bandit problem under safety risk constraints. |
Tianrui Chen; Aditya Gangrade; Venkatesh Saligrama; |
137 | On The Sample Complexity of Learning Infinite-horizon Discounted Linear Kernel MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we extend the uniform-PAC sample complexity from episodic setting to the infinite-horizon discounted setting, and propose a novel algorithm dubbed UPAC-UCLK that achieves an $\Tilde{O}\big(d^2/((1-\gamma)^4\epsilon^2)+1/((1-\gamma)^6\epsilon^2)\big)$ uniform-PAC sample complexity, where $d$ is the dimension of the feature mapping, $\gamma \in(0,1)$ is the discount factor of the MDP and $\epsilon$ is the accuracy parameter. |
Yuanzhou Chen; Jiafan He; Quanquan Gu; |
138 | Streaming Algorithms for Support-Aware Histograms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a result, even relatively simple distributions cannot be approximated by succinct histograms without incurring large error. In this paper, we address this issue by adapting the definition of approximation so that only the errors of the items that belong to the support of the distribution are considered. |
Justin Chen; Piotr Indyk; Tal Wagner; |
139 | Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce two new no-regret algorithms for the stochastic shortest path (SSP) problem with a linear MDP that significantly improve over the only existing results of (Vial et al., 2021). |
Liyu Chen; Rahul Jain; Haipeng Luo; |
140 | Learning Infinite-horizon Average-reward Markov Decision Process with Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study regret minimization for infinite-horizon average-reward Markov Decision Processes (MDPs) under cost constraints. |
Liyu Chen; Rahul Jain; Haipeng Luo; |
141 | Active Multi-Task Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance. |
Yifang Chen; Kevin Jamieson; Simon Du; |
142 | On Collective Robustness of Bagging Against Data Poisoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on this analysis, we propose hash bagging to improve the robustness of vanilla bagging almost for free. |
Ruoxin Chen; Zenan Li; Jie Li; Junchi Yan; Chentao Wu; |
143 | Online Active Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The goal is to efficiently maintain the regression of received data points with a small budget of label queries. We propose novel algorithms for this problem under $\ell_p$ loss where $p\in[1,2]$. |
Cheng Chen; Yi Li; Yiming Sun; |
144 | Selling Data To A Machine Learner: Pricing Via Costly Signaling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider a new problem of selling data to a machine learner who looks to purchase data to train his machine learning model. |
Junjie Chen; Minming Li; Haifeng Xu; |
145 | ME-GAN: Learning Panoptic Electrocardio Representations for Multi-view ECG Synthesis Conditioned on Heart Diseases Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel disease-aware generative adversarial network for multi-view ECG synthesis called ME-GAN, which attains panoptic electrocardio representations conditioned on heart diseases and projects the representations onto multiple standard views to yield ECG signals. |
Jintai Chen; Kuanlun Liao; Kun Wei; Haochao Ying; Danny Z Chen; Jian Wu; |
146 | Weisfeiler-Lehman Meets Gromov-Wasserstein Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the Weisfeiler-Lehman (WL) distance, a notion of distance between labeled measure Markov chains (LMMCs), of which labeled graphs are special cases. |
Samantha Chen; Sunhyuk Lim; Facundo Memoli; Zhengchao Wan; Yusu Wang; |
147 | On Non-local Convergence Analysis of Deep Linear Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the non-local convergence properties of deep linear networks. |
Kun Chen; Dachao Lin; Zhihua Zhang; |
148 | Flow-based Recurrent Belief State Learning for POMDPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce the \textbf{F}l\textbf{O}w-based \textbf{R}ecurrent \textbf{BE}lief \textbf{S}tate model (FORBES), which incorporates normalizing flows into the variational inference to learn general continuous belief states for POMDPs. |
Xiaoyu Chen; Yao Mark Mu; Ping Luo; Shengbo Li; Jianyu Chen; |
149 | Structure-Aware Transformer for Graph Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose several methods for automatically generating the subgraph representation and show theoretically that the resulting representations are at least as expressive as the subgraph representations. |
Dexiong Chen; Leslie O?Bray; Karsten Borgwardt; |
150 | The Poisson Binomial Mechanism for Unbiased Federated Learning with Secure Aggregation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the Poisson Binomial mechanism (PBM), a discrete differential privacy mechanism for distributed mean estimation (DME) with applications to federated learning and analytics. |
Wei-Ning Chen; Ayfer Ozgur; Peter Kairouz; |
151 | Learning Mixtures of Linear Dynamical Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of learning a mixture of multiple linear dynamical systems (LDSs) from unlabeled short sample trajectories, each generated by one of the LDS models. |
Yanxi Chen; H. Vincent Poor; |
152 | On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the off-policy evaluation (OPE) problem in an infinite-horizon Markov decision process with continuous states and actions. |
Xiaohong Chen; Zhengling Qi; |
153 | Faster Fundamental Graph Algorithms Via Learned Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the question of speeding up classic graph algorithms with machine-learned predictions. |
Justin Chen; Sandeep Silwal; Ali Vakilian; Fred Zhang; |
154 | Improve Single-Point Zeroth-Order Optimization Using High-Pass and Low-Pass Filters Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we borrow the idea of high-pass and low-pass filters from extremum seeking control (continuous-time version of SZO) and develop a novel SZO method called HLF-SZO by integrating these filters. |
Xin Chen; Yujie Tang; Na Li; |
155 | Deep Variational Graph Convolutional Recurrent Network for Multivariate Time Series Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we model sensor dependency and stochasticity within MTS by developing an embedding-guided probabilistic generative network. |
Wenchao Chen; Long Tian; Bo Chen; Liang Dai; Zhibin Duan; Mingyuan Zhou; |
156 | Auxiliary Learning with Joint Task and Data Scheduling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to learn a joint task and data schedule for auxiliary learning, which captures the importance of different data samples in each auxiliary task to the target task. |
Hong Chen; Xin Wang; Chaoyu Guan; Yue Liu; Wenwu Zhu; |
157 | Optimization-Induced Graph Implicit Nonlinear Diffusion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Due to the over-smoothing issue, most existing graph neural networks can only capture limited dependencies with their inherently finite aggregation layers. To overcome this limitation, we propose a new kind of graph convolution, called Graph Implicit Nonlinear Diffusion (GIND), which implicitly has access to infinite hops of neighbors while adaptively aggregating features with nonlinear diffusion to prevent over-smoothing. |
Qi Chen; Yifei Wang; Yisen Wang; Jiansheng Yang; Zhouchen Lin; |
158 | Robust Meta-learning with Sampling Noise and Label Noise Via Eigen-Reptile Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Besides, when handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise on a corrupted dataset. To address these two challenges, we present Eigen-Reptile (ER) that updates the meta-parameters with the main direction of historical task-specific parameters. |
Dong Chen; Lingfei Wu; Siliang Tang; Xiao Yun; Bo Long; Yueting Zhuang; |
159 | Adaptive Model Design for Markov Decision Process Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Hence, appropriate regulations are often required, if we hope to take the external costs/benefits of its actions into consideration. In this paper, we study how to regulate such an agent by redesigning model parameters that can affect the rewards and/or the transition kernels. |
Siyu Chen; Donglin Yang; Jiayang Li; Senmiao Wang; Zhuoran Yang; Zhaoran Wang; |
160 | State Transition of Dendritic Spines Improves Learning of Sparse Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the state transition of dendritic spines in the filopodial model of spinogenesis, we model different states of SNN weights, facilitating weight optimization for pruning. |
Yanqi Chen; Zhaofei Yu; Wei Fang; Zhengyu Ma; Tiejun Huang; Yonghong Tian; |
161 | Efficient Online ML API Selection for Multi-Label Classification Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose FrugalMCT, a principled framework that adaptively selects the APIs to use for different data in an online fashion while respecting the user’s budget. |
Lingjiao Chen; Matei Zaharia; James Zou; |
162 | Data-Efficient Double-Win Lottery Tickets from Robust Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we formulate a more rigorous concept, Double-Win Lottery Tickets, in which a located subnetwork from a pre-trained model can be independently transferred on diverse downstream tasks, to reach BOTH the same standard and robust generalization, under BOTH standard and adversarial training regimes, as the full pre-trained model can do. |
Tianlong Chen; Zhenyu Zhang; Sijia Liu; Yang Zhang; Shiyu Chang; Zhangyang Wang; |
163 | Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To trade off the DNN expressiveness (which calls for more non-linearity) and robustness certification scalability (which prefers more linearity), we propose a novel solution to strategically manipulate neurons, by "grafting" appropriate levels of linearity. |
Tianlong Chen; Huan Zhang; Zhenyu Zhang; Shiyu Chang; Sijia Liu; Pin-Yu Chen; Zhangyang Wang; |
164 | Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the first optimistic model-based algorithm for PbRL with general function approximation, which estimates the model using value-targeted regression and calculates the exploratory policies by solving an optimistic planning problem. |
Xiaoyu Chen; Han Zhong; Zhuoran Yang; Zhaoran Wang; Liwei Wang; |
165 | Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop decentralized AC and natural AC (NAC) algorithms that avoid sharing agents’ local information and are sample and communication-efficient. |
Ziyi Chen; Yi Zhou; Rong-Rong Chen; Shaofeng Zou; |
166 | Task-aware Privacy Preservation for Multi-dimensional Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we address how to significantly improve the ultimate task performance with multi-dimensional user data by considering a task-aware privacy preservation problem. |
Jiangnan Cheng; Ao Tang; Sandeep Chinchali; |
167 | Adversarially Trained Actor Critic for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Adversarially Trained Actor Critic (ATAC), a new model-free algorithm for offline reinforcement learning (RL) under insufficient data coverage, based on the concept of relative pessimism. |
Ching-An Cheng; Tengyang Xie; Nan Jiang; Alekh Agarwal; |
168 | Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We create classical (non-quantum) dynamic data structures supporting queries for recommender systems and least-squares regression that are comparable to their quantum analogues. |
Nadiia Chepurko; Kenneth Clarkson; Lior Horesh; Honghao Lin; David Woodruff; |
169 | RieszNet and ForestRiesz: Automatic Debiased Machine Learning with Neural Nets and Random Forests Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a multitasking Neural Net debiasing method with stochastic gradient descent minimization of a combined Riesz representer and regression loss, while sharing representation layers for the two functions. |
Victor Chernozhukov; Whitney Newey; Vi?ctor M Quintas-Marti?nez; Vasilis Syrgkanis; |
170 | Self-supervised Learning with Random-projection Quantizer for Speech Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a simple and effective self-supervised learning approach for speech recognition. |
Chung-Cheng Chiu; James Qin; Yu Zhang; Jiahui Yu; Yonghui Wu; |
171 | Discrete Probabilistic Inverse Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We formalize and systematically analyze the properties of IOT using tools from the study of entropy-regularized OT. |
Wei-Ting Chiu; Pei Wang; Patrick Shafto; |
172 | Selective Network Linearization for Efficient Private Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To reduce PI latency we propose a gradient-based algorithm that selectively linearizes ReLUs while maintaining prediction accuracy. |
Minsu Cho; Ameya Joshi; Brandon Reagen; Siddharth Garg; Chinmay Hegde; |
173 | From Block-Toeplitz Matrices to Differential Equations on Graphs: Towards A General Theory for Scalable Masked Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we provide, to the best of our knowledge, the first comprehensive approach for incorporating various masking mechanisms into Transformers architectures in a scalable way. |
Krzysztof Choromanski; Han Lin; Haoxian Chen; Tianyi Zhang; Arijit Sehanobish; Valerii Likhosherstov; Jack Parker-Holder; Tamas Sarlos; Adrian Weller; Thomas Weingarten; |
174 | Shuffle Private Linear Contextual Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a general algorithmic framework for linear contextual bandits under the shuffle trust model, where there exists a trusted shuffler – in between users and the central server– that randomly permutes a batch of users data before sending those to the server. |
Sayak Ray Chowdhury; Xingyu Zhou; |
175 | DNA: Domain Generalization with Diversified Neural Averaging Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Methodologically, we propose a diversified neural averaging (DNA) method for DG, which optimizes the proposed PAC-Bayes bound approximately. |
Xu Chu; Yujie Jin; Wenwu Zhu; Yasha Wang; Xin Wang; Shanghang Zhang; Hong Mei; |
176 | TPC: Transformation-Specific Smoothing for Point Cloud Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a transformation-specific smoothing framework TPC, which provides tight and scalable robustness guarantees for point cloud models against semantic transformation attacks. |
Wenda Chu; Linyi Li; Bo Li; |
177 | Unified Scaling Laws for Routed Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For these models, parameter count and computational requirement form two independent axes along which an increase leads to better performance. In this work we derive and justify scaling laws defined on these two variables which generalize those known for standard language models and describe the performance of a wide range of routing architectures trained via three different techniques. |
Aidan Clark; Diego De Las Casas; Aurelia Guy; Arthur Mensch; Michela Paganini; Jordan Hoffmann; Bogdan Damoc; Blake Hechtman; Trevor Cai; Sebastian Borgeaud; George Bm Van Den Driessche; Eliza Rutherford; Tom Hennigan; Matthew J Johnson; Albin Cassirer; Chris Jones; Elena Buchatskaya; David Budden; Laurent Sifre; Simon Osindero; Oriol Vinyals; Marc?Aurelio Ranzato; Jack Rae; Erich Elsen; Koray Kavukcuoglu; Karen Simonyan; |
178 | Context-Aware Drift Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead we may wish to test for differences in the distributions conditional on context that is permitted to change. To facilitate this we borrow machinery from the causal inference domain to develop a more general drift detection framework built upon a foundation of two-sample tests for conditional distributional treatment effects. |
Oliver Cobb; Arnaud Van Looveren; |
179 | On The Robustness of CountSketch to Adaptive Inputs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a robust estimator (for a slightly modified sketch) that allows for quadratic number of queries in the sketch size, which is an improvement factor of $\sqrt{k}$ (for $k$ heavy hitters) over prior "blackbox" approaches. |
Edith Cohen; Xin Lyu; Jelani Nelson; Tamas Sarlos; Moshe Shechner; Uri Stemmer; |
180 | Diffusion Bridges Vector Quantized Variational Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a new model to train the prior and the encoder/decoder networks simultaneously. |
Max Cohen; Guillaume Quispe; Sylvain Le Corff; Charles Ollion; Eric Moulines; |
181 | Online and Consistent Correlation Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we study the problem in the classic online setting with recourse; The vertices of the graphs arrive in an online manner and the goal is to maintain an approximate clustering while minimizing the number of times each vertex changes cluster. |
Vincent Cohen-Addad; Silvio Lattanzi; Andreas Maggiori; Nikos Parotsidis; |
182 | Massively Parallel $k$-Means Clustering for Perturbation Resilient Instances Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider $k$-means clustering of $n$ data points in Euclidean space in the Massively Parallel Computation (MPC) model, a computational model which is an abstraction of modern massively parallel computing system such as MapReduce. |
Vincent Cohen-Addad; Vahab Mirrokni; Peilin Zhong; |
183 | One-Pass Diversified Sampling with Application to Terabyte-Scale Genomic Sequence Streams Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an efficient sampling routine that uses an online representation of the data distribution as a prefilter to retain elements from rare groups. |
Benjamin Coleman; Benito Geordie; Li Chou; R. A. Leo Elworth; Todd Treangen; Anshumali Shrivastava; |
184 | Transfer and Marginalize: Explaining Away Label Noise with Privileged Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a simple and efficient method for supervised learning with neural networks: it transfers via weight sharing the knowledge learned with privileged information and approximately marginalizes over privileged information at test time. |
Mark Collier; Rodolphe Jenatton; Effrosyni Kokiopoulou; Jesse Berent; |
185 | MAML and ANIL Provably Learn Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we prove that two well-known GBML methods, MAML and ANIL, as well as their first-order approximations, are capable of learning common representation among a set of given tasks. |
Liam Collins; Aryan Mokhtari; Sewoong Oh; Sanjay Shakkottai; |
186 | Entropic Causal Inference: Graph Identifiability Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In our work, we first extend the causal graph identifiability result in the two-variable setting under relaxed assumptions. We then show the first identifiability result using the entropic approach for learning causal graphs with more than two nodes. |
Spencer Compton; Kristjan Greenewald; Dmitriy A Katz; Murat Kocaoglu; |
187 | Mitigating Gender Bias in Face Recognition Using The Von Mises-Fisher Mixture Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate the gender bias of deep Face Recognition networks. |
Jean-R?my Conti; Nathan Noiry; Stephan Clemencon; Vincent Despiegel; St?phane Gentric; |
188 | Counterfactual Transportability: A Formal Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the transportability of counterfactuals from an arbitrary combination of observational and experimental distributions coming from disparate domains. |
Juan D Correa; Sanghack Lee; Elias Bareinboim; |
189 | Label-Free Explainability for Unsupervised Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Hence, choosing which component(s) to interpret in a label-free unsupervised/self-supervised setting is an important, yet unsolved problem. To bridge this gap in the literature, we introduce two crucial extensions of post-hoc explanation techniques: (1) label-free feature importance and (2) label-free example importance that respectively highlight influential features and training examples for a black-box to construct representations at inference time. |
Jonathan Crabb?; Mihaela van der Schaar; |
190 | Evaluating The Adversarial Robustness of Adaptive Test-time Defenses Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While these results are disappointing, we still believe that adaptive test-time defenses are a promising avenue of research and, as such, we provide recommendations for their thorough evaluation. |
Francesco Croce; Sven Gowal; Thomas Brunner; Evan Shelhamer; Matthias Hein; Taylan Cemgil; |
191 | Adversarial Robustness Against Multiple and Single $l_p$-Threat Models Via Quick Fine-Tuning of Robust Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we propose Extreme norm Adversarial Training (E-AT) for multiple-norm robustness which is based on geometric properties of $l_p$-balls. |
Francesco Croce; Matthias Hein; |
192 | Self-conditioning Pre-Trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we aim to investigate the mechanisms that guide text generation with pre-trained Transformer-based Language Models (TLMs). |
Xavier Suau Cuadros; Luca Zappella; Nicholas Apostoloff; |
193 | Only Tails Matter: Average-Case Universality and Robustness in The Convex Regime Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work shows that the concentration of eigenvalues near the edges of the ESD determines a problem’s asymptotic average complexity. |
Leonardo Cunha; Gauthier Gidel; Fabian Pedregosa; Damien Scieur; Courtney Paquette; |
194 | Principal Component Flows Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we characterize the geometric structure of flows using principal manifolds and understand the relationship between latent variables and samples using contours. |
Edmond Cunningham; Adam D Cobb; Susmit Jha; |
195 | Deep Symbolic Regression for Recurrence Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we train Transformers to infer the function or recurrence relation underlying sequences of integers or floats, a typical task in human IQ tests which has hardly been tackled in the machine learning literature. |
St?phane D?Ascoli; Pierre-Alexandre Kamienny; Guillaume Lample; Francois Charton; |
196 | Continuous Control with Action Quantization from Demonstrations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel Reinforcement Learning (RL) framework for problems with continuous action spaces: Action Quantization from Demonstrations (AQuaDem). |
Robert Dadashi; L?onard Hussenot; Damien Vincent; Sertan Girgin; Anton Raichuk; Matthieu Geist; Olivier Pietquin; |
197 | Dialog Inpainting: Turning Documents Into Dialogs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, conversational question answering (ConvQA) systems have long been stymied by scarce training data that is expensive to collect. To address this problem, we propose a new technique for synthetically generating diverse and high-quality dialog data: dialog inpainting. |
Zhuyun Dai; Arun Tejasvi Chaganty; Vincent Y Zhao; Aida Amini; Qazi Mamunur Rashid; Mike Green; Kelvin Guu; |
198 | DisPFL: Towards Communication-Efficient Personalized Federated Learning Via Decentralized Sparse Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel personalized federated learning framework in a decentralized (peer-to-peer) communication protocol named DisPFL, which employs personalized sparse masks to customize sparse local models on the edge. |
Rong Dai; Li Shen; Fengxiang He; Xinmei Tian; Dacheng Tao; |
199 | Marginal Distribution Adaptation for Discrete Sets Via Module-Oriented Divergence Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a general framework to adapt a generative model subject to a (possibly counterfactual) target data distribution with both sampling and computation efficiency. |
Hanjun Dai; Mengjiao Yang; Yuan Xue; Dale Schuurmans; Bo Dai; |
200 | Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel formulation for the Inverse Reinforcement Learning (IRL) problem, which jointly accounts for the compatibility with the expert behavior of the identified reward and its effectiveness for the subsequent forward learning phase. |
Angelo Damiani; Giorgio Manganini; Alberto Maria Metelli; Marcello Restelli; |
201 | Understanding Robust Generalization in Learning Regular Languages Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We hypothesize that standard end-to-end modeling strategies cannot generalize well to systematic distribution shifts and propose a compositional strategy to address this. |
Soham Dan; Osbert Bastani; Dan Roth; |
202 | Unsupervised Image Representation Learning with Deep Latent Particles Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new representation of visual data that disentangles object position from appearance. |
Tal Daniel; Aviv Tamar; |
203 | Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These crucial questions have been scarcely investigated, despite the prominent practical importance of these policies. This paper presents a theoretical analysis of such policies and provides the first regret and sample-complexity bounds for reinforcement learning with myopic exploration. |
Chris Dann; Yishay Mansour; Mehryar Mohri; Ayush Sekhari; Karthik Sridharan; |
204 | Monarch: Expressive Structured Matrices for Efficient and Accurate Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: These methods have not seen widespread adoption (1) in end-to-end training due to unfavorable efficiency–quality tradeoffs, and (2) in dense-to-sparse fine-tuning due to lack of tractable algorithms to approximate a given dense weight matrix. To address these issues, we propose a class of matrices (Monarch) that is hardware-efficient (they are parameterized as products of two block-diagonal matrices for better hardware utilization) and expressive (they can represent many commonly used transforms). |
Tri Dao; Beidi Chen; Nimit S Sohoni; Arjun Desai; Michael Poli; Jessica Grogan; Alexander Liu; Aniruddh Rao; Atri Rudra; Christopher Re; |
205 | Score-Guided Intermediate Level Optimization: Fast Langevin Mixing for Inverse Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In practice, to allow for increased expressivity, we propose to do posterior sampling in the latent space of a pre-trained generative model. |
Giannis Daras; Yuval Dagan; Alex Dimakis; Constantinos Daskalakis; |
206 | Test-Time Training Can Close The Natural Distribution Shift Performance Gap in Deep Learning Based Compressed Sensing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a domain adaptation method for deep learning based compressive sensing that relies on self-supervision during training paired with test-time training at inference. |
Mohammad Zalbagi Darestani; Jiayu Liu; Reinhard Heckel; |
207 | Knowledge Base Question Answering By Case-based Reasoning Over Subgraphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Leveraging this structural similarity between local neighborhoods of different subgraphs, we introduce a semiparametric model (CBR-SUBG) with (i) a nonparametric component that for each query, dynamically retrieves other similar $k$-nearest neighbor (KNN) training queries along with query-specific subgraphs and (ii) a parametric component that is trained to identify the (latent) reasoning patterns from the subgraphs of KNN queries and then apply them to the subgraph of the target query. |
Rajarshi Das; Ameya Godbole; Ankita Naik; Elliot Tower; Manzil Zaheer; Hannaneh Hajishirzi; Robin Jia; Andrew Mccallum; |
208 | Framework for Evaluating Faithfulness of Local Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the faithfulness of an explanation system to the underlying prediction model. |
Sanjoy Dasgupta; Nave Frost; Michal Moshkovitz; |
209 | Distinguishing Rule and Exemplar-based Generalization in Learning Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The trade-off between exemplar- and rule-based generalization has been studied extensively in cognitive psychology; in this work, we present a protocol inspired by these experimental approaches to probe the inductive biases that control this trade-off in category-learning systems such as artificial neural networks. |
Ishita Dasgupta; Erin Grant; Tom Griffiths; |
210 | Robust Multi-Objective Bayesian Optimization Under Input Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Since directly optimizing MVaR is computationally infeasible in many settings, we propose a scalable, theoretically-grounded approach for optimizing MVaR using random scalarizations. |
Samuel Daulton; Sait Cakmak; Maximilian Balandat; Michael A. Osborne; Enlu Zhou; Eytan Bakshy; |
211 | Attentional Meta-learners for Few-shot Polythetic Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, we find that in the presence of task-irrelevant features, inherent to meta-learning problems, attentional models are susceptible to misclassification. To address this challenge, we propose a self-attention feature-selection mechanism that adaptively dilutes non-discriminative features. |
Ben J Day; Ramon Vi?as Torn?; Nikola Simidjievski; Pietro Li?; |
212 | Adversarial Vulnerability of Randomized Ensembles Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, this impressive performance raises the question: Are these robustness gains provided by randomized ensembles real? In this work we address this question both theoretically and empirically. |
Hassan Dbouk; Naresh Shanbhag; |
213 | Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a novel framework for optimization based on energy-conserving Hamiltonian dynamics in a strongly mixing (chaotic) regime and establish its key properties analytically and numerically. |
Giuseppe Bruno De Luca; Eva Silverstein; |
214 | Error-driven Input Modulation: Solving The Credit Assignment Problem Without A Backward Pass Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we propose to replace the backward pass with a second forward pass in which the input signal is modulated based on the error of the network. |
Giorgia Dellaferrera; Gabriel Kreiman; |
215 | DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a reconstruction-free MBRL agent, called DreamerPro, that can enhance robustness to distractions. |
Fei Deng; Ingook Jang; Sungjin Ahn; |
216 | NeuralEF: Deconstructing Kernels By Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the existing method relies on an expensive orthogonalization step and is difficult to implement. We show that these problems can be fixed by using a new series of objective functions that generalizes the EigenGame to function space. |
Zhijie Deng; Jiaxin Shi; Jun Zhu; |
217 | Deep Causal Metric Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, this can lead the model to recklessly learn all the correlated distances found in training data including the spurious distance (e.g., background differences) that is not the distance of interest and can harm the generalization of the learned metric. To address this issue, we study metric learning from a causality perspective and accordingly propose deep causal metric learning (DCML) that pursues the true causality of the distance between samples. |
Xiang Deng; Zhongfei Zhang; |
218 | On The Convergence of Inexact Predictor-Corrector Methods for Linear Programming Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To remedy this, we theoretically and empirically analyze (slightly modified) predictor-corrector IPMs when using approximate linear solvers: our approach guarantees that, when certain conditions are satisfied, the number of IPM iterations does not increase and that the final solution remains feasible. |
Gregory Dexter; Agniva Chowdhury; Haim Avron; Petros Drineas; |
219 | Analysis of Stochastic Processes Through Replay Buffers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we analyze a system where a stochastic process X is pushed into a replay buffer and then randomly sampled to generate a stochastic process Y from the replay buffer. |
Shirli Di-Castro; Shie Mannor; Dotan Di Castro; |
220 | Streaming Algorithms for High-Dimensional Robust Statistics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop the first efficient streaming algorithms for high-dimensional robust statistics with near-optimal memory requirements (up to logarithmic factors). |
Ilias Diakonikolas; Daniel M. Kane; Ankit Pensia; Thanasis Pittas; |
221 | Learning General Halfspaces with Adversarial Label Noise Via Online Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we show that the problem can be solved directly via online gradient descent applied to a sequence of natural non-convex surrogates. |
Ilias Diakonikolas; Vasilis Kontonis; Christos Tzamos; Nikos Zarifis; |
222 | Variational Feature Pyramid Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we opt to learn a dataset-specific architecture for Feature Pyramid Networks. |
Panagiotis Dimitrakopoulos; Giorgos Sfikas; Christophoros Nikou; |
223 | Understanding Doubly Stochastic Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the analysis of why this projection improves clustering has been limited. In this paper we present theoretical conditions on the given affinity matrix under which its doubly stochastic projection is an ideal affinity matrix (i.e., it has no false connections between clusters, and is well-connected within each cluster). |
Tianjiao Ding; Derek Lim; Rene Vidal; Benjamin D Haeffele; |
224 | Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To learn a Nash equilibrium of an MPG in which the size of state space and/or the number of players can be very large, we propose new independent policy gradient algorithms that are run by all players in tandem. |
Dongsheng Ding; Chen-Yu Wei; Kaiqing Zhang; Mihailo Jovanovic; |
225 | Generalization and Robustness Implications in Object-Centric Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we train state-of-the-art unsupervised models on five common multi-object datasets and evaluate segmentation metrics and downstream object property prediction. |
Andrea Dittadi; Samuele S Papa; Michele De Vita; Bernhard Sch?lkopf; Ole Winther; Francesco Locatello; |
226 | Fair Generalized Linear Models with A Convex Penalty Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we introduce two fairness criteria for GLMs based on equalizing expected outcomes or log-likelihoods. |
Hyungrok Do; Preston Putzel; Axel S Martin; Padhraic Smyth; Judy Zhong; |
227 | Bayesian Learning with Information Gain Provably Bounds Risk for A Robust Adversarial Defense Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new algorithm to learn a deep neural network model robust against adversarial attacks. |
Bao Gia Doan; Ehsan M Abbasnejad; Javen Qinfeng Shi; Damith Ranashinghe; |
228 | On The Adversarial Robustness of Causal Algorithmic Recourse Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we formulate the adversarially robust recourse problem and show that recourse methods that offer minimally costly recourse fail to be robust. |
Ricardo Dominguez-Olmedo; Amir H Karimi; Bernhard Sch?lkopf; |
229 | Finding The Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present an adaptive-mapping quantization method to learn an optimal latent sub-distribution that is inherent within models and smoothly approximated with a concrete Gaussian Mixture (GM). |
Runpei Dong; Zhanhong Tan; Mengdi Wu; Linfeng Zhang; Kaisheng Ma; |
230 | PACE: A Parallelizable Computation Encoder for Directed Acyclic Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a Parallelizable Attention-based Computation structure Encoder (PACE) that processes nodes simultaneously and encodes DAGs in parallel. |
Zehao Dong; Muhan Zhang; Fuhai Li; Yixin Chen; |
231 | Privacy for Free: How Does Dataset Condensation Help Privacy? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we for the first time identify that dataset condensation (DC) which is originally designed for improving training efficiency is also a better solution to replace the traditional data generators for private data generation, thus providing privacy for free. |
Tian Dong; Bo Zhao; Lingjuan Lyu; |
232 | Fast Rates for Noisy Interpolation Require Rethinking The Effect of Inductive Bias Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Even though this intuition is valid for regularized models, in this paper we caution against a strong inductive bias for interpolation in the presence of noise: While a stronger inductive bias encourages a simpler structure that is more aligned with the ground truth, it also increases the detrimental effect of noise. |
Konstantin Donhauser; Nicol? Ruggeri; Stefan Stojanovic; Fanny Yang; |
233 | Adapting to Mixing Time in Stochastic Optimization with Markovian Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose the first optimization method that does not require the knowledge of the mixing time, yet obtains the optimal asymptotic convergence rate when applied to convex problems. |
Ron Dorfman; Kfir Yehuda Levy; |
234 | TACTiS: Transformer-Attentional Copulas for Time Series Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we address the problem of estimating the joint predictive distribution of high-dimensional multivariate time series. |
Alexandre Drouin; ?tienne Marcotte; Nicolas Chapados; |
235 | Branching Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel Branching Reinforcement Learning (Branching RL) model, and investigate both Regret Minimization (RM) and Reward-Free Exploration (RFE) metrics for this model. |
Yihan Du; Wei Chen; |
236 | Bayesian Imitation Learning for End-to-End Mobile Manipulation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we investigate and demonstrate benefits of a Bayesian approach to imitation learning from multiple sensor inputs, as applied to the task of opening office doors with a mobile manipulator. |
Yuqing Du; Daniel Ho; Alex Alemi; Eric Jang; Mohi Khansari; |
237 | GLaM: Efficient Scaling of Language Models with Mixture-of-Experts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose and develop a family of language models named \glam (\textbf{G}eneralist \textbf{La}nguage \textbf{M}odel), which uses a sparsely activated mixture-of-experts architecture to scale the model capacity while also incurring substantially less training cost compared to dense variants. |
Nan Du; Yanping Huang; Andrew M Dai; Simon Tong; Dmitry Lepikhin; Yuanzhong Xu; Maxim Krikun; Yanqi Zhou; Adams Wei Yu; Orhan Firat; Barret Zoph; Liam Fedus; Maarten P Bosma; Zongwei Zhou; Tao Wang; Emma Wang; Kellie Webster; Marie Pellat; Kevin Robinson; Kathleen Meier-Hellstern; Toju Duke; Lucas Dixon; Kun Zhang; Quoc Le; Yonghui Wu; Zhifeng Chen; Claire Cui; |
238 | Learning Iterative Reasoning Through Energy Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a new framework for iterative reasoning with neural networks. |
Yilun Du; Shuang Li; Joshua Tenenbaum; Igor Mordatch; |
239 | SE(3) Equivariant Graph Neural Networks with Complete Local Frames Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a framework to construct SE(3) equivariant graph neural networks that can approximate the geometric quantities efficiently. |
Weitao Du; He Zhang; Yuanqi Du; Qi Meng; Wei Chen; Nanning Zheng; Bin Shao; Tie-Yan Liu; |
240 | A Context-Integrated Transformer-Based Neural Network for Auction Design Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these works either focus on a fixed set of bidders and items, or restrict the auction to be symmetric. In this work, we overcome such limitations by factoring public contextual information of bidders and items into the auction learning framework. |
Zhijian Duan; Jingwu Tang; Yutong Yin; Zhe Feng; Xiang Yan; Manzil Zaheer; Xiaotie Deng; |
241 | Augment with Care: Contrastive Learning for Combinatorial Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We find that label-preserving augmentations are critical for the success of contrastive pre-training. |
Haonan Duan; Pashootan Vaezipoor; Max B Paulus; Yangjun Ruan; Chris Maddison; |
242 | Parametric Visual Program Induction with Function Modularization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the concept of parametric visual program induction. |
Xuguang Duan; Xin Wang; Ziwei Zhang; Wenwu Zhu; |
243 | Bayesian Deep Embedding Topic Meta-Learner Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel framework that efficiently solves the problem of topic modeling under the small data regime. |
Zhibin Duan; Yishi Xu; Jianqiao Sun; Bo Chen; Wenchao Chen; Chaojie Wang; Mingyuan Zhou; |
244 | Deletion Robust Submodular Maximization Over Matroids Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we study the deletion robust version of the problem under the classic matroids constraint. |
Paul Duetting; Federico Fusco; Silvio Lattanzi; Ashkan Norouzi-Fard; Morteza Zadimoghaddam; |
245 | From Data to Functa: Your Data Point Is A Function and You Can Treat It Like One Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A powerful continuous alternative is then to represent these measurements using an implicit neural representation, a neural function trained to output the appropriate measurement value for any input spatial location. In this paper, we take this idea to its next level: what would it take to perform deep learning on these functions instead, treating them as data? |
Emilien Dupont; Hyunjik Kim; S. M. Ali Eslami; Danilo Jimenez Rezende; Dan Rosenbaum; |
246 | Efficient Low Rank Convex Bounds for Pairwise Discrete Graphical Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we extend a Burer-Monteiro style method to compute low rank Semi-Definite Programming (SDP) bounds for the MAP problem on discrete graphical models with an arbitrary number of states and arbitrary pairwise potentials. |
Valentin Durante; George Katsirelos; Thomas Schiex; |
247 | Robust Counterfactual Explanations for Tree-Based Ensembles Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel strategy – that we call RobX – to generate robust counterfactuals for tree-based ensembles, e.g., XGBoost. |
Sanghamitra Dutta; Jason Long; Saumitra Mishra; Cecilia Tilli; Daniele Magazzeni; |
248 | On The Difficulty of Defending Self-Supervised Learning Against Model Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We thus explore model stealing attacks against SSL. |
Adam Dziedzic; Nikita Dhawan; Muhammad Ahmad Kaleem; Jonas Guan; Nicolas Papernot; |
249 | LIMO: Latent Inceptionism for Targeted Molecule Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Latent Inceptionism on Molecules (LIMO), which significantly accelerates molecule generation with an inceptionism-like technique. |
Peter Eckmann; Kunyang Sun; Bo Zhao; Mudong Feng; Michael Gilson; Rose Yu; |
250 | Inductive Biases and Variable Creation in Self-Attention Mechanisms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To support our analysis, we present synthetic experiments to probe the sample complexity of learning sparse Boolean functions with Transformers. |
Benjamin L Edelman; Surbhi Goel; Sham Kakade; Cyril Zhang; |
251 | Provable Reinforcement Learning with A Short-Term Memory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by the problem structure in several physical applications, as well as a commonly used technique known as "frame stacking", this paper proposes to study a new subclass of POMDPs, whose latent states can be decoded by the most recent history of a short length m. |
Yonathan Efroni; Chi Jin; Akshay Krishnamurthy; Sobhan Miryoosefi; |
252 | Sparsity in Partially Controllable Linear Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, our structural results characterize those state variables which are irrelevant for optimal control, an analysis which departs from classical control techniques. |
Yonathan Efroni; Sham Kakade; Akshay Krishnamurthy; Cyril Zhang; |
253 | FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduced a novel framework called FedNew in which there is no need to transmit Hessian information from clients to PS, hence resolving the bottleneck to improve communication efficiency. |
Anis Elgabli; Chaouki Ben Issaid; Amrit Singh Bedi; Ketan Rajawat; Mehdi Bennis; Vaneet Aggarwal; |
254 | PathGCN: Learning General Graph Spatial Operators from Paths Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we propose pathGCN, a novel approach to learn the spatial operator from random paths on the graph. |
Moshe Eliasof; Eldad Haber; Eran Treister; |
255 | Discrete Tree Flows Via Tree-Structured Permutations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our approach seeks to reduce computational burden and remove the need for pseudo-gradients by developing a discrete flow based on decision trees—building upon the success of efficient tree-based methods for classification and regression for discrete data. |
Mai Elkady; Hyung Zin Lim; David I Inouye; |
256 | For Learning in Symmetric Teams, Local Optima Are Global Nash Equilibria Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we show that any locally optimal symmetric strategy profile is also a (global) Nash equilibrium. |
Scott Emmons; Caspar Oesterheld; Andrew Critch; Vincent Conitzer; Stuart Russell; |
257 | Streaming Algorithm for Monotone K-Submodular Maximization with Cardinality Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop a new streaming algorithm for maximizing a monotone k-submodular function subject to a per-coordinate cardinality constraint attaining an approximation guarantee close to the state of the art guarantee in the offline setting. |
Alina Ene; Huy Nguyen; |
258 | Towards Scaling Difference Target Propagation By Learning Backprop Targets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel feedback weight training scheme that ensures both that DTP approximates BP and that layer-wise feedback weight training can be restored without sacrificing any theoretical guarantees. |
Maxence M Ernoult; Fabrice Normandin; Abhinav Moudgil; Sean Spinney; Eugene Belilovsky; Irina Rish; Blake Richards; Yoshua Bengio; |
259 | Understanding Dataset Difficulty with $\mathcalV$-Usable Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address these questions, we frame dataset difficulty-w.r.t. a model V-as the lack of V-usable information (Xu et al., 2019), where a lower value indicates a more difficult dataset for V. |
Kawin Ethayarajh; Yejin Choi; Swabha Swayamdipta; |
260 | Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a method, Head-to-Toe probing (Head2Toe), that selects features from all layers of the source model to train a classification head for the target-domain. |
Utku Evci; Vincent Dumoulin; Hugo Larochelle; Michael C Mozer; |
261 | Variational Sparse Coding with Learned Thresholding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a new approach to variational sparse coding that allows us to learn sparse distributions by thresholding samples, avoiding the use of problematic relaxations. |
Kion Fallah; Christopher J Rozell; |
262 | Training Discrete Deep Generative Models Via Gapped Straight-Through Estimator Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a Gapped Straight-Through (GST) estimator to reduce the variance without incurring resampling overhead. |
Ting-Han Fan; Ta-Chung Chi; Alexander I. Rudnicky; Peter J Ramadge; |
263 | DRIBO: Robust Deep Reinforcement Learning Via Multi-View Information Bottleneck Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specif- ically, we introduce a novel contrastive version of the Multi-View Information Bottleneck (MIB) objective for temporal data. |
Jiameng Fan; Wenchao Li; |
264 | Generalized Data Distribution Iteration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To obtain higher sample efficiency and superior final performance simultaneously has been one of the major challenges for deep reinforcement learning (DRL). Previous work could handle one of these challenges but typically failed to address them concurrently. In this paper, we try to tackle these two challenges simultaneously. |
Jiajun Fan; Changnan Xiao; |
265 | Variational Wasserstein Gradient Flow Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper builds on the recent works with a slight but crucial difference: we propose to utilize a variational formulation of the objective function formulated as maximization over a parametric class of functions. |
Jiaojiao Fan; Qinsheng Zhang; Amirhossein Taghvaei; Yongxin Chen; |
266 | Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP) Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Since these language-image models differ from previous training approaches in several ways, an important question is what causes the large robustness gains. We answer this question via a systematic experimental investigation. |
Alex Fang; Gabriel Ilharco; Mitchell Wortsman; Yuhao Wan; Vaishaal Shankar; Achal Dave; Ludwig Schmidt; |
267 | Bayesian Continuous-Time Tucker Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: They either drop the timestamps or bin them into crude steps and hence ignore the temporal dynamics within each step or use simple parametric time coefficients. To overcome these limitations, we propose Bayesian Continuous-Time Tucker Decomposition. |
Shikai Fang; Akil Narayan; Robert Kirby; Shandian Zhe; |
268 | Byzantine Machine Learning Made Easy By Resilient Averaging of Momentums Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present RESAM (RESilient Averaging of Momentums), a unified framework that makes it simple to establish optimal Byzantine resilience, relying only on standard machine learning assumptions. |
Sadegh Farhadkhani; Rachid Guerraoui; Nirupam Gupta; Rafael Pinot; John Stephan; |
269 | An Equivalence Between Data Poisoning and Byzantine Gradient Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show a surprising equivalence between this model and data poisoning, a threat considered much more realistic. |
Sadegh Farhadkhani; Rachid Guerraoui; L? Nguy?n Hoang; Oscar Villemaud; |
270 | Investigating Generalization By Controlling Normalized Margin Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The paper finds that yes{—}in a standard training setup, test performance closely tracks normalized margin. The paper suggests a Gaussian process model as a promising explanation for this behavior. |
Alexander R Farhang; Jeremy D Bernstein; Kushal Tirumala; Yang Liu; Yisong Yue; |
271 | Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging The Gap Between Learning in Extensive-Form and Normal-Form Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we show that the Optimistic Multiplicative Weights Update (OMWU) algorithm—the premier learning algorithm for NFGs—can be simulated on the normal-form equivalent of an EFG in linear time per iteration in the game tree size using a kernel trick. |
Gabriele Farina; Chung-Wei Lee; Haipeng Luo; Christian Kroer; |
272 | Local Linear Convergence of Douglas-Rachford for Linear Programming: A Probabilistic Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we analyze the local linear convergence rate $r$ of the DRS method for random linear programs, and give explicit and tight bounds on $r$. |
Oisin Faust; Hamza Fawzi; |
273 | Matching Structure for Dual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose to further enhance dual learning with structure matching that explicitly builds structural connections in between. |
Hao Fei; Shengqiong Wu; Yafeng Ren; Meishan Zhang; |
274 | Cascaded Gaps: Towards Logarithmic Regret for Risk-Sensitive Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study gap-dependent regret guarantees for risk-sensitive reinforcement learning based on the entropic risk measure. |
Yingjie Fei; Ruitu Xu; |
275 | Private Frequency Estimation Via Projective Geometry Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a new algorithm ProjectiveGeometryResponse (PGR) for locally differentially private (LDP) frequency estimation. |
Vitaly Feldman; Jelani Nelson; Huy Nguyen; Kunal Talwar; |
276 | An Intriguing Property of Geophysics Inversion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To alleviate those issues, recent studies leverage deep neural networks to learn the inversion mappings from measurements to the property directly. In this paper, we show that such a mapping can be well modeled by a very shallow (but not wide) network with only five layers. |
Yinan Feng; Yinpeng Chen; Shihang Feng; Peng Jin; Zicheng Liu; Youzuo Lin; |
277 | Principled Knowledge Extrapolation with GANs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to study counterfactual synthesis from a new perspective of knowledge extrapolation, where a given knowledge dimension of the data distribution is extrapolated, but the remaining knowledge is kept indistinguishable from the original distribution. |
Ruili Feng; Jie Xiao; Kecheng Zheng; Deli Zhao; Jingren Zhou; Qibin Sun; Zheng-Jun Zha; |
278 | A Resilient Distributed Boosting Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a distributed boosting algorithm which is resilient to a limited amount of noise. |
Yuval Filmus; Idan Mehalel; Shay Moran; |
279 | Model-Value Inconsistency As A Signal for Epistemic Uncertainty Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Using a model of the environment and a value function, an agent can construct many estimates of a state’s value, by unrolling the model for different lengths and bootstrapping with its value function. Our key insight is that one can treat this set of value estimates as a type of ensemble, which we call an implicit value ensemble (IVE). |
Angelos Filos; Eszter V?rtes; Zita Marinho; Gregory Farquhar; Diana Borsa; Abram Friesen; Feryal Behbahani; Tom Schaul; Andre Barreto; Simon Osindero; |
280 | Coordinated Double Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While this methodology is flexible and can accommodate arbitrary predictive models, typically trained independently of one another, this paper argues that a carefully coordinated learning algorithm for deep neural networks may reduce the estimation bias. |
Nitai Fingerhut; Matteo Sesia; Yaniv Romano; |
281 | Conformal Prediction Sets with Limited False Positives Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a new approach to multi-label conformal prediction in which we aim to output a precise set of promising prediction candidates with a bounded number of incorrect answers. |
Adam Fisch; Tal Schuster; Tommi Jaakkola; Dr.Regina Barzilay; |
282 | Fast Population-Based Reinforcement Learning on A Single Machine Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we compare implementations and revisit previous studies to show that the judicious use of compilation and vectorization allows population-based training to be performed on a single machine with one accelerator with minimal overhead compared to training a single agent. |
Arthur Flajolet; Claire Bizon Monroc; Karim Beguir; Thomas Pierrot; |
283 | Fast Relative Entropy Coding with A* Coding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce AS* and AD* coding, two REC algorithms based on A* sampling. |
Gergely Flamich; Stratis Markou; Jose Miguel Hernandez-Lobato; |
284 | Contrastive Mixture of Posteriors for Counterfactual Inference, Data Integration and Fairness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Adopting a Conditional VAE framework, we show that marginal independence between the representation and a condition variable plays a key role in both of these challenges. We propose the Contrastive Mixture of Posteriors (CoMP) method that uses a novel misalignment penalty defined in terms of mixtures of the variational posteriors to enforce this independence in latent space. |
Adam Foster; Arpi Vezer; Craig A. Glastonbury; Paidi Creed; Samer Abujudeh; Aaron Sim; |
285 | Label Ranking Through Nonparametric Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a generative model for Label Ranking, in noiseless and noisy nonparametric regression settings, and provide sample complexity bounds for learning algorithms in both cases. |
Dimitris Fotakis; Alkis Kalavasis; Eleni Psaroudaki; |
286 | A Neural Tangent Kernel Perspective of GANs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel theoretical framework of analysis for Generative Adversarial Networks (GANs). |
Jean-Yves Franceschi; Emmanuel De B?zenac; Ibrahim Ayed; Mickael Chen; Sylvain Lamprier; Patrick Gallinari; |
287 | Extracting Latent State Representations with Linear Dynamics from Rich Observations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider a setting where there is a hidden linear subspace of the high-dimensional feature space in which the dynamics are linear. |
Abraham Frandsen; Rong Ge; Holden Lee; |
288 | SPDY: Accurate Pruning with Speedup Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet, most existing pruning methods minimize just the number of remaining weights, i.e. the size of the model, rather than optimizing for inference time. We address this gap by introducing SPDY, a new compression method which automatically determines layer-wise sparsity targets achieving a desired inference speedup on a given system, while minimizing accuracy loss. |
Elias Frantar; Dan Alistarh; |
289 | Revisiting The Effects of Stochasticity for Hamiltonian Samplers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We revisit the theoretical properties of Hamiltonian stochastic differential equations (SDES) for Bayesian posterior sampling, and we study the two types of errors that arise from numerical SDE simulation: the discretization error and the error due to noisy gradient estimates in the context of data subsampling. |
Giulio Franzese; Dimitrios Milios; Maurizio Filippone; Pietro Michiardi; |
290 | Bregman Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a framework based on bilevel optimization for learning multilayer, deep data representations. |
Jordan Frecon; Gilles Gasso; Massimiliano Pontil; Saverio Salzo; |
291 | (Non-)Convergence Results for Predictive Coding Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: One major open problem around PCNs is their convergence behavior. In this paper, we use dynamical systems theory to formally investigate the convergence of PCNs as they are used in machine learning. |
Simon Frieder; Thomas Lukasiewicz; |
292 | Scaling Structured Inference with Randomization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we propose a family of randomized dynamic programming (RDP) algorithms for scaling structured models to tens of thousands of latent states. |
Yao Fu; John Cunningham; Mirella Lapata; |
293 | Greedy When Sure and Conservative When Uncertain About The Opponents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a new approach, named Greedy when Sure and Conservative when Uncertain (GSCU), to competing online against unknown and nonstationary opponents. |
Haobo Fu; Ye Tian; Hongxiang Yu; Weiming Liu; Shuang Wu; Jiechao Xiong; Ying Wen; Kai Li; Junliang Xing; Qiang Fu; Wei Yang; |
294 | DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we open up a new compression paradigm for developing real-hardware efficient DNNs, leading to boosted hardware efficiency while maintaining model accuracy. |
Yonggan Fu; Haichuan Yang; Jiayi Yuan; Meng Li; Cheng Wan; Raghuraman Krishnamoorthi; Vikas Chandra; Yingyan Lin; |
295 | Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by our theoretical analysis, we present practical suggestions on implementing multi-agent PG algorithms for either high rewards or diverse emergent behaviors and empirically validate our findings on a variety of domains, ranging from the simplified matrix and grid-world games to complex benchmarks such as StarCraft Multi-Agent Challenge and Google Research Football. |
Wei Fu; Chao Yu; Zelai Xu; Jiaqi Yang; Yi Wu; |
296 | $p$-Laplacian Based Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Moreover, when the topology is non-informative for label prediction, ordinary GNNs may work significantly worse than simply applying multi-layer perceptrons (MLPs) on each node. To tackle the above problem, we propose a new $p$-Laplacian based GNN model, termed as $^p$GNN, whose message passing mechanism is derived from a discrete regularization framework and could be theoretically explained as an approximation of a polynomial graph filter defined on the spectral domain of $p$-Laplacians. |
Guoji Fu; Peilin Zhao; Yatao Bian; |
297 | Why Should I Trust You, Bellman? The Bellman Error Is A Poor Replacement for Value Error Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study the use of the Bellman equation as a surrogate objective for value prediction accuracy. |
Scott Fujimoto; David Meger; Doina Precup; Ofir Nachum; Shixiang Shane Gu; |
298 | Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We analyze the impact of DP on these models vis-a-vis underrepresented classes/subgroups of data, specifically, studying: 1) the size of classes/subgroups in the synthetic data, and 2) the accuracy of classification tasks run on them. |
Georgi Ganev; Bristena Oprisanu; Emiliano De Cristofaro; |
299 | The Complexity of K-Means Clustering When Little Is Known Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we study the complexity of k-means clustering in settings where most of the data is not known or simply irrelevant. |
Robert Ganian; Thekla Hamm; Viktoriia Korchemna; Karolina Okrasa; Kirill Simonov; |
300 | IDYNO: Learning Nonparametric DAGs from Interventional Dynamic Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new algorithm, IDYNO, to learn the DAG structure from potentially nonlinear times series data by using a continuous optimization framework that includes a recent formulation for continuous acyclicity constraint. |
Tian Gao; Debarun Bhattacharjya; Elliot Nelson; Miao Liu; Yue Yu; |
301 | Loss Function Learning for Domain Generalization By Implicit Gradient Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, we introduce a novel meta-learning approach to loss function search based on implicit gradient. |
Boyan Gao; Henry Gouk; Yongxin Yang; Timothy Hospedales; |
302 | On The Convergence of Local Stochastic Compositional Gradient Descent with Momentum Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we developed a novel local stochastic compositional gradient descent with momentum method, which facilitates Federated Learning for the stochastic compositional problem. |
Hongchang Gao; Junyi Li; Heng Huang; |
303 | Deep Reference Priors: What Is The Best Way to Pretrain A Model? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents the first demonstration of reference priors for medium-scale deep networks and image-based data. |
Yansong Gao; Rahul Ramesh; Pratik Chaudhari; |
304 | On The Equivalence Between Temporal and Static Equivariant Graph Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work formalizes the associational task of predicting node attribute evolution in temporal graphs from the perspective of learning equivariant representations. |
Jianfei Gao; Bruno Ribeiro; |
305 | Generalizing Gaussian Smoothing for Random Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on an analysis of DFO for non-convex functions, we propose to choose a distribution for perturbations that minimizes the mean squared error (MSE) of the gradient estimate. |
Katelyn Gao; Ozan Sener; |
306 | Rethinking Image-Scaling Attacks: The Interplay Between Vulnerabilities in Machine Learning Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate the interplay between vulnerabilities of the image scaling procedure and machine learning models in the decision-based black-box setting. |
Yue Gao; Ilia Shumailov; Kassem Fawaz; |
307 | Lazy Estimation of Variable Importance for Large Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a fast and flexible method for approximating the reduced model with important inferential guarantees. |
Yue Gao; Abby Stevens; Garvesh Raskutti; Rebecca Willett; |
308 | Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel method, minimum-margin (MM) attack, to fast and reliably evaluate adversarial robustness. |
Ruize Gao; Jiongxiao Wang; Kaiwen Zhou; Feng Liu; Binghui Xie; Gang Niu; Bo Han; James Cheng; |
309 | Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). |
Lucy L Gao; Jane Ye; Haian Yin; Shangzhi Zeng; Jin Zhang; |
310 | Learning to Incorporate Texture Saliency Adaptive Attention to Image Cartoonization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, a novel cartoon-texture-saliency-sampler (CTSS) module is proposed to adaptively sample cartoon-texture-salient patches from training data. |
Xiang Gao; Yuqi Zhang; Yingjie Tian; |
311 | Stochastic Smoothing of The Top-K Calibrated Hinge Loss for Deep Imbalanced Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we introduce a stochastic top-K hinge loss inspired by recent developments on top-K calibrated losses. |
Camille Garcin; Maximilien Servajean; Alexis Joly; Joseph Salmon; |
312 | PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: After a compact survey on some of the main variance-reduced REINFORCE-type methods, we propose ProbAbilistic Gradient Estimation for Policy Gradient (PAGE-PG), a novel loopless variance-reduced policy gradient method based on a probabilistic switch between two types of update. |
Matilde Gargiani; Andrea Zanelli; Andrea Martinelli; Tyler Summers; John Lygeros; |
313 | The Power of First-order Smooth Optimization for Black-box Non-smooth Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, besides the oracle complexity, we focus also on iteration complexity, and propose a generic approach that, based on optimal first-order methods, allows to obtain in a black-box fashion new zeroth-order algorithms for non-smooth convex optimization problems. |
Alexander Gasnikov; Anton Novitskii; Vasilii Novitskii; Farshed Abdukhakimov; Dmitry Kamzolov; Aleksandr Beznosikov; Martin Takac; Pavel Dvurechensky; Bin Gu; |
314 | A Functional Information Perspective on Model Interpretation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work suggests a theoretical framework for model interpretability by measuring the contribution of relevant features to the functional entropy of the network with respect to the input. |
Itai Gat; Nitay Calderon; Roi Reichart; Tamir Hazan; |
315 | UniRank: Unimodal Bandit Algorithms for Online Ranking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a generic algorithm, UniRank, that tackles state-of-the-art click models. |
Camille-Sovanneary Gauthier; Romaric Gaudel; Elisa Fromont; |
316 | Variational Inference with Locally Enhanced Bounds for Hierarchical Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new family of variational bounds for hierarchical models, based on the application of tightening methods (e.g. importance weighting) separately for each group of local random variables. |
Tomas Geffner; Justin Domke; |
317 | Inducing Causal Structure for Interpretable Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In many areas, we have well-founded insights about causal structure that would be useful to bring into our trained models while still allowing them to learn in a data-driven fashion. To achieve this, we present the new method of interchange intervention training (IIT). |
Atticus Geiger; Zhengxuan Wu; Hanson Lu; Josh Rozner; Elisa Kreiss; Thomas Icard; Noah Goodman; Christopher Potts; |
318 | Achieving Minimax Rates in Pool-Based Batch Active Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we propose a solution which requires a careful trade off between the informativeness of the queried points and their diversity. |
Claudio Gentile; Zhilei Wang; Tong Zhang; |
319 | Near-Exact Recovery for Tomographic Inverse Problems Via Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work is concerned with the following fundamental question in scientific machine learning: Can deep-learning-based methods solve noise-free inverse problems to near-perfect accuracy? |
Martin Genzel; Ingo G?hring; Jan Macdonald; Maximilian M?rz; |
320 | Online Learning for Min Sum Set Cover and Pandora’s Box Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a computationally efficient algorithm that is constant-competitive against the cost of the optimal search order. |
Evangelia Gergatsouli; Christos Tzamos; |
321 | Equivariance Versus Augmentation for Spherical Images Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We analyze the role of rotational equivariance in convolutional neural networks (CNNs) applied to spherical images. |
Jan Gerken; Oscar Carlsson; Hampus Linander; Fredrik Ohlsson; Christoffer Petersson; Daniel Persson; |
322 | A Regret Minimization Approach to Multi-Agent Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of multi-agent control of a dynamical system with known dynamics and adversarial disturbances. |
Udaya Ghai; Udari Madhushani; Naomi Leonard; Elad Hazan; |
323 | Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a naturalistic physics-based environment with a set of connectable magnet blocks inspired by children’s toy kits. |
Seyed Kamyar Seyed Ghasemipour; Satoshi Kataoka; Byron David; Daniel Freeman; Shixiang Shane Gu; Igor Mordatch; |
324 | Faster Privacy Accounting Via Evolving Discretization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new algorithm for numerical composition of privacy random variables, useful for computing the accurate differential privacy parameters for compositions of mechanisms. |
Badih Ghazi; Pritish Kamath; Ravi Kumar; Pasin Manurangsi; |
325 | Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce Plug-In Inversion, which relies on a simple set of augmentations and does not require excessive hyper-parameter tuning. |
Amin Ghiasi; Hamid Kazemi; Steven Reich; Chen Zhu; Micah Goldblum; Tom Goldstein; |
326 | Offline RL Policies Should Be Trained to Be Adaptive Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose that offline RL methods should instead be adaptive in the presence of uncertainty. |
Dibya Ghosh; Anurag Ajay; Pulkit Agrawal; Sergey Levine; |
327 | Breaking The $\sqrtT$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show that stochastic contexts indeed help to reduce the regret from $\sqrt{T}$ to $\polylog(T)$. |
Avishek Ghosh; Abishek Sankararaman; |
328 | SCHA-VAE: Hierarchical Context Aggregation for Few-Shot Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We extend current latent variable models for sets to a fully hierarchical approach with an attention-based point to set-level aggregation and call our method SCHA-VAE for Set-Context-Hierarchical-Aggregation Variational Autoencoder. |
Giorgio Giannone; Ole Winther; |
329 | A Joint Exponential Mechanism For Differentially Private Top-$k$ Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a differentially private algorithm for releasing the sequence of $k$ elements with the highest counts from a data domain of $d$ elements. |
Jennifer Gillenwater; Matthew Joseph; Andres Munoz; Monica Ribero Diaz; |
330 | Neuro-Symbolic Hierarchical Rule Induction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Neuro-Symbolic Hierarchical Rule Induction, an efficient interpretable neuro-symbolic model, to solve Inductive Logic Programming (ILP) problems. |
Claire Glanois; Zhaohui Jiang; Xuening Feng; Paul Weng; Matthieu Zimmer; Dong Li; Wulong Liu; Jianye Hao; |
331 | It’s Raw! Audio Generation with State-Space Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose SaShiMi, a new multi-scale architecture for waveform modeling built around the recently introduced S4 model for long sequence modeling. |
Karan Goel; Albert Gu; Chris Donahue; Christopher Re; |
332 | RankSim: Ranking Similarity Regularization for Deep Imbalanced Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents the RankSim (ranking similarity) regularizer for deep imbalanced regression, which encodes an inductive bias that samples that are closer in label space should also be closer in feature space. |
Yu Gong; Greg Mori; Fred Tung; |
333 | How to Fill The Optimum Set? Population Gradient Descent with Harmless Diversity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, it is useful to consider the problem of finding a set of diverse points in the optimum set of an objective function. In this work, we frame this problem as a bi-level optimization problem of maximizing a diversity score inside the optimum set of the main loss function, and solve it with a simple population gradient descent framework that iteratively updates the points to maximize the diversity score in a fashion that does not hurt the optimization of the main loss. |
Chengyue Gong; Lemeng Wu; Qiang Liu; |
334 | Partial Label Learning Via Label Influence Function Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, inspired by influence function, we develop a novel PLL framework called Partial Label Learning via Label Influence Function (PLL-IF). |
Xiuwen Gong; Dong Yuan; Wei Bao; |
335 | Secure Distributed Training at Scale Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel protocol for secure (Byzantine-tolerant) decentralized training that emphasizes communication efficiency. |
Eduard Gorbunov; Alexander Borzunov; Michael Diskin; Max Ryabinin; |
336 | Retrieval-Augmented Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we explore an alternative paradigm in which we train a network to map a dataset of past experiences to optimal behavior. |
Anirudh Goyal; Abram Friesen; Andrea Banino; Theophane Weber; Nan Rosemary Ke; Adri? Puigdom?nech Badia; Arthur Guez; Mehdi Mirza; Peter C Humphreys; Ksenia Konyushova; Michal Valko; Simon Osindero; Timothy Lillicrap; Nicolas Heess; Charles Blundell; |
337 | The State of Sparse Training in Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we perform a systematic investigation into applying a number of existing sparse training techniques on a variety of DRL agents and environments. |
Laura Graesser; Utku Evci; Erich Elsen; Pablo Samuel Castro; |
338 | Causal Inference Through The Structural Causal Marginal Problem Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce an approach to counterfactual inference based on merging information from multiple datasets. |
Luigi Gresele; Julius Von K?gelgen; Jonas K?bler; Elke Kirschbaum; Bernhard Sch?lkopf; Dominik Janzing; |
339 | Mirror Learning: A Unifying Framework of Policy Optimisation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In contrast, in this paper, we introduce a novel theoretical framework, named Mirror Learning, which provides theoretical guarantees to a large class of algorithms, including TRPO and PPO. |
Jakub Grudzien; Christian A Schroeder De Witt; Jakob Foerster; |
340 | Adapting K-means Algorithms for Outliers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we build on their ideas and show how to adapt several sequential and distributed k-means algorithms to the setting with outliers, but with substantially stronger theoretical guarantees: our algorithms output (1 + $\epsilon$)z outliers while achieving an O(1/$\epsilon$)-approximation to the objective function. |
Christoph Grunau; V?clav Rozhon; |
341 | Variational Mixtures of ODEs for Inferring Cellular Gene Expression Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Additionally, a single progenitor cell type often bifurcates into multiple child cell types, further complicating the problem of modeling the dynamics. To address this problem, we developed an approach called variational mixtures of ordinary differential equations. |
Yichen Gu; David T Blaauw; Joshua Welch; |
342 | Learning Pseudometric-based Action Representations for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes an action representation learning framework for offline RL based on a pseudometric, which measures both the behavioral relation and the data-distributional relation between actions. |
Pengjie Gu; Mengchen Zhao; Chen Chen; Dong Li; Jianye Hao; Bo An; |
343 | NeuroFluid: Fluid Dynamics Grounding with Particle-Driven Neural Radiance Fields Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider a partially observable scenario known as fluid dynamics grounding, that is, inferring the state transitions and interactions within the fluid particle systems from sequential visual observations of the fluid surface. |
Shanyan Guan; Huayu Deng; Yunbo Wang; Xiaokang Yang; |
344 | Fast-Rate PAC-Bayesian Generalization Bounds for Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a general PAC-Bayesian framework to cope with single-task learning and meta-learning uniformly. |
Jiechao Guan; Zhiwu Lu; |
345 | Leveraging Approximate Symbolic Models for Reinforcement Learning Via Skill Diversity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Symbolic models of real world tasks are however often incomplete. To this end, we introduce Approximate Symbolic-Model Guided Reinforcement Learning, wherein we will formalize the relationship between the symbolic model and the underlying MDP that will allow us to characterize the incompleteness of the symbolic model. |
Lin Guan; Sarath Sreedharan; Subbarao Kambhampati; |
346 | Large-Scale Graph Neural Architecture Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing approaches fail to handle large-scale graphs because current performance estimation strategies in GNAS are computationally expensive for large-scale graphs and suffer from consistency collapse issues. To tackle these problems, we propose the Graph ArchitectUre Search at Scale (GAUSS) method that can handle large-scale graphs by designing an efficient light-weight supernet and the joint architecture-graph sampling. |
Chaoyu Guan; Xin Wang; Hong Chen; Ziwei Zhang; Wenwu Zhu; |
347 | Identifiability Conditions for Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unfortunately, it is unclear under what conditions this identifiability assumption holds, even when restricting ourselves to the case where a correct bijective map between domains exists. We study this bijective domain mapping problem and provide several new sufficient conditions for the identifiability of linear domain maps. |
Ishaan Gulrajani; Tatsunori Hashimoto; |
348 | A Parametric Class of Approximate Gradient Updates for Policy Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To better capture the commonalities and identify key differences between policy optimization methods, we develop a unified perspective that re-expresses the underlying updates in terms of a limited choice of gradient form and scaling function. |
Ramki Gummadi; Saurabh Kumar; Junfeng Wen; Dale Schuurmans; |
349 | Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In the offline setting, estimating these operators directly is challenging due to (i) the large observation space and (ii) insufficient coverage of the offline dataset. To tackle these challenges, we propose a novel algorithm that constructs confidence regions for these Bellman operators via offline estimation of their RKHS embeddings, and returns the final policy via pessimistic planning within the confidence regions. |
Hongyi Guo; Qi Cai; Yufeng Zhang; Zhuoran Yang; Zhaoran Wang; |
350 | No-Regret Learning in Partially-Informed Auctions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Auctions with partially-revealed information about items are broadly employed in real-world applications, but the underlying mechanisms have limited theoretical support. In this work, we study a machine learning formulation of these types of mechanisms, presenting algorithms that are no-regret from the buyer’s perspective. |
Wenshuo Guo; Michael Jordan; Ellen Vitercik; |
351 | Bounding Training Data Reconstruction in Private (Deep) Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we derive the first semantic guarantees for DP mechanisms against training data reconstruction attacks under a formal threat model. |
Chuan Guo; Brian Karrer; Kamalika Chaudhuri; Laurens van der Maaten; |
352 | Adversarially Trained Neural Representations Are Already As Robust As Biological Neural Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop a method for performing adversarial visual attacks directly on primate brain activity. |
Chong Guo; Michael Lee; Guillaume Leclerc; Joel Dapello; Yug Rao; Aleksander Madry; James Dicarlo; |
353 | Class-Imbalanced Semi-Supervised Learning with Adaptive Thresholding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop a simple yet effective framework, which only involves adaptive thresholding for different classes in SSL algorithms, and achieves remarkable performance improvement on more than twenty imbalance ratios. |
Lan-Zhe Guo; Yu-Feng Li; |
354 | Deep Squared Euclidean Approximation to The Levenshtein Distance for DNA Storage Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel deep squared Euclidean embedding for DNA sequences using Siamese neural network, squared Euclidean embedding, and chi-squared regression. |
Alan J.X. Guo; Cong Liang; Qing-Hu Hou; |
355 | Online Continual Learning Through Mutual Information Maximization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposed a new online continual learning approach called OCMM based on mutual information (MI) maximization. |
Yiduo Guo; Bing Liu; Dongyan Zhao; |
356 | Fast Provably Robust Decision Trees and Boosting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work proposes the Fast Provably Robust Decision Tree (FPRDT) with the smallest computational complexity O(n log n), a tradeoff between global and local optimizations over the adversarial 0/1 loss. |
Jun-Qi Guo; Ming-Zhuo Teng; Wei Gao; Zhi-Hua Zhou; |
357 | Understanding and Improving Knowledge Graph Embedding for Entity Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To fill the research gap, we define a typical paradigm abstracted from existing EEA methods and analyze how the embedding discrepancy between two potentially aligned entities is implicitly bounded by a predefined margin in the score function. |
Lingbing Guo; Qiang Zhang; Zequn Sun; Mingyang Chen; Wei Hu; Huajun Chen; |
358 | NISPA: Neuro-Inspired Stability-Plasticity Adaptation for Continual Learning in Sparse Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The main desiderata associated with CL are to maintain performance on older tasks, leverage the latter to improve learning of future tasks, and to introduce minimal overhead in the training process (for instance, to not require a growing model or retraining). We propose the Neuro-Inspired Stability-Plasticity Adaptation (NISPA) architecture that addresses these desiderata through a sparse neural network with fixed density. |
Mustafa B Gurbuz; Constantine Dovrolis; |
359 | Active Learning on A Budget: Opposite Strategies Suit High and Low Budgets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Accordingly, we propose TypiClust – a deep active learning strategy suited for low budgets. |
Guy Hacohen; Avihu Dekel; Daphna Weinshall; |
360 | You Only Cut Once: Boosting Data Augmentation with A Single Cut Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present You Only Cut Once (YOCO) for performing data augmentations. |
Junlin Han; Pengfei Fang; Weihao Li; Jie Hong; Mohammad Ali Armin; Ian Reid; Lars Petersson; Hongdong Li; |
361 | Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we develop a scalable MCMC sampling algorithm for $k$-NDPPs with low-rank kernels, thus enabling runtime that is sublinear in $n$. |
Insu Han; Mike Gartrell; Elvis Dohmatob; Amin Karbasi; |
362 | G-Mixup: Graph Data Augmentation for Graph Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, it is challenging to directly adopt Mixup to augment graph data because different graphs typically: 1) have different numbers of nodes; 2) are not readily aligned; and 3) have unique typologies in non-Euclidean space. To this end, we propose G-Mixup to augment graphs for graph classification by interpolating the generator (i.e., graphon) of different classes of graphs. |
Xiaotian Han; Zhimeng Jiang; Ninghao Liu; Xia Hu; |
363 | Private Streaming SCO in $\ell_p$ Geometry with Applications in High Dimensional Online Decision Making Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a private variant of the Frank-Wolfe algorithm with recursive gradients for variance reduction to update and reveal the parameters upon each data. |
Yuxuan Han; Zhicong Liang; Zhipeng Liang; Yang Wang; Yuan Yao; Jiheng Zhang; |
364 | Off-Policy Reinforcement Learning with Delayed Rewards Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study deep reinforcement learning (RL) algorithms with delayed rewards. |
Beining Han; Zhizhou Ren; Zuofan Wu; Yuan Zhou; Jian Peng; |
365 | Adversarial Attacks on Gaussian Process Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our goal is to understand adversarial attacks on GP bandits from theoretical and practical perspectives. |
Eric Han; Jonathan Scarlett; |
366 | Random Gegenbauer Features for Scalable Kernel Methods Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose efficient random features for approximating a new and rich class of kernel functions that we refer to as Generalized Zonal Kernels (GZK). |
Insu Han; Amir Zandieh; Haim Avron; |
367 | Stochastic Reweighted Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose stochastic reweighted gradient descent (SRG), a stochastic gradient method based solely on importance sampling that can reduce the variance of the gradient estimator and improve on the asymptotic error of stochastic gradient descent (SGD) in the strongly convex and smooth case. |
Ayoub El Hanchi; David Stephens; Chris Maddison; |
368 | Dual Perspective of Label-Specific Feature Learning for Multi-Label Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a dual perspective for label-specific feature learning, where label-specific discriminative properties are considered by identifying each label’s own non-informative features and making the discrimination process immutable to variations of these features. |
Jun-Yi Hang; Min-Ling Zhang; |
369 | Temporal Difference Learning for Model Predictive Control Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we combine the strengths of model-free and model-based methods. |
Nicklas A Hansen; Hao Su; Xiaolong Wang; |
370 | Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new form of state abstraction called goal-conditioned bisimulation that captures functional equivariance, allowing for the reuse of skills to achieve new goals. |
Philippe Hansen-Estruch; Amy Zhang; Ashvin Nair; Patrick Yin; Sergey Levine; |
371 | TURF: Two-Factor, Universal, Robust, Fast Distribution Learning Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive a near-linear-time and essentially sample-optimal estimator that establishes c_{t,d}=2 for all (t,d)!=(1,0). |
Yi Hao; Ayush Jain; Alon Orlitsky; Vaishakh Ravindrakumar; |
372 | Contextual Information-Directed Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate the IDS design through two contextual bandit problems: contextual bandits with graph feedback and sparse linear contextual bandits. |
Botao Hao; Tor Lattimore; Chao Qin; |
373 | GSmooth: Certified Robustness Against Semantic Transformations Via Generalized Randomized Smoothing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing methods are insufficient or unable to provably defend against semantic transformations, especially those without closed-form expressions (such as defocus blur and pixelate), which are more common in practice and often unrestricted. To fill up this gap, we propose generalized randomized smoothing (GSmooth), a unified theoretical framework for certifying robustness against general semantic transformations via a novel dimension augmentation strategy. |
Zhongkai Hao; Chengyang Ying; Yinpeng Dong; Hang Su; Jian Song; Jun Zhu; |
374 | Implicit Regularization with Polynomial Growth in Deep Tensor Factorization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the implicit regularization effects of deep learning in tensor factorization. |
Kais Hariz; Hachem Kadri; Stephane Ayache; Maher Moakher; Thierry Artieres; |
375 | Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, our work establishes a novel connection between strategic responses to ML models and instrumental variable (IV) regression by observing that the sequence of deployed models can be viewed as an instrument that affects agents’ observable features but does not directly influence their outcomes. |
Keegan Harris; Dung Daniel T Ngo; Logan Stapleton; Hoda Heidari; Steven Wu; |
376 | C*-algebra Net: A New Approach Generalizing Neural Network Parameters to C*-algebra Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new framework that generalizes the parameters of neural network models to $C^*$-algebra-valued ones. |
Yuka Hashimoto; Zhao Wang; Tomoko Matsui; |
377 | General-purpose, Long-context Autoregressive Modeling with Perceiver AR Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop Perceiver AR, an autoregressive, modality-agnostic architecture which uses cross-attention to map long-range inputs to a small number of latents while also maintaining end-to-end causal masking. |
Curtis Hawthorne; Andrew Jaegle; Catalina Cangea; Sebastian Borgeaud; Charlie Nash; Mateusz Malinowski; Sander Dieleman; Oriol Vinyals; Matthew Botvinick; Ian Simon; Hannah Sheahan; Neil Zeghidour; Jean-Baptiste Alayrac; Joao Carreira; Jesse Engel; |
378 | On Distribution Shift in Learning-based Bug Detectors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we argue that this massive performance difference is caused by a distribution shift, i.e., a fundamental mismatch between the real bug distribution and the synthetic bug distribution used to train and evaluate the detectors. |
Jingxuan He; Luca Beurer-Kellner; Martin Vechev; |
379 | GNNRank: Learning Global Rankings from Pairwise Comparisons Via Directed Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce neural networks into the ranking recovery problem by proposing the so-called GNNRank, a trainable GNN-based framework with digraph embedding. |
Yixuan He; Quan Gan; David Wipf; Gesine D Reinert; Junchi Yan; Mihai Cucuringu; |
380 | Exploring The Gap Between Collapsed & Whitened Features in Self-Supervised Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We identify power law behaviour in eigenvalue decay, parameterised by exponent ß=0, as a spectrum that bridges between the collapsed & whitened feature extremes. |
Bobby He; Mete Ozay; |
381 | Sparse Double Descent: Where Network Pruning Aggravates Overfitting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we have three main contributions. First, we report the novel sparse double descent phenomenon through extensive experiments. |
Zheng He; Zeke Xie; Quanzhi Zhu; Zengchang Qin; |
382 | A Reduction from Linear Contextual Bandit Lower Bounds to Estimation Lower Bounds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we complete the reverse direction by establishing the necessity. |
Jiahao He; Jiheng Zhang; Rachel Zhang; |
383 | HyperPrompt: Prompt-based Task-Conditioning of Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we explore the use of HyperNetworks to generate hyper-prompts: we propose HyperPrompt, a novel architecture for prompt-based task-conditioning of self-attention in Transformers. |
Yun He; Steven Zheng; Yi Tay; Jai Gupta; Yu Du; Vamsi Aribandi; Zhe Zhao; Yaguang Li; Zhao Chen; Donald Metzler; Heng-Tze Cheng; Ed H. Chi; |
384 | Label-Descriptive Patterns and Their Application to Characterizing Classification Errors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose to discover those feature-value combinations (i.e., patterns) that strongly correlate with correct resp. |
Michael A. Hedderich; Jonas Fischer; Dietrich Klakow; Jilles Vreeken; |
385 | NOMU: Neural Optimization-based Model Uncertainty Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, we find that established benchmarks often fail to reliably capture some of these desiderata, even those that are required by Bayesian theory. To address this, we introduce a new approach for capturing model uncertainty for NNs, which we call Neural Optimization-based Model Uncertainty (NOMU). |
Jakob M Heiss; Jakob Weissteiner; Hanna S Wutte; Sven Seuken; Josef Teichmann; |
386 | Scaling Out-of-Distribution Detection for Real-World Settings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To set the stage for more realistic out-of-distribution detection, we depart from small-scale settings and explore large-scale multiclass and multi-label settings with high-resolution images and thousands of classes. |
Dan Hendrycks; Steven Basart; Mantas Mazeika; Andy Zou; Joseph Kwon; Mohammadreza Mostajabi; Jacob Steinhardt; Dawn Song; |
387 | Generalization Bounds Using Lower Tail Exponents in Stochastic Optimizers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While recent work has revealed connections between generalization and heavy-tailed behavior in stochastic optimization, they mainly relied on continuous-time approximations; and a rigorous treatment for the original discrete-time iterations is yet to be performed. To bridge this gap, we present novel bounds linking generalization to the lower tail exponent of the transition kernel associated with the optimizer around a local minimum, in both discrete- and continuous-time settings. |
Liam Hodgkinson; Umut Simsekli; Rajiv Khanna; Michael Mahoney; |
388 | Unsupervised Detection of Contextualized Embedding Bias with Application to Ideology Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a fully unsupervised method to detect bias in contextualized embeddings. |
Valentin Hofmann; Janet Pierrehumbert; Hinrich Sch?tze; |
389 | Neural Laplace: Learning Diverse Classes of Differential Equations in The Laplace Domain Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose Neural Laplace, a unifying framework for learning diverse classes of DEs including all the aforementioned ones. |
Samuel I Holt; Zhaozhi Qian; Mihaela van der Schaar; |
390 | Deep Hierarchy in Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Since the hierarchy can have multiple layers, we call it deep. We propose a hierarchical Thompson sampling algorithm (HierTS) for this problem and show how to implement it efficiently for Gaussian hierarchies. |
Joey Hong; Branislav Kveton; Sumeet Katariya; Manzil Zaheer; Mohammad Ghavamzadeh; |
391 | DAdaQuant: Doubly-adaptive Quantization for Communication-efficient Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce DAdaQuant as a doubly-adaptive quantization algorithm that dynamically changes the quantization level across time and different clients. |
Robert H?nig; Yiren Zhao; Robert Mullins; |
392 | Equivariant Diffusion for Molecule Generation in 3D Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work introduces a diffusion model for molecule generation in 3D that is equivariant to Euclidean transformations. |
Emiel Hoogeboom; Vi?ctor Garcia Satorras; Cl?ment Vignac; Max Welling; |
393 | Conditional GANs with Auxiliary Discriminative Classifier Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The fundamental reason pointed out in this paper is that the classifier of AC-GAN is generator-agnostic, which therefore cannot provide informative guidance for the generator to approach the joint distribution, resulting in a minimization of the conditional entropy that decreases the intra-class diversity. |
Liang Hou; Qi Cao; Huawei Shen; Siyuan Pan; Xiaoshuang Li; Xueqi Cheng; |
394 | AdAUC: End-to-end Adversarial AUC Optimization Against Long-tail Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Under such a scenario, AUC is a much more reasonable metric than accuracy since it is insensitive toward class distribution. Motivated by this, we present an early trial to explore adversarial training methods to optimize AUC. |
Wenzheng Hou; Qianqian Xu; Zhiyong Yang; Shilong Bao; Yuan He; Qingming Huang; |
395 | Wide Bayesian Neural Networks Have A Simple Weight Posterior: Theory and Accelerated Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce repriorisation, a data-dependent reparameterisation which transforms a Bayesian neural network (BNN) posterior to a distribution whose KL divergence to the BNN prior vanishes as layer widths grow. |
Jiri Hron; Roman Novak; Jeffrey Pennington; Jascha Sohl-Dickstein; |
396 | Learning Inverse Folding from Millions of Predicted Structures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of predicting a protein sequence from its backbone atom coordinates. |
Chloe Hsu; Robert Verkuil; Jason Liu; Zeming Lin; Brian Hie; Tom Sercu; Adam Lerer; Alexander Rives; |
397 | Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we consider the episodic inhomogeneous linear Markov Decision Process (MDP), and propose a novel computation-efficient algorithm, LSVI-UCB$^+$, which achieves an $\widetilde{O}(Hd\sqrt{T})$ regret bound where $H$ is the episode length, $d$ is the feature dimension, and $T$ is the number of steps. |
Pihe Hu; Yu Chen; Longbo Huang; |
398 | Neuron Dependency Graphs: A Causal Abstraction of Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We discover that neural networks exhibit approximate logical dependencies among neurons, and we introduce Neuron Dependency Graphs (NDG) that extract and present them as directed graphs. |
Yaojie Hu; Jin Tian; |
399 | Policy Diagnosis Via Measuring Role Diversity in Cooperative Multi-agent RL Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we quantify the agent’s behavior difference and build its relationship with the policy performance via {\bf Role Diversity}, a metric to measure the characteristics of MARL tasks. |
Siyi Hu; Chuanlong Xie; Xiaodan Liang; Xiaojun Chang; |
400 | On The Role of Discount Factor in Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper examines two distinct effects of $\gamma$ in offline RL with theoretical analysis, namely the regularization effect and the pessimism effect. |
Hao Hu; Yiqin Yang; Qianchuan Zhao; Chongjie Zhang; |
401 | Transformer Quality in Linear Time Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We revisit the design choices in Transformers, and propose methods to address their weaknesses in handling long sequences. |
Weizhe Hua; Zihang Dai; Hanxiao Liu; Quoc Le; |
402 | Language Models As Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate the possibility of grounding high-level tasks, expressed in natural language (e.g. “make breakfast”), to a chosen set of actionable steps (e.g. “open fridge”). |
Wenlong Huang; Pieter Abbeel; Deepak Pathak; Igor Mordatch; |
403 | Forward Operator Estimation in Generative Models with Kernel Transfer Operators Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a substantially cheaper (and simpler) forward operator estimation strategy based on adapting known results on kernel transfer operators. |
Zhichun Huang; Rudrasis Chakraborty; Vikas Singh; |
404 | Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we generalize the concept of heavy-tailed multi-armed bandits to adversarial environments, and develop robust best-of-both-worlds algorithms for heavy-tailed multi-armed bandits (MAB), where losses have $\alpha$-th ($1<\alpha\le 2$) moments bounded by $\sigma^\alpha$, while the variances may not exist. |
Jiatai Huang; Yan Dai; Longbo Huang; |
405 | Frustratingly Easy Transferability Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing estimation algorithms either require intensive training on target tasks or have difficulties in evaluating the transferability between layers. To this end, we propose a simple, efficient, and effective transferability measure named TransRate. |
Long-Kai Huang; Junzhou Huang; Yu Rong; Qiang Yang; Ying Wei; |
406 | Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably) Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Recently, it has been observed that the best uni-modal network outperforms the jointly trained multi-modal network across different combinations of modalities on various tasks, which is counter-intuitive since multiple signals would bring more information (Wang et al., 2020). This work provides a theoretical explanation for the emergence of such performance gap in neural networks for the prevalent joint training framework. |
Yu Huang; Junyang Lin; Chang Zhou; Hongxia Yang; Longbo Huang; |
407 | Action-Sufficient State Representation Learning for Control with Structural Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we focus on partially observable environments and propose to learn a minimal set of state representations that capture sufficient information for decision-making, termed Action-Sufficient state Representations (ASRs). |
Biwei Huang; Chaochao Lu; Liu Leqi; Jose Miguel Hernandez-Lobato; Clark Glymour; Bernhard Sch?lkopf; Kun Zhang; |
408 | 3DLinker: An E(3) Equivariant Variational Autoencoder for Molecular Linker Design Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on a new type of drug design problem — generating a small “linker” to physically attach two independent molecules with their distinct functions. |
Yinan Huang; Xingang Peng; Jianzhu Ma; Muhan Zhang; |
409 | SDQ: Stochastic Differentiable Quantization with Mixed Precision Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present a novel Stochastic Differentiable Quantization (SDQ) method that can automatically learn the MPQ strategy in a more flexible and globally-optimized space with a smoother gradient approximation. |
Xijie Huang; Zhiqiang Shen; Shichao Li; Zechun Liu; Hu Xianghong; Jeffry Wicaksana; Eric Xing; Kwang-Ting Cheng; |
410 | Tackling Data Heterogeneity: A New Unified Framework for Decentralized SGD with Sample-induced Topology Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a general framework unifying several gradient-based stochastic optimization methods for empirical risk minimization problems both in centralized and distributed scenarios. |
Yan Huang; Ying Sun; Zehan Zhu; Changzhi Yan; Jinming Xu; |
411 | Efficient Representation Learning Via Adaptive Context Pooling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In doing so, they assume a fixed attention granularity defined by the individual tokens (e.g., text characters or image pixels), which may not be optimal for modeling complex dependencies at higher levels. In this paper, we propose ContextPool to address this problem by adapting the attention granularity for each token. |
Chen Huang; Walter Talbott; Navdeep Jaitly; Joshua M Susskind; |
412 | On The Learning of Non-Autoregressive Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present theoretical and empirical analyses to reveal the challenges of NAT learning and propose a unified perspective to understand existing successes. |
Fei Huang; Tianhua Tao; Hao Zhou; Lei Li; Minlie Huang; |
413 | Going Deeper Into Permutation-Sensitive Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we devise an efficient permutation-sensitive aggregation mechanism via permutation groups, capturing pairwise correlations between neighboring nodes. |
Zhongyu Huang; Yingheng Wang; Chaozhuo Li; Huiguang He; |
414 | Directed Acyclic Transformer for Non-Autoregressive Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Directed Acyclic Transfomer (DA-Transformer), which represents the hidden states in a Directed Acyclic Graph (DAG), where each path of the DAG corresponds to a specific translation. |
Fei Huang; Hao Zhou; Yang Liu; Hang Li; Minlie Huang; |
415 | Unsupervised Ground Metric Learning Using Wasserstein Singular Vectors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose for the first time a canonical answer by simultaneously computing an OT distance between samples and between features of a dataset. |
Geert-Jan Huizing; Laura Cantini; Gabriel Peyr?; |
416 | Robust Kernel Density Estimation with Median-of-Means Principle Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a robust non-parametric density estimator combining the popular Kernel Density Estimation method and the Median-of-Means principle (MoM-KDE). |
Pierre Humbert; Batiste Le Bars; Ludovic Minvielle; |
417 | A Data-driven Approach for Learning to Control Computers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we investigate the setting of computer control using keyboard and mouse, with goals specified via natural language. |
Peter C Humphreys; David Raposo; Tobias Pohlen; Gregory Thornton; Rachita Chhaparia; Alistair Muldal; Josh Abramson; Petko Georgiev; Adam Santoro; Timothy Lillicrap; |
418 | Proximal Denoiser for Convergent Plug-and-Play Optimization with Nonconvex Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Using such a denoiser guarantees the convergence of the PnP version of the Half-Quadratic-Splitting (PnP-HQS) iterative algorithm. In this paper, we show that this gradient denoiser can actually correspond to the proximal operator of another scalar function. |
Samuel Hurault; Arthur Leclaire; Nicolas Papadakis; |
419 | Inverse Contextual Bandits: Learning How Behavior Evolves Over Time Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To give an answer, we desire a policy learning method that provides interpretable representations of decision-making, in particular capturing an agent’s non-stationary knowledge of the world, as well as operating in an offline manner. |
Alihan H?y?k; Daniel Jarrett; Mihaela van der Schaar; |
420 | Datamodels: Understanding Predictions with Data and Data with Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a conceptual framework, datamodeling, for analyzing the behavior of a model class in terms of the training data. |
Andrew Ilyas; Sung Min Park; Logan Engstrom; Guillaume Leclerc; Aleksander Madry; |
421 | Parsimonious Learning-Augmented Caching Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we introduce and study the setting in which the learning-augmented algorithm can utilize the predictions parsimoniously. |
Sungjin Im; Ravi Kumar; Aditya Petety; Manish Purohit; |
422 | Bayesian Optimization for Distributionally Robust Chance-constrained Problem Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we consider distributionally robust CC (DRCC) problem and propose a novel DRCC Bayesian optimization method for the case where the distribution of the environmental variables cannot be precisely specified. |
Yu Inatsu; Shion Takeno; Masayuki Karasuyama; Ichiro Takeuchi; |
423 | LeNSE: Learning To Navigate Subgraph Embeddings for Large-Scale Combinatorial Optimisation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a low-complexity approach for identifying a (possibly much smaller) subgraph of the original graph where the heuristics can be run in reasonable time and with a high likelihood of finding a global near-optimal solution. |
David Ireland; Giovanni Montana; |
424 | The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns Via Spotlights of Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, this dual formulation offers a possibility of directly visualising how an NN makes use of training patterns at test time, by examining the corresponding attention weights. We conduct experiments on small scale supervised image classification tasks in single-task, multi-task, and continual learning settings, as well as language modelling, and discuss potentials and limits of this view for better understanding and interpreting how NNs exploit training patterns. |
Kazuki Irie; R?bert Csord?s; J?rgen Schmidhuber; |
425 | A Modern Self-Referential Weight Matrix That Learns to Modify Itself Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a scalable self-referential WM (SRWM) that learns to use outer products and the delta update rule to modify itself. |
Kazuki Irie; Imanol Schlag; R?bert Csord?s; J?rgen Schmidhuber; |
426 | Revisiting Online Submodular Minimization: Gap-Dependent Regret Bounds, Best of Both Worlds and Adversarial Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider online decision problems with submodular loss functions. |
Shinji Ito; |
427 | Modeling Strong and Human-Like Gameplay with KL-Regularized Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the task of accurately modeling strong human policies in multi-agent decision-making problems, given examples of human behavior. |
Athul Paul Jacob; David J Wu; Gabriele Farina; Adam Lerer; Hengyuan Hu; Anton Bakhtin; Jacob Andreas; Noam Brown; |
428 | A Deep Convolutional Neural Network That Is Invariant to Time Rescaling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents a deep CNN (SITHCon) that uses a logarithmically compressed temporal representation at each level. |
Brandon G Jacques; Zoran Tiganj; Aakash Sarkar; Marc Howard; Per Sederberg; |
429 | Input Dependent Sparse Gaussian Processes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A limitation is, however, that in some tasks a large number of inducing points may be required to obtain good results. To alleviate this, we propose here to amortize the computation of the inducing points locations, as well as the parameters of $q$. |
Bahram Jafrasteh; Carlos Villacampa-Calvo; Daniel Hernandez-Lobato; |
430 | Regret Minimization with Performative Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our main contribution is regret bounds that scale only with the complexity of the distribution shifts and not that of the reward function. |
Meena Jagadeesan; Tijana Zrnic; Celestine Mendler-D?nner; |
431 | Biological Sequence Design with GFlowNets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose an active learning algorithm leveraging epistemic uncertainty estimation and the recently proposed GFlowNets as a generator of diverse candidate solutions, with the objective to obtain a diverse batch of useful (as defined by some utility function, for example, the predicted anti-microbial activity of a peptide) and informative candidates after each round. |
Moksh Jain; Emmanuel Bengio; Alex Hernandez-Garcia; Jarrid Rector-Brooks; Bonaventure F. P. Dossou; Chanakya Ajit Ekbote; Jie Fu; Tianyu Zhang; Michael Kilgour; Dinghuai Zhang; Lena Simine; Payel Das; Yoshua Bengio; |
432 | Combining Diverse Feature Priors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To improve model generalization, model designers often restrict the features that their models use, either implicitly or explicitly. In this work, we explore the design space of leveraging such feature priors by viewing them as distinct perspectives on the data. |
Saachi Jain; Dimitris Tsipras; Aleksander Madry; |
433 | Training Your Sparse Neural Network Better with Any Mask Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Apart from the popular belief that only the quality of sparse masks matters for sparse training, in this paper we demonstrate an alternative opportunity: one can carefully customize the sparse training techniques to deviate from the default dense network training protocols, consisting of introducing “ghost" neurons and skip connections at the early stage of training, and strategically modifying the initialization as well as labels. |
Ajay Kumar Jaiswal; Haoyu Ma; Tianlong Chen; Ying Ding; Zhangyang Wang; |
434 | Sequential Covariate Shift Detection Using Classifier Two-Sample Tests Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of detecting covariate shift, where the covariate distribution shifts but the conditional distribution of labels given covariates remains the same. |
Sooyong Jang; Sangdon Park; Insup Lee; Osbert Bastani; |
435 | Surrogate Likelihoods for Variational Annealed Importance Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, supporting data subsampling in these hybrid methods can be a challenge, a shortcoming that we address by introducing a surrogate likelihood that can be learned jointly with other variational parameters. |
Martin Jankowiak; Du Phan; |
436 | Planning with Diffusion for Flexible Behavior Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem, such that sampling from the model and planning with it become nearly identical. |
Michael Janner; Yilun Du; Joshua Tenenbaum; Sergey Levine; |
437 | HyperImpute: Generalized Iterative Imputation with Automatic Model Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study an approach that marries the advantages of both: We propose *HyperImpute*, a generalized iterative imputation framework for adaptively and automatically configuring column-wise models and their hyperparameters. |
Daniel Jarrett; Bogdan C Cebere; Tennison Liu; Alicia Curth; Mihaela van der Schaar; |
438 | Mitigating Modality Collapse in Multimodal VAEs Via Impartial Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We refer to this limitation as modality collapse. In this work, we argue that this effect is a consequence of conflicting gradients during multimodal VAE training. |
Adrian Javaloy; Maryam Meghdadi; Isabel Valera; |
439 | Towards Understanding How Momentum Improves Generalization in Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we adopt another perspective and first empirically show that gradient descent with momentum (GD+M) significantly improves generalization compared to gradient descent (GD) in some deep learning problems. From this observation, we formally study how momentum improves generalization. |
Samy Jelassi; Yuanzhi Li; |
440 | MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward. |
Jeewon Jeon; Woojun Kim; Whiyoung Jung; Youngchul Sung; |
441 | An Exact Symbolic Reduction of Linear Smart Predict+Optimize to Mixed Integer Linear Programming Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we cast the SPO problem as a bi-level program and apply Symbolic Variable Elimination (SVE) to analytically solve the lower optimization. |
Jihwan Jeong; Parth Jaggi; Andrew Butler; Scott Sanner; |
442 | Agnostic Learnability of Halfspaces Via Logistic Loss Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Previously, for a certain broad class of “well-behaved” distributions on the examples, Diakonikolas et al. (2020) proved an tilde{Omega}(OPT) lower bound, while Frei et al. (2021) proved an tilde{O}(sqrt{OPT}) upper bound, where OPT denotes the best zero-one/misclassification risk of a homogeneous halfspace. In this paper, we close this gap by constructing a well-behaved distribution such that the global minimizer of the logistic risk over this distribution only achieves Omega(sqrt{OPT}) misclassification risk, matching the upper bound in (Frei et al., 2021). |
Ziwei Ji; Kwangjun Ahn; Pranjal Awasthi; Satyen Kale; Stefani Karp; |
443 | Improving Policy Optimization with Generalist-Specialist Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To have the best of both worlds, we propose a novel generalist-specialist training framework. |
Zhiwei Jia; Xuanlin Li; Zhan Ling; Shuang Liu; Yiran Wu; Hao Su; |
444 | Translatotron 2: High-quality Direct Speech-to-speech Translation with Voice Preservation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Translatotron 2, a neural direct speech-to-speech translation model that can be trained end-to-end. |
Ye Jia; Michelle Tadmor Ramanovich; Tal Remez; Roi Pomerantz; |
445 | Online Learning and Pricing with Reusable Resources: Linear Bandits with Sub-Exponential Rewards Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a rate-optimal online learning and pricing algorithm, termed Batch Linear Confidence Bound (BLinUCB), and prove that the cumulative regret is $\tilde{O}( d_f \sqrt{T } )$. |
Huiwen Jia; Cong Shi; Siqian Shen; |
446 | The Role of Deconfounding in Meta-learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we offer a novel causal perspective of meta-learning. |
Yinjie Jiang; Zhengyu Chen; Kun Kuang; Luotian Yuan; Xinhai Ye; Zhihua Wang; Fei Wu; Ying Wei; |
447 | Subspace Learning for Effective Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an algorithm to learn the meta-parameters (\ie, subspace bases). |
Weisen Jiang; James Kwok; Yu Zhang; |
448 | Optimal Algorithms for Stochastic Multi-Level Compositional Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the problem of stochastic multi-level compositional optimization, where the objective function is a composition of multiple smooth but possibly non-convex functions. |
Wei Jiang; Bokun Wang; Yibo Wang; Lijun Zhang; Tianbao Yang; |
449 | Antibody-Antigen Docking and Design Via Hierarchical Structure Refinement Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new model called Hierarchical Structure Refinement Network (HSRN) for paratope docking and design. |
Wengong Jin; Dr.Regina Barzilay; Tommi Jaakkola; |
450 | Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This is due to the fact that in Greedy-BFGS the Hessian is directly approximated and the Newton direction approximation may not be as accurate as the one for BFGS. In this paper, we close this gap and present a novel BFGS method that has the best of two worlds. |
Qiujiang Jin; Alec Koppel; Ketan Rajawat; Aryan Mokhtari; |
451 | The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper considers two-player zero-sum Markov Games (MGs). We propose a new algorithm that can provably find the Nash equilibrium policy using a polynomial number of samples, for any MG with low multi-agent Bellman-Eluder dimension—a new complexity measure adapted from its single-agent version (Jin et al., 2021). |
Chi Jin; Qinghua Liu; Tiancheng Yu; |
452 | Domain Adaptation for Time Series Forecasting Via Attention Sharing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This poses a challenge for typical forecasting problems in practice, where there is a limited number of time series or observations per time series, or both. To cope with this data scarcity issue, we propose a novel domain adaptation framework, Domain Adaptation Forecaster (DAF). |
Xiaoyong Jin; Youngsuk Park; Danielle Maddix; Hao Wang; Yuyang Wang; |
453 | Accelerated Federated Learning with Decoupled Adaptive Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work aims to develop novel adaptive optimization methods for FL from the perspective of dynamics of ordinary differential equations (ODEs). |
Jiayin Jin; Jiaxiang Ren; Yang Zhou; Lingjuan Lyu; Ji Liu; Dejing Dou; |
454 | Supervised Off-Policy Ranking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a method to solve SOPR, which learns a policy scoring model by minimizing a ranking loss of the training policies rather than estimating the precise policy performance. |
Yue Jin; Yue Zhang; Tao Qin; Xudong Zhang; Jian Yuan; Houqiang Li; Tie-Yan Liu; |
455 | Input-agnostic Certified Group Fairness Via Gaussian Parameter Smoothing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes an input-agnostic certified group fairness algorithm, FairSmooth, for improving the fairness of classification models while maintaining the remarkable prediction accuracy. |
Jiayin Jin; Zeru Zhang; Yang Zhou; Lingfei Wu; |
456 | Score-based Generative Modeling of Graphs Via The System of Stochastic Differential Equations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet, this is a challenging problem, and the previous graph generative methods either fail to capture the permutation-invariance property of graphs or cannot sufficiently model the complex dependency between nodes and edges, which is crucial for generating real-world graphs such as molecules. To overcome such limitations, we propose a novel score-based generative model for graphs with a continuous-time framework. |
Jaehyeong Jo; Seul Lee; Sung Ju Hwang; |
457 | Choosing Answers in Epsilon-Best-Answer Identification for Linear Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We demonstrate that picking the answer with highest mean does not allow an algorithm to reach asymptotic optimality in terms of expected sample complexity. Instead, a furthest answer should be identified. |
Marc Jourdan; R?my Degenne; |
458 | Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We design an algorithm that incorporates consistent losses and distance-based regularization for fine-tuning. |
Haotian Ju; Dongyue Li; Hongyang R Zhang; |
459 | Robust Alignment of Cross-session Recordings of Neural Population Activity By Behaviour Via Unsupervised Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: On the other hand, evidence suggests that the latent dynamics underlying behaviour may be stable even over months and years. Based on this idea, we introduce a model capable of inferring behaviourally relevant latent dynamics from previously unseen data recorded from the same animal, without any need for decoder recalibration. |
Justin Jude; Matthew Perich; Lee Miller; Matthias Hennig; |
460 | On Measuring Causal Contributions Via Do-interventions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop a principled method for quantifying causal contributions. First, we provide desiderata of properties axioms that causal contribution measures should satisfy and propose the do-Shapley values (inspired by do-interventions [Pearl, 2000]) as a unique method satisfying these properties. |
Yonghan Jung; Shiva Kasiviswanathan; Jin Tian; Dominik Janzing; Patrick Bloebaum; Elias Bareinboim; |
461 | Efficient Approximate Inference for Stationary Kernel on Frequency Domain Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, despite its expressive power, training this kernel is typically difficult because scalability and overfitting issues often arise due to a large number of training parameters. To resolve these issues, we propose an approximate inference method for estimating the Spectral mixture kernel hyperparameters. |
Yohan Jung; Kyungwoo Song; Jinkyoo Park; |
462 | Sketching Algorithms and Lower Bounds for Ridge Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give a sketching-based iterative algorithm that computes a $1+\varepsilon$ approximate solution for the ridge regression problem $\min_x \|Ax-b\|_2^2 +\lambda\|x\|_2^2$ where $A \in R^{n \times d}$ with $d \ge n$. |
Praneeth Kacham; David Woodruff; |
463 | Flashlight: Enabling Innovation in Tools for Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce Flashlight, an open-source library built to spur innovation in machine learning tools and systems by prioritizing open, modular, customizable internals and state-of-the-art, research-ready models and training setups across a variety of domains. |
Jacob D Kahn; Vineel Pratap; Tatiana Likhomanenko; Qiantong Xu; Awni Hannun; Jeff Cai; Paden Tomasello; Ann Lee; Edouard Grave; Gilad Avidov; Benoit Steiner; Vitaliy Liptchinsky; Gabriel Synnaeve; Ronan Collobert; |
464 | Learning-based Optimisation of Particle Accelerators Under Partial Observability Without Real-World Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this contribution, we demonstrate how to successfully apply RL to the optimisation of a highly complex real-world machine {–} specifically a linear particle accelerator {–} in an only partially observable setting and without requiring training on the real machine. |
Jan Kaiser; Oliver Stein; Annika Eichler; |
465 | Stochastic Deep Networks with Linear Competing Units for Model-Agnostic Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Abstract: This work addresses meta-learning (ML) by considering deep networks with stochastic local winner-takes-all (LWTA) activations. This type of network units results in sparse … |
Konstantinos Kalais; Sotirios Chatzis; |
466 | Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the first DR algorithms for DROPE/L with KL-divergence uncertainty sets. |
Nathan Kallus; Xiaojie Mao; Kaiwen Wang; Zhengyuan Zhou; |
467 | Improved Rates for Differentially Private Stochastic Convex Optimization with Heavy-Tailed Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study stochastic convex optimization with heavy-tailed data under the constraint of differential privacy (DP). |
Gautam Kamath; Xingtu Liu; Huanyu Zhang; |
468 | Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Currently, empirical hyperparameter tuning addresses this problem at the cost of computational time. To solve this problem, we theoretically analyzed NS loss to assist hyperparameter tuning and understand the better use of the NS loss in KGE learning. |
Hidetaka Kamigaito; Katsuhiko Hayashi; |
469 | Matching Learned Causal Effects of Neural Networks with Domain Priors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, we propose a regularization method that aligns the learned causal effects of a neural network with domain priors, including both direct and total causal effects. |
Sai Srinivas Kancheti; Abbavaram Gowtham Reddy; Vineeth N Balasubramanian; Amit Sharma; |
470 | Deduplicating Training Data Mitigates Privacy Risks in Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Past work has shown that large language models are susceptible to privacy attacks, where adversaries generate sequences from a trained model and detect which sequences are memorized from the training set. In this work, we show that the success of these attacks is largely due to duplication in commonly used web-scraped training sets. |
Nikhil Kandpal; Eric Wallace; Colin Raffel; |
471 | Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Can we combine these two concepts, producing learning-based control algorithms that constrain the system to in-distribution states using only in-distribution actions? In this paper, we propose to do this by combining concepts from Lyapunov stability and density estimation, introducing Lyapunov density models: a generalization of control Lyapunov functions and density models that provides guarantees about an agent’s ability to stay in-distribution over its entire trajectory. |
Katie Kang; Paula Gradu; Jason J Choi; Michael Janner; Claire Tomlin; Sergey Levine; |
472 | Forget-free Continual Learning with Winning Subnetworks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by Lottery Ticket Hypothesis that competitive subnetworks exist within a dense network, we propose a continual learning method referred to as Winning SubNetworks (WSN), which sequentially learns and selects an optimal subnetwork for each task. |
Haeyong Kang; Rusty John Lloyd Mina; Sultan Rizky Hikmawan Madjid; Jaehong Yoon; Mark Hasegawa-Johnson; Sung Ju Hwang; Chang D. Yoo; |
473 | Differentially Private Approximate Quantiles Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we study the problem of differentially private (DP) quantiles, in which given dataset $X$ and quantiles $q_1, …, q_m \in [0,1]$, we want to output $m$ quantile estimations which are as close as possible to the true quantiles and preserve DP. |
Haim Kaplan; Shachar Schnapp; Uri Stemmer; |
474 | Simultaneous Graph Signal Clustering and Graph Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we address the problem of learning multiple graphs from heterogeneous data by formulating an optimization problem for joint graph signal clustering and graph topology inference. |
Abdullah Karaaslanli; Selin Aviyente; |
475 | Composing Partial Differential Equations with Physics-Aware Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a compositional physics-aware FInite volume Neural Network (FINN) for learning spatiotemporal advection-diffusion processes. |
Matthias Karlbauer; Timothy Praditia; Sebastian Otte; Sergey Oladyshkin; Wolfgang Nowak; Martin V. Butz; |
476 | Meta-Learning Hypothesis Spaces for Sequential Decision-making Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose to meta-learn a kernel from offline data (Meta-KeL). |
Parnian Kassraie; Jonas Rothfuss; Andreas Krause; |
477 | FOCUS: Familiar Objects in Common and Uncommon Settings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce FOCUS (Familiar Objects in Common and Uncommon Settings), a dataset for stress-testing the generalization power of deep image classifiers. |
Priyatham Kattakinda; Soheil Feizi; |
478 | Training OOD Detectors in Their Natural Habitats Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel framework that leverages wild mixture data—that naturally consists of both ID and OOD samples. |
Julian Katz-Samuels; Julia B Nakhleh; Robert Nowak; Yixuan Li; |
479 | Robustness Implies Generalization Via Data-Dependent Generalization Bounds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present several examples, including ones for lasso and deep learning, in which our bounds are provably preferable. |
Kenji Kawaguchi; Zhun Deng; Kyle Luh; Jiaoyang Huang; |
480 | Generating Distributional Adversarial Examples to Evade Statistical Detectors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Due to the difficulties in designing adaptive attacks, however, recent work suggests that most detectors have incomplete evaluation. We aim to fill this gap by designing a generic adaptive attack against detectors: the ’statistical indistinguishability attack’ (SIA). |
Yigitcan Kaya; Muhammad Bilal Zafar; Sergul Aydore; Nathalie Rauschmayr; Krishnaram Kenthapadi; |
481 | Secure Quantized Training for Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We implement training of neural networks in secure multi-party computation (MPC) using quantization commonly used in said setting. |
Marcel Keller; Ke Sun; |
482 | A Convergent and Dimension-Independent Min-Max Optimization Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study a variant of a recently introduced min-max optimization framework where the max-player is constrained to update its parameters in a greedy manner until it reaches a first-order stationary point. |
Vijay Keswani; Oren Mangoubi; Sushant Sachdeva; Nisheeth K. Vishnoi; |
483 | Neural Network Poisson Models for Behavioural and Neural Spike Train Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Predominant modeling methods apply rather disjoint techniques to these scales; by contrast, we suggest an end-to-end model which exploits recent developments of flexible, but tractable, neural network point-process models to characterize dependencies between stimuli, actions, and neural data. |
Moein Khajehnejad; Forough Habibollahi; Richard Nock; Ehsan Arabzadeh; Peter Dayan; Amir Dezfouli; |
484 | Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider a federated reinforcement learning framework where multiple agents collaboratively learn a global model, without sharing their individual data and policies. |
Sajad Khodadadian; Pranay Sharma; Gauri Joshi; Siva Theja Maguluri; |
485 | Multi-Level Branched Regularization for Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To alleviate the limitations, we propose a novel architectural regularization technique that constructs multiple auxiliary branches in each local model by grafting local and global subnetworks at several different levels and that learns the representations of the main pathway in the local model congruent to the auxiliary hybrid pathways via online knowledge distillation. |
Jinkyu Kim; Geeho Kim; Bohyung Han; |
486 | Learning Fair Representation with A Parametric Integral Probability Metric Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new adversarial training scheme for LFR, where the integral probability metric (IPM) with a specific parametric family of discriminators is used. |
Dongha Kim; Kunwoong Kim; Insung Kong; Ilsang Ohn; Yongdai Kim; |
487 | Dataset Condensation Via Efficient Synthetic-Data Parameterization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose a novel condensation framework that generates multiple synthetic data with a limited storage budget via efficient parameterization considering data regularity. |
Jang-Hyun Kim; Jinuk Kim; Seong Joon Oh; Sangdoo Yun; Hwanjun Song; Joonhyun Jeong; Jung-Woo Ha; Hyun Oh Song; |
488 | Guided-TTS: A Diffusion Model for Text-to-Speech Via Classifier Guidance Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Guided-TTS, a high-quality text-to-speech (TTS) model that does not require any transcript of target speaker using classifier guidance. |
Heeseung Kim; Sungwon Kim; Sungroh Yoon; |
489 | Variational On-the-Fly Personalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel personalization method, Variational On-the-Fly Personalization. |
Jangho Kim; Jun-Tae Lee; Simyung Chang; Nojun Kwak; |
490 | Fisher SAM: Information Geometry and Sharpness Aware Minimisation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we consider the information geometry of the model parameter space when defining the neighborhood, namely replacing SAM’s Euclidean balls with ellipsoids induced by the Fisher information. |
Minyoung Kim; Da Li; Shell X Hu; Timothy Hospedales; |
491 | ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we propose a new ViT neural tree decoder (ViT-NeT). |
Sangwon Kim; Jaeyeal Nam; Byoung Chul Ko; |
492 | Sanity Simulations for Saliency Methods Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we design a synthetic benchmarking framework, SMERF, that allows us to perform ground-truth-based evaluation while controlling the complexity of the model’s reasoning. |
Joon Sik Kim; Gregory Plumb; Ameet Talwalkar; |
493 | Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: For successful training, therefore, we introduce Soft Truncation, a universally applicable training technique for diffusion models, that softens the fixed and static truncation hyperparameter into a random variable. |
Dongjun Kim; Seungjae Shin; Kyungwoo Song; Wanmo Kang; Il-Chul Moon; |
494 | Rotting Infinitely Many-Armed Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider the infinitely many-armed bandit problem with rotting rewards, where the mean reward of an arm decreases at each pull of the arm according to an arbitrary trend with maximum rotting rate $\rot=o(1)$. |
Jung-Hun Kim; Milan Vojnovic; Se-Young Yun; |
495 | Accelerated Gradient Methods for Geodesically Convex Optimization: Tractable Algorithms and Convergence Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose computationally tractable accelerated first-order methods for Riemannian optimization, extending the Nesterov accelerated gradient (NAG) method. |
Jungbin Kim; Insoon Yang; |
496 | Generalizing to New Physical Systems Via Context-Informed Dynamics Model Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Data-driven approaches to modeling physical systems fail to generalize to unseen systems that share the same general dynamics with the learning domain, but correspond to different physical contexts. We propose a new framework for this key problem, context-informed dynamics adaptation (CoDA), which takes into account the distributional shift across systems for fast and efficient adaptation to new dynamics. |
Matthieu Kirchmeyer; Yuan Yin; Jeremie Dona; Nicolas Baskiotis; Alain Rakotomamonjy; Patrick Gallinari; |
497 | SoQal: Selective Oracle Questioning for Consistency Based Active Learning of Cardiac Signals Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: One way to mitigate this burden is via active learning (AL) which involves the (a) acquisition and (b) annotation of informative unlabelled instances. Whereas previous work addresses either one of these elements independently, we propose an AL framework that addresses both. |
Dani Kiyasseh; Tingting Zhu; David A Clifton; |
498 | Curriculum Reinforcement Learning Via Constrained Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on the idea of framing curricula as interpolations between task distributions, which has previously been shown to be a viable approach to CRL. |
Pascal Klink; Haoyi Yang; Carlo D?Eramo; Jan Peters; Joni Pajarinen; |
499 | Exploiting Redundancy: Separable Group Convolutional Networks on Lie Groups Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the properties of representations learned by regular G-CNNs, and show considerable parameter redundancy in group convolution kernels. |
David M. Knigge; David W Romero; Erik J Bekkers; |
500 | Revisiting Contrastive Learning Through The Lens of Neighborhood Component Analysis: An Integrated Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: By investigating the connection between contrastive learning and neighborhood component analysis (NCA), we provide a novel stochastic nearest neighbor viewpoint of contrastive learning and subsequently propose a series of contrastive losses that outperform the existing ones. |
Ching-Yun Ko; Jeet Mohapatra; Sijia Liu; Pin-Yu Chen; Luca Daniel; Lily Weng; |
501 | Transfer Learning In Differential Privacy’s Hybrid-Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we study the problem of machine learning in the hybrid-model where the $n$ individuals in the curator’s dataset are drawn from a different distribution than the one of the general population (the local-agents). |
Refael Kohen; Or Sheffet; |
502 | Markov Chain Monte Carlo for Continuous-Time Switching Dynamical Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel inference algorithm utilizing a Markov Chain Monte Carlo approach. |
Lukas K?hs; Bastian Alt; Heinz Koeppl; |
503 | Partial Disentanglement for Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Given the theoretical insights, we propose a practical domain adaptation framework, called iMSDA. |
Lingjing Kong; Shaoan Xie; Weiran Yao; Yujia Zheng; Guangyi Chen; Petar Stojanov; Victor Akinwande; Kun Zhang; |
504 | Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: With a general feedback graph, the observation of an arm may not be available when this arm is pulled, which makes the exploration more expensive and the algorithms more challenging to perform optimally in both environments. In this work, we overcome this difficulty by a new trade-off mechanism with a carefully-designed proportion for exploration and exploitation. |
Fang Kong; Yichi Zhou; Shuai Li; |
505 | Adaptive Data Analysis with Correlated Observations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We embark on a systematic study of the possibilities of adaptive data analysis with correlated observations. |
Aryeh Kontorovich; Menachem Sadigurschi; Uri Stemmer; |
506 | Controlling Conditional Language Models Without Catastrophic Forgetting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we extend DPG to conditional tasks by proposing Conditional DPG (CDPG). |
Tomasz Korbak; Hady Elsahar; German Kruszewski; Marc Dymetman; |
507 | Batch Greenkhorn Algorithm for Entropic-Regularized Multimarginal Optimal Transport: Linear Rate of Convergence and Iteration Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we propose a batch multimarginal version of the Greenkhorn algorithm for the entropic-regularized optimal transport problem. |
Vladimir R. Kostic; Saverio Salzo; Massimiliano Pontil; |
508 | Certified Adversarial Robustness Under The Bounded Support Set Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we generalize the $f$-divergence-based framework to a Wasserstein-distance-based and total-variation-distance-based framework that is first able to analyze robustness properties of bounded support set smoothing measures both theoretically and experimentally. |
Yiwen Kou; Qinyuan Zheng; Yisen Wang; |
509 | Exact Learning of Preference Structure: Single-peaked Preferences and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the setting where the members of a society (voters) have preferences over candidates, and the candidates can be ordered on an axis so that the voters’ preferences are single-peaked on this axis. |
Sonja Kraiczy; Edith Elkind; |
510 | Reconstructing Nonlinear Dynamical Systems from Multi-Modal Time Series Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we propose a general framework for multi-modal data integration for the purpose of nonlinear DS reconstruction and the analysis of cross-modal relations. |
Daniel Kramer; Philine L Bommer; Daniel Durstewitz; Carlo Tombolini; Georgia Koppe; |
511 | Probabilistic ODE Solutions in Millions of Dimensions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we explain the mathematical assumptions and detailed implementation schemes behind solving high-dimensional ODEs with a probabilistic numerical algorithm. |
Nicholas Kr?mer; Nathanael Bosch; Jonathan Schmidt; Philipp Hennig; |
512 | Active Nearest Neighbor Regression Through Delaunay Refinement Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce an algorithm for active function approximation based on nearest neighbor regression. |
Alexander Kravberg; Giovanni Luca Marchetti; Vladislav Polianskii; Anastasiia Varava; Florian T. Pokorny; Danica Kragic; |
513 | Functional Generalized Empirical Likelihood Estimation for Conditional Moment Restrictions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To benefit from recent developments in machine learning, we provide a functional reformulation of GEL in which arbitrary models can be leveraged. Motivated by a dual formulation of the resulting infinite dimensional optimization problem, we devise a practical method and explore its asymptotic properties. |
Heiner Kremer; Jia-Jie Zhu; Krikamol Muandet; Bernhard Sch?lkopf; |
514 | Calibrated and Sharp Uncertainties in Deep Learning Via Density Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a simple training procedure based on recalibration that yields calibrated models without sacrificing overall performance; unlike previous approaches, ours ensures the most general property of distribution calibration and applies to any model, including neural networks. |
Volodymyr Kuleshov; Shachi Deshpande; |
515 | ActiveHedge: Hedge Meets Active Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the classical problem of multi-class prediction with expert advice, but with an active learning twist. |
Bhuvesh Kumar; Jacob D Abernethy; Venkatesh Saligrama; |
516 | Balancing Discriminability and Transferability for Source-Free Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Upon analyzing the hurdles from both theoretical and empirical standpoints, we derive novel insights to show that a mixup between original and corresponding translated generic samples enhances the discriminability-transferability trade-off while duly respecting the privacy-oriented source-free setting. |
Jogendra Nath Kundu; Akshay R Kulkarni; Suvaansh Bhambri; Deepesh Mehta; Shreyas Anand Kulkarni; Varun Jampani; Venkatesh Babu Radhakrishnan; |
517 | Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we argue for the importance of an online evaluation budget for a reliable comparison of deep offline RL algorithms. |
Vladislav Kurenkov; Sergey Kolesnikov; |
518 | Equivariant Priors for Compressed Sensing with Unknown Orientation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Additionally, in many scenarios, the signal has an unknown orientation prior to measurements. To address such recovery problems, we propose using equivariant generative models as a prior, which encapsulate orientation information in their latent space. |
Anna Kuzina; Kumar Pratik; Fabio Valerio Massoli; Arash Behboodi; |
519 | Coordinated Attacks Against Contextual Bandits: Fundamental Limits and Defense Mechanisms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by online recommendation systems, we propose the problem of finding the optimal policy in multitask contextual bandits when a small fraction $\alpha < 1/2$ of tasks (users) are arbitrary and adversarial. |
Jeongyeol Kwon; Yonathan Efroni; Constantine Caramanis; Shie Mannor; |
520 | Large Batch Experience Replay Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we cast the replay buffer sampling problem as an importance sampling one for estimating the gradient. |
Thibault Lahire; Matthieu Geist; Emmanuel Rachelson; |
521 | FedScale: Benchmarking Model and System Performance of Federated Learning at Scale Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present FedScale, a federated learning (FL) benchmarking suite with realistic datasets and a scalable runtime to enable reproducible FL research. |
Fan Lai; Yinwei Dai; Sanjay Singapuram; Jiachen Liu; Xiangfeng Zhu; Harsha Madhyastha; Mosharaf Chowdhury; |
522 | Smoothed Adaptive Weighting for Imbalanced Semi-Supervised Learning: Improve Reliability Against Unknown Distribution Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on this study, we propose a self-adaptive algorithm, named Smoothed Adaptive Weighting (SAW). |
Zhengfeng Lai; Chao Wang; Henrry Gunawan; Sen-Ching S Cheung; Chen-Nee Chuah; |
523 | Functional Output Regression with Infimal Convolution: Exploring The Huber and $e$-insensitive Losses Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive computationally tractable algorithms relying on duality to tackle the resulting tasks in the context of vector-valued reproducing kernel Hilbert spaces. |
Alex Lambert; Dimitri Bouche; Zoltan Szabo; Florence D?Alch?-Buc; |
524 | Tell Me Why! Explanations Support Learning Relational and Causal Structure Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we show that language can play a similar role for deep RL agents in complex environments. |
Andrew K Lampinen; Nicholas Roy; Ishita Dasgupta; Stephanie Cy Chan; Allison Tam; James Mcclelland; Chen Yan; Adam Santoro; Neil C Rabinowitz; Jane Wang; Felix Hill; |
525 | Generative Cooperative Networks for Natural Language Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce Generative Cooperative Networks, in which the discriminator architecture is cooperatively used along with the generation policy to output samples of realistic texts for the task at hand. |
Sylvain Lamprier; Thomas Scialom; Antoine Chaffin; Vincent Claveau; Ewa Kijak; Jacopo Staiano; Benjamin Piwowarski; |
526 | DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural Network for Traffic Flow Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a novel Dynamic Spatial-Temporal Aware Graph Neural Network (DSTAGNN) to model the complex spatial-temporal interaction in road network. |
Shiyong Lan; Yitong Ma; Weikang Huang; Wenwu Wang; Hongyu Yang; Pyang Li; |
527 | Cooperative Online Learning in Stochastic and Adversarial MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study cooperative online learning in stochastic and adversarial Markov decision process (MDP). |
Tal Lancewicki; Aviv Rosenberg; Yishay Mansour; |
528 | PINs: Progressive Implicit Networks for Multi-Scale Neural Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, scenes with a wide frequency spectrum remain a challenge: choosing high frequencies for positional encoding introduces noise in low structure areas, while low frequencies results in poor fitting of detailed regions. To address this, we propose a progressive positional encoding, exposing a hierarchical MLP structure to incremental sets of frequency encodings. |
Zoe Landgraf; Alexander Sorkine Hornung; Ricardo S Cabral; |
529 | Co-training Improves Prompt-based Learning for Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We demonstrate that co-training (Blum & Mitchell, 1998) can improve the performance of prompt-based learning by using unlabeled data. |
Hunter Lang; Monica N Agrawal; Yoon Kim; David Sontag; |
530 | Goal Misgeneralization in Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study goal misgeneralization, a type of out-of-distribution robustness failure in reinforcement learning (RL). |
Lauro Langosco Di Langosco; Jack Koch; Lee D Sharkey; Jacob Pfau; David Krueger; |
531 | Marginal Tail-Adaptive Normalizing Flows Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on improving the ability of normalizing flows to correctly capture the tail behavior and, thus, form more accurate models. |
Mike Laszkiewicz; Johannes Lederer; Asja Fischer; |
532 | Bregman Proximal Langevin Monte Carlo Via Bregman-Moreau Envelopes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose efficient Langevin Monte Carlo algorithms for sampling distributions with nonsmooth convex composite potentials, which is the sum of a continuously differentiable function and a possibly nonsmooth function. |
Tim Tsz-Kit Lau; Han Liu; |
533 | Scalable Deep Reinforcement Learning Algorithms for Mean Field Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This is far from being trivial in the case of non-linear function approximation that enjoy good generalization properties, e.g. neural networks. We propose two methods to address this shortcoming. |
Mathieu Lauriere; Sarah Perrin; Sertan Girgin; Paul Muller; Ayush Jain; Theophile Cabannes; Georgios Piliouras; Julien Perolat; Romuald Elie; Olivier Pietquin; Matthieu Geist; |
534 | Implicit Bias of Linear Equivariant Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this context, we show that L-layer full-width linear G-CNNs trained via gradient descent for binary classification converge to solutions with low-rank Fourier matrix coefficients, regularized by the 2/L-Schatten matrix norm. |
Hannah Lawrence; Bobak Kiani; Kristian G Georgiev; Andrew K Dienes; |
535 | Differentially Private Maximal Information Coefficients Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a solution, we present algorithms to approximate MIC in a way that provides differential privacy. |
John Lazarsfeld; Aaron Johnson; Emmanuel Adeniran; |
536 | Entropic Gromov-Wasserstein Between Gaussian Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the entropic Gromov-Wasserstein and its unbalanced version between (unbalanced) Gaussian distributions with different dimensions. |
Khang Le; Dung Q Le; Huy Nguyen; Dat Do; Tung Pham; Nhat Ho; |
537 | Neurocoder: General-Purpose Computation Using Stored Neural Programs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we design Neurocoder, a new class of general-purpose neural networks in which the neural network “codes” itself in a data-responsive way by composing relevant programs from a set of shareable, modular programs stored in external memory. |
Hung Le; Svetha Venkatesh; |
538 | Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in The Mean-Field Regime Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the global convergence of policy gradient for infinite-horizon, continuous state and action space, and entropy-regularized Markov decision processes (MDPs). |
James-Michael Leahy; Bekzhan Kerimkulov; David Siska; Lukasz Szpruch; |
539 | A Random Matrix Analysis of Data Stream Clustering: Coping With Limited Memory Resources Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This article introduces a random matrix framework for the analysis of clustering on high-dimensional data streams, a particularly relevant setting for a more sober processing of large amounts of data with limited memory and energy resources. |
Hugo Lebeau; Romain Couillet; Florent Chatelain; |
540 | Neural Tangent Kernel Analysis of Deep Narrow Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present the first trainability guarantee of infinitely deep but narrow neural networks. |
Jongmin Lee; Joo Young Choi; Ernest K Ryu; Albert No; |
541 | Dataset Condensation with Contrastive Signals Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We attribute this to the lack of participation of the contrastive signals between the classes resulting from the class-wise gradient matching strategy. To address this problem, we propose Dataset Condensation with Contrastive signals (DCC) by modifying the loss function to enable the DC methods to effectively capture the differences between classes. |
Saehyung Lee; Sanghyuk Chun; Sangwon Jung; Sangdoo Yun; Sungroh Yoon; |
542 | Confidence Score for Source-Free Unsupervised Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To differentiate between sample importance, in this study, we propose a novel sample-wise confidence score, the Joint Model-Data Structure (JMDS) score for SFUDA. |
Jonghyun Lee; Dahuin Jung; Junho Yim; Sungroh Yoon; |
543 | A Statistical Manifold Framework for Point Cloud Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A growing number of applications require a means of measuring not only distances between point clouds, but also angles, volumes, derivatives, and other more advanced concepts. To formulate and quantify these concepts in a coordinate-invariant way, we develop a Riemannian geometric framework for point cloud data. |
Yonghyeon Lee; Seungyeon Kim; Jinwon Choi; Frank Park; |
544 | Low-Complexity Deep Convolutional Neural Networks on Fully Homomorphic Encryption Using Multiplexed Parallel Convolutions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To improve the performance, we first minimize total bootstrapping runtime using multiplexed parallel convolution that collects sparse output data for multiple channels compactly. We also propose the imaginary-removing bootstrapping to prevent the deep neural networks from catastrophic divergence during approximate ReLU operations. |
Eunsang Lee; Joon-Woo Lee; Junghyun Lee; Young-Sik Kim; Yongjune Kim; Jong-Seon No; Woosuk Choi; |
545 | Statistical Inference with Implicit SGD: Proximal Robbins-Monro Vs. Polyak-Ruppert Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we conduct an in-depth analysis of the two modes of ISGD for smooth convex functions, namely proximal Robbins-Monro (proxRM) and proximal Poylak-Ruppert (proxPR) procedures, for their use in statistical inference on model parameters. |
Yoonhyung Lee; Sungdong Lee; Joong-Ho Won; |
546 | Maslow’s Hammer in Catastrophic Forgetting: Node Re-Use Vs. Node Activation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Surprisingly, the amount of forgetting does not increase with the dissimilarity between the learned tasks, but appears to be worst in an intermediate similarity regime. In this paper we theoretically analyse both a synthetic teacher-student framework and a real data setup to provide an explanation of this phenomenon that we name Maslow’s Hammer hypothesis. |
Sebastian Lee; Stefano Sarao Mannelli; Claudia Clopath; Sebastian Goldt; Andrew Saxe; |
547 | Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data Via Bayesian Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce block decomposition and history subsampling techniques to improve the scalability of Bayesian optimization when an input sequence becomes long. |
Deokjae Lee; Seungyong Moon; Junhyeok Lee; Hyun Oh Song; |
548 | Least Squares Estimation Using Sketched Data with Heteroskedastic Errors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper considers the case when the regression errors do not have constant variance and heteroskedasticity robust standard errors would normally be needed for test statistics to provide accurate inference. |
Sokbae Lee; Serena Ng; |
549 | Why The Rich Get Richer? On The Balancedness of Random Partition Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a principled way to compare the balancedness of random partition models, which gives a better understanding of what model works better and what doesn’t for different applications. |
Changwoo J Lee; Huiyan Sang; |
550 | Model Selection in Batch Policy Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of model selection in batch policy optimization: given a fixed, partial-feedback dataset and M model classes, learn a policy with performance that is competitive with the policy derived from the best model class. |
Jonathan Lee; George Tucker; Ofir Nachum; Bo Dai; |
551 | Supervised Learning with General Risk Functionals Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We establish the first uniform convergence results for estimating the CDF of the loss distribution, which yield uniform convergence guarantees that hold simultaneously both over a class of Hölder risk functionals and over a hypothesis class. |
Liu Leqi; Audrey Huang; Zachary Lipton; Kamyar Azizzadenesheli; |
552 | Generalized Strategic Classification and The Case of Aligned Incentives Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we argue for a broader perspective on what accounts for strategic user behavior, and propose and study a flexible model of generalized strategic classification. |
Sagi Levanon; Nir Rosenfeld; |
553 | A Simple Unified Framework for High Dimensional Bandit Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Stochastic high dimensional bandit problems with low dimensional structures are useful in different applications such as online advertising and drug discovery. In this work, we propose a simple unified algorithm for such problems and present a general analysis framework for the regret upper bound of our algorithm. |
Wenjie Li; Adarsh Barik; Jean Honorio; |
554 | Robust Training of Neural Networks Using Scale Invariant Architectures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the use of adaptivity not only comes at the cost of extra memory but also raises the fundamental question: can non-adaptive methods like SGD enjoy similar benefits? In this paper, we provide an affirmative answer to this question by proposing to achieve both robust and memory-efficient training via the following general recipe: (1) modify the architecture and make it scale invariant, (2) train with SGD and weight decay, and optionally (3) clip the global gradient norm proportional to weight norm multiplied by $\sqrt{\frac{2\lambda}{\eta}}$, where $\eta$ is learning rate and $\lambda$ is weight decay. |
Zhiyuan Li; Srinadh Bhojanapalli; Manzil Zaheer; Sashank Reddi; Sanjiv Kumar; |
555 | Spatial-Channel Token Distillation for Vision MLPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work solves the problem from a novel knowledge distillation perspective. We propose a novel Spatial-channel Token Distillation (STD) method, which improves the information mixing in the two dimensions by introducing distillation tokens to each of them. |
Yanxi Li; Xinghao Chen; Minjing Dong; Yehui Tang; Yunhe Wang; Chang Xu; |
556 | An Analytical Update Rule for General Policy Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an analytical policy update rule that is independent of parametric function approximators. |
Hepeng Li; Nicholas Clavette; Haibo He; |
557 | On Convergence of Gradient Descent Ascent: A Tight Local Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While this stepsize ratio suggests a slow training of the min player, practical GAN algorithms typically adopt similar stepsizes for both variables, indicating a wide gap between theoretical and empirical results. In this paper, we aim to bridge this gap by analyzing the local convergence of general nonconvex-nonconcave minimax problems. |
Haochuan Li; Farzan Farnia; Subhro Das; Ali Jadbabaie; |
558 | On The Finite-Time Performance of The Knowledge Gradient Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this research, we present new theoretical results about the finite-time performance of the KG algorithm. |
Yanwen Li; Siyang Gao; |
559 | Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel phasic solution by alternating online RL and offline SL for tackling sparse-reward goal-conditioned problems.In the online phase, we perform RL training and collect rollout data while in the offline phase, we perform SL on those successful trajectories from the dataset. |
Yunfei Li; Tian Gao; Jiaqi Yang; Huazhe Xu; Yi Wu; |
560 | G$^2$CN: Graph Gaussian Convolution Networks with Concentrated Graph Filters Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Nevertheless, we notice that existing spectral analysis fails to explain why existing graph propagations with the same global tendency, such as low-pass or high-pass, still yield very different results. Motivated by this situation, we develop a new framework for spectral analysis in this paper called concentration analysis. |
Mingjie Li; Xiaojun Guo; Yifei Wang; Yisen Wang; Zhouchen Lin; |
561 | Decomposing Temporal High-Order Interactions Via Latent ODEs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a result, these methods might not be capable enough of capturing complex, fine-grained temporal dynamics or making accurate predictions for long-term interaction results. To overcome these limitations, we propose a novel Temporal High-order Interaction decompoSition model based on Ordinary Differential Equations (THIS-ODE). |
Shibo Li; Robert Kirby; Shandian Zhe; |
562 | Neural Inverse Transform Sampler Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that when modeling one-dimensional conditional densities with a neural network, $Z$ can be exactly and efficiently computed by letting the network represent the cumulative distribution function of a target density, and applying a generalized fundamental theorem of calculus. |
Henry Li; Yuval Kluger; |
563 | PLATINUM: Semi-Supervised Model Agnostic Meta-Learning Using Submodular Mutual Information Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose PLATINUM (semi-suPervised modeL Agnostic meTa learnIng usiNg sUbmodular Mutual information ), a novel semi-supervised model agnostic meta learning framework that uses the submodular mutual in- formation (SMI) functions to boost the perfor- mance of FSC. |
Changbin Li; Suraj Kothawade; Feng Chen; Rishabh Iyer; |
564 | Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate VD from a novel perspective of causal inference. |
Jiahui Li; Kun Kuang; Baoxiang Wang; Furui Liu; Long Chen; Changjie Fan; Fei Wu; Jun Xiao; |
565 | C-MinHash: Improving Minwise Hashing with Circulant Permutation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Circulant MinHash (C-MinHash) and provide the surprising theoretical results that using only two independent random permutations in a circulant manner leads to uniformly smaller Jaccard estimation variance than that of the classical MinHash with K independent permutations. |
Xiaoyun Li; Ping Li; |
566 | BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose BLIP, a new VLP framework which transfers flexibly to both vision-language understanding and generation tasks. |
Junnan Li; Dongxu Li; Caiming Xiong; Steven Hoi; |
567 | Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in The $O(e^-7/4)$ Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper studies the accelerated gradient descent for general nonconvex problems under the gradient Lipschitz and Hessian Lipschitz assumptions. |
Huan Li; Zhouchen Lin; |
568 | Achieving Fairness at No Utility Cost Via Data Reweighing with Influence Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on the pre-processing aspect for achieving fairness, and propose a data reweighing approach that only adjusts the weight for samples in the training phase. |
Peizhao Li; Hongfu Liu; |
569 | High Probability Guarantees for Nonconvex Stochastic Gradient Descent with Heavy Tails Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop high probability bounds for nonconvex SGD with a joint perspective of optimization and generalization performance. |
Shaojie Li; Yong Liu; |
570 | MetAug: Contrastive Learning Via Meta Feature Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In response, we propose to directly augment the features in latent space, thereby learning discriminative representations without a large amount of input data. |
Jiangmeng Li; Wenwen Qiang; Changwen Zheng; Bing Su; Hui Xiong; |
571 | PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration. To address this issue, we propose a novel MARL framework, called Progressive Mutual Information Collaboration (PMIC), for more effective MI-driven collaboration. |
Pengyi Li; Hongyao Tang; Tianpei Yang; Xiaotian Hao; Tong Sang; Yan Zheng; Jianye Hao; Matthew E. Taylor; Wenyuan Tao; Zhen Wang; |
572 | CerDEQ: Certifiable Deep Equilibrium Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we aim to tackle the problem of DEQ’s certified training. |
Mingjie Li; Yisen Wang; Zhouchen Lin; |
573 | Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To the best of our knowledge, this paper provides the first theoretical justification of graph topology sampling in training (up to) three-layer GCNs for semi-supervised node classification. |
Hongkang Li; Meng Wang; Sijia Liu; Pin-Yu Chen; Jinjun Xiong; |
574 | Let Invariant Rationale Discovery Inspire Graph Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Taking an invariance look at GCL, we argue that a high-performing augmentation should preserve the salient semantics of anchor graphs regarding instance-discrimination. To this end, we relate GCL with invariant rationale discovery, and propose a new framework, Rationale-aware Graph Contrastive Learning (RGCL). |
Sihang Li; Xiang Wang; An Zhang; Yingxin Wu; Xiangnan He; Tat-Seng Chua; |
575 | Difference Advantage Estimation for Multi-Agent Policy Gradients Related Papers Related Patents |