Paper Digest: ICASSP 2023 Highlights
The IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) is one of the top signal processing conferences in the world. In 2023, it is to be held in Rhodes Island, Greece.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: ICASSP 2023 Highlights
Paper | Author(s) | |
---|---|---|
1 | 2DSBG: A 2d Semi Bi-Gaussian Filter Adapted for Adjacent and Multi-Scale Line Feature Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a new filter composed of a bi-Gaussian and a semi-Gaussian kernel is proposed, capable of highlighting complex linear structures such as ridges and valleys of different widths, with noise robustness. |
B. Magnier; G. S. Shokouh; L. Berthier; M. Pie; A. Ruggiero; |
2 | 3D Audio Signal Processing Systems for Speech Enhancement and Sound Localization and Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a two-stage system based on DPRNN and UNet for the SE task and a Conformer-based system for the SELD task. |
J. Bai; S. Huang; H. Yin; Y. Jia; M. Wang; J. Chen; |
3 | 3D Point Cloud Completion Based on Multi-Scale Degradation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To explore unsupervised 3D point cloud completion methods that give attention to both, we propose a multi resolution completion net (MRC-Net) which introduces a multi-scale degradation (KM- mask) and multi-discriminator into GAN inversion paradigm. |
J. Long; Q. Zhu; H. He; Z. Yu; Q. Zhang; Z. Zhang; |
4 | 6G Integrated Sensing and Communication – Sensing Assisted Environmental Reconstruction and Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a multi transmission reception points (TRP) sensing architecture based on scatter polygon assumption to improve environment sensing accuracy. |
Z. Zhou; X. Li; J. He; X. Bi; Y. Chen; G. Wang; P. Zhu; |
5 | A2S-NAS: Asymmetric Spectral-Spatial Neural Architecture Search for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Meanwhile, plenty of previous works ignore asymmetric spectral-spatial dimensions in HSI. To address the above issues, we propose a multi-stage search architecture in order to overcome asymmetric spectral-spatial dimensions and capture significant features. |
L. Zhan; J. Fan; P. Ye; J. Cao; |
6 | A 3D-Assisted Framework to Evaluate The Quality of Head Motion Replication By Reenactment DEEPFAKE Generators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we focus on the quality of head motion replication by deepfake generators that use a pilot video of a particular person to animate a single source image of another person. |
S. Husseini; J. -L. Dugelay; F. Aili; E. Nars; |
7 | A3S: Adversarial Learning of Semantic Representations for Scene-Text Spotting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Texts in natural scene images tend to not be a random string of characters but a meaningful string of characters, a word. Therefore, we propose adversarial learning of semantic representations for scene text spotting (A3S) to improve end-to-end accuracy, including text recognition. |
M. Fujitake; |
8 | A Bandit Online Convex Optimization Approach To Distributed Energy Management In Networked Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the energy-sharing problem in a system consisting of several DERs. |
I. Tsetis; X. Cheng; S. Maghsudi; |
9 | A Bayesian Perspective for Determinant Minimization Based Robust Structured Matrix Factorization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that the corresponding maximum a posteriori estimation problem boils down to the robust determinant minimization approach for structured matrix factorization, providing insights about parameter selections and potential algorithmic extensions. |
G. Tatli; A. T. Erdogan; |
10 | A Bayesian Perspective on Noise2Noise: Theory and Extensions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a Bayesian counter-piece to the original Noise2Noise formulation, with a fully stochastic treatment of the latent variable. |
S. Miller; C. Karam; A. Idoughi; K. Kikuchi; K. Hirakawa; |
11 | A Benchmark for Evaluating Robustness of Spoken Language Understanding Models in Slot Filling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our experiments and analysis reveal that all of the six SLU models have a significant performance degradation on NASE. |
M. Peng; X. Jia; M. Peng; |
12 | A Bidirectional Joint Model for Spoken Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a bidirectional joint model for SLU that explicitly incorporates intent information into slot filling and slot information into intent detection. |
N. A. Tu; D. Xuan Hieu; T. M. Phuong; N. Xuan Bach; |
13 | Absolute Decision Corrupts Absolutely: Conservative Online Speaker Diarisation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our focus lies in developing an online speaker diarisation framework which demonstrates robust performance across diverse domains. |
Y. Kwon; H. -S. Heo; B. -J. Lee; Y. J. Kim; J. -W. Jung; |
14 | Abstract Representation for Multi-Intent Spoken Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose in this study a new way to project annotation in an abstract structure with more compositional expressive power and a model to directly generate this abstract structure. |
R. Abrougui; G. Damnati; J. Heinecke; F. Béchet; |
15 | Abusive Activity Detection with Multi-Modality Based on Convolutional Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it is difficult to detect because it has various forms and is not easy to define. Therefore, in this study, we try to detect using the Convolutional Neural Network (CNN). |
J. Kim; H. Ahn; B. Yoo; |
16 | A Causal Convolutional Approach for Packet Loss Concealment in Low Powered Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a deep learning model for audio Packet Loss Concealment (PLC) for real time communications that is accurate, lightweight, with a low inference time suitable for low powered mobile handsets. |
S. Davy; N. Belton; J. Tobin; O. B. Zuber; L. Dong; Y. Xuewen; |
17 | Accelerated Distributed Stochastic Non-Convex Optimization Over Time-Varying Directed Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The network nodes, which can access only their local objectives and query a stochastic first-order oracle for the gradient estimates, collaborate by exchanging messages with their neighbors to minimize a global objective function. We propose an algorithm for non-convex optimization problems in such settings that leverages stochastic gradient descent with momentum and gradient tracking. |
Y. Chen; A. Hashemi; H. Vikalo; |
18 | Accelerated Massive MIMO Detector Based on Annealed Underdamped Langevin Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a multiple-input multiple-output (MIMO) detector based on an annealed version of the underdamped Langevin (stochastic) dynamic. |
N. Zilberstein; C. Dick; R. Doost-Mohammady; A. Sabharwal; S. Segarra; |
19 | Accelerating Matrix Trace Estimation By Aitken’s Δ2 Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an algorithm to estimate the trace of symmetric matrices that are available only via Matrix-Vector multiplication. |
V. Kalantzis; G. Kollias; S. Ubaru; T. Salonidis; |
20 | Accelerating RNN-T Training and Inference Using CTC Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel method to accelerate training and inference process of recurrent neural network transducer (RNN-T) based on the guidance from a co-trained connectionist temporal classification (CTC) model. |
Y. Wang; Z. Chen; C. Zheng; Y. Zhang; W. Han; P. Haghani; |
21 | Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we extend previous self-supervised approaches for language identification by experimenting with Conformer based architecture in a multilingual pre-training paradigm. |
T. M. Bartley; F. Jia; K. C. Puvvada; S. Kriman; B. Ginsburg; |
22 | ACE-VC: Adaptive and Controllable Voice Conversion Using Explicitly Disentangled Self-Supervised Speech Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a zero-shot voice conversion method using speech representations trained with self-supervised learning. |
S. Hussain; P. Neekhara; J. Huang; J. Li; B. Ginsburg; |
23 | ACF: Aligned Contrastive Finetuning For Language and Vision Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel aligned contrastive finetuning (ACF) approach in this work. |
W. Zhu; P. Wang; X. Wang; Y. Ni; G. Xie; |
24 | Achievable Error Exponents for Almost Fixed-Length M-Ary Hypothesis Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We revisit multiple hypothesis testing and propose a two-phase test, where each phase is a fixed-length test and the second-phase proceeds only if a reject option is decided in the first phase. |
J. Diao; L. Zhou; L. Bai; |
25 | Achieving Fair Speech Emotion Recognition Via Perceptual Fairness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we proposed a two-stage framework, which produces debiased representations by using a fairness constraint adversarial framework in the first stage. |
W. -S. Chien; C. -C. Lee; |
26 | A Closer Look At Scoring Functions And Generalization Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, GEPs often utilize disparate mechanisms (e.g., regressors, thresholding functions, calibration datasets, etc), to derive such error estimates, which can obfuscate the benefits of a particular scoring function. Therefore, in this work, we rigorously study the effectiveness of popular scoring functions (confidence, local manifold smoothness, model agreement), independent of mechanism choice. |
P. Trivedi; D. Koutra; J. J. Thiagarajan; |
27 | A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we compare three state-of-the-art semi-supervised methods encompassing both unpaired text and audio as well as several of their combinations in a controlled setting using joint training. |
C. Peyser; M. Picheny; K. Cho; R. Prabhavalkar; W. R. Huang; T. N. Sainath; |
28 | A Compensated Shrinkage Affine Projection Algorithm for Debiased Sparse Adaptive Filtering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel sparse adaptive filtering algorithm termed compensated shrinkage affine projection algorithm (CS-APA). |
Y. Zhang; I. Yamada; |
29 | A Comprehensive Comparison of Projections in Omnidirectional Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we find that different projection methods have great impact on the performance of DNNs. |
H. Pi; S. Tian; M. Lu; J. Liu; Y. Guo; S. Zhang; |
30 | A Computationally Efficient Algorithm for Distributed Adaptive Signal Fusion Based on Fractional Programs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on Dinkelbach’s iterative procedure to solve fractional programs, i.e., problems of which the objective function is a ratio of two continuous functions. |
C. A. Musluoglu; A. Bertrand; |
31 | A Content Adaptive Learnable Time-Frequency Representation for Audio Signal Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a way of computing a content-adaptive learnable time-frequency representation. |
P. Verma; C. Chafe; |
32 | A Content-Based Multi-Scale Network for Single Image Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A novel content-based multi-scale network (CMNet) is proposed in this paper for conducting single image super-resolution (SISR). |
J. Ji; B. Zhong; K. -K. Mu; |
33 | A Context-Aware Computational Approach for Measuring Vocal Entrainment in Dyadic Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a context-aware approach for measuring vocal entrainment in dyadic conversations. |
R. Lahiri; M. Nasir; C. Lord; S. H. Kim; S. Narayanan; |
34 | A Contrastive Embedding-Based Domain Adaptation Method for Lung Sound Recognition in Children Community-Acquired Pneumonia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The data scarcity will further exacerbate this problem. Therefore, we propose a contrastive embedding-based domain adaptation network (CEDANN) to eliminate individual differences and alleviate data scarcity for improving the generalization ability. |
D. Huang; L. Wang; H. Lu; W. Wang; |
35 | A Contrastive Framework to Enhance Unsupervised Sentence Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models often suffer from semantic monotonicity, sampling bias, and training effect dependent on batch size. In order to solve these problems, this paper proposes a contrastive framework (CEUR) to enhance unsupervised sentence representation learning. |
H. Ma; Z. Li; H. Guo; |
36 | A Contrastive Knowledge Transfer Framework for Model Compression and Transfer Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these works overlook the high-dimension structural knowledge from the intermediate representations of the teacher, which leads to limited effectiveness, and they are motivated by various heuristic intuitions, which makes it difficult to generalize. This paper proposes a novel Contrastive Knowledge Transfer Framework (CKTF), which enables the transfer of sufficient structural knowledge from the teacher to the student by optimizing multiple contrastive objectives across the intermediate representations between them. |
K. Zhao; Y. Chen; M. Zhao; |
37 | A Controllable Lifestyle Simulator for Use in Deep Reinforcement Learning Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel and highly generalizable simulation system based on state machines associated with probabilistic transitions to simulate the user’s lifestyle. |
L. G. Braz; A. Susaiyah; |
38 | Acoustically-Driven Phoneme Removal That Preserves Vocal Affect Cues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a method for removing linguistic information from speech for the purpose of isolating paralinguistic indicators of affect. |
C. Noufi; J. Berger; K. J. Parker; D. L. Bowling; |
39 | Acoustic Source Localization in The Spherical Harmonics Domain Exploiting Low-Rank Approximations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simple yet effective method to localize prominent acoustic sources in adverse acoustic scenarios. |
M. Cobos; M. Pezzoli; F. Antonacci; A. Sarti; |
40 | A Critical Look at Recent Trends in Compression of Channel State Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we challenge the current view on state-of-the-art deep learning-based methods for compressing wireless channel state information and show that traditional methods can be highly competitive on commonly used open-source benchmarks. |
M. V. Örnhag; S. Adalbjörnsson; P. Güler; M. Mahdavi; |
41 | Active Beam Tracking with Reconfigurable Intelligent Surface Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is an active sensing problem which is analytically intractable. This paper proposes a deep learning framework to solve this problem. |
H. Han; T. Jiang; W. Yu; |
42 | Active IRS-Assisted MIMO Channel Estimation and Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to estimate and predict the user-IRS channels by exploiting a small number of sparsely distributed active elements with a low pilot overhead. |
M. A. Haider; S. R. Pavel; Y. D. Zhang; E. Aboutanios; |
43 | Active Learning for Efficient Few-Shot Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the problem of Active Few-Shot Classification (AFSC) where the objective is to classify a small, initially unlabeled, dataset given a very restrained labeling budget. |
A. Abdali; V. Gripon; L. Drumetz; B. Boguslawski; |
44 | Active Learning of Non-Semantic Speech Tasks with Pretrained Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose ALOE, a novel system for improving the data- and label-efficiency of non-semantic speech tasks with active learning (AL). |
H. Lee; A. Saeed; A. L. Bertozzi; |
45 | Active Noise Control Over 3D Space: A Realistic Error Microphone Geometry Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we optimize the aforementioned system in terms of the error microphone geometry. |
H. Sun; P. Samarasinghe; T. Abhayapala; |
46 | Active Perception System for Enhanced Visual Signal Recovery Using Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models can only recognize and predict segmentation masks with great accuracy when RGB data have sufficient information about the objects of interest. In this paper, we suggest an intelligent, active perception system that can adjust its 3D position to improve signal acquisition. |
G. Chaudhary; L. Behera; T. Sandhan; |
47 | Active Selection of Source Patients in Transfer Learning for Epileptic Seizure Detection Using Riemannian Manifold Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we introduced an active learning based training data selection and modification method with a Riemannian geometry, centroid alignment, tangent space mapping and a support vector machine classifier. |
T. Orihara; K. M. Hassan; T. Tanaka; |
48 | Active Subsampling Using Deep Generative Models By Maximizing Expected Information Gain Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an adaptive, fully probabilistic pipeline for optimized signal subsampling in sampling-budget constrained systems. |
K. C. E. van de Camp; H. Joudeh; D. J. Antunes; R. J. G. van Sloun; |
49 | Activity-Informed Industrial Audio Anomaly Detection Via Source Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is particularly challenging since the interfering sounds are virtually indistinguishable from the target machine without additional information. To overcome these challenges, we fully exploit the information of machine activity or control that is easy to obtain in the industrial environment, and propose a framework of source separation (SS) followed by anomaly detection (AD), so called SSAD. |
J. Kim; Y. Lee; H. M. Cho; D. W. Kim; C. H. Song; J. Ok; |
50 | AdapITN: A Fast, Reliable, and Dynamic Adaptive Inverse Text Normalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce a novel end2end model that can handle both semiotic phrases (SEP) and phonetization phrases (PHP), named AdapITN. |
T. -B. Nguyen; L. D. M. Nhat; Q. M. Nguyen; Q. T. Do; C. M. Luong; A. Waibel; |
51 | Adaptable End-to-End ASR Models Using Replaceable Internal LMs and Residual Softmax Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it still suffers from domain shifts from training to testing, and domain adaptation is still challenging. To alleviate this problem, this paper designs a replaceable internal language model (RILM) method, which makes it feasible to directly replace the internal language model (LM) of E2E ASR models with a target-domain LM in the decoding stage when a domain shift is encountered. |
K. Deng; P. C. Woodland; |
52 | Adapted Multimodal Bert with Layer-Wise Fusion for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Adapted Multimodal BERT (AMB), a BERT-based architecture for multimodal tasks that uses a combination of adapter modules and intermediate fusion layers. |
O. S. Chlapanis; G. Paraskevopoulos; A. Potamianos; |
53 | Adapter Tuning With Task-Aware Attention Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, this paper proposes the task-aware attention mechanism (TAM) to enhance adapter tuning. |
J. Lu; F. Jin; J. Zhang; |
54 | Adapting A Self-Supervised Speech Representation for Noisy Speech Emotion Recognition By Using Contrastive Teacher-Student Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For adaptation, it is essential to balance between acquiring new knowledge from noisy speech and keeping the previous knowledge acquired during the pre-training and fine-tuning of the model. Therefore, we propose a contrastive teacher-student learning framework to retrain a self-supervised speech representation model for noisy SER. |
S. -G. Leem; D. Fulford; J. -P. Onnela; D. Gard; C. Busso; |
55 | Adapting Exploratory Behaviour in Active Inference for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we integrate the imitation learning method with active inference to minimize the expected free energy under the supervision of an expert model. |
S. Nozari; A. Krayani; P. Marin; L. Marcenaro; D. Martin; C. Regazzoni; |
56 | Adapting Self-Supervised Models to Multi-Talker Speech Recognition Using Speaker Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the adaptation of upstream SSL models to the multi-talker automatic speech recognition (ASR) task under two conditions. |
Z. Huang; D. Raj; P. García; S. Khudanpur; |
57 | Adaptive Axonal Delays in Feedforward Spiking Neural Networks for Accurate Spoken Word Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we consider a learnable axonal delay capped at a maximum value, which can be adapted according to the axonal delay distribution in each network layer. |
P. Sun; E. Eqlimi; Y. Chua; P. Devos; D. Botteldooren; |
58 | Adaptive CSI Feedback with Hidden Semantic Information Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a deep-learning-empowered adaptive CSI feedback compression and quantization based on the information-bottleneck principle, where the sensory data transmission is hidden within the CSI feedback to eliminate extra communication cost and preserve the data privacy at the same time. |
J. Cao; L. Lian; Y. Mao; B. Clerckx; |
59 | Adaptive Data Augmentation for Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AdDA, which implements a closed-loop feedback structure to a generic contrastive learning network. |
Y. Zhang; H. Zhu; S. Yu; |
60 | Adaptive Eccm for Mitigating Smart Jammers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper considers adaptive radar electronic counter-counter measures (ECCM) to mitigate ECM by an adversarial jammer. |
S. Jain; K. Pattanayak; V. Krishnamurthy; C. Berry; |
61 | Adaptive Endpointing with Deep Contextual Multi-Armed Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to provide a solution for adaptive endpointing by proposing an efficient method for choosing an optimal endpointing configuration given utterance-level audio features in an online setting, while avoiding hyperparameter grid-search. |
D. J. Min; A. Stolcke; A. Raju; C. Vaz; D. He; V. Ravichandran; V. A. Trinh; |
62 | Adaptive Filtering Algorithms For Set-Valued Observations-Symmetric Measurement Approach To Unlabeled And Anonymized Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By using symmetric polynomials, we formulate a symmetric measurement equation that maps the observation set to a unique vector. |
V. Krishnamurthy; |
63 | Adaptive Gaussian Nested Filter for Parameter Estimation and State Tracking in Dynamical Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the adaptive Gaussian nested filter (AGNesF), the first nested method that adapts the number of samples to estimate both the static parameters and the dynamical variables of a state-space model. |
S. Pérez-Vieites; V. Elvira; |
64 | Adaptive Knowledge Distillation Between Text and Speech Pre-Trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since the semantic and granularity gap between text and speech has been omitted in literature, which impairs the distillation, we propose the Prior-informed Adaptive knowledge Distillation (PAD) that adaptively leverages text/speech units of variable granularity and prior distributions to achieve better global and local alignments between text and speech pre-trained models. |
J. Ni; Y. Ma; W. Wang; Q. Chen; D. Ng; H. Lei; T. H. Nguyen; C. Zhang; B. Ma; E. Cambria; |
65 | Adaptive Large Margin Fine-Tuning For Robust Speaker Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our experiments, we also find that LMFT fails in short duration and other verification scenarios. To solve this problem, we propose the duration-based and similarity-based adaptive large margin fine-tuning (ALMFT) strategy. |
L. Zhang; Z. Chen; Y. Qian; |
66 | Adaptive Mask Co-Optimization for Modal Dependence in Multimodal Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, multimodal models may incline to rely on some modalities that are easier to be learned, while under-fit the other modalities and lead to sub-optimal results. To address this problem, we propose a novel plug-in module, Adaptive Mask Co-optimization (AMCo), which could be inserted into advanced models. |
Y. Zhou; X. Liang; S. Zheng; H. Xuan; T. Kumada; |
67 | Adaptive Multi-Corpora Language Model Training for Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel adaptive multi-corpora training algorithm that dynamically learns and adjusts the sampling probability of each corpus along the training process. |
Y. Ma; Z. Liu; X. Zhang; |
68 | Adaptive Noise Canceller Algorithm with SNR-Based Stepsize and Data-Dependent Averaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an adaptive noise canceller algorithm with an SNR-based stepsize and data-dependent averaging. |
A. Sugiyama; |
69 | Adaptive Non-Local Generative Adversarial Networks for Low-Dose CT Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For different input images, conventional neural networks always adopt a fixed number of channels which limits the performance of deep networks. To address these problems, we propose a channel-adaptive convolution and patch selection (CAPS) module to enhance the feature extraction of our network. |
L. Yang; H. Liu; F. Shang; Y. Liu; |
70 | Adaptive Scale and Spatial Aggregation for Real-Time Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The accuracy of detection may be limited by their insufficient capabilities to obtain powerful feature representation, which is a notoriously onerous task in machine vision applications. Aiming at this problem, this study proposes a method of adaptive aggregation of features at both scale and spatial levels in an anchor-free framework: 1) at the scale level, a Multi-scale Point Feature Fusion (MPFF) module has been proposed to fuse point features from multiple scales via a self-adaptive re-weighting manner; 2) at the spatial level, a Restrained Deformable Convolution (R-DCN) has been designed to focus on the most informative features in a pre-defined region while avoiding the remote feature distraction. |
W. Chen; Y. He; Z. Liang; Y. Guo; |
71 | Adaptive Semantic Fusion Framework for Unsupervised Monocular Depth Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, numerous existing methods relying on photometric consistency are excessively susceptible to variations in illumination and suffer in the regions with strong reflection. To overcome this limitation, we propose a novel unsupervised depth estimation framework named ColorDepth, which forces the model to explore object semantic to infer depth. |
R. Li; H. Yu; K. Du; Z. Xiao; B. Yan; Z. Yuan; |
72 | Adaptive Simulated Annealing Through Alternating Rényi Divergence Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose here a new simulated annealing algorithm with adaptive cooling schedule, which draws samples from variational approximations of the Boltzmann distributions. |
T. Guilmeau; E. Chouzenoux; V. Elvira; |
73 | Adaptive Step-Size Methods for Compressed SGD Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In particular, we introduce a scaling technique for the descent step, which we use to establish order-optimal convergence rates for convex-smooth and strong convex-smooth objectives under an interpolation condition, and for non-convex objectives under a strong growth condition. |
A. M. Subramaniam; A. Magesh; V. V. Veeravalli; |
74 | Adaptive Submanifold-Preserving Sparse Regression for Feature Selection And Multiclass Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel embedded feature selection method, which is able to select the informative and discriminative features with the underlying submanifolds of data in intra-class being well preserved so as to improve the classification performance. |
R. Xu; X. Liang; |
75 | Adaptive Time-Scale Modification for Improving Speech Intelligibility Based On Phoneme Clustering For Streaming Services Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes an adaptive time-scale modification algorithm (ATSM); that adaptively varies the speaking rate for each phoneme cluster of speech to improve speech intelligibility. |
S. Jang; J. Kim; Y. -J. Kim; J. -H. Chang; |
76 | A Database for Multi-Modal Short Video Quality Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we establish a novel database dubbed MMSVD-Douyin for assessing multi-modal short video quality under consideration of three evaluation criteria. |
Y. Zhang; C. Wang; S. Zhang; X. Cao; |
77 | A Dataset for Audio-Visual Sound Event Detection in Movies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a dataset of audio events called Subtitle-Aligned Movie Sounds (SAM-S). |
R. Hebbar; D. Bose; K. Somandepalli; V. Vijai; S. Narayanan; |
78 | A Deep Disentangled Approach for Interpretable Hyperspectral Unmixing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a physically interpretable deep learning method for hyperspectral unmixing accounting for nonlinearity and the variability of the endmembers. |
R. A. Borsoi; T. Imbiriba; D. Erdo?mu?; |
79 | A Deep Fusion Rule for Infrared and Visible Image Fusion: Feature Communication for Importance Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing fusion rules may not extract the most useful information and cannot effectively retain important information. To solve this problem, we propose a novel deep learning-based fusion rule. |
X. Lv; J. Cheng; G. Lv; Z. Wei; |
80 | A Deep Temporal Factor Analysis Method for Large Scale Financial Portfolio Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a neural network temporal factor analysis (NN-TFA) model for dimensionality reduction and it enables us to build a scalable deep reinforcement learning method for large-scale portfolio management. |
Y. Zhou; R. Su; S. Tu; L. Xu; |
81 | ADHD Classification with Biomarker Identification Using A Triplet Loss Attention Auto-Encoding Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we proposed an attention auto-encoding network with triplet loss (Tri-Att-AENet) for both ADHD classification and biomarker identification. |
Y. Tang; Y. Chen; Y. Gao; A. Jiang; L. Zhou; |
82 | A Discriminative Multi-Channel Noise Feature Representation Model for Image Manipulation Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the ability of different noise feature modules to localize different manipulation types. |
Y. Zhou; H. Wang; Q. Zeng; R. Zhang; S. Meng; |
83 | A Distributed Adaptive Algorithm for Non-Smooth Spatial Filtering Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the DASF algorithm has only been shown to converge for filtering problems that can be expressed as smooth optimization problems. In this paper, we explore an extension of the DASF algorithm to a family of non-smooth spatial filtering problems, allowing the addition of non-smooth regularizers to the optimization problem, which could for example be used to perform node selection, and eliminate nodes not contributing to the filter objective, therefore further reducing communication costs. |
C. Hovine; A. Bertrand; |
84 | A DNN-Based Hearing-Aid Strategy For Real-Time Processing: One Size Fits All Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we present a deep-neural-network (DNN) HA processing strategy that can provide individualised sound processing for the audiogram of a listener using a single model architecture. |
F. Drakopoulos; A. Van Den Broucke; S. Verhulst; |
85 | A DNN Based Normalized Time-Frequency Weighted Criterion for Robust Wideband DoA Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve the robustness against interference, we propose a DNN based normalized time-frequency (T-F) weighted criterion which minimizes the distance between the candidate steering vectors and the filtered snapshots in the T-F domain. |
K. -L. Chen; C. -H. Lee; B. D. Rao; H. Garudadri; |
86 | A Dual-Branch Adaptive Distribution Fusion Framework for Real-World Facial Expression Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address FER task via label distribution learning paradigm, and develop a dual-branch Adaptive Distribution Fusion (AdaDF) framework. |
S. Liu; Y. Xu; T. Wan; X. Kui; |
87 | A Dual-Path Transformer Network for Scene Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose DPTNet (Dual-Path Transformer Network), a simple yet effective network to utilize both global and local information for the scene text detection task. |
J. Lin; Y. Yan; H. Wang; |
88 | Advancing The Dimensionality Reduction of Speaker Embeddings for Speaker Diarisation: Disentangling Noise and Informing Speech Activity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The objective of this work is to train noise-robust speaker embeddings adapted for speaker diarisation. |
Y. J. Kim; H. -S. Heo; J. -W. Jung; Y. Kwon; B. -J. Lee; J. S. Chung; |
89 | Adversarial Attacks on Genotype Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such studies commonly include steps involving the analysis of the genomic sequences’ structure using dimensionality reduction techniques and ancestry inference methods. In this paper we show how white-box gradient-based adversarial attacks can be used to corrupt the output of genomic analyses, and we explore different machine learning techniques to detect such manipulations. |
D. M. Montserrat; A. G. Ioannidis; |
90 | Adversarial Contrastive Distillation with Adaptive Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a novel structured ARD method called Contrastive Relationship DeNoise Distillation (CRDND). |
Y. Wang; Z. Chen; D. Yang; Y. Liu; S. Liu; W. Zhang; L. Qi; |
91 | Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents novel variational auto-encoder generative adversarial network (VAE-GAN) based personalized disordered speech augmentation approaches that simultaneously learn to encode, generate and discriminate synthesized impaired speech. |
Z. Jin; X. Xie; M. Geng; T. Wang; S. Hu; J. Deng; G. Li; X. Liu; |
92 | Adversarial Guitar Amplifier Modelling with Unpaired Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an audio effects processing framework that learns to emulate a target electric guitar tone from a recording. |
A. Wright; V. Välimäki; L. Juvela; |
93 | Adversarially Robust Fairness-Aware Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a minimax framework, in this paper, we aim to design an adversarially robust fair regression model that achieves optimal performance in the presence of an attacker who is able to perform a rank-one attack on the dataset. |
Y. Jin; L. Lai; |
94 | Adversarial Network Pruning By Filter Robustness Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Previous studies maintain the robustness of the pruned networks by combining adversarial training and network pruning but ignore preserving the robustness at a high sparsity ratio in structured pruning. To address such a problem, we propose an effective filter importance criterion, Filter Robustness Estimation (FRE), to evaluate the importance of filters by estimating their contribution to the adversarial training loss. |
X. Zhuang; Y. Ge; B. Zheng; Q. Wang; |
95 | Adversarial Permutation Invariant Training for Universal Sound Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we complement PIT with adversarial losses but find it challenging with the standard formulation used in speech source separation. |
E. Postolache; J. Pons; S. Pascual; J. Serrà; |
96 | A Dynamic Cross-Scale Transformer with Dual-Compound Representation for 3D Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, single-scale attention fails to achieve a balance between feature representation and semantic information. Aiming at the above problems, we propose a window-based dynamic crossscale cross-attention transformer (DCS-Former) for precise representation of the diversity features. |
R. Zhang; Z. Wang; Z. Wang; J. Xin; |
97 | A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we propose a novel approach to construct the interactive graph based on the injection of label semantics, which can automatically update the graph to better alleviate error propagation. |
Z. Zhu; W. Xu; X. Cheng; T. Song; Y. Zou; |
98 | AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the issue, we proposed an angular-distance-based multiple SELD (AD-YOLO), which is an adaptation of the You Look Only Once algorithm for SELD. |
J. S. Kim; H. Joon Park; W. Shin; S. W. Han; |
99 | AE-Flow: Autoencoder Normalizing Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce supervision to the training process of normalizing flows, without the need for parallel data. |
J. Mosiński; P. Biliński; T. Merritt; A. Ezzerg; D. Korzekwa; |
100 | AERO: Audio Super Resolution in The Spectral Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present AERO, a audio super-resolution model that processes speech and music signals in the spectral domain. |
M. Mandel; O. Tal; Y. Adi; |
101 | A Fast and Accurate Pitch Estimation Algorithm Based on The Pseudo Wigner-Ville Distribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we capitalize on the high time and frequency resolution of the pseudo Wigner-Ville distribution (PWVD) and propose a new PWVD-based pitch estimation method. |
Y. Liu; P. Wu; A. W. Black; G. K. Anumanchipalli; |
102 | A Few Shot Learning of Singing Technique Conversion Based on Cycle Consistency Generative Adversarial Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate the proposed methods on three datasets that were commonly used in pop songs which involve singing techniques in terms of breathy voice, vibrato, and vocal fry. |
P. -W. Chen; V. -W. Soo; |
103 | Affinity Learning With Blind-Spot Self-Supervision for Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we extend the blind-spot based self-supervised denoising by using affinity learning to remove noise from affected pixels. |
Y. Zhou; L. Zhou; I. H. Laradji; T. Lun Lam; Y. Xu; |
104 | A Flow-Guided Non-Local Alignment Network for Video Compressive Sensing Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop a flow-guided non-local alignment network (FNLAN), which can build accurate temporal dependencies among adjacent frames to help video recovery. |
C. Zhou; C. Chen; D. Zhang; |
105 | A Framework for Unified Real-Time Personalized and Non-Personalized Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present an approach to train a single speech enhancement network that can perform both personalized and non-personalized speech enhancement. |
Z. Wang; R. Giri; D. Shah; J. -M. Valin; M. M. Goodwin; P. Smaragdis; |
106 | A Frequency-Domain Recursive Least-Squares Adaptive Filtering Algorithm Based On A Kronecker Product Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a frequency-domain recursive least-squares (RLS) adaptive filtering algorithm for identifying time-varying acoustic systems in noisy environments. |
H. He; J. Chen; J. Benesty; Y. Yu; |
107 | A Frequency-Weighted Leaky Fxlms Algorithm with Application to Feedback Active Noise Control Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the traditional leaky filtered-x least mean square (FxLMS) algorithm, a frequency-weighted leaky FxLMS algorithm is proposed in this paper, where the weight factors of the proposed algorithm are characterized in frequency-domain and can be calculated directly by solving a constrained optimization problem. |
Y. Tang; H. Zhang; |
108 | A Fusion-Based and Multi-Layer Method for Low Light Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a low light image enhancement algorithm using a fusion-based and multi-layer model. |
X. Zhou; J. Guo; H. Liu; C. Wang; |
109 | A Game of Snakes and Gans Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we establish a connection between active contour models (snakes) and GANs. |
S. Asokan; F. S. Mohammed; C. Sekhar Seelamantula; |
110 | A Gaussian Latent Variable Model for Incomplete Mixed Type Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a Gaussian process framework that efficiently captures the information from mixed numerical and categorical data that effectively incorporates missing variables. |
M. Ajirak; P. M. Djurić; |
111 | A Generalized Subspace Distribution Adaptation Framework for Cross-Corpus Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel transfer learning framework, named generalized subspace distribution adaptation (GSDA), to tackle the challenging cross-corpus speech emotion recognition problem. |
S. Li; P. Song; L. Ji; Y. Jin; W. Zheng; |
112 | A Geometric Surrogate for Simulation Calibration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we employ a machine learning-based approach to perform calibration faster and more accurately, with two components: a surrogate model of the simulation that is easy to obtain but not physically interpretable and a bridge model that maps the surrogate to the calibrated parameters. |
L. S. Souza; B. Batalo; K. Yamazaki; |
113 | Agile Radio Map Prediction Using Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a runtime-efficient radio frequency (RF) map prediction method based on UNet convolutional neural networks (CNNs), trained on a large-scale 3D maps dataset. |
E. Krijestorac; H. Sallouha; S. Sarkar; D. Cabric; |
114 | A Graph Neural Network Multi-Task Learning-Based Approach for Detection and Localization of Cyberattacks in Smart Grids Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a multi-task learning-based approach that performs both tasks simultaneously using a graph neural network (GNN) with stacked convolutional Chebyshev graph layers. |
A. Takiddin; R. Atat; M. Ismail; K. Davis; E. Serpedin; |
115 | A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a hierarchical framework, based on chain regression models, for affective recognition from VBs, that explicitly considers multiple relationships: (i) between emotional states and diverse cultures; (ii) between low-dimensional (arousal & valence) and high-dimensional (10 emotion classes) emotion spaces; and (iii) between various emotion classes within the high-dimensional space. |
J. Li; X. Wu; K. Song; D. Li; X. Liu; H. Meng; |
116 | A Highly Interpretable Deep Equilibrium Network for Hyperspectral Image Deconvolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a novel technique for the hyperspectral image deconvolution problem is developed. |
A. Gkillas; D. Ampeliotis; K. Berberidis; |
117 | A Holistic Cascade System, Benchmark, and Human Evaluation Protocol for Expressive Speech-to-Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a holistic cascade system for expressive S2ST, combining multiple prosody transfer techniques previously considered only in isolation. |
W. -C. Huang; B. Peloquin; J. Kao; C. Wang; H. Gong; E. Salesky; Y. Adi; A. Lee; P. -J. Chen; |
118 | A Hybrid Deep Neural Network for Nonlinear Causality Analysis in Complex Industrial Control System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel neural causality analysis network with directed acyclic graph to locate the root cause for complex industrial systems. |
T. Feng; Q. Chen; Y. Shi; X. Lang; L. Xie; H. Su; |
119 | Aiding Speech Harmonic Recovery in DNN-Based Single Channel Noise Reduction Using Cepstral Excitation Manipulation (CEM) Components Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, inspired by previous work on speech harmonic enhancement using statistical methods, we present a loss function component we term cepstral excitation manipulation (CEM) loss, which is constructed based on the fundamental frequency-related cepstral coefficients. |
Y. Song; N. Madhu; |
120 | A Knowledge-Driven Vowel-Based Approach of Depression Classification from Speech Using Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel explainable machine learning (ML) model that identifies depression from speech, by modeling the temporal dependencies across utterances and utilizing the spectrotemporal information at the vowel level. |
K. Feng; T. Chaspari; |
121 | A Large-Scale Pretrained Deep Model for Phishing URL Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose PhishBERT, a veritable pretrained deep transformer network model for phishing URL detection. |
Y. Wang; W. Zhu; H. Xu; Z. Qin; K. Ren; W. Ma; |
122 | A Learnable Spatial Mapping for Decoding The Directional Focus of Auditory Attention Using EEG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a learnable spatial mapping (LSM) mechanism to transform EEG channels into a 2D form, which can be combined with the spatial attention mechanism to better extract the inherent coherence among the electrodes. |
Y. Zhang; H. Ruan; Z. Yuan; H. Du; X. Gao; J. Lu; |
123 | Aleatoric Uncertainty Estimation of Overnight Sleep Statistics Through Posterior Sampling Using Conditional Normalizing Flows Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead of factorizing, we propose to jointly model the sequence of sleep stages, by introducing U-Flow, a conditional normalizing flow network. |
H. v. Gorp; M. M. van Gilst; P. Fonseca; S. Overeem; R. J. G. van Sloun; |
124 | Algebraic Convolutional Filters on Lie Group Algebras Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Taking an algebraic signal processing perspective, we propose a novel convolutional filter from the Lie group algebra directly, thereby removing the need to lift altogether. |
H. Kumar; A. Parada-Mayorga; A. Ribeiro; |
125 | A Lightweight Convolutional Neural Network Using Feature Filtering Module Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: At the same time, if a large number of channel connections are used to fuse the feature layer, the parameter quantity will increase dramatically. In this work, we propose a new network architecture with dense connection and feature filtering to tackle this problem. |
N. Jing; Y. Zhang; |
126 | A Lightweight Fourier Convolutional Attention Encoder for Multi-Channel Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The spectral-spatial cues are crucial in beamforming weights estimation, however, many existing works fail to optimally predict the beamforming weights with an absence of adequate spectral-spatial information learning. To tackle this challenge, we propose a Fourier convolutional attention encoder (FCAE) to provide a global receptive field over the frequency axis and boost the learning of spectral contexts and cross-channel features. |
S. Sun; J. Jin; Z. Han; X. Xia; L. Chen; Y. Xiao; P. Ding; S. Song; R. Togneri; H. Zhang; |
127 | Alignment Entropy Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we use entropy to measure a model’s uncertainty, i.e. how it chooses to distribute the probability mass over the set of allowed alignments. |
E. Variani; K. Wu; D. Rybach; C. Allauzen; M. Riley; |
128 | Align, Write, Re-Order: Explainable End-to-End Speech Translation Via Operation Sequence Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The black-box nature of end-to-end speech-to-text translation (E2E ST) makes it difficult to understand how source language inputs are being mapped to the target language. To solve this problem, we propose to simultaneously generate automatic speech recognition (ASR) and ST predictions such that each source language word is explicitly mapped to a target language word. |
M. Omachi; B. Yan; S. Dalmia; Y. Fujita; S. Watanabe; |
129 | A Low-Latency Deep Hierarchical Fusion Network for Fullband Acoustic Echo Cancellation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our submission to the fourth Acoustic Echo Cancellation (AEC) Challenge, which is part of ICASSP 2023 Signal Processing Grand Challenge. |
H. Zhao; N. Li; R. Han; X. Zheng; C. Zhang; L. Guo; B. Yu; |
130 | A Low-Latency Hybrid Multi-Channel Speech Enhancement System For Hearing Aids Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper summarizes a hybrid multi-channel speech enhancement system for the ICASSP Signal Processing Grand Challenge: Clarity Challenge (Speech Enhancement for Hearing Aids) 2023. |
T. Lei; Z. Hou; Y. Hu; W. Yang; T. Sun; X. Rong; D. Wang; K. Chen; J. Lu; |
131 | Alternating Constrained Minimization Based Approximate Message Passing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we revisit the GAMP algorithm (as e.g. for sparse Bayesian learning (SBL)) by more rigorously applying an alternating constrained minimization strategy to an appropriately reparameterized LSL BFE. |
C. K. Thomas; D. Slock; |
132 | Alternating Phase Langevin Sampling with Implicit Denoiser Priors for Phase Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a way of leveraging the prior implicitly learned by a denoiser to solve phase retrieval problems by incorporating it in a classical alternating minimization framework. |
R. Agrawal; O. Leong; |
133 | A Magnetic Framelet-Based Convolutional Neural Network for Directed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Framelet-MagNet, a magnetic framelet-based spectral GCNN for directed graphs (digraphs). |
L. Lin; J. Gao; |
134 | A Mathematical Model for Neuronal Activity and Brain Information Processing Capacity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce an information conservation law for regional brain activation, and establish a mathematical model to quantify the relationship between the information processing capacity, input storage capacity, the arrival rate of exogenous information, and the neuronal activity of a brain region—referred to as the brain information processing capacity (IPC) model. |
Y. Zheng; D. Zhu; J. Ren; T. Liu; K. Friston; T. Li; |
135 | AMC-Net: An Effective Network for Automatic Modulation Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the drawback, we propose a novel AMC-Net that improves recognition by denoising the input signal in the frequency domain while performing multi-scale and effective feature extraction. |
J. Zhang; T. Wang; Z. Feng; S. Yang; |
136 | A Memory-Free Evolving Bipolar Neural Network for Efficient Multi-Label Stream Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes an Evolving Bipolar Network architecture called EBN-MSL consisting of two parallel layers trained in a maximum margin framework to learn efficiently in a continual multi-label learning scenario without utilizing any samples stored from previous tasks. |
S. Mishra; S. Sundaram; |
137 | A Meta-Gnn Approach to Personalized Seizure Detection and Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a personalized seizure detection and classification framework that quickly adapts to a specific patient from limited seizure samples. |
A. Rahmani; A. Venkitaraman; P. Frossard; |
138 | A Method of Constructing and Automatically Labeling Radio Frequency Signal Training Dataset for UAV Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since the UAV signal dataset cannot be directly applied to object detection, we propose a method using time-frequency domain filtering and automatic labeling to construct a large-scale time-frequency spectrogram dataset. |
C. Liu; R. Ma; Z. Si; M. Chi; |
139 | Amicable Aid: Perturbing Images to Improve Classification Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that by taking the opposite search direction of perturbation, an image can be modified to yield higher classification confidence and even a misclassified image can be made correctly classified. |
J. Kim; J. -H. Choi; S. Jang; J. -S. Lee; |
140 | A Model-Based Hearing Compensation Method Using A Self-Supervised Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a model-based hearing compensation method using a self-supervised framework with a given auditory model. |
Y. Niu; N. Li; X. Wu; J. Chen; |
141 | A Momentum Two-Gradient Direction Algorithm with Variable Step Size Applied to Solve Practical Output Constraint Issue for Active Noise Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a two-gradient direction ANC algorithm with a momentum factor to solve the saturation with faster convergence. |
X. Shen; D. Shi; Z. Luo; J. Ji; W. -S. Gan; |
142 | AMPose: Alternately Mixed Global-Local Attention Model for 3D Human Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the single-frame method still needs to model the physically connected relations among joints because the feature representations transformed only by global relations via the Transformer neglect information on the human skeleton. To deal with this problem, we propose a novel method in which the Transformer encoder and GCN blocks are alternately stacked, namely AMPose, to combine the global and physically connected relations among joints towards HPE. |
H. Lin; Y. Chiu; P. Wu; |
143 | A Multi-Channel Aggregation Framework for Object Detection in Large-Scale SAR Image Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, multiple sets of slices of large-scale images are first sliced using slicers of various sizes. |
C. Yang; C. Zhang; Z. Fan; Z. Yu; Q. Sun; M. Dai; |
144 | A Multi-Modal Approach For Context-Aware Network Traffic Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a Multi-Modal Classification method named MTCM to systematically exploit the context for the classification task. |
B. Pang; Y. Fu; S. Ren; S. Shen; Y. Wang; Q. Liao; Y. Jia; |
145 | A Multi-Scale Feature Aggregation Based Lightweight Network for Audio-Visual Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing AVSE models are heavyweight in the sense of parameter count, which is inappropriate for the deployment and practical applications. In this paper, we therefore present a lightweight AVSE approach (called M3Net) by incorporating several multi-modality, multi-scale and multi-branch strategies. |
H. Xu; L. Wei; J. Zhang; J. Yang; Y. Wang; T. Gao; X. Fang; L. Dai; |
146 | A Multi-Signal Perception Network for Textile Composition Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper propose a Multi-Signal Perception Network (MSPNet) for nondestructive textile composition identification, allowing the model to benefit from the advantages of multimodal data. |
B. Peng; L. He; D. Wu; M. Chi; J. Chen; |
147 | A Multi-Stage Hierarchical Relational Graph Neural Network for Multimodal Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a multi-stage hierarchical relational graph neural network (MHRG), catering to intra- and inter-modal dynamics learning with modality calibration. |
P. Gong; J. Liu; X. Zhang; X. Li; |
148 | A Multi-Stage Low-Latency Enhancement System for Hearing Aids Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce four major novelties: (1) a novel multi-stage system in both the magnitude and complex domains to better utilize phase information; (2) an asymmetric window pair to achieve higher frequency resolution with the 5ms latency constraint; (3) the integration of head rotation information and the mixture signals to achieve better enhancement; (4) a post-processing module that achieves higher hearing aid speech perception index (HASPI) scores with the hearing aid amplification stage provided by the baseline system. |
C. Ouyang; K. Fei; H. Zhou; C. Lu; L. Li; |
149 | A Multi-Stage Triple-Path Method For Speech Separation in Noisy and Reverberant Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In noisy and reverberant environments, the performance of deep learning-based speech separation methods drops dramatically because previous methods are not designed and optimized for such situations. To address this issue, we propose a multi-stage end-to-end learning method that decouples the difficult speech separation problem in noisy and reverberant environments into three sub-problems: speech denoising, separation, and de-reverberation. |
Z. Mu; X. Yang; X. Yang; W. Zhu; |
150 | A Mutual Implicit Sentiment Analysis Model with Bundle-Aware Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Different from them, the core idea of this paper is to form explicit-implicit bundles to ensure each batch has the two expressions, which does not rely on external resources. |
S. Cai; J. Yuan; L. Li; |
151 | An Adapter Based Multi-Label Pre-Training for Speech Separation and Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on HuBERT, this work investigates improving the SSL model for SS and SE. |
T. Wang; X. Chen; Z. Chen; S. Yu; W. Zhu; |
152 | An Adaptive DFE Using Light-Pattern-Protection Algorithm in 12 NM CMOS Technology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article proposes a novel light-pattern-protection (LPP) algorithm to achieve robustness. |
S. Xing; C. Lin; Y. Li; H. Wang; |
153 | An Adaptive Enhancement Method for Gastrointestinal Low-Light Images of Capsule Endoscope Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose an adaptive enhancement method for WCE images. |
P. Liu; Y. Wang; J. Yang; W. Li; |
154 | An Adaptive Plug-and-Play Network for Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The bottleneck is that deep networks and complex metrics tend to induce overfitting in FSL, making it difficult to further improve the performance. Towards this, we propose plug-and-play model-adaptive resizer (MAR) and adaptive similarity metric (ASM) without any other losses. |
H. Li; L. Li; Y. Huang; N. Li; Y. Zhang; |
155 | Analysing Diffusion-based Generative Approaches Versus Discriminative Approaches for Speech Restoration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we systematically compare the performance of generative diffusion models and discriminative approaches on different speech restoration tasks. |
J. -M. Lemercier; J. Richter; S. Welker; T. Gerkmann; |
156 | Analysing Discrete Self Supervised Speech Representation For Spoken Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work profoundly analyzes discrete self-supervised speech representations (units) through the eyes of Generative Spoken Language Modeling (GSLM). Following the findings of such an analysis, we propose practical improvements to the discrete unit for the GSLM. |
A. Sicherman; Y. Adi; |
157 | Analysing The Masked Predictive Coding Training Criterion for Pre-Training A Speech Representation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the impact of MPC loss on the type of information learnt at various layers in the HuBERT model, using nine probing tasks. |
H. Yadav; S. Sitaram; R. R. Shah; |
158 | Analysis and Re-Synthesis of Natural Cricket Sounds Assessing The Perceptual Relevance of Idiosyncratic Parameters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes cricket sounds from a parametric point of view, characterizes their main temporal and spectral features, namely jitter, shimmer and frequency sweeps, and explains a re-synthesis process generating modified natural cricket sounds. |
M. Oliveira; V. Almeida; J. Silva; A. Ferreira; |
159 | Analysis and Transformation of Voice Level in Singing Voice Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a neural auto-encoder that transforms the musical dynamic in recordings of singing voice via changes in voice level. |
F. Bous; A. Roebel; |
160 | Analysis Of Noisy-Target Training For Dnn-Based Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct various analyses to deepen our understanding of NyTT. |
T. Fujimura; T. Toda; |
161 | Analyzing Acoustic Word Embeddings from Pre-Trained Self-Supervised Speech Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study several pre-trained models and pooling methods for constructing AWEs with self-supervised representations. |
R. Sanabria; H. Tang; S. Goldwater; |
162 | An Analysis of Degenerating Speech Due to Progressive Dysarthria on ASR Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The aims of this study were to (1) analyze the change of performance of ASR over time in individuals with degrading speech, and (2) explore mitigation strategies to optimize recognition throughout disease progression. |
K. Tomanek; K. Seaver; P. -P. Jiang; R. Cave; L. Harrell; J. R. Green; |
163 | An Antispoofing Approach in Biometric Authentication System for A Smartcard Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To meet low-power constraints for smartcards, we propose a simple convolutional neural network-based architecture and dedicated hardware to handle the problem. |
H. -S. Lee; M. -K. Song; J. Lee; Y. Seong; D. Kim; K. Bae; S. Song; |
164 | An Application of Quantum Mechanics to Attention Methods in Computer Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes the quantum-state-based mapping (QSM) for machine learning. |
J. Zhang; Y. Luo; P. Cheng; Z. Li; H. Wu; K. Yu; W. An; J. Zhou; |
165 | An Approach to Ontological Learning from Weak Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ontologies encompass a formal representation of knowledge through the definition of concepts or properties of a domain, and the relationships between those concepts. In this work, we seek to investigate whether using this ontological information will improve learning from weakly labeled data, which are easier to collect since it requires only the presence or absence of an event to be known. |
A. Shah; L. Tang; P. H. Chou; Y. Y. Zheng; Z. Ge; B. Raj; |
166 | An ASR-Free Fluency Scoring Approach with Self-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes a novel ASR-free approach for automatic fluency assessment using self-supervised learning (SSL). |
W. Liu; K. Fu; X. Tian; S. Shi; W. Li; Z. Ma; T. Lee; |
167 | An Asynchronous Updating Reinforcement Learning Framework for Task-Oriented Dialog System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The errors from DST might misguide the dialog policy, and the system action brings extra difficulties for the DST module. To alleviate this problem, we propose Asynchronous Updating Reinforcement Learning framework (AURL) that updates the DST module and the DP module asynchronously under a cooperative setting. |
S. Zhang; Y. Hu; X. Wang; C. Yuan; |
168 | An Attention-Based Approach to Hierarchical Multi-Label Music Instrument Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the effective joint training in the multi-label setting, we propose two methods to model the connection between fine- and coarse-level tags, where one uses rule-based grouped max-pooling, the other one uses the attention mechanism obtained in a data-driven manner. |
Z. Zhong; M. Hirano; K. Shimada; K. Tateishi; S. Takahashi; Y. Mitsufuji; |
169 | An Augmented Gaussian Sum Filter Through A Mixture Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a way of controlling the covariances of the underlying Gaussian mixture. |
K. Tsampourakis; V. Elvira; |
170 | An Auto-Encoder Based Method for Camera Fingerprint Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a new method to compress high-dimensional floating-point fingerprints to low-dimensional binary features to save storage as well as maintaining their representative abilities. |
K. Zhang; Z. Liu; J. Hu; S. Wang; |
171 | An Automotive Radar Dataset For Object Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel 77 GHz automotive radar dataset of static and moving objects. |
A. Shyam; K. Komalavally; M. Gautam; V. Kancharla; V. Gudisa; V. Patil; A. Balasubramanian; S. Channappayya; |
172 | Anchored Speech Recognition with Neural Transducers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate anchored speech recognition to make neural transducers robust to background speech. |
D. Raj; J. Jia; J. Mahadeokar; C. Wu; N. Moritz; X. Zhang; O. Kalinli; |
173 | Ancient Chinese Word Segmentation and Part-of-Speech Tagging Using Distant Supervision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel augmentation method of ancient Chinese WSG and POS tagging data using distant supervision over parallel corpus. |
S. Feng; P. Li; |
174 | An Edge Alignment-Based Orientation Selection Method for Neutron Tomography Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an adaptive orientation selection method in which an MBIR reconstruction on previously-acquired measurements is used to define an objective function on orientations that balances a data-fitting term promoting edge alignment and a regularization term promoting orientation diversity. |
D. Yang; S. Tang; S. V. Venkatakrishnan; M. S. N. Chowdhury; Y. Zhang; H. Z. Bilheux; G. T. Buzzard; C. A. Bouman; |
175 | An Effective Anomalous Sound Detection Method Based on Representation Learning with Simulated Anomalies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an effective anomalous sound detection (ASD) method based on representation learning with simulated anomalies. |
H. Chen; Y. Song; Z. Zhuo; Y. Zhou; Y. -H. Li; H. Xue; I. McLoughlin; |
176 | An Efficient Beam-Sharing Algorithm for RIS-aided Simultaneous Wireless Information and Power Transfer Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an efficient beam-sharing algorithm for RIS-aided SWIPT systems. |
N. M. Tran; M. M. Amri; J. H. Park; D. I. Kim; K. W. Choi; |
177 | An Efficient Relay Selection Scheme for Relay-assisted HARQ Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Different from previous works, whether to participate in the transmission is determined by each RN itself in this work, thus reducing the overhead. |
W. Ding; M. Shikh-Bahaei; |
178 | An Empirical Study and Improvement for Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Prior works mainly focus on exploiting advanced networks to model and fuse different modality information to facilitate performance, while neglecting the effect of different fusion strategies on emotion recognition. In this work, we consider a simple yet important problem: how to fuse audio and text modality information is more helpful for this multimodal task. |
Z. Wu; Y. Lu; X. Dai; |
179 | An Empirical Study of Backdoor Attacks on Masked Auto Encoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, as a representation learning method, the backdoor pitfall of MAE, and its impact on downstream tasks, have not been fully investigated. In this paper, we use several common triggers to perform backdoor attacks on the pre-training phase of MAE and test them on downstream tasks. |
S. Zhuang; P. Xia; B. Li; |
180 | An Empirical Study on Speech Restoration Guided By Self-Supervised Speech Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on exploring the impact of self-supervised speech representation learning on the speech restoration task. |
J. Byun; Y. Ji; S. -W. Chung; S. Choe; M. -S. Choi; |
181 | An End-to-End Framework for Partial View-Aligned Clustering with Graph Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method to tackle it, termed An End-to-end Framework for Partial View-aligned Clustering with Graph structure(EGPVC). |
L. Zhao; Q. Xie; S. Wu; S. Ma; |
182 | An End-to-End Neural Network for Image-to-Audio Transformation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes an end-to-end (E2E) neural architecture for the audio rendering of small portions of display content on low resource personal computing devices. |
L. Chen; M. Deisher; M. Georges; |
183 | A Nested Ensemble Method to Bilevel Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such problems involve a nested relation between inner- and outer-level problems, which often have suboptimal solutions with poor generalization ability. To address this issue, this paper proposes an ensemble method tailored to bilevel learning. |
L. Chen; M. Abbas; T. Chen; |
184 | An Evaluation Platform to Scope Performance of Synthetic Environments in Autonomous Ground Vehicles Simulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present our Scoping Autonomous Vehicle Simulation (SAVeS) platform for benchmarking the performance of simulated environments for autonomous ground vehicle testing 1. |
X. Bai; L. Jiang; Y. Luo; A. Gupta; P. Kaveti; H. Singh; S. Ostadabbas; |
185 | A New Approach to Extract Fetal Electrocardiogram Using Affine Combination of Adaptive Filters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a new method based on affine combinations of adaptive filters to extract FECG signals. |
Y. Xuan; X. Zhang; S. S. Li; Z. Shen; X. Xie; L. P. Garcia; R. Togneri; |
186 | A New Personalized Efficacy Atlas for Pallidal Deep Brain Stimulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes to create a novel personalized efficacy atlas that warps functional scales and estimates activation volume by modeling electric field to characterize the link between electrode location and neurosurgical performance. |
X. Luo; |
187 | A New Probabilistic Distance Metric with Application in Gaussian Mixture Reduction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new distance metric to compare two continuous probability density functions. |
A. Sajedi; Y. A. Lawryshyn; K. N. Plataniotis; |
188 | A New Semi-Supervised Classification Method Using A Supervised Autoencoder for Biomedical Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a new approach to solve semi-supervised classification tasks for biomedical applications, involving a supervised autoencoder network. |
C. Gille; F. Guyard; M. Barlaud; |
189 | An Experimental Study on Sound Event Localization and Detection Under Realistic Testing Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study four data augmentation (DA) techniques and two model architectures on realistic data for sound event localization and detection (SELD). |
S. Niu; J. Du; Q. Wang; L. Chai; H. Wu; Z. Nian; L. Sun; Y. Fang; J. Pan; C. -H. Lee; |
190 | Angle-Of-Arrival Target Tracking Using A Mobile Uav In External Signal-Denied Environment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on the angle-of-arrival (AOA) target tracking problem using a mobile unmanned aerial vehicle (UAV) equipped with an angle-of-arrival (AOA) sensor to observe targets in an external-denied (no global positioning system, inertial navigation system aid) environment. |
B. Zhu; S. Xu; F. Rice; K. Doğançay; |
191 | Animal Re-Identification Algorithm for Posture Diversity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a Multi-pose Feature Fusion Network (MPFNet) is proposed to improve the performance of the Re-ID. |
Z. He; J. Qian; D. Yan; C. Wang; Y. Xin; |
192 | An Implicit Gradient Method for Constrained Bilevel Problems Using Barrier Approximation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose algorithms for solving a class of Bilevel Optimization (BLO) problems, with applications in areas such as signal processing, networking and machine learning. |
I. Tsaknakis; P. Khanduri; M. Hong; |
193 | An Improved Optimal Transport Kernel Embedding Method with Gating Mechanism for Singing Voice Separation and Speaker Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since essential features of a signal can be well reflected on its latent geometric structure of the feature distribution, a natural way to address SVS/SI is to extract the geometry-aware and distribution-related features of the target signal. To do this, this work introduces the concept of optimal transport (OT) to SVS/SI and proposes an improved optimal transport kernel embedding (iOTKE) to extract the target-distribution-related features. |
W. Yuan; Y. Bian; S. Wang; M. Unoki; W. Wang; |
194 | An Interpretable Model Using Evidence Information for Multi-Hop Question Answering Over Long Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To better use evidence information, we propose a loss function considering answer groups, which improves the reasoning ability of the reader in the Retriever-Reader architecture. |
Y. Chen; R. Liu; X. Liu; Y. Shi; G. Bai; |
195 | An Isotropy Analysis for Self-Supervised Acoustic Unit Embeddings on The Zero Resource Speech Challenge 2021 Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose using hidden-unit BERT (HuBERT) self-supervised representation learning, and we provide detailed analyses and comparisons of their isotropies of embedding space, which might influence performance. |
J. Chen; S. Sakti; |
196 | Anomalous Signal Detection for Cyber-Physical Systems Using Interpretable Causal Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a novel time series anomalous signal detection model based on neural system identification and causal inference to track the dynamics of CPS in a dynamical state-space and avoid absorbing spurious correlation caused by confounding bias generated by system noise, which improves the stability, security and interpretability in detection of anomalous signals from CPS. |
S. Zhang; J. Liu; |
197 | Anomalous Sound Detection Using Audio Representation with Machine ID Based Contrastive Learning Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper uses contrastive learning to refine audio representations for each machine ID, rather than for each audio sample. |
J. Guan; F. Xiao; Y. Liu; Q. Zhu; W. Wang; |
198 | Anomaly Detection in Optical Spectra VIA Joint Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a method based on a joint optimization procedure for estimating the major trends that characterize the spectrum, enabling the detection of anomalies even in the presence of few channels and heavy distortions. |
A. M. Rizzo; L. Magri; P. Invernizzi; E. Sozio; S. Piciaccia; A. Tanzi; S. Binetti; C. Alippi; G. Boracchi; |
199 | A Non-contact SpO2 Estimation Using Video Magnification and Infrared Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Eulerian Video Magnification (EVM) technique was used to enhance the subtle differences in skin pixel intensity in the facial area. |
T. Stogiannopoulos; G. -A. Cheimariotis; N. Mitianoudis; |
200 | An Online Algorithm for Chance Constrained Resource Allocation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To model their uncertainties, we take the chance constraints into the consideration. |
Y. Chen; Z. Deng; Y. Zhou; Z. Chen; Y. Chen; H. Hu; |
201 | An Online Algorithm for Contrastive Principal Component Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we introduce a modified cPCA method, which we denote cPCA∗, that is more interpretable and less sensitive to the choice of hyper-parameter. |
S. Golkar; D. Lipshutz; T. Tesileanu; D. B. Chklovskii; |
202 | A Novel Approach Based on Voronoï Cells to Classify Spectrogram Zeros of Multicomponent Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach to classify the spectrogram zeros (SZs) of multicomponent signals based on the analysis of the Voronoï cells associated with these zeros. |
N. Laurent; S. Meignen; M. A. Colominas; J. M. Miramont; F. Auger; |
203 | A Novel Cross-Component Context Model for End-to-End Wavelet Image Coding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore a promising alternative approach for neural compression, with an autoencoder whose latent space represents a nonlinear wavelet decomposition. |
A. Meyer; A. Kaup; |
204 | A Novel Efficient Multi-View Traffic-Related Object Detection Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Accordingly, we propose a novel traffic-related framework named CEVAS to achieve efficient object detection using multi-view video data. |
K. Yang; J. Liu; D. Yang; H. Wang; P. Sun; Y. Zhang; Y. Liu; L. Song; |
205 | A Novel Extrapolation Technique to Accelerate WMMSE Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel extrapolation technique to further accelerate WMMSE. |
K. Zhou; Z. Chen; G. Liu; Z. Chen; |
206 | A Novel Heart Rate Estimation Method Exploiting Heartbeat Second Harmonic Reconstruction Via Millimeter Wave Radar Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: At present, the interference of the second and third harmonics of respiration has become a significant problem that hinders further improvement of heart rate estimation accuracy. To handle this problem, we propose a novel method to estimate heart rate based on reconstructing the heartbeat second harmonic. |
T. Li; H. Shou; Y. Deng; Y. Zhou; C. Shi; P. Chen; |
207 | A Novel Metric For Evaluating Audio Caption Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel metric based on Text-to-Audio Grounding (TAG), to incorporate acoustic semantics. |
S. Bhosale; R. Chakraborty; S. K. Kopparapu; |
208 | A Novel Mode Selection-Based Fast Intra Prediction Algorithm for Spatial SHVC Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we have proposed a novel Mode Selection-Based Fast Intra Prediction algorithm for SSHVC. |
D. Wang; Y. Sun; W. Li; L. Xie; X. Lu; F. Dufaux; C. Zhu; |
209 | A Novel State Connection Strategy for Quantum Computing to Represent and Compress Digital Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new SCMFRQI (state connection modification FRQI) approach for further reducing the required bits by modifying the state connection using a reset gate rather than repeating the use of the same Toffoli gate connection as a reset gate. |
M. E. Haque; M. Paul; A. Ulhaq; T. Debnath; |
210 | A Novel Transformer-Based Pipeline for Lung Cytopathological Whole Slide Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel three-stage Transformer-based methodology for entire cytopathological whole slide image (WSI) classification. |
G. Li; Q. Liu; H. Liu; Y. Liang; |
211 | Antenna Impedance Estimation in Correlated Rayleigh Fading Channels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We formulate antenna impedance estimation in a classical estimation framework under correlated Raleigh fading channels. |
S. Wu; B. L. Hughes; |
212 | Any-to-Any Voice Conversion with F0 and Timbre Disentanglement and Novel Timbre Conditioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a PPG-based VC model that directly decodes waveforms. |
S. Kovela; R. Valle; A. Dantrey; B. Catanzaro; |
213 | A Parallel Attention Mechanism for Image Manipulation Detection and Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a parallel attention mechanism based network to localize tampered regions, which is inclined to have better generalization, while it possesses higher model capacity. |
Q. Zeng; H. Wang; Y. Zhou; R. Zhang; S. Meng; |
214 | A Patient Invariant Model Towards The Prediction of Freezing of Gait Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel algorithm to predict the onset of FoG using a single ankle accelerometer sensor. |
N. Ahmed; S. Singhal; A. Sinha; A. Ghose; |
215 | A Perceptual Neural Audio Coder with A Mean-Scale Hyperprior Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an end-to-end neural audio coder based on a mean-scale hyperprior model together with a perceptual optimization using a psychoacoustic model (PAM)-based loss function. |
J. Byun; S. Shin; Y. Park; J. Sung; S. Beack; |
216 | A Person Identification System for The ICASSP 2023 E-Prevention Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes SRCB-LUL team’s person identification system submitted to track 1 of the ICASSP 2023 Person Identification and Relapse Detection from Continuous Recordings of Biosignals (e-Prevention) challenge, which aims to identify the wearer of the smartwatch. |
J. Wu; M. Tu; |
217 | A Perturbation-Based Policy Distillation Framework with Generative Adversarial Nets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose deep imitation learning through a guidance-based policy distillation (GIL) algorithm. |
L. Zhang; Q. Liu; X. Zhang; Y. Xu; |
218 | APGP: Accuracy-Preserving Generative Perturbation for Defending Against Model Cloning Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel formulation to defend against model cloning attacks. |
A. Cheng; J. Cheng; |
219 | A Phoneme-Informed Neural Network Model For Note-Level Singing Transcription Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a method of finding note onsets of singing voice more accurately by leveraging the linguistic characteristics of singing, which are not seen in other instruments. |
S. Yong; L. Su; J. Nam; |
220 | A Physically Explainable Framework for Human-Related Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce physically explainable dynamics to enhance visual representations. |
Y. Jiang; H. Li; C. Li; |
221 | A Point Is A Wave: Point-Wave Network for Place Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods concentrate on the multi-layer perception with intricate architectures, needing lots of parameters to learn with limited gains. Unlike these methods, we propose an innovative method by designing a point-wave module, modeling a point as a wave function to avoid losing the information of origin points. |
G. Li; R. Zhang; |
222 | Applying Independent Vector Analysis on EEG-Based Motor Imagery Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an original approach of IVA as a feature extraction step for Brain-Computer Interfaces, focused on the Motor Imagery (MI) paradigm. |
C. P. A. Moraes; B. Aristimunha; L. H. Dos Santos; W. H. L. Pinaya; R. Y. de Camargo; D. G. Fantinato; A. Neves; |
223 | Applying Symmetrical Component Transform for Industrial Appliance Classification in Non-Intrusive Load Monitoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a load recognition technique for NILM applying low complexity Fortesque Transform (FT). |
A. Faustine; L. Pereira; |
224 | Approximation Error Back-Propagation for Q-Function in Scalable Reinforcement Learning with Tree Dependence Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper applies the exponential decay property of scalable RL theory to a specific scenario where the network structure is a tree, and use KL (Kullback-Leibler) divergence to analyze the propagation of approximation error along the structure over time, in order to quantify its backtracking result. |
Y. Yan; Y. Dong; K. Ma; Y. Shen; |
225 | A Practical Distributed Active Noise Control Algorithm Overcoming Communication Restrictions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, this paper develops a novel DMCANC algorithm that utilizes the compensation filters and neighbour nodes’ information to counterbalance the cross-talk effect between channels while maintaining independent weight updating. |
J. Ji; D. Shi; Z. Luo; X. Shen; W. -S. Gan; |
226 | A Principled Approach to Model Validation in Domain Generalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, when it comes to model selection, most of these methods rely on traditional validation routines that select models solely based on the lowest classification risk on the validation set. In this paper, we theoretically demonstrate a trade-off between minimizing classification risk and mitigating domain discrepancy, i.e., it is impossible to achieve the minimum of these two objectives simultaneously. |
B. Lyu; T. Nguyen; M. Scheutz; P. Ishwar; S. Aeron; |
227 | A Privacy-Preserving Trajectory Mining Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider user privacy issues in location-based social networks (LBSNs). |
Z. Wang; S. X. Wu; J. Zhu; Y. Zhu; |
228 | A Probabilistic Framework for Pruning Transformers Via A Finite Admixture of Keys Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop a novel probabilistic framework for pruning attention scores and keys in transformers. |
T. M. Nguyen; T. Nguyen; L. Bui; H. Do; D. K. Nguyen; D. D. Le; H. Tran-The; N. Ho; S. J. Osher; R. G. Baraniuk; |
229 | A Processing Framework to Access Large Quantities of Whispered Speech Found in ASMR Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe our processing pipeline and a method for improved whispered activity detection (WAD) in the ASMR data. |
P. P. Zarazaga; G. Eje Henter; Z. Malisz; |
230 | Aprogressive Image Dehazing Framework with Inter and Intra Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, it is hard to train end-to-end dehazing networks due to the enormous gap between hazy images and corresponding clear images. In this paper, we propose a novel progressive image dehazing framework with inter and intra contrastive learning to solve the above problems. |
H. Xu; S. Liu; Y. Shu; F. Jiang; |
231 | A Progressive Neural Network for Acoustic Echo Cancellation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a hybrid signal processing and deep echo cancellation method, where a two-stage neural network is designed to remove residual echo progressively. |
Z. Chen; X. Xia; S. Sun; Z. Wang; C. Chen; G. Xie; P. Zhang; Y. Xiao; |
232 | A Prototypical Semantic Decoupling Method Via Joint Contrastive Learning for Few-Shot Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a Prototypical Semantic Decoupling method via joint Contrastive learning (PSDC) for few-shot NER. |
G. Dong; Z. Wang; L. Wang; D. Guo; D. Fu; Y. Wu; C. Zeng; X. Li; T. Hui; K. He; X. Cui; Q. Gao; W. Xu; |
233 | A Proximal Approach to IVA-G with Convergence Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a penalized maximum-likelihood framework for the problem, which enables us to derive a non-convex cost function that depends on the precision matrices of the source component vectors, the main mechanism by which IVA-G leverages correlation across the datasets. |
C. Cosserat; B. Gabrielson; E. Chouzenoux; J. -C. Pesquet; T. Adali; |
234 | A Quantum Approach for Stochastic Constrained Binary Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this work puts forth a quantum heuristic to cope with stochastic binary quadratically constrained quadratic programs (QCQP). |
S. Gupta; V. Kekatos; |
235 | A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a quantum kernel learning (QKL) framework to address the inherent data sparsity issues often encountered in training large-scare acoustic models in low-resource scenarios. |
C. -H. H. Yang; B. Li; Y. Zhang; N. Chen; T. N. Sainath; S. Marco Siniscalchi; C. -H. Lee; |
236 | A Radar-Jammer Zero-Sum Repeated Bayesian Game Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider an instance of a radar jamming countermeasure problem, where the radar and the jammer have uncertainty about the radar environment (for instance, about the noise variance and the radar cross section variance), and they have to account for these uncertainties with statistical priors. |
S. Suvorova; A. Pezeshki; R. Kyprianou; B. Moran; |
237 | A Reality Check and A Practical Baseline for Semantic Speech Embedding Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generating spoken word embeddings that possess semantic information has attracted lots of research interest. Among them, Speech2vec, as one of the most influential works, has … |
G. Chen; Y. Cao; |
238 | A Robust Kalman Filter Based Approach for Indoor Robot Positionning with Multi-Path Contaminated UWB Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, UWB performance suffers from multi-path outliers when signals reflect on surfaces or encounter obstacles. This paper describes an approach to mitigate this issue, based on a M-Estimation Robust Kalman Filter (M-RKF) and leveraging an adaptive empirical variance model for UWB signals. |
J. Cano; Y. Ding; G. Pages; E. Chaumette; J. Le Ny; |
239 | A Role Engineering Approach Based on Spectral Clustering Analysis for Restful Permissions in Cloud Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Generally, encryption methods are used to ensure privacy, which may result in high computation and communication overheads. |
Y. Xia; Y. Luo; W. Luo; Q. Shen; Y. Yang; Z. Wu; |
240 | Articulation GAN: Unsupervised Modeling of Articulatory Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Articulatory Generator to the Generative Adversarial Network paradigm, a new unsupervised generative model of speech production/synthesis. |
G. Beguš; A. Zhou; P. Wu; G. K. Anumanchipalli; |
241 | Articulatory Representation Learning Via Joint Factor Analysis and Neural Matrix Factorization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel articulatory representation decomposition algorithm that takes the advantage of guided factor analysis to derive the articulatory-specific factors and factor scores. |
J. Lian; A. W. Black; Y. Lu; L. Goldstein; S. Watanabe; G. K. Anumanchipalli; |
242 | A Sentiment and Syntactic-Aware Graph Convolutional Network for Aspect-Level Sentiment Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, constructing more accurate syntactic trees by introducing external knowledge has limited improvement on ungrammatical informal texts and has led to over-parameterization of the model. To alleviate this problem, we propose a sentiment and syntactic-aware graph convolutional network (SaS-GCN) that combines syntactic and sentiment relations. |
Y. Yang; X. Sun; Q. Lu; R. Sutcliffe; J. Feng; |
243 | A Sidecar Separator Can Convert A Single-Talker Speech Recognition System to A Multi-Talker One Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Although automatic speech recognition (ASR) can perform well in common non-overlapping environments, sustaining performance in multi-talker overlapping speech recognition remains … |
L. Meng; J. Kang; M. Cui; Y. Wang; X. Wu; H. Meng; |
244 | A Simple Scheme for Coupled Factorization for Hyperspectral Super-Resolution: Exploiting Sparsity in An Easy Way Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we develop a simple scheme for a coupled matrix factorization problem arising in the topic of hyperspectral super-resolution (HSR). |
Y. Li; W. -K. Ma; R. Wu; H. Liu; |
245 | A Simple Yet Effective Approach to Structured Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Structured prediction models aim at solving tasks where the output is a complex structure, rather than a single variable. |
W. Lin; Y. Li; L. Liu; S. Shi; H. -T. Zheng; |
246 | A Simulation-Based Framework for Urban Traffic Accident Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a framework to synthesize traffic videos containing both normal traffic and accident events by simulating the real urban traffic scenarios. |
H. Luo; F. Wang; |
247 | A Slot-Shared Span Prediction-Based Neural Network for Multi-Domain Dialogue State Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, the slot-independent design leads to poor scalability. In this paper, we propose a Slot-shared Span Prediction based Network (SSNet) with a general value extraction module for all slots to tackle these problems. |
A. Atawulla; X. Zhou; Y. Yang; B. Ma; F. Yang; |
248 | A Spatial-Temporal ECG Emotion Recognition Model Based on Dynamic Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel ECG emotion recognition method, which adopts a spatial and temporal ECG emotion recognition model based on dynamic feature fusion (DFF-STM) to learn spatial-temporal representations of different ECG areas. |
S. Xiao; X. Qiu; C. Tang; Z. Huang; |
249 | A Spatio-Temporal Decomposition Network for Compressed Video Quality Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Spatio-Temporal Decomposition Network (STDN) to reduce the compressed distortion with motion classification and frequency separation. |
K. Wang; F. Chen; Z. Ye; L. Wang; X. Wu; S. Pu; |
250 | A Speech Representation Anonymization Framework Via Selective Noise Perturbation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a speech anonymization framework that achieves privacy via noise perturbation to a selected subset of the high-utility representations extracted using a pre-trained speech encoder. |
M. Tran; M. Soleymani; |
251 | ASSD: Synthetic Speech Detection in The AAC Compressed Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our goal is to study if a small set of coding metadata contained in the AAC compressed bit stream is sufficient to detect synthetic speech. |
A. K. Singh Yadav; Z. Xiang; E. R. Bartusiak; P. Bestagini; S. Tubaro; E. J. Delp; |
252 | Assessing The Robustness of Deep Learning-Assisted Pathological Image Analysis Under Practical Variables of Imaging System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we construct an evaluation pathway to assess the stability and consistency of deep learning models under various customized scanner parameters. |
Y. Sun; C. Zhu; Y. Zhang; H. Li; P. Chen; L. Yang; |
253 | Assisted RTF-Vector-Based Binaural Direction of Arrival Estimation Exploiting A Calibrated External Microphone Array Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we assume the availability of a calibrated array of external microphones, which is characterized by a second database of anechoic prototype RTF vectors. |
D. Fejgin; S. Doclo; |
254 | Associative Learning Network for Coherent Visual Storytelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel Associative Learning Network for Coherent Visual Storytelling to explore the model’s association ability while telling a new story. |
X. Li; C. Liu; Y. Ji; |
255 | A Statistical Interpretation of The Maximum Subarray Problem Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Maximum subarray is a classical problem in computer science that given an array of numbers aims to find a contiguous subarray with the largest sum. We focus on its use for a noisy statistical problem of localizing an interval with a mean different from background. |
D. Wei; D. M. Malioutov; |
256 | AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an effective sound event detection (SED) method based on the audio spectrogram transformer (AST) model, pretrained on the large-scale AudioSet for audio tagging (AT) task, termed AST-SED. |
K. Li; Y. Song; L. -R. Dai; I. McLoughlin; X. Fang; L. Liu; |
257 | A Study of Audio Mixing Methods for Piano Transcription in Violin-Piano Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to analyze the impact of different data augmentation methods on piano transcription performance, specifically focusing on mixing techniques applied to violin-piano ensembles. |
H. Kim; J. Park; T. Kwon; D. Jeong; J. Nam; |
258 | A Study on Bias and Fairness in Deep Speaker Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we study the notion of fairness in recent SR systems based on 3 popular and relevant definitions, namely Statistical Parity, Equalized Odds, and Equal Opportunity. |
A. Hajavi; A. Etemad; |
259 | A Study on The Integration of Pipeline and E2E SLU Systems for Spoken Semantic Parsing Toward Stop Quality Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe our proposed spoken semantic parsing system for the quality track (Track 1) in Spoken Language Understanding Grand Challenge which is part of ICASSP Signal Processing Grand Challenge 2023. |
S. Arora; H. Futami; S. -L. Wu; J. Huynh; Y. Peng; Y. Kashiwagi; E. Tsunoo; B. Yan; S. Watanabe; |
260 | A Study on The Invariance in Security Whatever The Dimension of Images for The Steganalysis By Deep-Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study the performance invariance of convolutional neural networks when confronted with variable image sizes in the context of a more wild steganalysis. |
K. Planolles; M. Chaumont; F. Comby; |
261 | Asymmetric Polynomial Loss for Multi-Label Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Besides, the imbalance between redundant negative samples and rare positive samples could degrade the model performance. In this paper, we propose an effective Asymmetric Polynomial Loss (APL) to mitigate the above issues. |
Y. Huang; J. Qi; X. Wang; Z. Lin; |
262 | Asymptotically Optimal Nonparametric Classification Rules for Spike Train Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we consider the nonparametric classification problem for a class of spike train data characterized by nonparametricaly specified intensity functions. |
M. Pawlak; M. Pabian; D. Rzepka; |
263 | Asymptotic Bias and Variance of Kernel Ridge Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Kernel ridge regression is widely used but the theory of its performance has never been fully developed. |
V. Solo; |
264 | Asymptotic Distribution of Stochastic Mirror Descent Iterates in Average Ensemble Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the performance of the SMD on mean-field ensemble models and generalize earlier results obtained for SGD. |
T. Kargin; F. Salehi; B. Hassibi; |
265 | Asynchronous Federated Learning for Real-Time Multiple Licence Plate Recognition Through Semantic Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a federated learning framework is introduced to simultaneously detect multiple license plates over different network cameras through semantic communication. |
R. Xie; C. Li; X. Zhou; Z. Dong; |
266 | Asynchronous Social Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze belief convergence and steady-state learning performance for both traditional and adaptive formulations of social learning under asynchronous behavior by the agents, where some of the agents may decide to abstain from sharing any information with the network at some time instants. |
M. Cemri; V. Bordignon; M. Kayaalp; V. Shumovskaia; A. H. Sayed; |
267 | A Synthetic Corpus Generation Method for Neural Vocoder Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a synthetic corpus generation method for neural vocoder training, which can easily generate synthetic audio with an unlimited number at nearly no cost. |
Z. Wang; P. Liu; J. Chen; S. Li; J. Bai; G. He; Z. Wu; H. Meng; |
268 | A Targeted Sampling Strategy for Compressive Cryo Focused Ion Beam Scanning Electron Microscopy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a compressive sensing variant of cryo FIB-SEM capable of reducing the operational electron dose and increasing speed. |
D. Nicholls; J. Wells; A. W. Robinson; A. Moshtaghpour; M. Kobylynska; R. A. Fleck; A. I. Kirkland; N. D. Browning; |
269 | A Template Matching Approach for Reference Picture Padding in Video Coding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper shows that it also improves coding performance when applied in the context of reference picture padding. |
N. Horst; P. Das; M. Wien; |
270 | A Token-Level Contrastive Framework for Sign Language Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the publicly available SLT corpus is very limited, which causes the collapse of the token representations and the inaccuracy of the generated tokens. To alleviate this issue, we propose Con-SLT, a novel token-level Contrastive learning framework for Sign Language Translation , which learns effective token representations by incorporating token-level contrastive learning into the SLT decoding process. |
B. Fu; P. Ye; L. Zhang; P. Yu; C. Hu; X. Shi; Y. Chen; |
271 | A Topic-Enhanced Approach for Emotion Distribution Forecasting in Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a new task: Emotion Distribution Forecasting in Conversations (EDFC), which aims to predict the emotion distribution of next utterance. |
X. Lu; W. Zhao; Y. Zhao; B. Qin; Z. Zhang; J. Wen; |
272 | A Transformer-Based E2E SLU Model for Improved Semantic Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper demonstrates our contribution to the Spoken Language Understanding Grand Challenge at ICASSP 2023. |
O. Istaiteh; Y. Kussad; Y. Daqour; M. Habib; M. Habash; D. Gowda; |
273 | Attention Based Relation Network for Facial Action Units Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Attention Based Relation Network (ABRNet) for AU recognition, which can automatically capture AU relations without unnecessary or even disturbing predefined rules. |
Y. Wei; H. Wang; M. Sun; J. Liu; |
274 | Attention-Guided Deep Learning Framework For Movement Quality Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore an attention-guided transformer-based architecture for MQA. |
A. Kanade; M. Sharma; M. Muniyandi; |
275 | Attention Localness in Shared Encoder-Decoder Model For Text Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a localness attention network, with simplicity and feasibility in mind, which circles different local regions in the source article as contributors in different decoding steps. |
L. Huang; H. Wu; Q. Gao; G. Liu; |
276 | Attention Mixup: An Accurate Mixup Scheme Based On Interpretable Attention Mechanism for Multi-Label Audio Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes Attention MixUp (AMU), which only selects those segments that contain sound events for mixup, rather than simply mixing the entire sample. |
W. Liu; Y. Ren; J. Wang; |
277 | A Two-Branch Network for Video Anomaly Detection with Spatio-Temporal Feature Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a two-branch network to obtain the global and each local object’s action information of the clip respectively, where the local objects are extracted by a pre-trained object detector. |
G. Li; S. Chen; Y. Yang; Z. Guo; |
278 | A Two-Stage System for Spoken Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Two-Stage system for SLU, which consists of Automatic Speech Recognition (ASR) tasks and Natural Language Understanding (NLU) tasks. |
G. Zhang; S. Miao; L. Tang; P. Qian; |
279 | Audio Barlow Twins: Self-Supervised Audio Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As such, we present Audio Barlow Twins, a novel self-supervised audio representation learning approach, adapting Barlow Twins to the audio domain. |
J. Anton; H. Coppock; P. Shukla; B. W. Schuller; |
280 | Audio Coding With Unified Noise Shaping And Phase Contrast Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a unified noise-shaping (UNS) framework including FDNS and complex LPC-based TNS (CTNS) in the DFT domain is proposed to overcome the aliasing issues. |
B. Jo; S. Beack; T. Lee; |
281 | Audio Cross Verification Using Dual Alignment Likelihood Ratio Test Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a method for cross verifying a short audio query against a reference recording from which it was taken. |
H. Lei; A. Wonghirundacha; I. Bukey; T. J. Tsai; |
282 | Audiodec: An Open-Source Streaming High-Fidelity Neural Audio Codec Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose an open-source, streamable, and real-time neural audio codec that achieves strong performance along all three axes: it can reconstruct highly natural sounding 48 kHz speech signals while operating at only 12 kbps and running with less than 6 ms (GPU)/10 ms (CPU) latency. |
Y. -C. Wu; I. D. Gebru; D. Marković; A. Richard; |
283 | Audio-Driven Facial Landmark Generation in Violin Performance Using 3DCNN Network with Self Attention Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compile a violin soundtrack and facial expression dataset (VSFE) for modeling facial expressions in violin performance. |
T. -W. Lin; C. -L. Liu; L. Su; |
284 | Audio-Driven High Definetion and Lip-Synchronized Talking Face Generation Based on Face Reenactment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a novel audio-driven talking face generation method was proposed, which subtly converts the problem of improving video definition into the problem of face reenactment to produce both lip-synchronized and high- definition face video. |
X. Wang; Y. Zhang; W. He; Y. Wang; M. Li; Y. Wang; J. Zhang; S. Zhou; Z. Zhang; |
285 | Audio-Driven Talking Head Video Generation with Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel audio-driven diffusion method for generating high-resolution realistic videos of talking heads with the help of the denoising diffusion model. |
Y. Zhua; C. Zhanga; Q. Liub; X. Zhoub; |
286 | Audio Quality Assessment of Vinyl Music Collections Using Self-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that the self-supervised learning (SSL) model wav2vec 2.0 can be successfully used to predict the perceived audio quality of archive music collections. |
A. Ragano; E. Benetos; A. Hines; |
287 | Audio Signal Enhancement with Learning from Positive and Unlabeled Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we explore SE using non-parallel training data consisting of noisy signals and noise, which can be easily recorded. |
N. Ito; M. Sugiyama; |
288 | Audio-Text Models Do Not Yet Leverage Natural Language Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show that state-of-the-art audio-text models do not yet really understand natural language, especially contextual concepts such as sequential or concurrent ordering of sound events. |
H. -H. Wu; O. Nieto; J. P. Bello; J. Salomon; |
289 | Audio-to-Intent Using Acoustic-Textual Subword Representations from End-to-End ASR Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present a novel approach to predict the user intention (whether the user is speaking to the device or not) directly from acoustic and textual information encoded at subword tokens which are obtained via an end-to-end (E2E) ASR model. |
P. Dighe; P. Nayak; O. Rudovic; E. Marchi; X. Niu; A. Tewfik; |
290 | Audio-Visual Inpainting: Reconstructing Missing Visual Information with Sound Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a multimodal, audio-visual inpainting method (AVIN), and show how to leverage sound to reconstruct semantically consistent images. |
V. Sanguineti; S. Thakur; P. Morerio; A. Del Bue; V. Murino; |
291 | Audio-Visual Speaker Diarization in The Framework of Multi-User Human-Robot Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a temporal audio-visual fusion model for multiusers speaker diarization, with low computing requirement, a good robustness and an absence of training phase. |
T. Dhaussy; B. Jabaian; F. Lefèvre; R. Horaud; |
292 | Audio-Visual Speech Enhancement with A Deep Kalman Filter Generative Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an audio-visual deep Kalman filter (AV-DKF) generative model which assumes a first-order Markov chain model for the latent variables and effectively fuses audio-visual data. |
A. Golmakani; M. Sadeghi; R. Serizel; |
293 | Augmentation Robust Self-Supervised Learning for Human Activity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We empirically verify our approaches on three public HAR datasets. |
C. Xu; Y. Li; D. Lee; D. Hoon Park; H. Mao; H. Do; J. Chung; D. Nair; |
294 | Augmenting Transformer-Transducer Based Speaker Change Detection with Token-Level Training Loss Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we propose a novel token-based training strategy that improves Transformer-Transducer (T-T) based speaker change detection (SCD) performance. |
G. Zhao; Q. Wang; H. Lu; Y. Huang; I. L. Moreno; |
295 | AugTarget Data Augmentation for Infrared Small Target Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As one of main limitations, it is hampering the further promotion of target detection performance. In this paper, we propose a simple and effective data augmentation scheme, AugTarget, to address this shortage issue of small target samples. |
S. Chen; J. Zhu; L. Ji; H. Pan; Y. Xu; |
296 | A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Units Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a unified system to realize one-shot voice conversion (VC) on the pitch, rhythm, and speaker attributes. |
L. -W. Chen; S. Watanabe; A. Rudnicky; |
297 | A Unified Uncertainty-Aware Exploration: Combining Epistemic and Aleatory Uncertainty Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an algorithm that clarifies the theoretical connection between aleatory and epistemic uncertainty, unifies aleatory and epistemic uncertainty estimation, and quantifies the combined effect of both uncertainties for a risk-sensitive exploration. |
P. Malekzadeh; M. Hou; K. N. Plataniotis; |
298 | A Unitary Transform Based Generalized Approximate Message Passing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on the unitary transform approximate message passing (UAMP) and expectation propagation, a unitary transform based generalized AMP (GUAMP) algorithm is proposed for general measurement matrices, in particular highly correlated matrices. |
J. Zhu; X. Meng; X. Lei; Q. Guo; |
299 | AURA: Privacy-Preserving Augmentation to Improve Test Set Diversity in Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Speech enhancement models running in production environments are commonly trained on publicly available data. |
X. Gitiaux; A. Khant; E. Beyrami; C. Reddy; J. Gupchup; R. Cutler; |
300 | Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Hence, in this work, we investigate the use of automatically-generated transcriptions of unlabelled datasets to increase the training set size. |
P. Ma; A. Haliassos; A. Fernandez-Lopez; H. Chen; S. Petridis; M. Pantic; |
301 | AutoGCF: Personalized Aggregation on Neural Graph Collaborative Filtering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent studies have shown that the effectiveness of existing NeuGCFs largely relies on the selection of optimal aggregation steps, which makes the performance on various recommendation scenarios unsatisfactory. To tackle this, we for the first time propose a framework to achieve personalized aggregation step assignment on NeuGCF. |
X. You; C. Li; J. Xu; M. Zhang; |
302 | Automatic Camera Pose Estimation By Key-Point Matching of Reference Objects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to design an automatic camera pose estimation pipeline for clinical spaces such as catheterization laboratories. |
J. Zeng; R. Butler; J. J. van den Dobbelsteen; B. H. W. Hendriks; M. V. der Elst; J. Dauwels; |
303 | Automatic Classification of Vocal Intensity Category from Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the current study, we study machine learning and deep learning -based methods in automatic classification of vocal intensity category when the input speech is expressed using an arbitrary amplitude scale. |
M. Kodali; S. R. Kadiri; L. Laaksonen; P. Alku; |
304 | Automatic Error Detection in Integrated Circuits Image Segmentation: A Data-Driven Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the first data-driven automatic error detection approach that targets two types of IC segmentation errors: wire and via errors. |
Z. Zhang; B. M. Trindade; M. Green; Z. Yu; C. Pawlowicz; F. Ren; |
305 | Automatic Segmentation of Nasopharyngeal Carcinoma in CT Images Using Dual Attention and Edge Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the features of uneven grayscale values and hazy boundaries of NPC regions make accurate NPC segmentation particularly challenging. To address these problems, we propose an accurate and effective NPC segmentation method using Dual Attention and Edge Detection Convolutional Neural Network (DAED-Net). |
Q. Wang; W. Huang; Y. Zhang; X. Li; X. Ye; K. Hu; |
306 | Automatic Severity Classification of Dysarthric Speech By Using Self-Supervised Model with Multi-Task Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To tackle the problem, we propose a novel automatic severity assessment method for dysarthric speech, using the self-supervised model in conjunction with multi-task learning. |
E. J. Yeo; K. Choi; S. Kim; M. Chung; |
307 | Autonomous Navigation of A Robotic Swarm in Space Exploration Missions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a kinematics-aware information seeking algorithm for swarm navigation. |
S. Zhang; T. Baumgartner; E. Staudinger; R. Pöhlmann; F. Broghammer; A. Dammann; |
308 | Autonomous Soundscape Augmentation with Multimodal Fusion of Visual and Participant-Linked Inputs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose modular modifications to an existing attention-based deep neural network, to allow early, mid-level, and late feature fusion of participant-linked, visual, and acoustic features. |
K. Ooi; K. N. Watcharasupat; B. Lam; Z. -T. Ong; W. -S. Gan; |
309 | Autotts: End-to-End Text-to-Speech Synthesis Through Differentiable Duration Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our method is based on a soft-duration mechanism that optimizes a stochastic process in expectation. Using this differentiable duration method, we introduce AutoTTS, a direct text-to-waveform speech synthesis model. |
B. Nguyen; F. Cardinaux; S. Uhlich; |
310 | Autovocoder: Fast Waveform Generation from A Learned Speech Representation Using Differentiable Digital Signal Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use machine learning to obtain a representation that replaces the mel-spectrogram, and that can be inverted back to a waveform using simple, fast operations including a differentiable implementation of the inverse STFT.The autovocoder generates a waveform 5 times faster than the DSP-based Griffin-Lim algorithm, and 14 times faster than the neural vocoder HiFi-GAN. |
J. J. Webber; C. Valentini-Botinhao; E. Williams; G. E. Henter; S. King; |
311 | Auxiliary Pooling Layer For Spoken Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the variable granularity in transferring knowledge from texts to speech representation via APLY, an auxiliary pooling layer, that fuses the global information with the adaptively encoded local context. |
Y. Ma; T. H. Nguyen; J. Ni; W. Wang; Q. Chen; C. Zhang; B. Ma; |
312 | A Variational Inequality Model for Learning Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an alternative approach which foregoes the optimization framework and adopts a variational inequality formalism. |
P. L. Combettes; J. -C. Pesquet; A. Repetti; |
313 | AVES: Animal Vocalization Encoder Based on Self-Supervision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In order to leverage a large amount of unannotated audio data, we propose AVES (Animal Vocalization Encoder based on Self-Supervision), a self-supervised, transformer-based audio representation model for encoding animal vocalizations. |
M. Hagiwara; |
314 | A Video Anomaly Detection Framework Based on Appearance-Motion Semantics Representation Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The prevalent methods mainly investigate the reconstruction difference between normal and abnormal patterns but ignore the semantics consistency between appearance and motion information of behavior patterns, making the results highly dependent on the local context of frame sequences and lacking the understanding of behavior semantics. To address this issue, we propose a framework of Appearance-Motion Semantics Representation Consistency that uses the gap of appearance and motion semantic representation consistency between normal and abnormal data. |
X. Huang; C. Zhao; Z. Wu; |
315 | Avoid Overthinking in Self-Supervised Models for Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We then motivate further research in EE by computing an optimal bound for performance versus speed trade-offs. To approach this bound we propose two new strategies for ASR: (1) we adapt the recently proposed patience strategy to ASR; and (2) we design a new EE strategy specific to ASR that performs better than all strategies previously introduced. |
D. Berrebbi; B. Yan; S. Watanabe; |
316 | Av-Sepformer: Cross-Attention Sepformer for Audio-Visual Target Speaker Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose AV-SepFormer, a SepFormer-based attention dual-scale model that utilizes cross- and self-attention to fuse and model features from audio and visual. |
J. Lin; X. Cai; H. Dinkel; J. Chen; Z. Yan; Y. Wang; J. Zhang; Z. Wu; Y. Wang; H. Meng; |
317 | AV-TAD: Audio-Visual Temporal Action Detection With Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current works mainly tackle this task with visual information, while neglecting to explore the potential of the audio modality. To address this challenge, in this paper, we propose a simple yet effective AudioVisual Temporal Action Detection Transformer named AV- TAD, which performs early fusion on audio and visual modalities in an end-to-end fashion. |
Y. Li; Z. Yu; S. Xiang; T. Liu; Y. Fu; |
318 | A Wavelet Scattering Approach for Load Identification with Limited Amount of Training Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, a two-dimensional wavelet scattering approach for load identification is presented. |
P. A. Schirmer; I. Mporas; |
319 | Backdoor Attack Against Automatic Speaker Verification Models in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: During the training process of FL, we make full use of the advantages of FL, and design a two stage training strategy. Besides, we propose Global Spectral Cluster (GSC) method to alleviate insufficient trigger generalization problem, which cased by the constrain that the attacker can only reach and poison its own data. |
D. Meng; X. Wang; J. Wang; |
320 | Backdoor Defense Via Suppressing Model Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the backdoor mechanism from the angle of the model structure. |
S. Yang; Y. Li; Y. Jiang; S. -T. Xia; |
321 | Background Disturbance Mitigation for Video Captioning Via Entity-Action Relocation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they focus on exploiting foreground semantics, ignoring the potential negative impact of video background disturbance to caption generation, i.e., the entities and the actions are misjudged by a similar video background. To ameliorate this issue, we propose Entity-Action Relocation (EAR) to enhance the adaptability of entities and actions to various backgrounds by giving them the background. |
Z. Li; X. Zhong; S. Chen; W. Liu; W. Huang; L. Li; |
322 | Background-Weakening Consistency Regularization for Semi-Supervised Video Action Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus we propose a Background-Weakening with Calibration Constraint (BWCC) framework, which highlights the negative impact of information in the background of false detection by calculating the consistency of the predictions of the background weakened video and the original video. |
X. Zhong; A. Yi; W. Liu; W. Huang; C. Zou; Z. Wang; |
323 | BadRes: Reveal The Backdoors Through Residual Connection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a simple yet strong backdoor attack method called BadRes, where the residual connections play as a turnstile to be deterministic on clean inputs while unpredictable on poisoned ones. |
M. He; T. Chen; H. Zhou; S. Zhang; J. Li; |
324 | Bagging R-CNN: Ensemble for Object Detection in Complex Traffic Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The existing methods are not robust enough to be extended to new complex traffic scenes. To address this issue, we leverage the idea of ensemble learning for strong robustness and propose a novel Bagging R-CNN framework. |
P. Li; Y. He; D. Yin; F. R. Yu; P. Song; |
325 | Bag of Tricks with Quantized Convolutional Neural Networks for Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate the effectiveness of our proposed method with two popular models, ResNet50 and MobileNetV2, on the ImageNet dataset. |
J. Hu; M. Zeng; E. Wu; |
326 | Balanced Deep CCA for Bird Vocalization Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The key objective of this work is to learn useful embeddings associated with high performance in downstream event detection tasks when labeled data is scarce and the audio events of interest — songbird vocalizations — are sparse. |
S. Kumar; B. Anshuman; L. Rüttimann; R. H. R. Hahnloser; V. Arora; |
327 | Balanced Mixup Loss for Long-Tailed Visual Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we detail the theoretical analysis of the data imbalance caused by Mixup, and propose a novel Balanced Mixup (BaMix) loss function from the output perspective. |
H. Ye; F. Zhou; X. Li; Q. Zhang; |
328 | Bat: Bi-Alignment Based On Transformation in Multi-Target Domain Adaptation for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a result, it is impossible for existing methods to handle the more realistic multi-target domain adaptive semantic segmentation (MT-DASS) tasks. To solve this problem, we propose a Bi-Alignment framework based on Transformation (BAT). |
X. Zhong; W. Li; L. Liao; J. Xiao; W. Liu; W. Huang; Z. Wang; |
329 | Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose an uncertainty quantification approach by modeling data distributions in feature spaces. |
X. Chen; Y. Li; Y. Yang; |
330 | Batch Normalization Damages Federated Learning on NON-IID Data: Analysis and Remedy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the first convergence analysis to show that the mismatched local and global statistical parameters due to non-i.i.d data cause gradient deviation and it leads the algorithm to converge to a biased solution with a slower rate. |
Y. Wang; Q. Shi; T. -H. Chang; |
331 | BATT: Backdoor Attack with Transformation-Based Triggers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the previous findings from another side. |
T. Xu; Y. Li; Y. Jiang; S. -T. Xia; |
332 | BAUENet: Boundary-Aware Uncertainty Enhanced Network for Infrared Small Target Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing ISTD methods can discover regularly-shaped and clear objects well, but tend to overlook the tough-to-detect ones, such as targets with irregular shapes or blurry boundaries, causing inaccurate segmentation and missed detection. Considering that boundary areas assemble rich uncertainty information, we propose the Boundary-Aware Uncertainty Enhanced Network (BAUENet), where Uncertainty Enhanced Context Refinement (UECR) and Adaptive Feature Fusion Modules (AFFM) are devised to address this problem. |
T. Chen; Q. Chu; Z. Tan; B. Liu; N. Yu; |
333 | Bayesian Cramér-Rao Bound Estimation With Score-Based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel data-driven method for Bayesian CRB estimation, leveraging state-of-the-art score estimation and deep generative modeling techniques. |
E. S. Crafts; B. Zhao; |
334 | Bayesian Methods for Optical Flow Estimation Using A Variational Approximation, with Applications to Ultrasound Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a unified Bayesian framework for optical flow (OF) estimation that uses a variational lower bound to obtain a variational approximation of the posterior probability distribution. |
J. Dorazil; B. H. Fleury; F. Hlawatsch; |
335 | Bayesian Network Modeling and Prediction of Transitions Within The Homelessness System Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Administrative data collected by homeless service providers offer a unique opportunity to understand how homeless individuals navigate the homeless system towards securing stable … |
K. S. Rahman; D. -S. Zois; C. Chelmis; |
336 | Bayesian Optimization with Ensemble Learning Models and Adaptive Expected Improvement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Optimizing a black-box function that is expensive to evaluate emerges in a gamut of machine learning and artificial intelligence applications including drug discovery, policy optimization in robotics, and hyperparameter tuning of learning models to list a few. |
K. D. Polyzos; Q. Lu; G. B. Giannakis; |
337 | Beamformer-Guided Target Speaker Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a Beamformer-guided Target Speaker Extraction (BG-TSE) method to extract a target speaker’s voice from a multi-channel recording informed by the direction of arrival of the target. |
M. Elminshawi; S. Raj Chetupalli; E. A. P. Habets; |
338 | Beamforming Optimization in RIS-Aided Mimo Systems Under Multiple-Reflection Effects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel method to optimize the RIS phase shifters considering the effect of multiple reflections using a more physically-consistent model (exact RIS model). |
D. Wijekoon; A. Mezghani; E. Hossain; |
339 | BEANS: The Benchmark of Animal Sounds Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we propose BEANS (the BEnchmark of ANimal Sounds), a collection of bioacoustics tasks and public datasets, specifically designed to measure the performance of machine learning algorithms in the field of bioacoustics. |
M. Hagiwara; B. Hoffman; J. -Y. Liu; M. Cusimano; F. Effenberger; K. Zacarian; |
340 | Bebert: Efficient And Robust Binary Ensemble Bert Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an efficient and robust binary ensemble BERT (BEBERT) to bridge the accuracy gap. |
J. Tian; C. Fang; H. Wang; Z. Wang; |
341 | BECTRA: Transducer-Based End-To-End ASR with Bert-Enhanced Encoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present BERT-CTC-Transducer (BECTRA), a novel end-to-end automatic speech recognition (E2E-ASR) model formulated by the transducer with a BERT-enhanced encoder. |
Y. Higuchi; T. Ogawa; T. Kobayashi; S. Watanabe; |
342 | Benchmarking Convolutional Neural Network Inference on Low-Power Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a result of the low-complexity AI models and the available low-power embedded systems on the market, this paper provides a comparative study on the inference performance of convolutional neural networks for different edge devices, by exploiting low-power GPUs and dedicated AI hardware. |
O. Ferraz; H. Araujo; V. Silva; G. Falcao; |
343 | Benchmarking Cross-Domain Face Recognition with Avatars, Caricatures and Sketches Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, various relevant domains have hardly been explored and a lack of public databases hampers the development of new algorithms.In this work, we introduce the HDA Cross-Domain (HDA-CD) face image database comprising 1,400 face images from three different domains including avatars, caricatures, and sketches. |
A. Foroughi; C. Rathgeb; M. Ibsen; C. Busch; |
344 | Benchmarking White Blood Cell Classification Under Domain Shift Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we establish a benchmark for WBC recognition. |
S. Tsutsui; Z. Su; B. Wen; |
345 | Benchmark of Physiological Model Based and Deep Learning Based Remote Photoplethysmography in Automotive Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Remote photoplethysmography (rPPG) can be used to monitor driver’s cardio-respiratory functions in automotive for improving the safety of driving. To understand the challenges of rPPG in this application, we created a benchmark of latest rPPG algorithms based on the MR-NIRP Car dataset, selecting the representative methods from both the physiological model based (PBV and DIS) and deep learning based (Supervised Learning and Contrastive Learning) approaches. |
Z. Wang; X. Yang; H. Lu; C. Shan; W. Wang; |
346 | BER-Aware Dynamic Resource Management for Edge-Assisted Goal-Oriented Communications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Exploiting Lyapunov optimization, we propose a minimum-energy strategy, which trades information rates for BER, under delay and classification accuracy constraints. |
F. Binucci; P. Banelli; |
347 | Bert Is Robust! A Case Against Word Substitution-Based Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the robustness of BERT using four word substitution-based attacks. |
J. Hauser; Z. Meng; D. Pascual; R. Wattenhofer; |
348 | Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to combine DS and Voice Activity Detection (VAD), both recently proposed for TV audio. |
M. Torcoli; E. A. P. Habets; |
349 | Beyond Neural-on-Neural Approaches to Speaker Gender Protection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Currently, the common practice for developing and testing gender protection algorithms is neural-on-neural, i.e., perturbations are generated and tested with a neural network. In this paper, we propose to go beyond this practice to strengthen the study of gender protection. |
L. van Bemmel; Z. Liu; N. Vaessen; M. Larson; |
350 | Beyond Rate Coding: Signal Coding and Reconstruction Using Lean Spike Trains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent years have seen a growing interest in spike based encoding of continuous time signals–a hallmark of biological computation. In this context, we present a mathematical framework for signal representation, leveraging a simple but robust mechanistic model of a biologically plausible spiking neuron. |
A. Chattopadhyay; A. Banerjee; |
351 | BHE-DARTS: Bilevel Optimization Based on Hypergradient Estimation for Differentiable Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a stochastic bilevel optimization approach based on a hypergradient estimator, called BHE- DARTS, as a remedy for this issue that it is easy to search for locally optimal structures rather than globally optimal ones in Differentiable Architecture Search (DARTS) bilevel optimization model. |
Z. Cai; L. Chen; H. -L. Liu; |
352 | Bias Identification with RankPix Saliency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce RankPix, a novel saliency method for visual bias identification in image classification tasks. |
S. Konate; L. Lebrat; R. S. Cruz; C. Fookes; A. Bradley; O. Salvado; |
353 | Bias Reduced Semidefinite Relaxation Method for Multistatic Localization in The Absence of Transmitter Position And Its Synchronization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using the time delay measurements from the direct and indirect paths, we propose to jointly estimate the object and transmitter positions together with the clock offset. |
J. Pei; G. Wang; K. C. Ho; L. Huang; |
354 | Bilateral Coarse-to-Fine Network for Point Cloud Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods of-ten directly infer the missing points from the partial shape, but they suffer from limited structural information. To address this, we propose the Bilateral Coarse-to-Fine Network (BCF-Net), which leverages 2D images as guidance to compensate for structural information loss. |
T. T. Phong Nguyen; S. Lam Phung; V. Gopaldasani; J. Whitelaw; |
355 | Bimodal Fusion Network for Basic Taste Sensation Recognition from Electroencephalography and Electromyography Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a bimodal fusion network (Bi-FusionNet) for recognizing basic taste sensations (sour, sweet, bitter, salty, umami, and blank). |
H. Gao; S. Zhao; H. Li; L. Liu; Y. Wang; R. Hu; J. Zhang; G. Li; |
356 | Binary Image Fast Perfect Recovery from Sparse 2D-DFT Coefficients Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We derive a lower bound on the number of coefficients required for perfect image recovery and propose a reconstruction algorithm. |
S. -C. Pei; K. -W. Chang; |
357 | Binary Sequence Set Optimization for CDMA Applications Via Mixed-Integer Quadratic Programming Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that the ISL minimization problem may be formulated as a mixed-integer quadratic program (MIQP). |
A. Yang; T. Mina; G. Gao; |
358 | Binauralization Robust To Camera Rotation Using 360° Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel binauralization method that is robust to camera rotation. |
M. Yoshida; R. Togo; T. Ogawa; M. Haseyama; |
359 | Biologically-Inspired Continual Learning of Human Motion Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a model for continual learning on tasks involving temporal sequences, specifically, human motions. |
J. Ott; S. -C. Liu; |
360 | Bipartite Graph Convolutional Networks with Adversarial Domain Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a novel graph convolution operation to propagate in bipartite graph with less spatial and temporal complexities, and two mapping functions with adversarial constraints to transfer features between two domains. |
D. Wu; B. Liang; X. Liu; X. Zang; M. Chi; |
361 | BIRD-PCC: Bi-Directional Range Image-Based Deep Lidar Point Cloud Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, their handcrafted design of residual coding methods could not fully exploit spatial redundancy. To remedy this, we propose a coding framework BIRD-PCC. |
C. -S. Liu; J. -F. Yeh; H. Hsu; H. -T. Su; M. -S. Lee; W. H. Hsu; |
362 | BISVP: Building Footprint Extraction Via Bidirectional Serialized Vertex Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a new refinement-free and end-to-end building footprint extraction method, which is conceptually intuitive, simple, and effective. |
M. Zhang; Y. Du; Z. Hu; Q. Liu; Y. Wang; |
363 | Bit Error and Block Error Rate Training for ML-Assisted Communication Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose new loss functions targeted at minimizing the block error rate and SNR deweighting, a novel method that trains communication systems for optimal performance over a range of signal-to-noise ratios. |
R. Wiesmayr; G. Marti; C. Dick; H. Song; C. Studer; |
364 | Blind Acoustic Room Parameter Estimation Using Phase Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by recent works in speech enhancement, we propose utilizing phase-related features to extend recent approaches to blindly estimate the so-called reverberation fingerprint parameters, namely, volume and RT60. |
C. Ick; A. Mehrabi; W. Jin; |
365 | Blind Estimation of Audio Processing Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We apply our model to singing voice effects and drum mixing estimation tasks. |
S. Lee; J. Park; S. Paik; K. Lee; |
366 | Blind Polynomial Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in many applications, the input may be partially known or not known at all, rendering conventional regression approaches not applicable. In this paper, we formally state the (potentially partial) blind regression problem, illustrate some of its theoretical properties, and propose an algorithmic approach to solve it. |
A. Natali; G. Leus; |
367 | Blind Source Counting and Separation with Relative Harmonic Coefficients Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we transfer a previous relative transfer function based separation method into the wave domain by utilizing a higher-order microphone for the mixture recording. |
H. Sun; P. Samarasinghe; T. Abhayapala; |
368 | Block-Based Color Constancy: The Deviation of Salient Pixels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: During experiments, we observed that some pixels decrease the performance of the method. In this work, the algorithm is modified to eliminate the impact of these pixels. |
O. Ulucan; D. Ulucan; M. Ebner; |
369 | Blood Oxygen Saturation Estimation from Facial Video Via DC and AC Components of Spatio-Temporal Map Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an SpO2 estimation method from facial videos based on convolutional neural networks (CNN). |
Y. Akamatsu; Y. Onishi; H. Imaoka; |
370 | Body Prior Guided Graph Convolutional Neural Network for Skeleton-Based Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Via taking full advantage of the body prior knowledge, this paper presents a Body Prior Guided Graph Convolutional Network (BPG-GCN) to jointly meet the demand for large-scale training data and effective model architecture. |
Q. Hu; H. Liu; H. -Q. Wang; M. Liu; |
371 | Boosting Bert Subnets with Neural Grafting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Neural grafting to boost BERT subnets, especially the larger ones. |
T. Hu; C. Meinel; H. Yang; |
372 | Boosting Face Recognition Performance with Synthetic Data and Limited Real Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we attempt to boost face recognition simultaneously using synthetic data and limited real data. |
W. Wang; L. Zhang; C. -M. Pun; J. -C. Xie; |
373 | Boosting Fine-Grained Sketch-Based Image Retrieval with Self-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a better self-supervised pre-trained FG-SBIR model which does not depend on large-scale annotated datasets. |
Z. Zhang; Y. Chen; Y. Zhang; R. Feng; T. Zhang; |
374 | Boosting No-Reference Super-Resolution Image Quality Assessment with Knowledge Distillation and Extension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a novel Knowledge Extension Super-Resolution Image Quality Assessment (KE-SR-IQA) framework to predict SR image quality by leveraging a semi-supervised knowledge distillation (KD) strategy. |
H. Zhang; S. Su; Y. Zhu; J. Sun; Y. Zhang; |
375 | Boosting Person Re-Identification with Viewpoint Contrastive Learning and Adversarial Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite significant progress in person ReID, viewpoint variation remains an obstacle to extracting discriminative features for retrieval. To address this problem, we propose a Viewpoint-Robust Network (VRN) based on contrastive learning and adversarial training to boost person ReID. |
X. Shi; H. Liu; W. Shi; Z. Zhou; Y. Li; |
376 | Boosting Prompt-Based Few-Shot Learners Through Out-of-Domain Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Boost-Distiller, the first few-shot KD algorithm for prompt-tuned PLMs with the help of the out-of-domain data. |
X. Chen; C. Wang; J. Dong; M. Qiu; L. Feng; J. Huang; |
377 | Boosting Semi-Supervised Federated Learning with Model Personalization and Client-Variance-Reduction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we boost the semi-supervised FL by addressing the issue using model personalization and client-variance-reduction. |
S. Wang; Y. Xu; Y. Yuan; X. Wang; T. Q. S. Quek; |
378 | Boosting Signal Modulation Few-Shot Learning with Pre-Transformation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Modulated Signal Pre-transformation (MSP), a parameterized radio signal transformation framework that encourages the signals having the same semantics to have similar representations. |
P. Sun; J. Su; Z. Wen; Y. Zhou; Z. Hong; S. Yu; H. Zhou; |
379 | Boosting The Accuracy of SRAM-Based In-Memory Architectures Via Maximum Likelihood-Based Error Compensation Method Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a Maximum Likelihood (ML)-based statistical Error Compensation (MLEC) method to enhance the accuracy of binary DPs in a 6T SRAM-based IMC. |
H. Kim; N. Shanbhag; |
380 | Boosting Transferability of Adversarial Example Via An Enhanced Euler’s Method Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we try to develop a better base attack to boost the transferability of adversarial examples. |
A. Peng; Z. Lin; H. Zeng; W. Yu; X. Kang; |
381 | Boundary Cue Guidance and Contextual Feature Mining for Glasss Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose a boundary cue guidance and contextual feature mining network (BCNet) to accurately and efficiently segment glass. |
Q. Xiao; Y. Zhang; X. Li; K. Hu; |
382 | Brainnetformer: Decoding Brain Cognitive States with Spatial-Temporal Cross Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose BrainNetFormer to incorporate both static and dynamic properties for human behavior prediction. |
L. Sheng; W. Wang; Z. Shi; J. Zhan; Y. Kong; |
383 | Brain Network Features Differentiate Intentions from Different Emotional Expressions of The Same Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To identify effective brain features that were most relevant to intent recognition improvement, we compared the event-related spectral perturbation and effective brain connectivity patterns on two intent conditions (praise vs. irony). |
Z. Li; B. Zhao; G. Zhang; J. Dang; |
384 | Breaking The Trade-Off in Personalized Speech Enhancement With Cross-Task Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that existing PSE methods suffer from a trade-off between speech over-suppression and interference leakage by addressing one problem at the expense of the other. We propose a new PSE model training framework using cross-task knowledge distillation to mitigate this trade-off. |
H. Taherian; S. Emre Eskimez; T. Yoshioka; |
385 | BreathIE: Estimating Breathing Inhale Exhale Ratio Using Motion Sensor Data from Consumer Earbuds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel algorithm, BreathIE, to estimate breathing rate and IE ratio using a low-power motion sensor embedded in consumer-grade earbuds. |
N. Rashid; M. M. Rahman; T. Ahmed; J. Kuang; J. A. Gao; |
386 | Bridging Speech and Textual Pre-Trained Models With Unsupervised ASR Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To be specific, we propose to use unsupervised automatic speech recognition (ASR) as a connector that bridges different modalities used in speech and textual pre-trained models. |
J. Shi; C. -J. Hsu; H. Chung; D. Gao; P. Garcia; S. Watanabe; A. Lee; H. -Y. Lee; |
387 | BTS-E: Audio Deepfake Detection Using Breathing-Talking-Silence Encoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Countermeasure (CM) systems have been developed recently to help ASV combat synthetic speech. In this work, we propose BTS-E, a framework to evaluate the correlation between Breathing, Talking (speech), and Silence sounds in an audio clip, then use this information for deepfake detection tasks. |
T. -P. Doan; L. Nguyen-Vu; S. Jung; K. Hong; |
388 | Building Blocks for A Complex-Valued Transformer Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally the Fourier transform of signals is complex-valued and has numerous applications. We aim to make deep learning directly applicable to these complex-valued signals without using projections into ? |
F. Eilers; X. Jiang; |
389 | Building Change Detection Using Cross-Temporal Feature Interaction Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fill the gap, we propose a cross-temporal feature interaction network to effectively derive the change representations. |
Y. Feng; J. Jiang; H. Xu; J. Zheng; |
390 | Building Keyword Search System from End-To-End Asr Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe a general KWS pipeline, applicable to any ASR model that generates N-best lists. |
R. Huang; M. Wiesner; L. P. Garcia-Perera; D. Povey; J. Trmal; S. Khudanpur; |
391 | Burst Perception-Distortion Tradeoff: Analysis and Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we extend the perception-distortion tradeoff theory by introducing multiple-frame information. |
D. Xue; L. Herranz; J. V. Corral; Y. Zhang; |
392 | Bytecover3: Accurate Cover Song Identification On Short Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we upgrade the previous ByteCover systems to ByteCover3 that utilizes local features to further improve the identification performance of short music queries. |
X. Du; Z. Wang; X. Liang; H. Liang; B. Zhu; Z. Ma; |
393 | Byzantine-Robust and Communication-Efficient Personalized Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a projected stochastic block gradient descent method to address the robustness issue. |
X. He; J. Zhang; Q. Ling; |
394 | C2BN: Cross-Modality and Cross-Scale Balance Network for Multi-Modal 3D Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Further, the multi-level features from images also suffer from imbalance problems in receptive fields. To address the above problems, we propose two novel networks: cross-modality balance network (CMN) and cross-scale balance network (CSN). |
B. Ding; J. Xie; J. Nie; |
395 | C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a Cross-Lingual Cross-Modal Knowledge Distillation method to improve multilingual text-video retrieval. |
A. Rouditchenko; Y. -S. Chuang; N. Shvetsova; S. Thomas; R. Feris; B. Kingsbury; L. Karlinsky; D. Harwath; H. Kuehne; J. Glass; |
396 | CADET: Control-Aware Dynamic Edge Computing for Real-Time Target Tracking in UAV Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an innovative approach – CADET – to control where sensor signals are processed in the system. |
L. F. Florenzan Reyes; F. Smarra; A. D’Innocenzo; M. Levorato; |
397 | CAENet: Using Collaborative Attention Transformer and Add-Boost Strategy for Single Image Deraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the rain streaks in an images are usually complex and diverse while few methods fully explore the richness of the information which may improve the network’s feature representation ability. To solve the above issues, we propose a novel Collaborative Attention Enhanced Network (CAENet) for single image deraining. |
S. Qin; S. Zhang; Y. Zhang; H. Gao; |
398 | Calibrating AI Models for Few-Shot Demodulation VIA Conformal Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to leverage the conformal prediction framework to obtain data-driven set predictions whose calibration properties hold irrespective of the data distribution. |
K. M. Cohen; S. Park; O. Simeone; S. Shamai Shitz; |
399 | CAN2V: Can-Bus Data-Based Seq2seq Model for Vehicle Velocity Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a model named CAN2V, which effectively analyzes the vehicle characteristics and driving patterns in the encoder through multi-task learning. |
J. -H. Cho; J. -H. Chang; |
400 | Cancelling Intermodulation Distortions for Otoacoustic Emission Measurements with Earbuds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel cancellation method of Intermodulation Distortions (IMDs) for earbud speakers used to measure Distortion Product Otoacoustic Emissions (DPOAE). |
B. U. Demirel; K. Al-Naimi; F. Kawsar; A. Montanari; |
401 | CANDY: Category-Kernelized Dynamic Convolution for Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the comparable performance between local-based and global-based approaches, the AP results of objects on different scales vary significantly. In this paper, we first point out that the key factor to bridging such a gap lies in the utilization of local RoI information for global mask prediction. |
Y. Lu; Z. Chen; Z. Chen; J. Hu; L. Cao; S. Zhang; |
402 | CANet: Curved Guide Line Network with Adaptive Decoder for Lane Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Lots of solutions were proposed, but can not deal with corner lanes well. To address this problem, this paper proposes a new top-down deep learning lane detection approach, CANet. |
Z. Yang; C. Shen; W. Shao; T. Xing; R. Hu; P. Xu; H. Chai; R. Xue; |
403 | Can Knowledge of End-to-End Text-to-Speech Models Improve Neural Midi-to-Audio Synthesis Systems? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we analyze the shortcomings of a TTS-based MIDI-to-audio system and improve it in terms of feature computation, model selection, and training strategy, aiming to synthesize highly natural-sounding audio. |
X. Shi; E. Cooper; X. Wang; J. Yamagishi; S. Narayanan; |
404 | Can Spoofing Countermeasure And Speaker Verification Systems Be Jointly Optimised? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using only a modest quantity of auxiliary data collected from new speakers, we show that joint optimisation degrades the performance of separate CM and ASV sub-systems, but that it nonetheless improves complementarity, thereby delivering superior SASV performance. |
W. Ge; H. Tak; M. Todisco; N. Evans; |
405 | Capacity Maximization for Active RIS Assisted Outdoor-to-Indoor Communication System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to implement outdoor-to-indoor communication with the aid of an active reconfigurable intelligent surface (active-RIS), where the active-RIS allows the incoming signal from an outdoor base station (BS) to pass through the surface and be received by indoor users (UEs) after shifting phase and magnifying amplitude. |
C. He; W. Gong; Y. Dong; X. Xie; Z. J. Wang; |
406 | Capturing Cross-Scale Disparity for Stereo Image Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on how to effectively exploit the disparity information between stereo viewpoints and proposes a cross-scale parallax-attention network (CSPAN) for stereo image SR. |
K. He; C. Li; D. Zhang; J. Shao; |
407 | Cardiac Disease Diagnosis on Imbalanced Electrocardiography Data Through Optimal Transport Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on a new method of data augmentation to solve the data imbalance problem within imbalanced ECG datasets to improve the robustness and accuracy of heart disease detection. |
J. Qiu; J. Zhu; M. Xu; P. Huang; M. Rosenberg; D. Weber; E. Liu; D. Zhao; |
408 | Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first study on unsupervised spoken constituency parsing given unlabeled spoken sentences and unpaired textual data. |
Y. Tseng; C. -I. J. Lai; H. -Y. Lee; |
409 | CAT: Causal Audio Transformer for Audio Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a Causal Audio Transformer (CAT) consisting of a Multi-Resolution Multi- Feature (MRMF) feature extraction with an acoustic attention block for more optimized audio modeling. |
X. Liu; H. Lu; J. Yuan; X. Li; |
410 | Causal Discovery and Causal Inference Based Counterfactual Fairness in Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework named Structural Causal Fairness Framework (SCFF) to achieve counterfactual fairness without assumptions like previous works. |
Y. Wang; Z. Luo; |
411 | CB-Conformer: Contextual Biasing Conformer for Biased Word Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose CB-Conformer to improve biased word recognition by introducing the Contextual Biasing Module and the Self-Adaptive Language Model to vanilla Conformer. |
Y. Xu; B. Liu; Q. Huang; X. Song; Z. Wu; S. Kang; H. Meng; |
412 | CC-PoseNet: Towards Human Pose Estimation in Crowded Classrooms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on improving human pose estimation in crowded classrooms from the perspective of crowd detection and pose refinement. |
Z. Yu; Y. Hu; S. Xiang; T. Liu; Y. Fu; |
413 | CD-FSOD: A Benchmark For Cross-Domain Few-Shot Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a study of the cross-domain few-shot object detection (CD-FSOD) benchmark, consisting of image data from a diverse data domain. |
W. Xiong; |
414 | CDHD: Contrastive Dreamer for Hint Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such homogenous images further hinder the knowledge distillation process when regularising only the deeper layers close to the output, resulting in catastrophic forgetting. To address these issues, we present CDHD: a contrastive dreamer for hint distillation. |
L. Yu; T. Hua; W. Yang; P. Ye; Q. Liao; |
415 | Centralized Cascade Multi-Channel Noise Reduction and Acoustic Feedback Cancellation in A Wireless Acoustic Sensor And Actuator Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a centralized cascade multi-channel noise reduction (NR) and acoustic feedback cancellation (AFC) algorithm for speech applications in a wireless acoustic sensor and actuator network (WASAN). |
S. Ruiz; T. van Waterschoot; M. Moonen; |
416 | Central Nodes Detection from Partially Observed Graph Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on detecting the central nodes in a graph from partially observed graph signals with unknown graph topology. |
Y. He; H. -T. Wai; |
417 | Centroid Distance Distillation for Effective Rehearsal in Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we focus on tackling the continual domain drift problem with centroid distance distillation. |
D. Liu; F. Lyu; L. Li; Z. Xia; F. Hu; |
418 | Certified Robustness of Quantum Classifiers Against Adversarial Examples Through Quantum Noise Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose one first theoretical study that utilizing the added quantum random rotation noise can improve the robustness of quantum classifiers against adversarial attacks. |
J. -C. Huang; Y. -L. Tsai; C. -H. H. Yang; C. -F. Su; C. -M. Yu; P. -Y. Chen; S. -Y. Kuo; |
419 | CFFMixer: Multi-Dimensional Feature Fusion for Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Considering that different modules are applicable to different dimensions, we proposed an object detector named CFFMixer which used hybrid architecture to achieve multi-dimensional feature fusion. |
H. Xie; W. Yuan; B. Kang; S. Du; |
420 | CF-VTON: Multi-Pose Virtual Try-on with Cross-Domain Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Previous works in this field have encountered issues such as unnatural garment alignment and difficulty in preserving the person’s identity, arising from the weak mapping relationship between different feature crosses. To address these challenges, this paper proposes a novel multi-pose virtual try-on network named CF-VTON. |
C. Du; S. Xiong; |
421 | Change Point Detection with Neural Online Density-Ratio Estimator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a change point detection method using an online approach based on neural networks to directly estimate the density-ratio between current and reference windows of the data stream. |
X. Wang; R. A. Borsoi; C. Richard; J. Chen; |
422 | Channel-Driven Decentralized Bayesian Federated Learning for Trustworthy Decision Making in D2D Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the observation that DSGLD applies random Gaussian perturbations to the model parameters, we propose to leverage channel noise on the D2D links as a mechanism for MCMC sampling. |
L. Barbieri; O. Simeone; M. Nicoli; |
423 | Channel Estimation in Massive MIMO with Heavy-Tailed Noise: Gaussian-Mixture Versus Cauchy Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare two types of massive multiple- input multiple-output (MIMO) receivers, namely those based on a Gaussian-mixture assumption and those based on a Cauchy assumption, in terms of channel estimation quality, when the noise is impulsive. |
Z. Gülgün; E. G. Larsson; |
424 | Channel Estimation with Tightly-Coupled Antenna Arrays Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper develops a linear minimum mean-square error (LMMSE) channel estimator that takes advantage of the mutual coupling in antenna arrays. |
B. Tadele; V. Shyianov; F. Bellili; A. Mezghani; |
425 | Channel State Information-Free Artificial Noise-Aided Location-Privacy Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, an artificial noise-aided strategy is presented for location-privacy preservation. |
J. Li; U. Mitra; |
426 | Choice Fusion As Knowledge For Zero-Shot Dialogue State Tracking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although prior works have leveraged question-answering (QA) data to reduce the need for in-domain training in DST, they fail to explicitly model knowledge transfer and fusion for tracking dialogue states. To address this issue, we propose CoFunDST, which is trained on domain-agnostic QA datasets and directly uses candidate choices of slot-values as knowledge for zero-shot dialogue-state generation, based on a T5 pre-trained language model. |
R. Su; J. Yang; T. -W. Wu; B. -H. Juang; |
427 | Chord-Conditioned Melody Harmonization With Controllable Harmonicity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Melody harmonization has long been closely associated with chorales composed by Johann Sebastian Bach. Previous works rarely emphasised chorale generation conditioned on chord … |
S. Wu; X. Li; M. Sun; |
428 | CLAP Learning Audio Concepts from Natural Language Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead, we propose to learn audio concepts from natural language supervision. |
B. Elizalde; S. Deshmukh; M. A. Ismail; H. Wang; |
429 | ClassA Entropy for The Analysis of Structural Complexity of Physiological Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the recent theoretical boom in Sample Entropy based algorithms for the analysis of physiological and pathological systems, the major issue which prevents their more widespread use remains that of large computational load, particularly in the studies of quantification of structural richness in data. |
H. Xiao; L. Li; D. P. Mandic; |
430 | Class-Aware Contextual Information for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a CACINet, which consists of a Semantic Affinity Module (SAM) and a Class Association Module (CAM), to generate class-aware contextual information among pixels on a fine-grained level. |
H. Tang; Y. Zhao; Y. Jiang; Z. Gan; Q. Wu; |
431 | Class-Aware Shared Gaussian Process Dynamic Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A new method of Gaussian process dynamic model (GPDM), named class-aware shared GPDM (CSGPDM), is presented in this paper. |
R. Sawata; T. Ogawa; M. Haseyama; |
432 | Class-Guided Triple Head Prediction Network for Long-Tail Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate this problem, we devise a novel Class-guided Triple Head Prediction Network (CTHNet). Considering the long-tail LVIS dataset contains frequent, common and rare classes, we propose a Triple Box Heads (TBH) to deal with these three classes, enhancing discriminative representations for all classes. |
X. Liu; Y. Zheng; |
433 | Classification-Based Dynamic Network for Efficient Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To accelerate network inference under resource constraints, we propose a classification-based dynamic network for efficient super-resolution (CDNSR), which combines the classification and SR networks in a unified framework. |
Q. Wang; W. Fang; M. Wang; Y. Cheng; |
434 | Classification of Synthetic Facial Attributes By Means of Hybrid Classification/Localization Patch-Based Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new architecture whose objective is to identify the altered facial attributes of synthetic face images. |
J. Wang; B. Tondi; M. Barni; |
435 | Classification of The Cervical Vertebrae Maturation (CVM) Stages Using The Tripod Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel deep learning method for fully automated detection and classification of the Cervical Vertebrae Maturation (CVM) stages. |
S. Atici; H. Pan; M. H. Elnagar; V. Allareddy; O. Suhaym; R. Ansari; A. E. Cetin; |
436 | Classification Via Subspace Learning Machine (SLM): Methodology and Performance Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the decision learning process of multilayer per-ceptron (MLP) and decision tree (DT), a new classification model, named the subspace learning machine (SLM), is proposed in this work. |
H. Fu; Y. Yang; V. K. Mishra; C. . -C. Jay Kuo; |
437 | Classifying Non-Individual Head-Related Transfer Functions with A Computational Auditory Model: Calibration And Metrics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the use of a multi-feature Bayesian auditory sound localisation model to classify non-individual head-related transfer functions (HRTFs). |
R. Daugintis; R. Barumerli; L. Picinali; M. Geronazzo; |
438 | Classifying Pathological Images Based on Multi-Instance Learning and End-to-End Attention Pooling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to address the issue that previous deep learning methods for classifying pathological images cannot adaptively learn features, we propose an end-to-end attention pooling method based on a multi-instance learning patch scoring model. |
Y. Chen; J. Liu; Z. Zuo; P. Jiang; Y. Jin; G. Wu; |
439 | Class-Incremental Learning on Multivariate Time Series Via Shape-Aligned Temporal Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on practical privacy-sensitive circumstances, we propose a novel distillation-based strategy using a single-headed classifier without saving historical samples. |
Z. Qiao; M. Hu; X. Jiang; P. N. Suganthan; R. Savitha; |
440 | Cleanformer: A Multichannel Array Configuration-Invariant Neural Enhancement Frontend for ASR in Smart Speakers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces Cleanformer —a streaming multichannel neural enhancement frontend for automatic speech recognition (ASR). |
J. Caroselli; A. Narayanan; N. Howard; T. O’Malley; |
441 | Clean Sample Guided Self-Knowledge Distillation for Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, for online Self-knowledge Distillation (SD), DA is not always beneficial because of the absence of a trustworthy teacher model. To address this issue, this paper proposes an SD method named Clean sample guided Self-knowledge Distillation (CleanSD), in which the original clean sample is used as a guide when the model is trained with the augmented samples. |
J. Wang; Y. Li; Q. He; W. Xie; |
442 | Clicker: Attention-Based Cross-Lingual Commonsense Knowledge Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While mPTMs show the potential to encode commonsense knowledge for different languages, transferring commonsense knowledge learned in large-scale English corpus to other languages is challenging. To address this problem, we propose the attention-based Cross-LIngual Commonsense Knowledge transfER (CLICKER) framework, which minimizes the performance gaps between English and non-English languages in commonsense question-answering tasks. |
R. Su; Z. Sun; S. Lu; C. Ma; C. Guo; |
443 | Client Selection for Generalization in Accelerated Federated Learning: A Bandit Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel multi-armed bandit (MAB)-based approach for CS to minimize the training latency without harming the ability of the model to generalize, i.e., to give reliable predictions for new observations. |
D. B. Ami; K. Cohen; Q. Zhao; |
444 | CLIP4VideoCap: Rethinking Clip for Video Captioning with Multiscale Temporal Fusion and Commonsense Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose CLIP4VideoCap for video captioning based on large-scale pre-trained CLIP image and text encoders together with multi-scale temporal reasoning and commonsense knowledge. |
T. Mahmud; F. Liang; Y. Qing; D. Marculescu; |
445 | CLMAE: A Liter and Faster Masked Autoencoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, pre-training on big datasets suffers a lengthy training schedule and large memory consumption. To alleviate these problems, we propose a light-weighted model called Convolutional Lite Masked AutoEncoder (CLMAE). |
Y. Song; L. Ma; |
446 | Clustered Greedy Algorithm For Large-Scale Sensor Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a clustering-based solution called clustered greedy selection (CGS) which not only reduces the problem size, but also achieves a similar performance to GS. |
K. Majumder; S. R. B. Pillai; S. Mulleti; |
447 | Clustering-Based Supervised Contrastive Learning for Identifying Risk Items on Heterogeneous Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Clustering-based Supervised Contrastive Learning (CSCL) to address the two challenges. |
A. Li; Y. Ji; G. Chu; X. Wang; D. Li; C. Shi; |
448 | CM-CS: Cross-Modal Common-Specific Feature Learning For Audio-Visual Video Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a novel cross-modal common-specific feature learning method (cm-CS) to map the modal features into modality-common and modality-specific subspaces. |
H. Chen; D. Zhu; G. Zhang; W. Shi; X. Zhang; J. Li; |
449 | CN-CVS: A Mandarin Audio-Visual Dataset for Large Vocabulary Continuous Visual to Speech Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research on Video to Speech Synthesis (VTS) surges recently and the focus is gradually shifting from small-vocabulary short-phrase VTS to large-vocabulary continuous VTS (LVC-VTS). A large-scale dataset with sufficient speakers and utterances is a prerequisite for such research, and the database is certainly language dependent.In this paper, we introduce CN-CVS, a large-scale Mandarin continuous visual-speech dataset, to support LVC-VTS research. |
C. Chen; D. Wang; T. F. Zheng; |
450 | CNEG-VC: Contrastive Learning Using Hard Negative Example In Non-Parallel Voice Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A positive example could not be effectively pushed toward the query examples. We present contrastive learning in non-parallel voice conversion to solve this problem using hard negative examples. |
B. Prihasto; Y. -X. Lin; P. T. Le; C. -L. Huang; J. -C. Wang; |
451 | CNN Filter for RPR-Based SR in VVC with Wavelet Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a convolutional neural network (CNN) filter for reference picture resampling (RPR)-based super-resolution (SR) with wavelet decomposition. |
H. Lan; C. Jung; Y. Liu; M. Li; |
452 | CNN Filter for Super-Resolution with RPR Functionality in VVC Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a convolutional neural network (CNN) filter for super-resolution (SR) with the RPR functionality in VVC. |
S. Huang; C. Jung; Y. Liu; M. Li; |
453 | Coarse-to-Fine Covid-19 Segmentation Via Vision-Language Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there are few relevant studies due to the lack of detailed information and high-quality annotation in the COVID-19 dataset. To solve the above problem, we propose C2FVL, a Coarse-to-Fine segmentation framework via Vision-Language alignment to merge text information containing the number of lesions and specific locations of image information. |
D. Shan; Z. Li; W. Chen; Q. Li; J. Tian; Q. Hong; |
454 | Coarse-To-Fine Knowledge Selection for Document Grounded Dialogs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes Re3G, which aims to optimize both coarse-grained knowledge retrieval and fine-grained knowledge extraction in a unified framework. |
Y. Zhang; H. Fu; C. Fu; H. Yu; Y. Li; C. -T. Nguyen; |
455 | Cochlear Decomposition: A Novel Bio-Inspired Multiscale Analysis Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, issues, such as mode mixing for signals with closelyspaced modes, have been identified. To confront such problems, we propose here a novel spatial auditory decomposition framework for non-stationary signals, namely the Cochlear Decomposition (CD). |
H. Alfalahi; A. Khandoker; G. Alhussein; L. Hadjileontiadis; |
456 | Cocktail Hubert: Generalized Self-Supervised Pre-Training for Mixture and Single-Source Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Cocktail HuBERT, a self-supervised learning framework that generalizes to mixture speech using a masked pseudo source separation objective. |
M. Fazel-Zarandi; W. -N. Hsu; |
457 | Codebook-Based User Tracking in IRS-Assisted MmWave Communication Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel mobile user tracking (UT) scheme for codebook-based intelligent reflecting surface (IRS)-aided millimeter wave (mmWave) systems. |
M. Garkisch; V. Jamali; R. Schober; |
458 | Coded Matrix Computations for D2D-Enabled Linearized Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel straggler-optimal approach for coded matrix computations which can significantly reduce the communication delay and privacy issues introduced from D2D data transmissions in FL. |
A. B. Das; A. Ramamoorthy; D. J. Love; C. G. Brinton; |
459 | Code-Enhanced Fine-Grained Semantic Matching For Tag Recommendation In Software Information Sites Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing methods usually ignore the semantic information of code snippets in software information sites. To tackle this issue, we regard the code as a semantic enhancement signal, and propose a novel Code-Enhanced fine-grained semantic matching method for Tag Recommendation in software information sites (CETR) to learn the matching score between tags and software objects. |
L. Li; P. Wang; X. Zheng; Q. Xie; |
460 | Codes Correcting Burst and Arbitrary Erasures for Reliable and Low-Latency Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by modern network communication applications which require low latency, we study codes that correct erasures with low decoding delay. |
S. Kas Hanna; Z. Tan; W. Xu; A. Wachter-Zeh; |
461 | Co-Design for Mimo Radar and Mimo Communication Aided By Reconfigurable Intelligent Surface Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To achieve the goal, we develop a cyclic framework based on semi-definite programming (SDP), semi-definite relaxation (SDR), and alternating direction method of multiplier (ADMM) to jointly optimize the radar transmit waveforms, the receive filters, the communication codebook, and the RIS coefficients. |
D. Li; B. Tang; L. Xue; |
462 | Code-Switching Speech Synthesis Based on Self-Supervised Learning and Domain Adaptive Speaker Encoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods are still challenging to synthesize highly natural speech. In order to solve the above problems, we introduce self-supervised learning and frame-level domain adversarial training into the speaker encoder based on the speaker verification task, so that the speaker vectors of different languages keep a consistent distribution in the speaker space, and the performance of speech synthesis is improved. |
Y. -X. Lin; C. -H. Pai; P. T. Le; B. Prihasto; C. -L. Huang; J. C. Wang; |
463 | Code-Switching Text Generation and Injection in Mandarin-English ASR Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate text generation and injection for improving the performance of an industry commonly-used streaming model, Transformer-Transducer (T-T), in Mandarin-English code-switching speech recognition. |
H. Yu; Y. Hu; Y. Qian; M. Jin; L. Liu; S. Liu; Y. Shi; Y. Qian; E. Lin; M. Zeng; |
464 | Cold Diffusion for Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The unique mathematical properties of the sampling process from cold diffusion could be utilized to restore high-quality samples from arbitrary degradations. Based on these properties, we propose an improved training algorithm and objective to help the model generalize better during the sampling process. |
H. Yen; F. G. Germain; G. Wichern; J. L. Roux; |
465 | Collaborative Audio-Visual Event Localization Based on Sequential Decision and Cross-Modal Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current approaches model the AVE localization task as a sequential classification process, through which event-relevant video segments cannot accurately collaborate with each other. Therefore, we propose the Collaborative Segments Decision (CSD) that can collaborate between event-relevant video segments by modeling the AVE localization task as a sequential decision process. |
Y. Kuang; X. Fan; |
466 | Color Guided Depth Map Super-Resolution with Nonlocla Autoregres-Sive Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a color guided depth map super-resolution method with nonlocal autoregressive modeling. |
W. Xu; N. Qi; Q. Zhu; J. Qi; L. Huang; K. Cao; Y. Bao; Q. Wang; |
467 | Column-Based Matrix Approximation with Quasi-Polynomial Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work designs the first provable matrix approximation algorithm using just column samples. |
J. Chae; P. Narayanamurthy; S. Bac; S. M. Sharada; U. Mitra; |
468 | Combining Dual-Tree Wavelet Analysis and Proximal Optimization for Anisotropic Scale-Free Texture Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To minimize the corresponding functional, a primal-dual proximal convergent algorithm is devised and accelerated by taking advantage of the strong convexity of the data-fidelity term. |
L. Davy; N. Pustelnik; P. Abry; |
469 | Combining Loss Reweighting and Sample Resampling for Long-Tailed Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The recent long-tailed solutions only consider loss reweighting or sample resampling, which still suffers from the gradient imbalance of positive and negative samples and the overfitting risk of the tail classes. To address these problems, we propose a novel reweighting method, named Foreground and Background Separation Loss (FBSL), to alleviate the imbalance problem of the tail classes being suppressed by the overwhelming foreground and background during the learning process of the model. |
Y. Zhao; S. Chen; Q. Chen; Z. Hu; |
470 | Combining The Silhouette and Skeleton Data for Gait Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, appearance-based methods are greatly affected by clothes-changing and carrying conditions, while model-based methods are limited by the accuracy of pose estimation. To tackle this challenge, a simple yet effective two-branch network is proposed in this paper, which contains a CNN-based branch taking silhouettes as input and a GCN-based branch taking skeletons as input. |
L. Wang; R. Han; W. Feng; |
471 | Commdre: Document-Level Relation Extraction with Self-Supervised Commonsense Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a self-supervised commonsense-enhanced DocRE model, called CommDRE, without external knowledge. |
R. Li; J. Zhong; Z. Xue; Q. Dai; C. Wang; X. Li; |
472 | Communication-Constrained Exchange of Zeroth-Order Information with Application to Collaborative Target Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study a communication-constrained multi-agent zeroth-order online optimization problem within the federated learning (FL) setting with application to target tracking where multiple agents have access only to the knowledge of their current distances to their respective targets. |
E. C. Kaya; M. Berk Sahin; A. Hashemi; |
473 | Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel graph-based clustering approach called Community Detection Graph Convolutional Network (CDGCN) to improve the performance of the speaker diarization system. |
J. Wang; Z. Chen; H. Zhou; L. Li; Q. Hong; |
474 | Comparative Layer-Wise Analysis of Self-Supervised Speech Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we examine the intermediate representations for a variety of recent models. |
A. Pasad; B. Shi; K. Livescu; |
475 | Comparative Study of IRS Assisted Opportunistic Communications Over I.i.d. and Los Channels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider intelligent reflecting surface (IRS) assisted opportunistic communications (OC), and present a comparative analysis of the system throughput over independent and identically distributed (i.i.d.) and line-of-sight (LoS) channels. |
L. Yashvanth; C. R. Murthy; |
476 | Comparing Decentralized Gradient Descent Approaches and Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In recent work, we presented constructive convergence guarantees for Dec-AltProjGDmin under simple assumptions. |
S. Moothedath; N. Vaswani; |
477 | Comparison of Soft and Hard Target RNN-T Distillation for Large-Scale ASR Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on knowledge distillation for the RNN-T model, which is widely used in state-of-the-art (SoTA) automatic speech recognition (ASR). |
D. Hwang; K. Chai Sim; Y. Zhang; T. Strohman; |
478 | Compensatory Debiasing For Gender Imbalances In Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is particularly challenging to detach and remove biased representations in the embedding space because the learned linguistic knowledge entails bias. To address this problem, we propose a compensatory debiasing strategy to reduce gender bias while preserving linguistic knowledge. |
T. -J. Woo; W. -J. Nam; Y. -J. Ju; S. -W. Lee; |
479 | Complementary Learning System Based Intrinsic Reward in Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the fact that humans evaluate curiosity by comparing current observations with historical information, we propose a novel intrinsic reward, namely CLS-IR, which aims to address the problems caused by sparse extrinsic rewards. |
Z. Gao; K. Xu; H. Jia; T. Wan; B. Ding; D. Feng; X. Mao; H. Wang; |
480 | #NAME? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a low-rank approximation algorithm of singular value decomposition (SVD) for large-scale matrices in tensor train format (TT-format). |
J. -C. Chi; C. -E. Chen; Y. -H. Huang; |
481 | Compose & Embellish: Well-Structured Piano Performance Generation Via A Two-Stage Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Observing the above, we devise a two-stage Transformer-based framework that Composes a lead sheet first, and then Embellishes it with accompaniment and expressive touches. |
S. -L. Wu; Y. -H. Yang; |
482 | Composition of Motion from Video Animation Through Learning Local Transformations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we solve the problem of motion representation in videos, according to local transformations applied to specific keypoints extracted from static the images. |
M. Vrigkas; V. Tagka; M. E. Plissiti; C. Nikou; |
483 | Comprehensive Complexity Assessment of Emerging Learned Image Compression on CPU and GPU Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The reported results (1) quantify the complexity of LC methods, (2) fairly compare different methods, and (3) a major contribution of the work is identifying and quantifying the key factors affecting the complexity. |
F. Pakdaman; M. Gabbouj; |
484 | Compressed Distributed Regression Over Adaptive Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We examine the learning performance achievable by a network of agents that solve a distributed regression problem using the recently proposed ACTC (Adapt-Compress-Then-Combine) diffusion strategy. |
M. Carpentiero; V. Matta; A. H. Sayed; |
485 | Compressed-Sensing-Based 3D Localization with Distributed Passive Reconfigurable Intelligent Surfaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, the programmable signal propagation paradigm, enabled by Reconfigurable Intelligent Surfaces (RISs), is exploited for high accuracy 3-Dimensional (3D) user localization with a single multi-antenna base station. |
J. He; A. Fakhreddine; H. Wymeersch; G. C. Alexandropoulos; |
486 | Compressing Cross-Domain Representation Via Lifelong Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address a more challenging scenario in which different tasks are presented sequentially, at different times, and the learning goal is to transfer the generative factors of visual concepts learned by a Teacher module to a compact latent space represented by a Student module. |
F. Ye; A. G. Bors; |
487 | Compressive Channel Estimation for IRS-Aided Millimeter-Wave Systems Via Two-Stage Lamp Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By exploiting the low-rank nature of mmWave channels in the virtual angular domain (VAD) and the powerful learned approximate message passing (LAMP) network, we propose a two-stage LAMP network with row compression (RCTS-LAMP). |
W. -C. Tsai; C. -W. Chen; A. -Y. A. Wu; |
488 | Compressive Estimation of Near Field Channels for Ultra Massive-Mimo Wideband THz Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop a channel estimation strategy for terahertz (THz) ultra-massive multiple-input multiple-output (UM-MIMO) system with a sub-connected array-of-subarrays architecture, in which one subarray (SA) is connected to one RF chain exclusively. |
S. Tarboush; A. Ali; T. Y. Al-Naffouri; |
489 | Compressive Sensing with Tensorized Autoencoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, our goal is to recover images without access to the ground-truth (clean) images using the articulations as structural prior of the data. |
R. Hyder; M. S. Asif; |
490 | Conditional Conformer: Improving Speaker Modulation For Single And Multi-User Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, Feature-wise Linear Modulation (FiLM) has been shown to outperform other approaches to incorporate speaker embedding into speech separation and VoiceFilter models. We propose an improved method of incorporating such embeddings into a Voice- Filter frontend for automatic speech recognition (ASR) and text- independent speaker verification (TI-SV). |
T. O’Malley; S. Ding; A. Narayanan; Q. Wang; R. Rikhye; Q. Liang; Y. He; I. McGraw; |
491 | Conditional LS-GAN Based Skylight Polarization Image Restoration and Application in Meridian Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a deep learning-based methodology for restoring SPIs and utilizing the restored images for navigation. |
T. Yang; H. Bo; X. Yang; J. Gao; Z. Shi; |
492 | Conditioning and Sampling in Variational Diffusion Models for Speech Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel sampling algorithm that communicates the information of the low-resolution audio via the reverse sampling process of DMs. |
C. -Y. Yu; S. -L. Yeh; G. Fazekas; H. Tang; |
493 | CO-NET: Classification-Oriented Point Cloud Sampling Via Informative Feature Learning and Non-Overlapped Local Adjustment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a classification-oriented sampling network named CO-Net, aiming to learn informative sampled points that benefit downstream classification tasks. |
Y. Lin; K. Chen; S. Zhou; Y. Huang; Y. Lei; |
494 | Confidence-Based Event-Centric Online Video Question Answering on A Newly Constructed ATBS Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the challenges of VideoQA on long videos of unknown length, we define a new set of problems called Online Open-ended Video Question Answering (O2VQA). |
W. Kong; S. Ye; C. Yao; J. Ren; |
495 | Conformer-Based Target-Speaker Automatic Speech Recognition For Single-Channel Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose CONF-TSASR, a non-autoregressive end-to-end time-frequency domain architecture for single-channel target-speaker automatic speech recognition (TS-ASR). |
Y. Zhang; K. C. Puvvada; V. Lavrukhin; B. Ginsburg; |
496 | CONSEN: Complementary and Simultaneous Ensemble for Alzheimer’s Disease Detection and MMSE Score Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel method for Alzheimer’s disease detection and MMSE prediction using a complementary and simultaneous ensemble (CONSEN) algorithm based on multilingual spontaneous speech. |
L. Jin; Y. Oh; H. Kim; H. Jung; H. J. Jon; J. E. Shin; E. Y. Kim; |
497 | Consistent Estimators of A New Class of Covariance Matrix Distances in The Large Dimensional Regime Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The problem of estimating the distance between two covariance matrices is considered. A general estimator is provided for a class of metrics, the estimator of which has never been … |
R. Pereira; X. Mestre; D. Gregoratti; |
498 | Constrained Dynamical Neural ODE for Time Series Modelling: A Case Study on Continuous Emotion Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose constrained dynamical neural ordinary differential equation (CD-NODE) models, which treat the desired time series as a dynamic process that can be described by an ODE. |
T. Dang; A. Dimitriadis; J. Wu; V. Sethu; E. Ambikairajah; |
499 | Constrained Independent Component Analysis Based on Entropy Bound Minimization for Subgroup Identification from Multi-subject FMRI Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a constrained independent component analysis algorithm based on minimizing the entropy bound (c-EBM) to overcome the computational complexity limitation of IVA. |
H. Yang; F. Ghayem; B. Gabrielson; M. A. B. S. Akhonda; V. D. Calhoun; T. Adali; |
500 | Constrained Non-negative PARAFAC2 for Electromyogram Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The objective of this paper is to reduce the crosstalk during the simultaneous extension of the muscles of the index finger and the little finger from a matrix of surface electromygraphy signals (sEMG). |
A. Magbonde; F. Quaine; B. Rivet; |
501 | Content-Insensitive Dynamic Lip Feature Extraction for Visual Speaker Authentication Against Deepfake Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, with emerging deepfake technology, attackers can make high fidelity talking videos of a user, thus posing a great threat to these systems. Confronted with this threat, we propose a new deep neural network for lip-based visual speaker authentication against human imposters and deepfake attacks. |
Z. Guo; S. Wang; |
502 | Context-Aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a context-aware coherent speaking style prediction method for audiobook speech synthesis. |
S. Lei; Y. Zhou; L. Chen; Z. Wu; S. Kang; H. Meng; |
503 | Context-Aware End-to-end ASR Using Self-Attentive Embedding and Tensor Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a context-aware end-to-end ASR model that injects the self-attentive context embedding into the decoder of the recurrent neural network transducer (RNN-T). |
S. -Y. Chang; C. Zhang; T. N. Sainath; B. Li; T. Strohman; |
504 | Context-Aware Face Clustering with Graph Convolutional Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Context-Aware Graph Convolutional Network (CAGCN) to explicitly consider both the global and local information. |
D. Zhang; J. Guo; Z. Jin; |
505 | Context-Aware Fine-Tuning of Self-Supervised Speech Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the use of context, i.e., surrounding segments, during fine-tuning and propose a new approach called context-aware fine-tuning. |
S. Shon; F. Wu; K. Kim; P. Sridhar; K. Livescu; S. Watanabe; |
506 | Contextually-Rich Human Affect Perception Using Multimodal Scene Information Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we leverage pretrained vision-language (VLN) models to extract descriptions of foreground context from images. |
D. Bose; R. Hebbar; K. Somandepalli; S. Narayanan; |
507 | Contextual Similarity Is More Valuable Than Character Similarity: An Empirical Study for Chinese Spell Checking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To make better use of contextual information, we propose a simple yet effective Curriculum Learning (CL) framework for the CSC task. |
D. Zhang; Y. Li; Q. Zhou; S. Ma; Y. Li; Y. Cao; H. -T. Zheng; |
508 | Continilm: A Continual Learning Scheme for Non-Intrusive Load Monitoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work alleviates the aforementioned limitation by introducing ContiNILM, a continual learning scheme for NILM to build robust models that track environmental/seasonal alterations with direct impact on several appliances’ operation. |
S. Sykiotis; M. Kaselimi; A. Doulamis; N. Doulamis; |
509 | Continual Cell Instance Segmentation of Microscopy Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, as acquiring annotations is label-intensive, cell images can be partially labeled. In this paper, we present iMRCNN, which extends Mask R-CNN with knowledge distillation and pseudo labeling, to address these challenges. |
T. -T. Chuang; T. -Y. Wei; Y. -H. Hsieh; C. -S. Chen; H. -F. Yang; |
510 | Continual Learning for On-Device Speech Recognition Using Disentangled Conformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This algorithm produces ASR models consisting of a frozen ‘core’ network for general-purpose use and several tunable ‘augment’ networks for speaker-specific tuning. Using such models, we propose a novel compute-efficient continual learning algorithm called DisentangledCL. |
A. Diwan; C. -F. Yeh; W. -N. Hsu; P. Tomasello; E. Choi; D. Harwath; A. Mohamed; |
511 | Continuous Action Space-Based Spoken Language Acquisition Agent Using Residual Sentence Embedding and Transformer Decoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Studies on spoken language acquisition agents aim to understand the mechanism of human language learning and to realize it on computers. |
R. Komatsu; Y. Kimura; T. Okamoto; T. Shinozaki; |
512 | Continuous Descriptor-Based Control for Deep Audio Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We assess the performance of our method on a wide variety of sounds including instrumental, percussive and speech recordings while providing both timbre and attributes transfer, allowing new ways of generating sounds. |
N. Devis; N. Demerlé; S. Nabi; D. Genova; P. Esling; |
513 | Continuous Interaction with A Smart Speaker Via Low-Dimensional Embeddings of Dynamic Hand Pose Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new continuous interaction strategy with visual feedback of hand pose and mid-air gesture recognition and control for a smart music speaker, which utilizes only 2 video frames to recognize gestures. |
S. Xu; C. Kaul; X. Ge; R. Murray-Smith; |
514 | Continuous Learning for Blind Image Quality Assessment with Contrastive Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Transformer-based BIQA contrastive continual learning approach to improve model transfer performance. |
J. Yang; Z. Wang; B. Huang; L. Deng; |
515 | Contrastive Domain Adaptation Via Delimitation Discriminator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we proposed contrastive domain adaptation via delimitation discriminator (CDVD), which addresses the inconsistency problem of optimizing contrastive learning and classification tasks. |
X. Wei; B. Wen; L. Chen; Y. Liu; C. Zhao; Y. Lu; |
516 | Contrastive Learning at The Relation and Event Level for Rumor Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, manual data annotation in realistic cases is very expensive and time-consuming. In this paper, we propose a novel self-supervised Relation-Event based Contrastive Learning (RECL) framework for rumor detection to address the above issue. |
Y. Xu; J. Hu; J. Ge; Y. Wu; T. Li; H. Li; |
517 | Contrastive Learning-Based Audio to Lyrics Alignment for Multiple Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we use instead a contrastive learning procedure that derives cross-modal embeddings linking the audio and text domains. |
S. Durand; D. Stoller; S. Ewert; |
518 | Contrastive Learning of Functionality-Aware Code Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Functionality-aware Code Embeddings (FaCE) in terms of contrastive learning. |
Y. Li; H. Wu; H. Zhao; |
519 | Contrastive Learning of Sentence Embeddings in Product Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose WS-SimCSE, a weak supervision approach based on graph neural networks, which utilizes user behavior data to model relevance relationship between queries and items in a heterogeneous graph. |
B. -W. Zhang; Y. Yan; J. Yu; |
520 | Contrastive Learning with Dialogue Attributes for Neural Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes to guide the response generation with attribute-aware contrastive learning to improve the overall quality of the generated responses, where contrastive learning samples are generated according to various important dialogue attributes each specializing in a different principle of conversation. |
J. Tan; H. Cai; H. Chen; H. Cheng; H. Meng; Z. Ding; |
521 | Contrastive Representation Learning for Acoustic Parameter Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A study is presented in which a contrastive learning approach is used to extract low-dimensional representations of the acoustic environment from single-channel, reverberant speech signals. |
P. Götz; C. Tuna; A. Walther; E. A. P. Habets; |
522 | Contrastive Self-Supervised Learning for Automated Multi-Modal Dance Performance Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A fundamental challenge of analyzing human motion is to effectively represent human movements both spatially and temporally. We propose a contrastive self-supervised strategy to tackle this challenge. |
Y. Zhong; F. Zhang; Y. Demiris; |
523 | Contrastive Speech Mixup for Low-Resource Keyword Spotting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, with the rising demand for smart devices to become more person-alized, KWS models need to adapt quickly to smaller user samples. To tackle this challenge, we propose a contrastive speech mixup (CosMix) learning algorithm for low-resource KWS. |
D. Ng; R. Zhang; J. Q. Yip; C. Zhang; Y. Ma; T. H. Nguyen; C. Ni; E. S. Chng; B. Ma; |
524 | Contrast-PLC: Contrastive Learning for Packet Loss Concealment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to use contrastive learning to learn a loss-robust semantic representation for PLC. |
H. Xue; X. Peng; Y. Lu; |
525 | Controllable Music Inpainting with Mixed-Level and Disentangled Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we contribute a controllable inpainting model by combining the high expressivity of mixed-level, disentangled music representations and the strong predictive power of masked language modeling. |
S. Wei; Z. Wang; W. Gao; G. Xia; |
526 | Convergence Analysis of Graphical Game-Based Nash Q−Learning Using The Interaction Detection Signal of N−Step Return Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we adopt the ${\mathcal{N}}$-step return signal to detect interactions between agents and build the Markov graphical game based on it. |
Y. Zhuang; S. Yang; W. Li; Y. Gao; |
527 | Convergence of Stochastic PDMM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the primal-dual method of multipliers (PDMM), which is a promising distributed optimisation algorithm that is suitable for distributed optimisation in heterogeneous networks. |
S. O. Jordan; T. W. Sherson; R. Heusdens; |
528 | Conversational Text-to-SQL: An Odyssey Into State-of-the-Art and Challenges Ahead Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With multi-tasking (MT) over coherent tasks with discrete prompts during training, we improve over specialized text-to-SQL T5-family models. |
S. Hari Krishnan Parthasarathi; L. Zeng; D. Hakkani-Tür; |
529 | Conversation-Oriented ASR with Multi-Look-Ahead CBS Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In streaming ASR, high accuracy is assured by attending to look-ahead frames, which leads to delay increments. To tackle this trade-off issue, we propose a multiple latency streaming ASR to achieve high accuracy with zero look-ahead. |
H. Zhao; S. Fujie; T. Ogawa; J. Sakuma; Y. Kida; T. Kobayashi; |
530 | Convex Optimization of Deep Polynomial and ReLU Activation Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider training multi-layer neural networks with polynomial and ReLU activation functions. |
B. Bartan; M. Pilanci; |
531 | Convolutional Filtering on Sampled Manifolds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Effective linear information processing on the manifold requires quantifying the error incurred when approximating manifold convolutions with graph convolutions. In this paper, we derive a non-asymptotic error bound for this approximation, showing that convolutional filtering on the sampled manifold converges to continuous manifold filtering. |
Z. Wang; L. Ruiz; A. Ribeiro; |
532 | Convolutional Recurrent MetriCGAN With Spectral Dimension Compression For Full-Band Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we expand it to full-band enhancement by combining our recently proposed learnable spectral dimension compression mapping strategy. |
Z. Hou; Q. Hu; T. Sun; Y. Hu; C. Zhu; K. Chen; |
533 | Convolutional Recurrent Neural Networks for The Classification of Cetacean Bioacoustic Patterns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we focus on the development of a convolutional recurrent neural network (CRNN) to categorize biosignals collected in the Hellenic Trench, generated by two cetacean species, sperm whales (Physeter macrocephalus) and striped dolphins (Stenella coeruleoalba). |
D. N. Makropoulos; A. Tsiami; A. Prospathopoulos; D. Kassis; A. Frantzis; E. Skarsoulis; G. Piperakis; P. Maragos; |
534 | Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an efficient two-dimensional convolution-based attention module, namely C2D-Att. |
J. Li; Y. Tian; T. Lee; |
535 | Convolutive NTF for Ambisonic Source Separation Under Reverberant Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a Non-negative Tensor Factorization (NTF) based sound source separation method with a novel convolutive Spatial Covariance Matrix (SCM) model, that is suitable for use with reverberant Ambisonic signals. |
M. Guzik; K. Kowalczyk; |
536 | Co-Operative CNN for Visual Saliency Prediction on WCE Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we preset a novel and robust gaze estimation methodology based on physicians’ eye fixations, using convolutional neural networks (CNNs) trained according to a novel co-operative scheme, on medical images acquired during Wireless Capsule Endoscopy (WCE). |
G. Dimas; A. Koulaouzidis; D. K. Iakovidis; |
537 | Cooperative Five Degrees Of Freedom Motion Estimation For A Swarm Of Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel cooperative-based system that facilitates each autonomous vehicle of the swarm to be fully aware of its 5 degrees of freedom (DOF) motion, i.e., 3D translation and 2D rotation, a very important task for autonomous navigation, known also as simultaneous localization and mapping (SLAM). |
N. Piperigkos; A. S. Lalos; K. Berberidis; C. Anagnostopoulos; |
538 | Core: Transferable Long-Range Time Series Forecasting Enhanced By Covariates-Guided Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, existing methods only take a window of the near past as input, which prevents the models from learning persistent historical patterns. To tackle these problems, we propose CoRe, a novel transferable long-term forecasting method enhanced by Covariates-guided Representation. |
X. -Y. Li; P. -N. Zhong; D. Chen; Y. -B. Yang; |
539 | CORSD: Class-Oriented Relational Self Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel training framework named Class-Oriented Relational Self Distillation (CORSD) to address the limitations. |
M. Yu; S. H. Tan; K. Wu; R. Dong; L. Zhang; K. Ma; |
540 | Cosmopolite Sound Monitoring (CoSMo): A Study of Urban Sound Event Detection Systems Generalizing to Multiple Cities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an in-depth study of the behaviour of state-of-the-art SED systems well suited to our problem, combining three far-field real recordings datasets which can be used jointly during training. |
F. Angulo; S. Essid; G. Peeters; C. Mietlicki; |
541 | Cough Detection Using Millimeter-Wave Fmcw Radar Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a signal processing method to detect human cough signals with a millimeter-wave frequency-modulated continuous-wave (FMCW) radar. |
K. Han; S. Hong; |
542 | Could The BubbleView Metaphor Be Used to Infer Visual Attention on 3D Graphical Content? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we question the adequacy of this method to provide a reliable proxy for visual attention in the context of 3D graphical objects. |
A. Bruckert; M. Abid; M. P. Da Silva; P. Le Callet; |
543 | Counterfactual Explanation for Multivariate Times Series Using A Contrastive Variational Autoencoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a model to understand abnormal class features on multivariate time series. |
W. Todo; M. Selmani; B. Laurent; J. -M. Loubes; |
544 | Counterfactual Two-Stage Debiasing For Video Corpus Moment Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present a Counterfactual Two-stage Debiasing Learning (CTDL), which incorporates a counterfactual bias network that intentionally learns the retrieval bias by providing a shortcut to learn the spurious correlation between keyword and scene, and performs two-stage debiasing learning that mitigates the bias via contrasting factual retrievals with counterfactually biased retrievals. |
S. Yoon; J. W. Hong; S. Eom; H. S. Yoon; E. Yoon; D. Kim; J. Kim; C. Kim; C. D. Yoo; |
545 | Coupled CP Tensor Decomposition with Shared and Distinct Components for Multi-Task Fmri Data Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a tensor-based framework for multi-task fMRI data fusion, using a partially constrained canonical polyadic (CP) decomposition model. |
R. A. Borsoi; I. Lehmann; M. A. B. S. Akhonda; V. D. Calhoun; K. Usevich; D. Brie; T. Adali; |
546 | Covariance Regularization for Probabilistic Linear Discriminant Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores two alternative covariance regularization approaches, namely, interpolated PLDA and sparse PLDA, to tackle the problem. |
Z. Peng; M. Shao; X. He; X. Li; T. Lee; K. Ding; G. Wan; |
547 | COVID-19 Detection from Speech in Noisy Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the integration of audio enhancement into a speech-based COVID-19 detection system in an attempt to make speech captured in noisy environments from everyday life useful for the detection of the virus. |
S. Liu; A. Mallol-Ragolta; B. W. Schuller; |
548 | Cov Loss: Covariance-Based Loss for Deep Face Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an optimized approach for large-scale face recognition. |
I. Alkanhal; A. Almansour; L. Alsalloom; R. Aljadaany; M. Savvides; |
549 | CPA: Compressed Private Aggregation for Scalable Federated Learning Over Massive Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we present compressed private aggregation (CPA), which allows massive deployments to simultaneously communicate at extremely low bit-rates while achieving privacy, anonymity, and resilience to malicious users. |
N. Lang; E. Sofer; N. Shlezinger; R. G. L. D’Oliveira; S. El Rouayheb; |
550 | CPD-GAN: Cascaded Pyramid Deformation GAN for Pose Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work often failed to transfer complex textures to generated images well. To solve this problem, we propose a novel network for this task. |
Y. Huang; Y. Tang; X. Zheng; J. Tang; |
551 | Cramér-Rao Bound on Lie Groups with Observations on Lie Groups: Application to SE(2) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this communication, we derive a new intrinsic Cramér-Rao bound for both parameters and observations lying on Lie groups. |
S. Labsir; A. Renaux; J. Vilà-Valls; É. Chaumette; |
552 | CRFAST: Clip-Based Reference-Guided Facial Image Semantic Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new task for CLIP-based reference-guided facial image semantic transfer: the source facial image is translated to the output image with the high-level semantic attributes from the reference image while maintaining identity preservation. |
A. Li; L. Zhao; Z. Zuo; Z. Wang; W. Xing; D. Lu; |
553 | Cross-Device Federated Learning for Mobile Health Diagnostics: A First Study on COVID-19 Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose FedLoss, a novel cross-device FL framework for health diagnostics. |
T. Xia; J. Han; A. Ghosh; C. Mascolo; |
554 | Cross-Domain Diffusion Based Speech Enhancement for Very Noisy Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose to incorporate diffusion-based learning into an enhancement model and improve robustness in extremely noisy conditions. |
H. Wang; D. Wang; |
555 | Cross-Domain Learning with Normalizing Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One big challenge in cross-domain learning is to effectively synergize the knowledge learning between domains. In this paper, we propose a new solution to address this challenge using normalizing flow, named as DomainFlow, which works as a learned mapping to establish knowledge sharing between source and target domains. |
C. Wang; J. Gao; Y. Hua; H. Wang; |
556 | Cross-Domain Object Classification Via Successive Subspace Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing SSL-based methods rely heavily on the data-centric subspace representations, leading to potential performance degradation problem in case of the domain shift between the training (a.k.a., source domain) and testing (a.k.a., target domain) data. To address this limitation, we propose an effective successive subspace learning method based on existing SSL-based methods. |
K. Chen; H. Li; H. Yan; |
557 | Cross-Head Supervision for Crowd Counting with Noisy Annotations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: These noisy annotations severely affect the model training, especially for density map-based methods. To alleviate the negative impact of noisy annotations, we propose a novel crowd counting model with one convolution head and one transformer head, in which these two heads can supervise each other in noisy areas, called Cross-Head Supervision. |
M. Dai; Z. Huang; J. Gao; H. Shan; J. Zhang; |
558 | Cross-Lingual Alzheimer’s Disease Detection Based on Paralinguistic and Pre-Trained Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our submission to the ICASSP-SPGC-2023 ADReSS-M Challenge Task, which aims to investigate which acoustic features can be generalized and transferred across languages for Alzheimer’s Disease (AD) prediction. |
X. Chen; Y. Pu; J. Li; W. -Q. Zhang; |
559 | Cross-Lingual Transfer Learning for Alzheimer’s Detection from Spontaneous Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fill this gap, the ADReSS-M challenge was organized. This paper presents our submission to this ICASSP-2023 Signal Processing Grand Challenge (SPGC). |
B. Tamm; R. Vandenberghe; H. Van Hamme; |
560 | Cross-Modal Adversarial Contrastive Learning for Multi-Modal Rumor Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Cross-Modal Adversarial Contrastive (CMAC) fusion strategy, in which adversarial learning is used to align the latent feature distribution of text and image, and contrastive learning is used to align the feature distribution among multi-modal samples of the same category. |
T. Zou; Z. Qian; P. Li; Q. Zhu; |
561 | Cross-Modal Audio-Visual Co-Learning for Text-Independent Speaker Verification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Visual speech (i.e., lip motion) is highly related to auditory speech due to the co-occurrence and synchronization in speech production. This paper investigates this correlation and proposes a cross-modal speech co-learning paradigm. |
M. Liu; K. A. Lee; L. Wang; H. Zhang; C. Zeng; J. Dang; |
562 | Cross-Modal Fusion Techniques for Utterance-Level Emotion Recognition from Text and Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Cross-Modal RoBERTa (CM-RoBERTa) model for emotion detection from spoken audio and corresponding transcripts. |
J. Luo; H. Phan; J. Reiss; |
563 | Cross-Modality Depth Estimation Via Unsupervised Stereo RGB-to-infrared Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our core idea is to first develop an unsupervised RGB-to-IR translation (RIT) network with proposed Fourier domain adaptation and multi-space warping regularization to synthesize stereo IR images from their corresponding stereo RGB images. |
S. Tang; X. Ye; F. Xue; R. Xu; |
564 | Cross Modality Knowledge Distillation for Robust Pedestrian Detection in Low Light and Adverse Weather Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new framework that utilizes Cross Modality Knowledge Distillation (CMKD) to improve the performance of RGB-only pedestrian detection in low light and adverse weather conditions. |
M. Hnewa; A. Rahimpour; J. Miller; D. Upadhyay; H. Radha; |
565 | Cross-Modal Matching and Adaptive Graph Attention Network for RGB-D Scene Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, cross-modal features have not been considered in most existing methods. To address these concerns, we propose to integrate the tasks of cross-modal matching and modal-specific recognition, termed as Matching-to-Recognition Network (MRNet). |
Y. Guo; X. Liang; J. T. Kwok; X. Zheng; B. Wu; Y. Ma; |
566 | Cross-Modal Mutual Learning for Cued Speech Recognition Related Papers Related Patents |