Paper Digest: ICASSP 2022 Highlights
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper. Based in New York, Paper Digest is dedicated to producing high-quality text analysis results that people can acturally use on a daily basis. Since 2018, we have been serving users across the world with a number of exclusive services on ranking, search, tracking and automatic literature review.
If you do not want to miss interesting academic papers, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: ICASSP 2022 Highlights
Paper | Author(s) | |
---|---|---|
1 | Coughtrigger: Earbuds IMU Based Cough Detection Activator Using An Energy-Efficient Sensitivity-Prioritized Time Series Classifier Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present CoughTrigger, which utilizes a lower-power sensor, inertial measurement unit (IMU), in earbuds as a cough detection activator to trigger a higher-power sensor for audio processing and classification. |
S. Zhang; et al. |
2 | Non-Invasive Blood Pressure Monitoring with Multi-Modal In-Ear Sensing Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a measurement technique based on the vascular transit time which utilises the time difference between the S1 heart sound and the PPG upstroke in one pulse cycle. |
H. Truong; A. Montanari; F. Kawsar; |
3 | Intelligent Wi-Fi Based Child Presence Detection System Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents the first-of-its-kind intelligent CPD system using commodity Wi-Fi. |
X. Zeng; B. Wang; C. Wu; S. Deepika Regani; K. J. Ray Liu; |
4 | Real-Time Fall Detection Using Mmwave Radar Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose mmFall, a real time fall detection system using millimeter wave signal which can achieve impressive accuracy with low computation complexity. |
W. Li; et al. |
5 | Hierarchical Deep Learning Model with Inertial and Physiological Sensors Fusion for Wearable-Based Human Activity Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a human activity recognition (HAR) system with wearable devices. |
D. Y. Hwang; et al. |
6 | Speech Recovery For Real-World Self-Powered Intermittent Devices Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a novel intermittent speech recovery (ISR) system for real-world self-powered intermittent devices. |
Y. -C. Lin; et al. |
7 | Phase Control of Parametric Array Loudspeaker By Optimizing Sideband Weights Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a method for controlling the directivity of parametric array loudspeakers (PAL) by optimizing the weights for the sideband signals. |
A. Okano; Y. Kajikawa; |
8 | Low-Latency Human-Computer Auditory Interface Based on Real-Time Vision Analysis Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a visuo-auditory substitution method to assist visually impaired people in scene understanding. |
F. Scalvini; C. Bordeau; M. Ambard; C. Migniot; J. Dubois; |
9 | Robust Adaptive Noise Canceller Algorithm with Snr-Based Stepsize Control and Noise-Path Gain Compensation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a robust adaptive noise canceller algorithm with SNR-based stepsize control and noise-path gain compensation. |
A. Sugiyama; |
10 | Neartracker: Acoustic 2-D Target Tracking with Nearby Reflector in Siso System Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose NearTracker, a contactless acoustic tracking system, achieves 2-D target tracking with only one speaker and one microphone (i.e., Single Input Single Output, SISO). |
C. Liu; L. Gao; R. Jiang; |
11 | An Efficient Method For Generic Dsp Implementation Of Dilated Convolution Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a scheme that allows efficient/generic implementation of 2D dilated convolution and stride on typical DSPs where the instruction sets are well tuned for standard 1D and 2D filtering and convolution operations. |
H. E. V; S. Ghanekar; |
12 | Compression-Aware Projection with Greedy Dimension Reduction for Convolutional Neural Network Activations Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Existing works in activation compression propose to transform feature maps for higher compressibility, thus enabling dimension reduction. |
Y. -S. Tai; C. -F. Teng; C. -Y. Chang; A. -Y. A. Wu; |
13 | Optimizing The Consumption Of Spiking Neural Networks With Activity Regularization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we look into different techniques to enforce sparsity on the neural network activation maps and compare the effect of different training regularizers on the efficiency of the optimized DNNs and SNNs. |
S. Narduzzi; S. A. Bigdeli; S. -C. Liu; L. A. Dunbar; |
14 | IMPQ: Reduced Complexity Neural Networks Via Granular Precision Assignment Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, the problem of granular precision assignment is challenging due to an exponentially large search space and efficient methods for such precision assignment are lacking. To address this problem, we introduce the iterative mixed-precision quantization (IMPQ) framework to allocate precision at variable granularity. |
S. K. Gonugondla; N. R. Shanbhag; |
15 | Rate Coding Or Direct Coding: Which One Is Better For Accurate, Robust, And Energy-Efficient Spiking Neural Networks? Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we conduct a comprehensive analysis of the two codings from three perspectives: accuracy, adversarial robustness, and energy-efficiency. |
Y. Kim; H. Park; A. Moitra; A. Bhattacharjee; Y. Venkatesha; P. Panda; |
16 | PYXIS: An Open-Source Performance Dataset Of Sparse Accelerators Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this work, we present PYXIS, a performance dataset for customized accelerators on sparse data. |
L. Song; Y. Chi; J. Cong; |
17 | Fast Fault Diagnosis Method Of Rolling Bearings In Multi-Sensor Measurement Enviroment Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a fast bearing state detection method based on multi-sensor signal fusion and compression feature extraction is proposed. |
Z. Pan; Z. Lin; Y. Zheng; Z. Meng; |
18 | Detecting Anomaly in Chemical Sensors Via Regularized Contrastive Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present a method for detecting anomalous chemical sensors using contrastive learning-based framework. |
D. Badawi; I. Bassi; S. Ozev; A. E. Cetin; |
19 | Evolutionary Neural Architecture Design of Liquid State Machine for Image Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Manually defining a neural architecture will be ineffective and laborious in most cases. Therefore, based on a state-of-the-art differential evolution algorithm, an evolutionary neural architecture design methodology is proposed to automatically build suitable model topologies for LSM in this study, without any prior knowledge. |
C. Tang; J. Ji; Q. Lin; Y. Zhou; |
20 | Invisible and Efficient Backdoor Attacks for Compressed Deep Neural Networks Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the feasibility of practical backdoor attacks for the compressed DNNs. |
H. Phan; Y. Xie; J. Liu; Y. Chen; B. Yuan; |
21 | Tensor-Based Orthogonal Matching Pursuit with Phase Rotation for Channel Estimation In Hybrid Beamforming Mimo-Ofdm Systems Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose to incorporate phase rotation in factor matrices of tensor-based orthogonal matching pursuit (T-OMP) algorithm to solve the energy leakage problem caused by the grid constraint. |
C. -H. Lo; P. -Y. Tsai; |
22 | Spain-Net: Spatially-Informed Stereophonic Music Source Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: While existing deep learning models implicitly absorb the spatial information conveyed by the multi-channel input signals, we argue that a more explicit and active use of spatial information could not only improve the separation process but also provide an entry-point for many user-interaction based tools. To this end, we introduce a control method based on the stereophonic location of the sources of interest, expressed as the panning angle. |
D. Petermann; M. Kim; |
23 | Improved Singing Voice Separation with Chromagram-Based Pitch-Aware Remixing Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel data augmentation technique, chromagram-based pitch-aware remixing, where music segments with high pitch alignment are mixed. |
S. Yuan; et al. |
24 | Don�t Separate, Learn To Remix: End-To-End Neural Remixing With Joint Optimization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we re-purpose Conv-TasNet, a well-known source separation model, into two neural remixing architectures that learn to remix directly rather than just to separate sources. |
H. Yang; S. Firodiya; N. J. Bryan; M. Kim; |
25 | Few-Shot Musical Source Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Deep learning-based approaches to musical source separation are often limited to the instrument classes that the models are trained on and do not generalize to separate unseen instruments. To address this, we propose a few-shot musical source separation paradigm. |
Y. Wang; D. Stoller; R. M. Bittner; J. Pablo Bello; |
26 | Source Separation By Steering Pretrained Music Models Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We use OpenAI�s Jukebox as the pretrained generative model, and we couple it with four kinds of pretrained music taggers (two architectures and two tagging datasets). |
E. Manilow; P. O�Reilly; P. Seetharaman; B. Pardo; |
27 | Infant Crying Detection In Real-World Environments Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we evaluate several established machine learning approaches including a model leveraging both deep spectrum and acoustic features. |
X. Yao; M. Micheletti; M. Johnson; E. Thomaz; K. de Barbaro; |
28 | Wikitag: Wikipedia-Based Knowledge Embeddings Towards Improved Acoustic Event Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe how to extract label embeddings from multiple Wikipedia texts, and formulate the multi-view aligned AEC problem based on VGGish model. |
Q. Zhang; Q. Tang; C. -C. Kao; M. Sun; Y. Liu; C. Wang; |
29 | Urban Sound & Sight: Dataset And Benchmark For Audio-Visual Urban Scene Understanding Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Yet, the lack of well-curated resources for training and evaluating models to research in this area hinders their development. To address this we present a curated audio-visual dataset, Urban Sound & Sight (Urbansas), developed for investigating the detection and localization of sounding vehicles in the wild. |
M. Fuentes; et al. |
30 | Real-World On-Board Uav Audio Data Set For Propeller Anomalies Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present a novel real-world audio data set of propeller anomalies, and use several deep learning models to classify the damage. |
S. S. Katta; K. Vuoj�rvi; S. Nandyala; U. -M. Kovalainen; L. Baddeley; |
31 | Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: To support research on building robust and accurate vocal sound recognition, we have created a VocalSound dataset consisting of over 21,000 crowdsourced recordings of laughter, sighs, coughs, throat clearing, sneezes, and sniffs from 3,365 unique subjects. |
Y. Gong; J. Yu; J. Glass; |
32 | Wearable Seld Dataset: Dataset For Sound Event Localization And Detection Using Wearable Devices Around Head Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: Some applications (including those for pedestrians that perform SELD while walking) require a wearable microphone array whose geometry can be designed to suit the task. In this paper, for development of such a wearable SELD, we propose a dataset named Wearable SELD dataset. |
K. Nagatomo; M. Yasuda; K. Yatabe; S. Saito; Y. Oikawa; |
33 | Tunet: A Block-Online Bandwidth Extension Model Based On Transformers And Self-Supervised Pretraining Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: We introduce a block-online variant of the temporal feature-wise linear modulation (TFiLM) model to achieve bandwidth extension. |
V. -A. Nguyen; A. H. T. Nguyen; A. W. H. Khong; |
34 | DRC-NET: Densely Connected Recurrent Convolutional Neural Network for Speech Dereverberation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: So that a massive deployment of RNN in time dimension is realized in this paper, by using the channel-wise long short-term memory neural network. |
J. Liu; X. Zhang; |
35 | Customizable End-To-End Optimization Of Online Neural Network-Supported Dereverberation For Hearing Devices Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an end-to-end approach specialized for online processing, that directly optimizes the dereverberated output signal. |
J. -M. Lemercier; J. Thiemann; R. Koning; T. Gerkmann; |
36 | Importance of Switch Optimization Criterion in Switching WPE Dereverberation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We thus propose a new SwWPE processing flow that enables us to optimize switching parameters based on an arbitrary optimization criterion. |
N. Kamo; R. Ikeshita; K. Kinoshita; T. Nakatani; |
37 | Audio-To-Symbolic Arrangement Via Cross-Modal Music Representation Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: Could we automatically derive the score of a piano accompaniment based on the audio of a pop song? This is the audio-to-symbolic arrangement problem we tackle in this paper. |
Z. Wang; D. Xu; G. Xia; Y. Shan; |
38 | Music Phrase Inpainting Using Long-Term Representation and Contrastive Loss Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we tackle the problem of long-term, phrase-level symbolic melody inpainting by equipping a sequence prediction model with phrase-level representation (as an extra condition) and contrastive loss (as an extra optimization term). |
S. Wei; G. Xia; Y. Zhang; L. Lin; W. Gao; |
39 | Melons: Generating Melody With Long-Term Structure Using Transformers And Structure Graph Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose MELONS, a melody generation framework based on a graph representation of music structure which consists of eight types of bar-level relations. |
Y. Zou; P. Zou; Y. Zhao; K. Zhang; R. Zhang; X. Wang; |
40 | Difficulty-Aware Neural Band-to-Piano Score Arrangement Based on Note- and Statistic-Level Criteria Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes a neural music arrangement method that converts a given band score into a piano score with an elementary or advanced level. |
M. Terao; Y. Hiramatsu; R. Ishizuka; Y. Wu; K. Yoshii; |
41 | Score Difficulty Analysis for Piano Performance Education Based on Fingering Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce score difficulty classification as a sub-task of music information retrieval (MIR), which may be used in music education technologies, for personalised curriculum generation, and score retrieval. |
P. Ramoneda; N. C. Tamer; V. Eremenko; X. Serra; M. Miron; |
42 | A Neural Network-based Howling Detection Method for Real-Time Communication Applications Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a convolutional recurrent neural network (CRNN) based method for howling detection in RTC applications, achieving excellent accuracy with low false-alarm rates. |
Z. Chen; Y. Hao; Y. Chen; G. Chen; L. Ruan; |
43 | Alarm Sound Detection Using Topological Signal Processing Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel approach to alarm sound detection using topological data analysis. |
T. Fireaizen; S. Ron; O. Bobrowski; |
44 | A Method For Estimating The Grouping Of Participants In Classroom Group Work Using Only Audio Information Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a novel method for estimating which microphone belongs to the same group in a situation where there are multiple discussion groups in one room, using only audio information. |
O. Ichikawa; Y. Shima; T. Nakayama; H. Shirouzu; |
45 | Environmental Sound Extraction Using Onomatopoeic Words Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an environmental-sound-extraction method using onomatopoeic words to specify the target sound to be extracted. |
Y. Okamoto; S. Horiguchi; M. Yamamoto; K. Imoto; Y. Kawaguchi; |
46 | Echo-Aware Adaptation of Sound Event Localization and Detection in Unknown Environments Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this study, we propose echo-aware feature refinement (EAR) for SELD, which suppresses environmental effects at the feature level by using additional spatial cues of the unknown environment obtained through measuring acoustic echoes. |
M. Yasuda; Y. Ohishi; S. Saito; |
47 | On Adversarial Robustness Of Large-Scale Audio Visual Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: This work aims to study several key questions related to multi-modal learning through the lens of robustness: 1) Are multi-modal models necessarily more robust than uni-modal models? |
J. B. Li; S. Qu; X. Li; P. -Y. B. Huang; F. Metze; |
48 | Adversarial Sample Detection for Speaker Verification By Neural Vocoders Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we adopt neural vocoders to spot adversarial samples for ASV. |
H. Wu; et al. |
49 | Amicable Examples for Informed Source Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In contrast, in this work, we improve the performance of a pre-trained separation model that does not use any side-information. |
N. Takahashi; Y. Mitsufuji; |
50 | Multi-Modal Pre-Training for Automated Speech Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce a novel approach that leverages a self-supervised learning technique based on masked language modeling to compute a global, multi-modal encoding of the environment in which the utterance occurs. |
D. M. Chan; S. Ghosh; D. Chakrabarty; B. Hoffmeister; |
51 | Speaker-Targeted Audio-Visual Speech Recognition Using A Hybrid CTC/Attention Model with Interference Loss Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Although AV Align shows an improvement in recognition accuracy in background noise environments, we have observed that the recognition accuracy degrades significantly in interference speaker environments, where a target speech and an interfering speech overlap each other. In order to improve the speech recognition accuracy of the target speaker in such situations, we propose a method that combines the auxiliary loss function that maximizes the recognition accuracy of the interference speaker and the CTC loss function for training the AV-ASR model. |
R. Tsunoda; R. Aihara; R. Takashima; T. Takiguchi; Y. Imai; |
52 | Time-Domain Audio-Visual Speech Separation on Low Quality Videos Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new structure to fuse the audio and visual features, which uses the audio feature to select relevant visual features by utilizing the attention mechanism. |
Y. Wu; C. Li; J. Bai; Z. Wu; Y. Qian; |
53 | Complex-Valued Spatial Autoencoders for Multichannel Speech Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this contribution, we present a novel online approach to multichannel speech enhancement. |
M. M. Halimeh; W. Kellermann; |
54 | Multichannel Noise Reduction Using Dilated Multichannel U-Net and Pre-Trained Single-Channel Network Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a transfer learning approach that leverages existing pre-trained single-channel neural networks for the optimization of multichannel neural networks. |
Z. -W. Tan; A. H. T. Nguyen; Y. Liu; A. W. H. Khong; |
55 | One Model to Enhance Them All: Array Geometry Agnostic Multi-Channel Personalized Speech Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Single-channel personalized speech enhancement (PSE) methods show promising results compared with the unconditional speech enhancement (SE) methods in these scenarios due to their ability to remove interfering speech in addition to the environmental noise. In this work, we leverage spatial information afforded by microphone arrays to improve such systems� performance further. |
H. Taherian; S. E. Eskimez; T. Yoshioka; H. Wang; Z. Chen; X. Huang; |
56 | Multi-Channel Speech Denoising for Machine Ears Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a MCSDN-Beamforming-MCSDN framework in the inference stage. |
C. Han; E. M. Kaya; K. Hoefer; M. Slaney; S. Carlile; |
57 | Localization Based Sequential Grouping for Continuous Speech Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Given a block of frames with at most two speakers, we apply a two-speaker separation model to separate (and enhance) the speakers, estimate the DOA of each separated speaker, and group the separation results across blocks based on the DOA estimates. |
Z. -Q. Wang; D. Wang; |
58 | Convolutional Weighted Minimum Mean Square Error Filter for Joint Source Separation and Dereverberation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we derive a convolutional multichannel filter which performs jointly optimum dereverberation and desired source signal extraction. |
M. Fras; M. Witkowski; K. Kowalczyk; |
59 | Improving Source Separation By Explicitly Modeling Dependencies Between Sources Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new method for training a supervised source separation system that aims to learn the interdependent relationships between all combinations of sources in a mixture. |
E. Manilow; C. Hawthorne; C. -Z. A. Huang; B. Pardo; J. Engel; |
60 | Music Source Separation With Deep Equilibrium Models Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Hence, in this paper we propose an architecture and training scheme for MSS with DEQ. |
Y. Koyama; N. Murata; S. Uhlich; G. Fabbro; S. Takahashi; Y. Mitsufuji; |
61 | Harmonic and Percussive Sound Separation Based on Mixed Partial Derivative of Phase Spectrogram Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel HPSS method named MipDroP that relies only on phase and does not use information of magnitude spectrograms. |
N. Akaishi; K. Yatabe; Y. Oikawa; |
62 | On Loss Functions and Evaluation Metrics for Music Source Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate which loss functions provide better separations via benchmarking an extensive set of those for music source separation. |
E. Gus�; J. Pons; S. Pascual; J. Serr�; |
63 | Time-Balanced Focal Loss for Audio Event Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This variability results in an inherent disproportional representation of effective training samples. To address this compounded imbalance issue, this work proposes a balanced focal learning function that introduces a novel time-sensitive classwise weight. |
S. Park; M. Elhilali; |
64 | Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: However, there is still a challenge in detecting the same event class from multiple locations. To overcome this problem while maintaining the advantages of the class-wise format, we extended ACCDOA to a multi one and proposed auxiliary duplicating permutation invariant training (ADPIT). |
K. Shimada; Y. Koyama; S. Takahashi; N. Takahashi; E. Tsunoo; Y. Mitsufuji; |
65 | Improved Representation Learning For Acoustic Event Classification Using Tree-Structured Ontology Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a structure-aware semi-supervised learning framework for acoustic event classification (AEC). |
A. Zharmagambetov; et al. |
66 | Temporal Contrastive-Loss for Audio Event Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we propose coherence-based learning, formulated as a contrastive loss, to train event detection models whereby embeddings driven by acoustic events are coherently constrained to maximize discriminability across events. |
S. Kothinti; M. Elhilali; |
67 | A Frame Loss of Multiple Instance Learning for Weakly Supervised Sound Event Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, the general MIL method only optimizes the global loss calculated from the aggregated clip-wise predictions and weak clip labels, lacking a direct constraint on the frame-wise predictions, which leads to a large number of unreasonable prediction values. To address this issue, we explore the deterministic information that can be used to constrain the framewise predictions and based on which we design a frame loss with two terms. |
X. Wang; X. Zhang; Y. Zi; S. Xiong; |
68 | Pseudo Strong Labels for Large Scale Weakly Supervised Audio Tagging Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: This work proposes pseudo strong labels (PSL), a simple label augmentation framework that enhances the supervision quality for large-scale weakly supervised audio tagging. |
H. Dinkel; Z. Yan; Y. Wang; J. Zhang; Y. Wang; |
69 | Individualized Hear-Through For Acoustic Transparency Using PCA-Based Sound Pressure Estimation At The Eardrum Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A particular challenge is the transfer function between the hearing device receiver and the eardrum, which is difficult to obtain in practice as it requires additional probe-tube measurements. In this work, we address this issue by proposing an individualized hear-through equalization filter design that leverages the measurement of the so-called secondary path to predict the sound pressure at the eardrum using a principle component analysis based estimator. |
W. Jin; T. Schoof; H. Schepker; |
70 | On Spectral and Temporal Sparsification of Speech Signals for The Improvement of Speech Perception in CI Listeners Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, two methods inspired by music simplification approaches were developed and evaluated through instrumental measures and in listening tests with adult CI listeners. |
B. Lentz; R. Martin; K. Oberl�nder; C. V�lter; |
71 | A Differentiable Optimisation Framework for The Design of Individualised DNN-based Hearing-Aid Strategies Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Current hearing aids mostly provide sound amplification fittings based on individual hearing thresholds or perceived loudness, even though it is known that sensorineural hearing damage is functionally complex, and requires different treatment strategies. To meet this demand, we propose an optimisation framework for the design of individualised hearingaid signal processing based on simulated (hearing-impaired) auditory-nerve responses. |
F. Drakopoulos; S. Verhulst; |
72 | Personalized Speech Enhancement: New Models and Comprehensive Evaluation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose two neural networks for PSE that achieve superior performance to the previously proposed VoiceFilter. |
S. E. Eskimez; T. Yoshioka; H. Wang; X. Wang; Z. Chen; X. Huang; |
73 | Dynamic Sliding Window for Realtime Denoising Networks Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In response, we propose a new sliding window strategy and a lightweight neural network to leverage it. |
J. Xiang; Y. Zhu; R. Wu; R. Xu; Y. Ishiwaka; C. Zheng; |
74 | Bloom-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a blockwise optimization method for masking-based networks (BLOOM-Net) for training scalable speech enhancement networks. |
S. Kim; M. Kim; |
75 | HGCN: Harmonic Gated Compensation Network for Speech Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, it is hard for most models to handle the situation when harmonics are partially masked by noise. To tackle this challenge, we propose a harmonic gated compensation network (HGCN). |
T. Wang; W. Zhu; Y. Gao; J. Feng; S. Zhang; |
76 | Speech Enhancement with Neural Homomorphic Synthesis Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work proposes a new speech enhancement method based on neural homomorphic synthesis. |
W. Jiang; Z. Liu; K. Yu; F. Wen; |
77 | A Bayesian Permutation Training Deep Representation Learning Method for Speech Enhancement with Variational Autoencoder Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on Bayesian theory, this paper derives a novel variational lower bound for VAE, which ensures that VAE can be trained in supervision, and can disentangle speech and noise latent variables from the observed signal. |
Y. Xiang; J. L. H�jvang; M. H. Rasmussen; M. G. Christensen; |
78 | Integrating Statistical Uncertainty Into Neural Network-Based Speech Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the benefits of modeling uncertainty in neural network-based speech enhancement. |
H. Fang; T. Peer; S. Wermter; T. Gerkmann; |
79 | Unsupervised Speech Enhancement with Speech Recognition Embedding and Disentanglement Losses Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Thus, we propose an unsupervised loss function to tackle those two problems. |
V. A. Trinh; S. Braun; |
80 | Musicyolo: A Sight-Singing Onset/Offset Detection Framework Based on Object Detection Instead of Spectrum Frames Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose MusicYOLO based on object detection to detect the onset and offset in singing for the first time. |
X. Wang; W. Xu; W. Yang; W. Cheng; |
81 | Modeling Beats and Downbeats with A Time-Frequency Transformer Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a novel Transformer-based approach to tackle beat and downbeat tracking. |
Y. -N. Hung; J. -C. Wang; X. Song; W. -T. Lu; M. Won; |
82 | Hierarchical Classification of Singing Activity, Gender, and Type in Complex Music Recordings Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Traditionally, work on singing voice detection has focused on identifying singing activity in music recordings. In this work, our aim is to extend this task towards simultaneously detecting the presence of singing voice as well as determining singer gender and voice type. |
M. Krause; M. M�ller; |
83 | Deepchorus: A Hybrid Model of Multi-Scale Convolution And Self-Attention for Chorus Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: To solve the problem, in this paper we propose an end-to-end chorus detection model DeepChorus, reducing the engineering effort and the need for prior knowledge. |
Q. He; X. Sun; Y. Yu; W. Li; |
84 | To Catch A Chorus, Verse, Intro, or Anything Else: Analyzing A Song with Structural Functions Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, explicitly identifying the function of each segment (e.g., �verse� or �chorus�) is rarely attempted, but has many applications. We introduce a multi-task deep learning framework to model these structural semantic labels directly from audio by estimating verseness, chorusness, and so forth, as a function of time. |
J. -C. Wang; Y. -N. Hung; J. B. L. Smith; |
85 | A Novel 1D State Space for Efficient Music Rhythmic Analysis Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: This paper proposes a new state space and a semi-Markov model for music time structure analysis. |
M. Heydari; M. McCallum; A. Ehmann; Z. Duan; |
86 | Upmixing Via Style Transfer: A Variational Autoencoder for Disentangling Spatial Images And Musical Content Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a modified variational autoencoder model that learns a latent space to describe the spatial images in multichannel music. |
H. Yang; S. Wager; S. Russell; M. Luo; M. Kim; W. Kim; |
87 | Spatial Mixup: Directional Loudness Modification As Data Augmentation for Sound Event Localization and Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Spatial Mixup, as an application of parametric spatial audio effects for data augmentation, which modifies the directional properties of a multi-channel spatial audio signal encoded in the ambisonics domain. |
R. Falc�n-P�rez; K. Shimada; Y. Koyama; S. Takahashi; Y. Mitsufuji; |
88 | Towards Faster Continuous Multi-Channel HRTF Measurements Based On Learning System Models Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To cope with faster rotations, we present a novel continuous HRTF measurement method. |
T. Kabzinski; P. Jax; |
89 | Towards Fast And Convenient End-To-End HRTF Personalization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose and evaluate a system utilizing this model to generate an individualized HRTF using a minimal set of easily obtainable measurements: single photographs of both ears, as well as head and ear scale for matching interaural time difference (ITD). |
B. Zhi; D. N. Zotkin; R. Duraiswami; |
90 | Wishart Localization Prior On Spatial Covariance Matrix In Ambisonic Source Separation Using Non-Negative Tensor Factorization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents an extension of the existing Non-negative Tensor Factorization (NTF) based method for sound source separation under reverberant conditions, formulated for Ambisonic microphone mixture signals. |
M. Guzik; K. Kowalczyk; |
91 | Improving Lyrics Alignment Through Joint Pitch Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we propose a multi-task learning approach for lyrics alignment that incorporates pitch and thus can make use of a new source of highly accurate temporal information. |
J. Huang; E. Benetos; S. Ewert; |
92 | Learning Music Audio Representations Via Weak Language Supervision Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this work, we pose the question of whether it may be possible to exploit weakly aligned text as the only supervisory signal to learn general-purpose music audio representations. |
I. Manco; E. Benetos; E. Quinton; G. Fazekas; |
93 | On The Prediction of The Frequency Response of A Wooden Plate from Its Mechanical Parameters Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by deep learning applications in structural mechanics, we focus on how to train two predictors to model the relation between the vibrational response of a prescribed point of a wooden plate and its material properties. |
D. G. Badiane; R. Malvermi; S. Gonzalez; F. Antonacci; A. Sarti; |
94 | Automatic DJ Transitions with Differentiable Audio Effects and Generative Adversarial Networks Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we explore a data-driven approach that uses a generative adversarial network to create the song transition by learning from real-world DJ mixes. |
B. -Y. Chen; W. -H. Hsu; W. -H. Liao; M. A. M. Ram�rez; Y. Mitsufuji; Y. -H. Yang; |
95 | Self-Supervised Representation Learning for Unsupervised Anomalous Sound Detection Under Domain Shift Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a self-supervised representation learning method is proposed for anomalous sound detection (ASD). |
H. Chen; Y. Song; L. -R. Dai; I. McLoughlin; L. Liu; |
96 | Federated Self-Training for Data-Efficient Audio Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose FedSTAR, a self-training approach to exploit large-scale on-device unlabeled data to improve the generalization of audio recognition models. |
V. Tsouvalas; A. Saeed; T. Ozcelebi; |
97 | Federated Self-Supervised Learning for Acoustic Event Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we investigate the feasibility of applying FL to improve AEC performance while no customer data can be directly uploaded to the server. |
M. Feng; et al. |
98 | Temporal Knowledge Distillation for On-device Audio Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new knowledge distillation method designed to incorporate the temporal knowledge embedded in attention weights of large transformer-based models into on-device models. |
K. Choi; M. Kersner; J. Morton; B. Chang; |
99 | Streaming On-Device Detection of Device Directed Speech from Voice and Touch-Based Invocation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, in many cases, the VA can accidentally be invoked by the keyword-like speech or accidental button press, which may have implications on user experience and privacy. To this end, we propose an acoustic false-trigger-mitigation (FTM) approach for on-device device-directed speech detection that simultaneously handles the voice-trigger and touch-based invocation. |
O. O. Rudovic; A. Bindal; V. Garg; P. Simha; P. Dighe; S. Kajarekar; |
100 | Multi-Frame Full-Rank Spatial Covariance Analysis for Underdetermined BSS in Reverberant Environments Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a new extension of FCA, aiming to improve BSS performance for mixtures in which the length of reverberation exceeds the analysis frame. |
H. Sawada; R. Ikeshita; K. Kinoshita; T. Nakatani; |
101 | Flow-Based Fast Multichannel Nonnegative Matrix Factorization for Blind Source Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes a blind source separation method for multichannel audio signals, called NF-FastMNMF, based on the integration of the normalizing flow (NF) into the multichannel nonnegative matrix factorization with jointly-diagonalizable spatial covariance matrices, a.k.a. FastMNMF. |
A. A. Nugraha; K. Sekiguchi; M. Fontaine; Y. Bando; K. Yoshii; |
102 | Harvesting Partially-Disjoint Time-Frequency Information for Improving Degenerate Unmixing Estimation Technique Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, to avoid the erroneous retention, instead of masking, we propose to use multiple linear spatial filters (e.g., the minimum variance distortionless response filter) to extract the desired signals. |
Y. He; H. Wang; Q. Chen; R. H. Y. So; |
103 | Investigation And Comparison of Optimization Methods for Variational Autoencoder-Based Underdetermined Multichannel Source Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate two algorithms for variational autoencoder (VAE)-based underdetermined multichannel source separation. |
S. Seki; H. Kameoka; L. Li; |
104 | HBP: An Efficient Block Permutation Solver Using Hungarian Algorithm and Spectrogram Inpainting for Multichannel Audio Source Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a method called Hungarian Block Permutation (HBP) to solve the block permutation problem in frequency-domain multichannel audio source separation. |
L. Li; H. Kameoka; S. Seki; |
105 | EAD-Conformer: A Conformer-Based Encoder-Attention-Decoder-Network for Multi-Task Audio Source Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a Conformer-based network to improve the performance of multi-task audio source separation. |
C. Li; Y. Wang; F. Deng; Z. Zhang; X. Wang; Z. Wang; |
106 | The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: However, separating an audio mixture (e.g., movie soundtrack) into the three broad categories of speech, music, and sound effects (understood to include ambient noise and natural sound events) has been left largely unexplored, despite a wide range of potential applications. This paper formalizes this task as the cocktail fork problem, and presents the Divide and Remaster (DnR) dataset to foster research on this topic. |
D. Petermann; G. Wichern; Z. -Q. Wang; J. L. Roux; |
107 | Phase Shifted Bedrosian Filterbank: An Interpretable Audio Front-End for Time-Domain Audio Source Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This type of filters also allows a potential reduction of the computational cost since larger encoder filters can be used. In this work, we propose to build a new parameterization of such encoder filter-bank which allows gaining interpretability while keeping flexibility. |
F. Mathieu; T. Courtat; G. Richard; G. Peeters; |
108 | Harmonicity Plays A Critical Role in DNN Based Versus in Biologically-Inspired Monaural Speech Segregation Systems Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We find that performance deteriorates significantly if one source is even slightly harmonically jittered, e.g., an imperceptible 3% harmonic jitter degrades performance of Conv-TasNet from 15.4 dB to 0.70 dB. |
R. Parikh; I. Kavalerov; C. Espy-Wilson; S. Shamma; |
109 | Multi-Channel Narrow-Band Deep Speech Separation with Full-Band Permutation Invariant Training Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: This paper addresses the problem of multi-channel multi-speech separation based on deep learning techniques. In the short time Fourier transform domain, we propose an end-to-end narrow-band network that directly takes as input the multi-channel mixture signals of one frequency, and outputs the separated signals of this frequency. |
C. Quan; X. Li; |
110 | Csenet: Complex Squeeze-and-Excitation Network for Speech Depression Level Prediction Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In order to make full use of speech information, this paper proposes a complex squeeze-and-excitation network (CSENet) for SDLP. |
C. Fan; Z. Lv; S. Pei; M. Niu; |
111 | Ubilung: Multi-Modal Passive-Based Lung Health Assessment Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Lung health assessment is traditionally done mainly through X-ray images and spirometry tests which are time-consuming, cumbersome, and costly. In this paper, we investigate the potential of passively recordable contents such as speech, cough and heart signal for such an assessment. |
E. Nemati; et al. |
112 | The Second Dicova Challenge: Dataset and Performance Analysis for Diagnosis of Covid-19 Using Acoustics Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present an overview of the challenge, the rationale for the data collection and the baseline system. |
N. K. Sharma; S. R. Chetupalli; D. Bhattacharya; D. Dutta; P. Mote; S. Ganapathy; |
113 | Supervised and Self-Supervised Pretraining Based Covid-19 Detection Using Acoustic Breathing/Cough/Speech Signals Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a bi-directional long short-term memory (BiLSTM) network based COVID-19 detection method using breath/speech/cough signals. |
X. -Y. Chen; Q. -S. Zhu; J. Zhang; L. -R. Dai; |
114 | Exploring Auditory Acoustic Features for The Diagnosis of Covid-19 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work presents the details of the automatic system for COVID-19 detection using breath, cough and speech recordings. |
M. R. Kamble; J. Patino; M. A. Zuluaga; M. Todisco; |
115 | Fast-Rir: Fast Neural Diffuse Room Impulse Response Generator Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: We present a neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment. |
A. Ratnarajah; S. -X. Zhang; M. Yu; Z. Tang; D. Manocha; D. Yu; |
116 | Region-to-Region Kernel Interpolation of Acoustic Transfer Function with Directional Weighting Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A method of interpolating the acoustic transfer function (ATF) between regions that takes into account both the physical properties of the ATF and the directionality of region configurations is proposed. |
J. G. C. Ribeiro; S. Koyama; H. Saruwatari; |
117 | Blind Reverberation Time Estimation in Dynamic Acoustic Conditions Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Previously proposed methods involving deep neural networks were mostly designed and tested under the assumption of static acoustic conditions. In this work, we show that these approaches can perform poorly in dynamically evolving acoustic environments. |
P. G�tz; C. Tuna; A. Walther; E. A. P. Habets; |
118 | Sparse Modeling of The Early Part of Noisy Room Impulse Responses with Sparse Bayesian Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose to reconstruct the sparse model for the early part of RIRs with sparse Bayesian learning (SBL). |
M. Fu; J. R. Jensen; Y. Li; M. G. Christensen; |
119 | Improved Simulation of Realistically-Spatialised Simultaneous Speech Using Multi-Camera Analysis in The Chime-5 Dataset Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In earlier work, we analysed a 50-hour audio-visual dataset of multiparty recordings made in real homes to estimate typical angular separations between speakers. |
J. Deadman; J. Barker; |
120 | A Data-Driven Approach for Acoustic Parameter Similarity Estimation of Speech Recording Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose two methods to estimate acoustic parameter similarity between a speech recording under analysis and a reference one. |
M. Papa; C. Borrelli; P. Bestagini; F. Antonacci; A. Sarti; S. Tubaro; |
121 | Violinist Identification Using Note-Level Timbre Feature Distributions Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To verify if timbre features can describe a performer�s style adequately, we examine a violinist identification method based on note-level timbre feature distributions. |
Y. Zhao; G. Fazekas; M. Sandler; |
122 | S3T: Self-Supervised Pre-Training with Swin Transformer For Music Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose S3T, a self-supervised pre-training method with Swin Transformer for music classification, aiming to learn meaningful music representations from massive easily accessible unlabeled music data. |
H. Zhao; C. Zhang; B. Zhu; Z. Ma; K. Zhang; |
123 | Ambiguity Modelling with Label Distribution Learning for Music Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we address the issue of ambiguity that can arise in many classification problems. |
M. Buisson; P. Alonso-Jim�nez; D. Bogdanov; |
124 | Bytecover2: Towards Dimensionality Reduction of Latent Embedding for Efficient Cover Song Identification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an up-graded version of ByteCover, termed ByteCover2, which further improves ByteCover in both identification performance and efficiency. |
X. Du; K. Chen; Z. Wang; B. Zhu; Z. Ma; |
125 | Tonet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we propose TONet1, a plug-and-play model that improves both tone and octave perceptions by leveraging a novel input representation and a novel network architecture. |
K. Chen; S. Yu; C. -i. Wang; W. Li; T. Berg-Kirkpatrick; S. Dubnov; |
126 | Hierarchical Graph-Based Neural Network for Singing Melody Extraction Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel hierarchical graph-based network for singing melody extraction. |
S. Yu; X. Chen; W. Li; |
127 | On The Impact of Normalization Strategies in Unsupervised Adversarial Domain Adaptation for Acoustic Scene Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address a more practical scenario where parallel data are not available. |
M. Olvera; E. Vincent; G. Gasso; |
128 | Improving Bird Classification with Unsupervised Sound Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we demonstrate improved separation quality when training a MixIT model specifically for birdsong data, outperforming a general audio separation model by over 5 dB in SI-SNR improvement of reconstructed mixtures. |
T. Denton; S. Wisdom; J. R. Hershey; |
129 | Scalable Neural Architectures for End-to-End Environmental Sound Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose novel neural architectures based on PhiNets for real-time acoustic event detection on microcontroller units. |
F. Paissan; A. Ancilotto; A. Brutti; E. Farella; |
130 | HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: However, existing audio transformers require large GPU memories and long training time, meanwhile relying on pretrained vision models to achieve high performance, which limits the model�s scalability in audio tasks. To combat these problems, we introduce HTS-AT: an audio transformer with a hierarchical structure to reduce the model size and training time. |
K. Chen; X. Du; B. Zhu; Z. Ma; T. Berg-Kirkpatrick; S. Dubnov; |
131 | Hybrid Attention-Based Prototypical Networks for Few-Shot Sound Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce a hybrid attention module and combine it with prototypical networks for few-shot sound classification. |
Y. Wang; D. V. Anderson; |
132 | End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we exploit the offset-compensating property of complex time-frequency masks and propose an end-to-end complex-valued neural network architecture. |
K. N. Watcharasupat; T. N. T. Nguyen; W. -S. Gan; S. Zhao; B. Ma; |
133 | NN3A: Neural Network Supported Acoustic Echo Cancellation, Noise Suppression and Automatic Gain Control for Real-Time Communications Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a neural network supported algorithm for RTC, namely NN3A, which incorporates an adaptive filter and a multi-task model for residual echo suppression, noise reduction and near-end speech activity detection. |
Z. Wang; Y. Na; B. Tian; Q. Fu; |
134 | Deep Residual Echo Suppression and Noise Reduction: A Multi-Input FCRN Approach in A Hybrid Speech Enhancement System Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Using the fully convolutional recurrent network (FCRN) architecture that is among state of the art topologies for noise reduction, we present a novel deep residual echo suppression and noise reduction with up to four input signals as part of a hybrid speech enhancement system with a linear frequency domain adaptive Kalman filter AEC. |
J. Franzen; T. Fingscheidt; |
135 | Neural Cascade Architecture for Joint Acoustic Echo and Noise Suppression Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a neural cascade architecture for joint acoustic echo and noise suppression. |
H. Zhang; D. Wang; |
136 | Cascade Multi-Channel Noise Reduction and Acoustic Feedback Cancellation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a cascade noise reduction (NR) and acoustic feedback cancellation (AFC) algorithm is presented for speech applications where a multi-channel Wiener filter (MWF) based NR is applied first followed by a single-channel prediction-error method (PEM) based adaptive feedback cancellation stage. |
S. Ruiz; T. van Waterschoot; M. Moonen; |
137 | Skim: Skipping Memory Lstm for Low-Latency Real-Time Continuous Speech Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We proposed a simple yet efficient model named Skipping Memory (SkiM) for the long sequence modeling. |
C. Li; L. Yang; W. Wang; Y. Qian; |
138 | Adapting Speech Separation to Real-World Meetings Using Mixture Invariant Training Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate using MixIT to adapt a separation model on real far-field overlapping reverberant and noisy speech data from the AMI Corpus. |
A. Sivaraman; S. Wisdom; H. Erdogan; J. R. Hershey; |
139 | Quantifying Discriminability Between NMF Bases Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces a quantitative measure to calculate how discriminative two NMF bases are. |
E. Konno; D. Saito; N. Minematsu; |
140 | Location-Based Training for Multi-Channel Talker-Independent Speaker Separation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Leveraging spatial information afforded by microphone arrays, we propose a new training approach to resolving permutation ambiguities for multi-channel speaker separation. |
H. Taherian; K. Tan; D. Wang; |
141 | SDR � Medium Rare with Fast Computations Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a fast algorithm fixing shortcomings of publicly available implementations. |
R. Scheibler; |
142 | Attentionpit: Soft Permutation Invariant Training for Audio Source Separation with Attention Mechanism Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: \right)$, which makes it infeasible as J increases, and the other is that it is prone to getting stuck in bad local optimal solutions due to the hard output-target assignment process. To overcome these problems simultaneously, in this paper, we propose AttentionPIT, which uses an attention mechanism to find soft output-target assignments for separation network training, and can be run in polynomial time in J, as with the recently proposed fast PIT variants such as SinkPIT and HungarianPIT. |
H. Kameoka; S. Seki; L. Li; C. Watanabe; |
143 | Locate This, Not That: Class-Conditioned Sound Event DOA Estimation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an alternative class-conditioned SELD model for situations where we may not be interested in localizing all classes all of the time. |
O. Slizovskaia; G. Wichern; Z. -Q. Wang; J. Le Roux; |
144 | SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs. |
T. N. Tho Nguyen; D. L. Jones; K. N. Watcharasupat; H. Phan; W. -S. Gan; |
145 | SRP-DNN: Learning Direct-Path Phase Difference for Multiple Moving Sound Source Localization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose to use deep learning techniques to learn competing and time-varying direct-path phase differences for localizing multiple moving sound sources. |
B. Yang; H. Liu; X. Li; |
146 | Closed-Form Single Source Direction-of-Arrival Estimator Using First-Order Relative Harmonic Coefficients Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In contrast, this paper utilizes the first-order RHC to propose a closed-form DOA estimator by deriving a direction vector, which points towards to the desired source direction. |
Y. Hu; S. Gannot; |
147 | A Slide-Save Based Framework for Multi-Source DOA Extraction with Closely Spaced Sources Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a slide-save based framework to address the problem of extracting multi-source DOAs for closely spaced sources. |
J. Geng; S. Wang; X. Lou; |
148 | An End-to-End Deep Learning Framework For Multiple Audio Source Separation And Localization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an end-to-end deep learning framework to separate and localize multiple audio sources from the mixture of multi-channels. |
Y. Chen; B. Liu; Z. Zhang; H. -S. Kim; |
149 | Deep Adaptation Control for Acoustic Echo Cancellation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a general framework for adaptation control using deep neural networks (NNs) and apply it to acoustic echo cancellation (AEC). |
A. Ivry; I. Cohen; B. Berdugo; |
150 | Off-the-Shelf Deep Integration For Residual-Echo Suppression Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we fine-tune three pre-trained deep learning-based systems originally designed for RES, SS, and SE, and show that the best performing system for the task of RES varies with respect to the acoustic conditions. |
A. Ivry; I. Cohen; B. Berdugo; |
151 | A Complex Spectral Mapping with Inplace Convolution Recurrent Neural Networks For Acoustic Echo Cancellation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Different from most methods which process the entire frequency band, we propose inplace convolution recurrent neural networks (ICRN) for end-to-end AEC, which utilizes inplace convolution and channel-wise temporal modeling to ensure the near-end signal information being preserved. |
C. Zhang; J. Liu; X. Zhang; |
152 | Deep Adaptive Aec: Hybrid of Deep Learning and Adaptive Acoustic Echo Cancellation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we integrate classic adaptive filtering algorithms with modern deep learning to propose a new approach called deep adaptive AEC. |
H. Zhang; S. Kandadai; H. Rao; M. Kim; T. Pruthi; T. Kristjansson; |
153 | Computationally Efficient Fixed-Filter ANC for Speech Based on Long-Term Prediction for Headphone Applications Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose to solve the causality problem in feedforward fixed-filter ANC systems by integrating a long-term linear prediction filter to predict the incoming disturbance, here speech, by the same amount of samples ahead in time, as the non-causal delay. |
Y. Iotov; S. M. N�rholm; V. Belyi; M. Dyrholm; M. G. Christensen; |
154 | End-To-End Deep Learning-Based Adaptation Control for Frequency-Domain Adaptive System Identification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel end-to-end deep learning-based adaptation control algorithm for frequency-domain adaptive system identification. |
T. Haubner; A. Brendel; W. Kellermann; |
155 | A Few-Sample Strategy for Guitar Tablature Transcription Based on Inharmonicity Analysis and Playability Constraints Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: The current work combines the two aforementioned strategies in an explicit manner by employing two discrete components for string-fret classification. |
G. Bastas; S. Koutoupis; M. Kaliakatsos-Papakostas; V. Katsouros; P. Maragos; |
156 | Exploring Transformer�s Potential on Automatic Piano Transcription Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Most recent research about automatic music transcription (AMT) uses convolutional neural networks and recurrent neural networks to model the mapping from music signals to symbolic notation. |
L. Ou; Z. Guo; E. Benetos; J. Han; Y. Wang; |
157 | A Lightweight Instrument-Agnostic Model for Polyphonic Note Transcription and Multipitch Estimation Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we propose a lightweight neural network for musical instrument transcription, which supports polyphonic outputs and generalizes to a wide variety of instruments (including vocals). |
R. M. Bittner; J. J. Bosch; D. Rubinstein; G. Meseguer-Brocal; S. Ewert; |
158 | Towards Automatic Transcription of Polyphonic Electric Guitar Music: A New Dataset and A Multi-Loss Transformer Model Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new dataset named EGDB, that contains transcriptions of the electric guitar performance of 240 tablatures rendered with different tones. |
Y. -H. Chen; W. -Y. Hsiao; T. -K. Hsieh; J. -S. R. Jang; Y. -H. Yang; |
159 | Genre-Conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose to transcribe the lyrics of polyphonic music using a novel genre-conditioned network. |
X. Gao; C. Gupta; H. Li; |
160 | Pseudo-Label Transfer from Frame-Level to Note-Level in A Teacher-Student Framework for Singing Transcription from Polyphonic Music Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: We address the issue by using pseudo labels from vocal pitch estimation models given unlabeled data. |
S. Kum; J. Lee; K. L. Kim; T. Kim; J. Nam; |
161 | Sound Event Detection Guided By Semantic Contexts of Scenes Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This is because one-hot representations of pre-defined scenes are exploited as prior contexts for such conventional methods. To alleviate this problem, we propose scene-informed SED where pre-defined scene-agnostic contexts are available for more accurate SED. |
N. Tonami; K. Imoto; R. Nagase; Y. Okamoto; T. Fukumori; Y. Yamashita; |
162 | CNN-Transformer with Self-Attention Network for Sound Event Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To construct a model with high prediction accuracy while capturing the properties of acoustic signals well, we propose an architecture called a CNN-SAN-Transformer, which retains CNN in the blocks close to the input and uses SAN in all remaining blocks. |
K. Wakayama; S. Saito; |
163 | A Mutual Learning Framework for Few-Shot Sound Event Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: Secondly, the feature extractor is task-agnostic (or class-agnostic): the feature extractor is trained with base-class data and directly applied to unseen-class data. To address these issues, we present a novel mutual learning framework with transductive learning, which aims at iteratively updating the class prototypes and feature extractor. |
D. Yang; H. Wang; Y. Zou; Z. Ye; W. Wang; |
164 | Anomalous Sound Detection Using Spectral-Temporal Information Fusion Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: This paper proposes a spectral-temporal fusion based self-supervised method to model the feature of the normal sound, which improves the stability and performance consistency in detection of anomalous sounds from individual machines, even of the same type. |
Y. Liu; J. Guan; Q. Zhu; W. Wang; |
165 | Sparse Self-Attention for Semi-Supervised Sound Event Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a sparse self-attention mechanism to alleviate the impact. |
Y. Guan; J. Xue; G. Zheng; J. Han; |
166 | Peer Collaborative Learning for Polyphonic Sound Event Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes how semi-supervised learning, called peer collaborative learning (PCL), can be applied to the polyphonic sound event detection (PSED) task, which is one of the tasks in the Detection and Classification of Acoustic Scenes and Events (DCASE) challenge. |
H. Endo; H. Nishizaki; |
167 | PostGAN: A GAN-Based Post-Processor to Enhance The Quality of Coded Speech Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose PostGAN, a GAN-based neural post-processor that operates in the sub-band domain and relies on the U-Net architecture and a learned affine transform. |
S. Korse; N. Pia; K. Gupta; G. Fuchs; |
168 | A DNN Based Post-Filter to Enhance The Quality of Coded Speech in MDCT Domain Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a mask-based post-filter operating directly in MDCT domain of the codec, inducing no extra delay. |
K. Gupta; S. Korse; B. Edler; G. Fuchs; |
169 | A Two-Stage U-Net for High-Fidelity Denoising of Historical Recordings Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a novel denoising method based on a fully-convolutional deep neural network. |
E. Moliner; V. V�lim�ki; |
170 | Experts Versus All-Rounders: Target Language Extraction for Multiple Target Languages Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we extend the Single-TLE framework to Multi-TLE. |
M. Borsdorf; K. Scheck; H. Li; T. Schultz; |
171 | Category-Adapted Sound Event Enhancement with Weakly Labeled Data Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a category-adapted system to enable enhancement on any selected sound category, where we first familiarize the model to all common sound classes and followed by a category-specific fine-tune procedure to enhance the targeted sound class. |
G. Li; X. Xu; H. Dinkel; M. Wu; K. Yu; |
172 | Sequential MCMC Methods for Audio Signal Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: With the aim of addressing audio signal restoration as a sequential inference problem, we build upon Gabor regression to propose a state-space model for audio time series. |
R. M. Claver�a; S. J. Godsill; |
173 | Architecture for Variable Bitrate Neural Speech Codec with Configurable Computation Complexity Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present a new neural speech codec that: 1) supports variable bitrates 2) supports packet losses of up to 120 ms and 3) can operate at low-compute and high-compute modes. |
T. Jayashankar; et al. |
174 | End-to-End Neural Speech Coding for Real-Time Communications Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes the TFNet, an end-to-end neural speech codec with low latency for RTC. |
X. Jiang; X. Peng; C. Zheng; H. Xue; Y. Zhang; Y. Lu; |
175 | Deep Neural Network (DNN) Audio Coder Using A Perceptually Improved Training Method Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: The distortion in the perceptual domain is measured using the psychoacoustic model (PAM), and a loss function is obtained through the two-stage compensation approach. |
S. Shin; J. Byun; Y. Park; J. Sung; S. Beack; |
176 | Progressive Multi-Stage Neural Audio Coding with Guided References Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an effective multi-stage neural audio coding algorithm that encodes full-band audio signals (up to 20 kHz) using an end-to-end training criterion. |
C. Lee; H. Lim; J. Lee; I. Jang; H. -G. Kang; |
177 | Vocbench: A Neural Vocoder Benchmark for Speech Synthesis Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: However, it becomes more challenging to assess these new vocoders and compare their performance to previous ones. To address this problem, we present VocBench, a framework that benchmark the performance of state-of-the-art neural vocoders. |
E. A. AlBadawy; A. Gibiansky; Q. He; J. Wu; M. -C. Chang; S. Lyu; |
178 | Dnsmos P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we train an objective metric based on P.835 human ratings that output 3 scores: i) speech quality (SIG), ii) background noise quality (BAK), and iii) the overall quality (OVRL) of the audio. |
C. K. A. Reddy; V. Gopal; R. Cutler; |
179 | SQAPP: No-Reference Speech Quality Assessment Via Pairwise Preference Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a learning framework for estimating the quality of a recording without any reference, and without any human judgments. |
P. Manocha; Z. Jin; A. Finkelstein; |
180 | LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this work, we present LDNet, a unified framework for mean opinion score (MOS) prediction that predicts the listener-wise perceived quality given the input speech and the listener identity. |
W. -C. Huang; E. Cooper; J. Yamagishi; T. Toda; |
181 | AECMOS: A Speech Quality Assessment Metric for Echo Impairment Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: More precisely, we develop a neural network model to evaluate call quality degradations in two separate categories: echo and degradations from other sources. |
M. Purin; S. Sootla; M. Sponza; A. Saabas; R. Cutler; |
182 | MOS Predictor for Synthetic Speech with I-Vector Inputs Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a neural-network-based model that splices the deep features extracted by convolutional neural network (CNN) and i-vector on the time axis and uses Transformer encoder as time sequence model. |
M. Liu; J. Wang; S. Li; F. Xiang; Y. Yao; L. Yang; |
183 | Wave-Domain Approach for Cancelling Noise Entering Open Windows Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a wave-domain approach that converges instantaneously, operates with low computational effort and does not require error microphones. |
D. Ratering; W. B. Kleijn; J. Gonzalez Silva; R. M. G. Ferrari; |
184 | On Synchronization of Wireless Acoustic Sensor Networks in The Presence of Time-Varying Sampling Rate Offsets and Speaker Changes Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: A wireless acoustic sensor network records audio signals with sampling time and sampling rate offsets between the audio streams, if the analog-digital converters (ADCs) of the network devices are not synchronized. Here, we introduce a new sampling rate offset model to simulate time-varying sampling frequencies caused, for example, by temperature changes of ADC crystal oscillators, and propose an estimation algorithm to handle this dynamic aspect in combination with changing acoustic source positions. |
T. Gburrek; J. Schmalenstroeer; R. Haeb-Umbach; |
185 | Picknet: Real-Time Channel Selection for Ad Hoc Microphone Arrays Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes PickNet, a neural network model for real-time channel selection using an ad hoc microphone array. |
T. Yoshioka; X. Wang; D. Wang; |
186 | End-To-End Alexa Device Arbitration Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a variant of the speaker localization problem, which we call device arbitration. |
J. Barber; Y. Fan; T. Zhang; |
187 | Instantaneous Linear Dimensionality Reduction of Multichannel Time-Series Signal for Array Signal Processing Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a frequency-independent, i.e., instantaneous, linear dimensionality reduction method that achieves low computational cost and latency and high restoration accuracy. |
N. Ueno; N. Ono; |
188 | Generalized Time Domain Velocity Vector Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce and analyze Generalized Time Domain Velocity Vector (GTVV), an extension of the previously presented acoustic multipath footprint extracted from the Ambisonic recordings. |
S. Kitic; J. Daniel; |
189 | Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a model (DDSP mixture model) that represents a mixture as the sum of the outputs of multiple pretrained DDSP autoencoders. |
M. Kawamura; T. Nakamura; D. Kitamura; H. Saruwatari; Y. Takahashi; K. Kondo; |
190 | The Mirrornet : Learning Audio Synthesizer Controls Inspired By Sensorimotor Interaction Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, the MirrorNet is applied to learn, in an unsupervised manner, the controls of a specific audio synthesizer (DIVA) to produce melodies only from their auditory spectrograms. |
Y. M. Siriwardena; G. Marion; S. Shamma; |
191 | Deep Performer: Score-to-Audio Music Performance Synthesis Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Hence, we propose two new techniques for handling polyphonic inputs and providing a fine-grained conditioning in a transformer encoder-decoder model. |
H. -W. Dong; C. Zhou; T. Berg-Kirkpatrick; J. McAuley; |
192 | KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE Using Mel-Spectrograms Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel neural network model called KaraSinger for a less-studied singing voice synthesis (SVS) task named score-free SVS, in which the prosody and melody are spontaneously decided by machine. |
C. -F. Liao; J. -Y. Liu; Y. -H. Yang; |
193 | Adversarial Audio Synthesis Using A Harmonic-Percussive Discriminator Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a discriminator design scheme for generative adversarial network-based audio signal generation. |
J. Lee; H. Lim; C. Lee; I. Jang; H. -G. Kang; |
194 | SleepGAN: Towards Personalized Sleep Therapy Music Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we take the first step towards generating personalized sleep therapy music. |
J. Yang; C. Min; A. Mathur; F. Kawsar; |
195 | Diversity-Controllable and Accurate Audio Captioning Based on Neural Condition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a novel neural conditional captioning model to balance the diversity and accuracy trade-off. |
X. Xu; M. Wu; K. Yu; |
196 | Audioclip: Extending Clip to Image, Text and Audio Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: Utilizing the AudioSet dataset, our proposed model incorporates the ESResNeXt audio-model into the CLIP framework, thus enabling it to perform multimodal classification and keeping CLIP�s zero-shot capabilities.AudioCLIP achieves new state-of-the-art results in the Environmental Sound Classification (ESC) task and out-performs others by reaching accuracies of 97.15 % on ESC-50 and 90.07 % on UrbanSound8K. |
A. Guzhov; F. Raue; J. Hees; A. Dengel; |
197 | Can Audio Captions Be Evaluated With Image Caption Metrics? Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: To overcome their limitations, we propose a metric named FENSE, where we combine the strength of Sentence-BERT in capturing similarity, and a novel Error Detector to penalize erroneous sentences for robustness. |
Z. Zhou; Z. Zhang; X. Xu; Z. Xie; M. Wu; K. Q. Zhu; |
198 | A Data-Driven Cognitive Salience Model for Objective Perceptual Audio Quality Assessment Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel data-driven salience model that informs the quality mapping stage by explicitly estimating the cognitive/degradation metric interactions using a salience measure. |
P. M. Delgado; J. Herre; |
199 | Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-Box Acoustic Models Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A deep neural network (DNN)-based speech enhancement (SE) aiming to maximize the performance of an automatic speech recognition (ASR) system is proposed in this paper. |
R. Sawata; Y. Kashiwagi; S. Takahashi; |
200 | Effect of Noise Suppression Losses on Speech Distortion and ASR Performance Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Furthermore, the introduced speech distortion and artifacts greatly harm speech quality and intelligibility, and often significantly degrade automatic speech recognition (ASR) rates. In this work, we shed light on the success of the spectral complex compressed mean squared error (MSE) loss, and how its magnitude and phase-aware terms are related to the speech distortion vs. noise reduction trade off. |
S. Braun; H. Gamper; |
201 | Increasing Loudness in Audio Signals: A Perceptually Motivated Approach to Preserve Audio Quality Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a method to maintain the subjective perception of volume of audio signals and, at the same time, reduce their absolute peak value. |
A. Jeannerot; N. de Koeijer; P. Mart�nez-Nuevo; M. B. M�ller; J. Dyreby; P. Prandoni; |
202 | Audio Peak Reduction Using A Synced Allpass Filter Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a new technique for linear peak amplitude reduction is proposed based on a Schroeder allpass filter, whose delay line and gain parameters are synced to match peaks of the signal�s auto-correlation function. |
S. J. Schlecht; L. Fierro; V. V�lim�ki; J. Backman; |
203 | APPLADE: Adjustable Plug-and-Play Audio Declipper Combining DNN with Sparse Optimization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an audio declipping method that takes advantages of both sparse optimization and deep learning. |
T. Tanaka; K. Yatabe; M. Yasuda; Y. Oikawa; |
204 | Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, and Pretraining: An Ablation Study Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents an ablation study that analyzes which components contribute to the boost in performance and training time. |
D. Tompkins; K. Kumar; J. Wu; |
205 | Threshold Independent Evaluation of Sound Event Detection Scores Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: Performing an adequate evaluation of sound event detection (SED) systems is far from trivial and is still subject to ongoing research. |
J. Ebbers; R. Haeb-Umbach; R. Serizel; |
206 | Multimodal Evaluation Method for Sound Event Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a novel multimodal method to evaluate SED systems from multiple perspectives such as detection, total duration, relative duration, and uniformity. |
S. M. R. Modaresi; A. Osmani; M. Razzazi; A. Chibani; |
207 | A Benchmark of State-of-the-Art Sound Event Detection Systems Evaluated on Synthetic Soundscapes Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a benchmark of submissions to Detection and Classification Acoustic Scene and Events 2021 Challenge (DCASE) Task 4 representing a sampling of the state-of-the-art in Sound Event Detection task. |
F. Ronchini; R. Serizel; |
208 | Attentive Max Feature Map and Joint Training for Acoustic Scene Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose the attentive max feature map that combines two effective techniques, attention and a max feature map, to further elaborate the attention mechanism and mitigate the above-mentioned phenomenon. |
H. -j. Shim; J. -w. Jung; J. -h. Kim; H. -J. Yu; |
209 | A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: We propose a variational Bayesian (VB) approach to learning distributions of latent variables in deep neural network (DNN) models for cross-domain knowledge transfer, to address acoustic mismatches between training and testing conditions. |
H. Hu; S. M. Siniscalchi; C. -H. H. Yang; C. -H. Lee; |
210 | ORCA-PARTY: An Automatic Killer Whale Sound Type Separation Toolkit Using Deep Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: The current study is the first introducing a fully-automated deep signal separation approach for overlapping orca vocalizations, addressing all of the previously mentioned challenges, together with one of the largest bioacoustic data archives recorded on killer whales (Orcinus Orca). |
C. Bergler; M. Schmitt; A. Maier; R. X. Cheng; V. Barth; E. N�th; |
211 | Sparsity-Based Sound Field Separation in The Spherical Harmonics Domain Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Sound field analysis and reconstruction has been a topic of intense research in the last decades for its multiple applications in spatial audio processing tasks. In this context, the identification of the direct and reverberant sound field components is a problem of great interest, where several solutions exploiting spherical harmonics representations have already been proposed. |
M. Pezzoli; M. Cobos; F. Antonacci; A. Sarti; |
212 | Spatial Active Noise Control Based on Individual Kernel Interpolation of Primary and Secondary Sound Fields Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, whereas the sound field to be interpolated is a superposition of primary and secondary sound fields, the directional weight for the primary noise source was applied to the total sound field in previous work; therefore, the performance improvement was limited. We propose a method of individually interpolating the primary and secondary sound fields and formulate a normalized least-mean-square algorithm based on this interpolation method. |
K. Arikawa; S. Koyama; H. Saruwatari; |
213 | Time-Domain Acoustic Contrast Control with A Spatial Uniformity Constraint for Personal Audio Systems Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a spatial uniformity constraint on time-domain broadband ACC in addition to the frequency response trend estimation constraint with the aim of ensuring a uniform sound field distribution in the bright zone. |
S. Zhao; I. S. Burnett; |
214 | Generation of Personal Sound Fields in Reverberant Environments Using Interframe Correlation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a personal sound field control approach that exploits interframe correlation. |
L. Shi; G. Ping; X. Shen; M. G. Christensen; |
215 | Variable Span Trade-Off Filter for Sound Zone Control with Kernel Interpolation Weighting Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A sound zone control method is proposed, based on the frequency domain variable span trade-off filter (VAST). |
J. Brunnstr�m; S. Koyama; M. Moonen; |
216 | Time Domain Radial Filter Design for Spherical Waves Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, the time-domain radial functions for spherical waves are realized as FIR filters. |
N. Hahn; F. Schultz; S. Spors; |
217 | Feature Space Message Passing Network for Medical Image Semantic Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To solve both problems, we propose a novel feature space message passing network (FSMPN) framework. |
J. Sun; K. Zhang; S. Niu; Y. Zhang; Y. Kong; |
218 | Cross-Domain Few-Shot Learning for Rare-Disease Skin Lesion Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a cross-domain few-shot segmentation (CD-FSS) framework, which enables the model to leverage the learning ability obtained from the natural domain, to facilitate rare-disease skin lesion segmentation with limited data of common diseases. |
Y. Wang; et al. |
219 | Adaptive Pseudo Labeling for Source-Free Domain Adaptation in Medical Image Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we combine the dual-classifiers consistency and predictive category-aware confidence to form a novel regularization for pseudo-label denoising. |
C. Li; W. Chen; X. Luo; Y. He; Y. Tan; |
220 | Object Detection and Tracking in Ultrasound Scans Using An Optical Flow and Semantic Segmentation Framework Based on Convolutional Neural Networks Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a framework to autonomously detect, localize and track anatomical structures in ultrasound scans during scanning and therapeutic sessions in real-time. |
A. F. Al-Battal; I. R. Lerman; T. Q. Nguyen; |
221 | Heuristic Dropout: An Efficient Regularization Method for Medical Image Segmentation Models Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This manuscript goes deep into the research of the Dropout algorithm, which is commonly used in neural networks to alleviate the overfitting problem. From the perspective of solving the co-adaptation problem, this manuscript explains the basic principles of the Dropout algorithm and discusses the existing limitations of its derivative methods. |
D. Shi; R. Liu; L. Tao; C. Yuan; |
222 | Superresolution and Segmentation of OCT Scans Using Multi-Stage Adversarial Guided Attention Training Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work aims to segment the OCT images automatically; however, it is a challenging task due to various issues such as the speckle noise, small target region, and unfavorable imaging conditions. |
P. Jeihouni; O. Dehzangi; A. Amireskandari; A. Dabouei; A. Rezai; N. M. Nasrabadi; |
223 | Heart Rate and Oxygen Saturation Estimation from Facial Video with Multimodal Physiological Data Generation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a method to estimate heart rate and oxygen saturation from facial videos with multimodal physiological data generation. |
Y. Akamatsu; Y. Onishi; H. Imaoka; |
224 | EMGSE: Acoustic/EMG Fusion for Multimodal Speech Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel EMGSE framework for multimodal SE, which integrates audio and facial electromyography (EMG) signals. |
K. -C. Wang; K. -C. Liu; H. -M. Wang; Y. Tsao; |
225 | A Dilated Residual Vision Transformer for Atrial Fibrillation Detection from Stacked Time-Frequency ECG Representations Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work proposes a new vision transformer (ViT) variant, namely, Dilated Residual ViT (DiResViT), by replacing the original patchify stem in ViT with dilated convolutional stem having residual connections for improved AF detection from an ensemble of ECG time-frequency representations. |
S. Pratiher; A. Srivastava; Y. B. Priyatha; N. Ghosh; A. Patra; |
226 | Contrastive Heartbeats: Contrastive Learning for Self-Supervised ECG Representation and Phenotyping Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Hence, we propose a new self-supervised representation learning framework, contrastive heartbeats (CT-HB), which learns general and robust electrocardiogram representations for efficient training on various downstream tasks. |
C. T. Wei; M. -E. Hsieh; C. -L. Liu; V. S. Tseng; |
227 | Ubiquitous Physiological Prediction of SUD Patients� Wellness State Using Memory-Based Convolutional Models Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Using wearable sensors, we aim to evaluate the impact of changes in heart rate (HR) and heart rate variability (HRV) signals on SUD wellness development using long-term and ubiquitous monitoring and machine learning and collected data from 10 subjects over an extended period of time. |
O. Dehzangi; P. Jeihouni; J. Ramadan; V. Finomore; N. M. Nasrabadi; A. Rezai; |
228 | Joint Hypoglycemia Prediction and Glucose Forecasting Via Deep Multi-Task Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a multitask learning approach to the problem of hypoglycemia (HG) prediction in diabetes. |
M. Yang; D. Dave; M. Erraguntla; G. L. Cote; R. Gutierrez-Osuna; |
229 | SegNet-Based Deep Representation Learning for Dysphagia Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This article presents a SegNet-based method for classifying healthy and dysphagic swallow signals by learning mel-spectrogram features. |
S. Subramani; A. R. M. V; A. Roy; P. S. Hegde; P. Kumar Ghosh; |
230 | Robust Collaborative Learning for Sequence Modelling Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: By constructing model-agnostic robustness checks and reusing features obtained from both architectures, we build a collaborative framework that improves performance and stability. |
F. Buet-Golfouse; H. Roggeman; I. Utyagulov; |
231 | A Self-Supervised Pre-Training Framework for Vision-Based Seizure Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a method to classify ES and PNES based on clinical signs in the seizure videos. |
J. -C. Hou; A. McGonigal; F. Bartolomei; M. Thonnat; |
232 | Design of Real-Time System Based on Machine Learning for Snoring and OSA Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we place a microphone under the patient�s bed and combined with full-night polysomnography to record audio signals. |
H. Luo; L. Zhang; L. Zhou; X. Lin; Z. Zhang; M. Wang; |
233 | Parametric Modeling of Human Wrist for Bioimpedance-Based Physiological Sensing Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This study provides a parametric model of the human wrist that involves different tissue layers (i.e., skin, fat, artery, muscle, bone) with complex dielectric properties built based on the human wrist anatomy. |
K. Sel; N. Huerta; M. S. Sacks; R. Jafari; |
234 | Preliminary Results on The Generation of Artificial Handwriting Data Using A Decomposition-Recombination Strategy Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work proposes the use of data augmentation techniques to improve the accuracy of a Long short-term memory system in the diagnosis of essential tremor. |
J. F. Adr�n Otero; O. Sol�ns Caballer; P. Marti-Puig; Z. Sun; T. Tanaka; J. Sol�-Casals; |
235 | A Style Transfer Mapping and Fine-Tuning Subject Transfer Framework Using Convolutional Neural Networks for Surface Electromyogram Pattern Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we propose a style transfer mapping (STM) and fine-tuning (FT) subject transfer framework using convolutional neural networks (CNNs). |
S. Kanoga; T. Hoshino; M. Tada; |
236 | Feature-Based Sensing Matrix Design for Analog to Information Converters Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel sensing matrix design for the pulse-width modulation (PWM)-based analog-to-information converter (AIC), which obtains the digital feature of an analog signal rather than its sparse coefficients. |
C. Guo; H. Qian; B. Hong; |
237 | ALSNet: A Dilated 1-D CNN for Identifying ALS from Raw EMG Signal Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a dilated one dimensional convolutional neural network, named ALSNet, is proposed for identifying ALS from raw EMG signal. |
K. M. Naimul Hassan; et al. |
238 | Joint Model Order Estimation for Multiple Tensors with A Coupled Mode and Applications to The Joint Decomposition of EEG, MEG Magnetometer, and Gradiometer Tensors Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we extend the rank estimation techniques, designed for a single tensor, to noise-corrupted coupled low-rank tensors that share one of their factor matrices. |
B. Ahmad; L. Khamidullina; A. A. Korobkov; A. Manina; J. Haueisen; M. Haardt; |
239 | An Experimental Study on Transferring Data-Driven Image Compressive Sensing to Bioelectric Signals Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we conduct an experimental study on transferring existing data-driven image CS methods to bioelectric signals. |
Z. Zhang; J. Zhao; F. Ren; |
240 | Hand Gesture Recognition Using Temporal Convolutions and Attention Mechanism Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Such data-driven models, however, have been challenged by their need for a large number of trainable parameters and their structural complexity. Here we propose the novel Temporal Convolutions-based Hand Gesture Recognition architecture (TC-HGR) to reduce this computational burden. |
E. Rahimian; S. Zabihi; A. Asif; D. Farina; S. F. Atashzar; A. Mohammadi; |
241 | Combining Multiple Style Transfer Networks and Transfer Learning For LGE-CMR Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents an algorithm for segmenting late gadolinium enhancement cardiac magnetic resonance (LGE-CMR) in the absence of labeled training data. |
B. Fang; J. Chen; W. Wang; Y. Zhou; |
242 | Multi-Domain Unpaired Ultrasound Image Artifact Removal Using A Single Convolutional Neural Network Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by the recent success of multi-domain image transfer, herein, we propose a novel unpaired deep learning approach where a single neural network can deal with different types of US artifacts simply by changing a mask vector that switches between different target domains. |
J. Huh; S. Khan; J. C. Ye; |
243 | Improving Ultrasound Image Classification with Local Texture Quantisation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a novel image classification framework for small-scaled and noisy ultrasound image datasets. |
X. Li; H. Liang; S. Nagala; J. Chen; |
244 | Accelerated Intravascular Ultrasound Imaging Using Deep Reinforcement Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To efficiently accelerate IVUS imaging, we propose a framework that utilizes deep reinforcement learning for an optimal adaptive acquisition policy on a per-frame basis enabled by actor-critic methods and Gumbel top-K sampling. |
T. S. W. Stevens; N. Chennakeshava; F. J. de Bruijn; M. Pekar; R. J. G. van Sloun; |
245 | Deep Proximal Unfolding For Image Recovery from Under-Sampled Channel Data in Intravascular Ultrasound Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we propose a model-based deep learning solution that aims to reconstruct images from data that has been beamformed by under-sampling the number of channels by a factor of 4. |
N. Chennakeshava; et al. |
246 | Multiview Long-Short Spatial Contrastive Learning For 3D Medical Image Analysis Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we extend the contrastive learning framework to 3D volumetric medical imaging. |
G. Cao; Y. Wang; M. Zhang; J. Zhang; G. Kang; X. Xu; |
247 | Composing Graphical Models with Generative Adversarial Networks for EEG Signal Modeling Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we propose a generative and inference approach that combines the complementary benefits of probabilistic graphical models and generative adversarial networks (GANs) for EEG signal modeling. |
K. Vo; M. Vishwanath; R. Srinivasan; N. Dutt; H. Cao; |
248 | Domain-Invariant Representation Learning from EEG with Private Encoders Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: To that end, we propose a multi-source learning architecture where we extract domain-invariant representations from dataset-specific private encoders. |
D. Bethge; et al. |
249 | Holistic Semi-Supervised Approaches for EEG Representation Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we adapt three state-of-the-art holistic semi-supervised approaches, namely MixMatch [1], Fix-Match [2], and AdaMatch [3], as well as five classical semi-supervised methods for EEG learning. |
G. Zhang; A. Etemad; |
250 | Music Identification Using Brain Responses to Initial Snippets Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We examine EEG encoding of naturalistic musical patterns employing the NMED-T and MUSIN-G datasets. |
P. Pandey; G. Sharma; K. P. Miyapuram; R. Subramanian; D. Lomas; |
251 | Multi-Level Spatial-Temporal Adaptation Network for Motor Imagery Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: And this variance is more significant across subjects and sessions, which imposes limitations on the cross-domain MI tasks. To address this problem, we propose a Multi-level Spatial-Temporal Adaptation Network (MSTAN), extracting domain-invariant multi-level spatial-temporal features to overcome domain differences. |
W. Xu; J. Wang; Z. Jia; Z. Hong; Y. Li; Y. Lin; |
252 | Learning Subject-Invariant Representations from Speech-Evoked EEG Using Variational Autoencoders Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we adapt factorized hierarchical variational autoencoders to exploit parallel EEG recordings of the same stimuli. |
L. Bollens; T. Francart; H. V. Hamme; |
253 | Unsupervised Hierarchical Translation-Based Model for Multi-Modal Medical Image Registration Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an unsupervised hierarchical translation-based model to perform a coarse to fine registration of multi-modal medical images. |
X. Dai; T. Ma; H. Cai; Y. Wen; |
254 | FAZ-BV: A Diabetic Macular Ischemia Grading Framework Combining Faz Attention Network and Blood Vessel Enhancement Filters Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, none of the existing methods can effectively segment the damaged foveal avascular zone (FAZ) and blood vessels (BV) of DMI patients. To avoid this disadvantage, this study proposes a DMI grading framework, i.e. FAZ-BV, combining accurate FAZ and vessel segmentation designed for DMI. |
Z. Chen; H. Lan; Y. Meng; Y. Xiong; J. Luo; H. Shen; |
255 | Fracture Detection and Localization in Chest X-Rays Using Semi-Supervised Learning with Dynamic Sharpening Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present a low-cost and efficient method for training a rib and clavicle fracture detection model for chest X-ray (CXR) in a semi-supervised setting where only a small portion of training data with location annotation. |
L. Lu; S. Miao; L. Ye; |
256 | Histokt: Cross Knowledge Transfer in Computational Pathology Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we take a data-centric approach to the transfer learning problem and examine the existence of generalizable knowledge between histopathological datasets. |
R. Zhang; et al. |
257 | Unsupervised Deep Learning Network for Deformable Fundus Image Registration Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Aiming at addressing the retina registration problem from the Deep Learning perspective, in this paper we introduce an end-to-end framework capable of learning the registration task in a fully unsupervised way. |
G. A. Benvenuto; M. Colnago; W. Casaca; |
258 | A Minimally Supervised Approach for Medical Image Quality Assessment in Domain Shift Settings Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a minimally-supervised image quality assessment (MIQA) approach that can learn effectively with small datasets and limited labels in class-imbalanced domain shift scenarios. |
H. Yang; et al. |
259 | A Channel Attention Based MLP-Mixer Network for Motor Imagery Decoding With EEG Literature Review Related Patents Related Grants Related Orgs Related Experts Details Abstract: Convolutional neural networks (CNNs) and their variants have been successfully applied to the electroencephalogram (EEG) based motor imagery (MI) decoding task. However, these … |
Y. He; Z. Lu; J. Wang; J. Shi; |
260 | Towards Closed-Loop Speech Synthesis from Stereotactic EEG: A Unit Selection Approach Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: The present study aims to address both challenges. |
M. Angrick; et al. |
261 | Enhancing Contextual Encoding With Stage-Confusion and Stage-Transition Estimation for EEG-Based Sleep Staging Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a novel network architecture that takes advantage of two auxiliary classification tasks and exploits their outputs to adapt feature representations, thus effectively discriminating confusing stages. |
J. Phyo; W. Ko; E. Jeon; H. -I. Suk; |
262 | Improving BCI-based Color Vision Assessment Using Gaussian Process Regression Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present metamer identification plus (metaID+), an algorithm that enhances the performance of brain-computer interface (BCI)-based color vision assessment. |
H. Habibzadeh; K. J. Long; A. E. Atkins; D. -S. Zois; J. J. S. Norton; |
263 | Transformer-Based Estimation of Spoken Sentences Using Electrocorticography Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Invasive brain�machine interfaces (BMIs) are a promising neurotechnological venture for achieving direct speech communication from a human brain, but it faces many challenges. In this paper, we measured the invasive electrocorticogram (ECoG) signals from seven participating epilepsy patients as they spoke a sentence consisting of multiple phrases. |
S. Komeiji; et al. |
264 | Boost Ensemble Learning for Classification of CTG SIGNALS Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In practice, we face highly imbalanced data, where the hypoxic fetuses are significantly underrepresented. We propose to address this problem by boost ensemble learning, where for learning, we use the distribution of classification error over the dataset. |
M. Ajirak; C. Heiselman; J. G. Quirk; P. M. Djuric; |
265 | Multi-View Learning Based on Non-Redundant Fusion for Icu Patient Mortality Prediction Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Those predicting from a single perspective cannot fully apply multiple sources of information, while the fusion of multiple perspectives may produce much redundant information. Therefore, this paper proposes a multi-view fusion method based on non-redundant information learning, applying it to ICU patient mortality prediction. |
Y. Wang; Y. Lan; |
266 | Improving Phase-Rectified Signal Averaging for Fetal Heart Rate Analysis Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we examine PRSA-based methods through the lens of dynamical systems theory and reveal the intrinsic connection between state space reconstruction and PRSA. |
T. Chen; G. Feng; C. Heiselman; J. G. Quirk; P. M. Djuric; |
267 | Unsupervised Clustering and Analysis of Contraction-Dependent Fetal Heart Rate Segments Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we provide a complete method for FHR-UC segment clustering and analysis via the Gaussian process latent variable model, and density-based spatial clustering. |
L. Yang; C. Heiselman; J. G. Quirk; P. M. Djuric; |
268 | A Method for Detecting Coronary Artery Disease Using Noisy Ultrashort Electrocardiogram Recordings Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: The current study aims at creating an algorithm able to detect Coronary Artery Disease (CAD), using ultrashort (duration of 30 seconds) one-lead ECG recordings. |
O. Apostolou; V. Charisis; G. Apostolidis; L. J. Hadjileontiadis; |
269 | Multi-Task Gaussian Process Regression for The Detection of Sleep Cycles in Premature Infants Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Studies on neonatal sleep suggest that the pattern of their sleep stages is determined by an endogenous ultradian rhythm, superimposed by other rhythms and external influences. In this article, we propose the use of multi-task Gaussian process regression as a flexible nonparametric approach to analyze this kind of sleep data while incorporating prior knowledge, such as of correlations between signals, signal periodicity, information from manual annotations and certain other signal properties. |
N. S. Br�gge; J. Grasshoff; A. Weigenand; P. Rostalski; |
270 | Fast Low Rank Column-Wise Compressive Sensing For Accelerated Dynamic MRI Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: Accelerated dynamic MRI is a key application where this problem occurs. In this work, we show the power of our approach (and of its modification for the MRI setting) for four very different highly undersampled dynamic MRI applications. |
S. Babu; S. S. Nayer; S. G. Lingala; N. Vaswani; |
271 | MRI Recovery with A Self-Calibrated Denoiser Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a PnP-inspired recovery method that does not require data beyond the single, incomplete set of measurements. |
S. Liu; P. Schniter; R. Ahmad; |
272 | 3d Cross-Scale Feature Transformer Network for Brain Mr Image Super-Resolution Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a 3D cross-scale feature transformer network (CFTN) to utilize the cross-scale priors within MR features. |
W. Zhang; L. Wang; W. Chen; Y. Jia; Z. He; J. Du; |
273 | Data Efficient Support Vector Machine Training Using The Minimum Description Length Principle Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a novel approach to training SVMs which does not suffer from the aforementioned limitation, which is at the same time much more rigorous in nature, being built upon solid information theoretic grounds. |
H. Singh; O. Arandjelovic; |
274 | Multiple Instance Learning with Task-Specific Multi-Level Features for Weakly Annotated Histopathological Image Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, three major challenges including lack of data efficiency because MIL approaches rely on task-agnostic feature extractor, overfitting challenges caused by high data imbalance between tumor and normal tissues, and the similarity between tumor and normal patches, are to be tackled. We proposed a three-stage deep MIL approach to address these challenges. |
Y. Zhou; Y. Lu; |
275 | Self-Knowledge Distillation Based Self-Supervised Learning for Covid-19 Detection from Chest X-Ray Images Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a novel self-knowledge distillation based self-supervised learning method for COVID-19 detection from chest X-ray images. |
G. Li; R. Togo; T. Ogawa; M. Haseyama; |
276 | Pixel-Level and Affinity-Level Knowledge Distillation for Unsupervised Segmentation of Covid-19 Lesions Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Although an unsupervised method based on anomaly detection has shown promising results in [1], its performance is relatively poor. We address this problem by proposing a pixel-level and affinity-level knowledge distillation method. |
R. Xu; et al. |
277 | Data Shapley Value for Handling Noisy Labels: An Application in Screening Covid-19 Pneumonia from Chest CT Scans Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, effects of utilizing different evaluation metrics for computation of the SV, detecting the noisy labels, and measuring the data points� importance has not yet been thoroughly investigated. In this context, we performed a series of comparative analyses to assess SV�s capabilities to detect noisy input labels when measured by different evaluation metrics. |
N. Enshaei; M. J. Rafiee; A. Mohammadi; F. Naderkhani; |
278 | Accurate Multiscale Selective Fusion of CT and Video Images for Real-Time Endoscopic Camera 3D Tracking in Robotic Surgery Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes an accurate multiscale selective fusion framework to register 2D endoscopic video images to 3D pre-operative CT data for endoscope 3D tracking. |
X. Luo; |
279 | Learning Deep Pathological Features for WSI-Level Cervical Cancer Grading Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: As WSIs are in gigapixel resolution, it is impossible to train a deep classification neural network with the entire WSIs as inputs. To bypass this problem, we propose a two-stage learning framework. |
R. Geng; Q. Liu; S. Feng; Y. Liang; |
280 | Selective Scale Cascade Attention Network for Breast Cancer Histopathology Image Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose selective scale cascade attention network (SSCA) to learning discriminative features for breast histopathological image classification. |
B. Xu; W. Zhang; |
281 | Frequency-Specific Non-Linear Granger Causality in A Network of Brain Signals Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel algorithm to extract frequency-band specific and non-linear Granger causality (Spectral NLGC) connections between components of a multivariate time series. |
A. Biswas; H. Ombao; |
282 | Epileptic Spike Detection By Recurrent Neural Networks with Self-Attention Mechanism Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper thus considers a scenario where candidates are not detected; that is, we propose a recurrent neural network (RNN)�based self-attention model that can be fitted from the EEG segments generated without spike candidates being detected. |
K. Fukumori; N. Yoshida; H. Sugano; M. Nakajima; T. Tanaka; |
283 | Topological Correlation of Brain Signals Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we propose a new correlation measure for EEG signals by correlating topological features across multiple directions. |
J. Yin; Y. Wang; |
284 | Online Detection of Scalp-Invisible Mesial-Temporal Brain Interictal Epileptiform Discharges from EEG Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we propose a method namely temporal components analysis (TCA) to detect the IEDs from ongoing sEEG and iEEG signals recorded simultaneously. |
B. Abdi-Sargezeh; A. Valentin; G. Alarcon; S. Sanei; |
285 | Leveraging Sparse Coding for EEG Based Emotion Recognition in Shooting Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we collected EEG of novice shooters and high-level shooters in different emotion states, and established two shooting datasets. |
Y. Wang; Y. Sun; L. Fang; C. Zhang; |
286 | A Novel Unsupervised Autoencoder-Based HFOs Detector in Intracranial EEG Signals Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, most of the existing HFOs detectors are based on manual feature extraction and supervised learning, which incur laborious feature selection and time-consuming labeling process. In order to tackle these issues, we propose an automatic unsupervised HFOs detector based on convolutional variational autoencoder (CVAE). |
W. Li; L. Zhong; W. Xiang; T. Kang; D. Lai; |
287 | A Novel Convolutional Neural Network Based on Adaptive Multi-Scale Aggregation and Boundary-Aware for Lateral Ventricle Segmentation on MR Images Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel convolutional neural network based on adaptive multi-scale feature aggregation and boundary-aware for lateral ventricle segmentation (MB-Net), which mainly includes three parts, i.e., an adaptive multi-scale feature aggregation module (AMSFM), an embedded boundary refinement module (EBRM), and a local feature extraction module (LFM). |
F. Ye; Z. Wang; S. Zhu; X. Li; K. Hu; |
288 | Multiscale Attention Aggregation Network for 2D Vessel Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel multiscale attention aggregation network (MAA-Net) for vessel segmentation. |
W. Liu; H. Yang; T. Tian; X. Pan; W. Xu; |
289 | TCRNet: Make Transformer, CNN and RNN Complement Each Other Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel encoder-decoder network named TCRNet, which makes Transformer, Convolutional neural network (CNN) and Recurrent neural network (RNN) complement each other. |
X. Shan; T. Ma; A. Gu; H. Cai; Y. Wen; |
290 | Double Noise Mean Teacher Self-Ensembling Model for Semi-Supervised Tumor Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a novel double noise mean teacher self-ensembling model for semi-supervised 2D tumor segmentation. |
K. Zheng; J. Xu; J. Wei; |
291 | Rethinking Computer-Aided Pelvis Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Some mainstream segmentation algorithms are trained and evaluated on the proposed PCT14K dataset and served as the baselines for future research. |
S. Yuan; Q. Liu; S. Liao; F. Han; H. Wei; Y. Zhang; |
292 | Vision Transformer-Based Retina Vessel Segmentation with Deep Adaptive Gamma Correction Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, the complexity of edge structural information and the changeable intensity distribution depending on retina images reduce the performance of the segmentation tasks. This paper proposes two novel deep learning-based modules, channel attention vision transformer (CAViT) and deep adaptive gamma correction (DAGC), to tackle these issues. |
H. Yu; J. -h. Shim; J. Kwak; J. W. Song; S. -J. Kang; |
293 | Spectral Permutation Test on Persistence Diagrams Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we propose a novel spectral permutation test on PDs by permuting Fourier coefficients from heat kernel estimation of the PDs. |
Y. Wang; M. K. Chung; J. Fridriksson; |
294 | Multi-Task FMRI Data Fusion Using IVA and PARAFAC2 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Various formulations of coupled matrix factorizations have been proposed, each with its own modeling assumptions. In this paper, we study two such methods, namely Independent Vector Analysis (IVA), i.e., extension of Independent Component Analysis (ICA) to multiple datasets, and PARAFAC2, a tensor factorization approach. |
I. Lehmann; et al. |
295 | Independent Vector Analysis Based Subgroup Identification from Multisubject FMRI Data Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a completely data-driven approach, subgroup identification using independent vector analysis (SI-IVA), which leverages the desirable properties of IVA to uncover the relationship across subjects along with the discovery of subgroup structures revealed by Gershgorin disc theorem. |
H. Yang; M. A. B. S. Akhonda; F. Ghayem; Q. Long; V. D. Calhoun; T. Adali; |
296 | Improving Brain Decoding Methods and Evaluation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose to directly classify an fMRI scan, mapping it to the corresponding word within a fixed vocabulary. |
D. Pascual; B. Egressy; N. Affolter; Y. Cai; O. Richter; R. Wattenhofer; |
297 | Cmri2spec: Cine MRI Sequence to Spectrogram Synthesis Via A Pairwise Heterogeneous Translator Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we propose a new synthesis framework to translate from cine MRI sequences to spectrograms with a limited dataset size. |
X. Liu; et al. |
298 | Spatio-Temporal Attention Graph Convolution Network for Functional Connectome Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we propose a novel Spatio-Temporal Attention Graph Convolution Network (STAGCN) for FC classification. |
W. Wang; Y. Kong; Z. Hou; C. Yang; Y. Yuan; |
299 | Bilevel Learning of L1 Regularizers with Closed-Form Gradients (BLORC) Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a method for supervised learning of sparsity-promoting regularizers, which are a key ingredient in many modern signal reconstruction approaches. |
A. Ghosh; M. T. Mccann; S. Ravishankar; |
300 | Multiband Image Fusion with Controllable Error Guarantees Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In classical variational techniques, this problem is formulated as the minimization of an objective function consisting of two quadratic data-fidelity terms and an edge-preserving regularizer; the former account for blur, resolution mismatch and additive noise. In this work, we explore a constrained formulation of this problem where the regularization function is minimized subject to hard constraints on the data fidelity. |
U. V. S.; R. G. Gavaskar; K. N. Chaudhury; |
301 | Weighted Graph Embedded Low-Rank Projection Learning for Feature Extraction Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To solve those problems, in this paper a weighted graph embedded low-rank projection (WGE_LRP) method is proposed. |
Z. Huang; S. Zhao; L. Fei; J. Wu; |
302 | ADMM-DAD Net: A Deep Unfolding Network for Analysis Compressed Sensing Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we propose a new deep unfolding neural network based on the ADMM algorithm for analysis Compressed Sensing. |
V. Kouni; G. Paraskevopoulos; H. Rauhut; G. C. Alexandropoulos; |
303 | High-Dimensional Sparse Bayesian Learning Without Covariance Matrices Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new inference scheme that avoids explicit construction of the covariance matrix by solving multiple linear systems in parallel to obtain the posterior moments for SBL. |
A. Lin; A. H. Song; B. Bilgic; D. Ba; |
304 | A Trainable Bounded Denoiser Using Double Tight Frame Network for Snapshot Compressive Imaging Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Recently, the PnP-GAP algorithm has achieved remarkable reconstruction quality for snapshot compressive imaging (SCI), and its convergence has been proven based on the condition of diminishing noise levels and the assumption of bounded denoisers. |
B. Shi; Y. Wang; Q. Lian; |
305 | Progressive Image Super-Resolution Via Neural Differential Equation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new approach for the image super-resolution (SR) task that progressively restores a high-resolution (HR) image from an input low-resolution (LR) image on the basis of a neural ordinary differential equation. |
S. Park; T. H. Kim; |
306 | High-Quality Self-Supervised Snapshot Hyperspectral Imaging Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper leverages the image priors encoded in untrained neural networks (NNs) to have a self-supervised learning method which is free from training datasets while adaptive to the statistics of a test sample. |
Y. Quan; X. Qin; M. Chen; Y. Huang; |
307 | Robust Bayesian Reconstruction of Multispectral Single-Photon 3D Lidar Data with Non-Uniform Background Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a new Bayesian algorithm for the robust reconstruction of multispectral single-photon Lidar data acquired in extreme conditions. |
A. Halimi; J. Koo; R. A. Lamb; G. S. Buller; S. McLaughlin; |
308 | Joint Calibration and Mapping of Satellite Altimetry Data Using Trainable Variational Models Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we show how a data-driven variational data assimilation framework could be used to jointly learn a calibration operator and an interpolator from non-calibrated data . |
Q. Febvre; R. Fablet; J. L. Sommer; C. Ubelmann; |
309 | 4D Convolutional Neural Networks for Multi-Spectral and Multi-Temporal Remote Sensing Data Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose the extension of current fully-convolutional models for multi-temporal remote sensing data classification to their high-dimensional analogs, which can naturally capture multi-dimensional dependencies and correlations. |
M. Giannopoulos; G. Tsagkatakis; P. Tsakalides; |
310 | A New Deep Learning Method for Multispectral Image Time Series Completion Using Hyperspectral Data Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new deep learning approach to that end. |
C. T. Ciss�; et al. |
311 | Image Denoising with Deep Unfolding And Normalizing Flows Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, current proximal mappings based on (predominantly convolutional) neural networks only implicitly learn such image priors. In this paper, we propose to make these image priors fully explicit by embedding deep generative models in the form of normalizing flows within the unfolded proximal gradient algorithm, and training the entire algorithm in an end-to-end fashion. |
X. Wei; H. van Gorp; L. G. Carabarin; D. Freedman; Y. C. Eldar; R. J. G. van Sloun; |
312 | 3D Texture Super Resolution Via The Rendering Loss Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Due to the nature of rendering 3D models, 2D SR methods applied directly to 3D object texture may not be a good approach. In this paper, we propose a rendering loss derived from the rendering of a 3D model and demonstrate its application to the SR task in the context of 3D texturing. |
R. Ranade; Y. Liang; S. Wang; D. Bai; J. Lee; |
313 | Bundle ICP with Virtual Depth for Hand-Held 3d Scanner Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a general-purpose hand-held 3D scan system that combines a iterative closest point (ICP) algorithm based on a large amount of virtual information for accuracy with the advantage of a graph-based reconstruction system for robustness. |
C. Sung; B. Kim; |
314 | Sketched RT3D: How to Reconstruct Billions of Photons Per Second Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In particular, we propose a sketched version of a recent state-of-the-art algorithm which uses point cloud denoisers to provide spatially regularized reconstructions. |
J. Tachella; M. P. Sheehan; M. E. Davies; |
315 | A Generic Method to Estimate Camera Extrinsic Parameters Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, an approach to self-calibrate an outward-looking camera from camera images is presented. |
N. Kuruba; N. Badadare; V. Narayan; S. Putta; |
316 | Photon-Limited Deblurring Using Algorithm Unrolling Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present an algorithm unrolling approach that unrolls a Plug-and-Play algorithm using a fixed-iteration network. |
Y. Sanghvi; A. Gnanasambandan; S. H. Chan; |
317 | NEX+: Novel View Synthesis with Neural Regularisation Over Multi-Plane Images Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Overfitting to training data is a common challenge for all learning-based models. We propose a novel solution for resolving such issue in the context of NVS with signal denoising-motivated operations over the alpha coefficients of the MPI, without any additional requirements for supervision. |
W. Xing; J. Chen; |
318 | Compressive Scanning Transmission Electron Microscopy Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a scanning method based on the theory of Compressive Sensing (CS) and subsampling the electron probe locations using a line hop sampling scheme that significantly reduces the electron beam damage. |
D. Nicholls; et al. |
319 | Deep Iterative Phase Retrieval for Ptychography Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we specifically consider ptychography, a sub-field of diffractive imaging, where objects are reconstructed from multiple overlapping diffraction images. |
S. Welker; T. Peer; H. N. Chapman; T. Gerkmann; |
320 | Compressive Phase Retrieval Based On Sparse Latent Generative Priors Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose to introduce structure on the signal by enforcing sparsity in the latent-space via proximal method while training the generator. |
V. Killedar; C. S. Seelamantula; |
321 | Model-Based Reconstruction for Collimated Beam Ultrasound Systems Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Such systems include a transmitter and multiple receivers to capture reflected signals. Common algorithms for ultrasound reconstruction use delay-and-sum (DAS) approaches; these have low computational complexity but produce inaccurate images in the presence of complex structures and specialized geometries such as collimated beams.In this paper, we propose a multi-layer, ultrasonic, model-based iterative reconstruction algorithm designed for collimated beam systems. |
A. Alanazi; S. Venkatakrishnan; H. Santos-Villalobos; G. Buzzard; C. Bouman; |
322 | Learned Acoustic Reconstruction Using Synthetic Aperture Focusing Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Many algorithmic approaches to 3D acoustic imaging have been devised which rely on a large abundance of receiving elements to produce images with delay-and-sum techniques, but these have found little use in air due to hardware complexity and low accuracy. |
T. Straubinger; R. Xiao; H. Rhodin; |
323 | SDETR: Attention-Guided Salient Object Detection with Transformer Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a two-stage predict-refine SDETR model to leverage both benefits of transformer and CNN layers that can produce results with accurate saliency prediction and fine-grained local details. |
G. Liu; B. Xu; H. Huang; C. Lu; Y. Guo; |
324 | Evaluation of Video Coding for Machines Without Ground Truth Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Thus, current methods have to either evaluate their codecs on still images or on already compressed data. To mitigate this problem, we propose an evaluation method based on pseudo ground-truth data from the field of semantic segmentation to the evaluation of video coding for machines. |
K. Fischer; M. Hofbauer; C. Kuhn; E. Steinbach; A. Kaup; |
325 | Raw Plenoptic Video Coding Under Hexagonal Lattice Resolution of Motion Vectors Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A study in this paper shows that motion vectors are highly concentrated at hexagonal lattice points, leading to use of the proposed resolution in the context of video compression. |
T. N. Huu; V. Duong Van; J. Yim; B. Jeon; |
326 | Comparison of Boundary Artifact Removal Methods in Coding of Generalized Cubemap Projection Using VVC Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we investigated the effect of different methods in Versatile Video Coding standard and other pre- and post-processing algorithms for removing boundary artifacts by introducing a new objective quality metric for systematic comparison. |
K. Jafari; A. Aminlou; M. M. Hannuksela; |
327 | Low-Complexity Multi-Model CNN In-Loop Filter for AVS3 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a low-complexity multi-model CNN in-loop filtering scheme is proposed for AVS3. |
S. Wang; Y. Fu; C. Zhu; L. Song; W. Zhang; |
328 | Unified Matrix Coding for NN Originated MIP in H.266/VVC Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper designs an efficient algorithm to determine the input vector of MIP, with which the range of the matrices can be minimized, and all matrices can be converted to integers with a unified shift and a unified offset. |
J. Huo; Y. Sun; H. Wang; S. Wan; F. Yang; M. Li; |
329 | FOV-Based Coding Optimization for 360-Degree Virtual Reality Videos Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an FoV-based coding scheme for 360-degree videos, which allocates more bits to tiles of the predicted FoV area than other tiles. |
Y. Xu; T. Yang; Z. Tan; H. Lan; |
330 | Multi-Hierarchy Proxy Structure for Deep Metric Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, these details are meaningful for capturing features of the class. Therefore, we propose a multi-hierarchy proxy (MHP) structure to extract the hierarchical details and regular features hidden in the embedding space. |
J. Wang; X. Li; W. Song; Z. Zhang; W. Guo; |
331 | Exploiting Caption Diversity for Unsupervised Video Summarization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a novel DPP-based regularizer is proposed that exploits a pretrained DNN-based image captioner in order to additionally enforce maximal key-frame diversity from the perspective of textual semantic content. |
M. Kaseris; I. Mademlis; I. Pitas; |
332 | Clustering and Separating Similarities for Deep Unsupervised Hashing Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: These fixed features are, however, neither designed originally for retrieval nor updated adaptively during training. In this paper, we propose a novel deep Unsupervised Cluster and Separate Hashing (UCSH) to address these issues. |
W. Zhang; D. Wu; C. Yang; B. Li; W. Wang; |
333 | Enhancing Prototypical Few-Shot Learning By Leveraging The Local-Level Strategy Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To tackle the problem, this paper returns the perspective to the local-level feature and proposes a series of local-level strategies. |
J. Huang; F. Chen; K. Wang; L. Lin; D. Zhang; |
334 | Blind Unmixing Using A Double Deep Image Prior Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel network structure to solve the blind hyperspectral unmixing problem using a double Deep Image Prior (DIP). |
C. Zhou; M. R. D. Rodrigues; |
335 | A New Framework for Multiple Deep Correlation Filters Based Object Tracking Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: According to this framework, we design each component step by step. |
Y. Liu; Y. Liang; Q. Wu; L. Zhang; H. Wang; |
336 | Adaptive Actor-Critic Bilateral Filter Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, while most prior works analyze the adaptation of the range kernel in one-step manner, in this paper we take a more constructive view towards multi-step framework with the goal of unveiling the vulnerability of bilateral filtering. To this end, we adaptively model the width setting of range kernel as a multi-agent reinforcement learning problem and learn an adaptive actor-critic bilateral filter from local image context during successive bilateral filtering operations. |
B. -H. Chen; H. -Y. Cheng; J. -L. Yin; |
337 | Domain Decomposition Algorithms for Real-Time Homogeneous Diffusion Inpainting in 4K Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This prevents them from being applicable to time-critical scenarios such as real-time inpainting of 4K images. As a remedy, we adapt state-of-the-art numerical algorithms of domain decomposition type to this problem. |
N. K�mper; J. Weickert; |
338 | Deep Temporal Interpolation of Radar-Based Precipitation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study optical flow-based interpolation of globally available weather radar images from satellites. |
M. Tatsubori; et al. |
339 | A Nonlinear Steerable Complex Wavelet Decomposition of Images Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a two-dimensional nonlinear transform that uses only two subbands to achieve rotation invariance property, and enjoys a mirror reconstruction making it similar to a tight frame. |
Z. Sun; T. Blu; |
340 | Kernel Estimation Network for Blind Super-Resolution Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, these methods suffer a severe performance drop when the real degradations deviate from this assumption. To address this issue, this paper proposes a novel kernel estimation network (KENet) for kernel prediction. |
X. Cao; H. Shen; L. Zhang; Y. Luo; T. Wang; |
341 | Terahertz Image Restoration Benchmarking Dataset Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: The paper introduces a new terahertz (THz) image benchmarking dataset for THz imaging. |
Y. Zhang; Z. Su; F. Qi; J. Zhou; X. -P. Zhang; |
342 | Binary Dense Predictors for Human Pose Estimation Based on Dynamic Thresholds and Filtering Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose two approaches to conduct image-aware and pixel-aware dynamic binarization in a model for human pose estimation. |
X. Xing; et al. |
343 | Self-Supervised Learning for Sentiment Analysis Via Image-Text Matching Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: There is often a resemblance in the sentiment expressed in social media posts (text) and their accompanying images. In this paper, We leverage this sentiment congruence for self-supervised representation learning for sentiment analysis. |
H. Zhu; Z. Zheng; M. Soleymani; R. Nevatia; |
344 | Domain-Agnostic Meta-Learning for Cross-Domain Few-Shot Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we tackle the challenging task of cross-domain few-shot classification and propose Domain-Agnostic Meta-Learning (DAML) algorithm. |
W. -Y. Lee; J. -Y. Wang; Y. -C. F. Wang; |
345 | Semantic Association Network for Video Corpus Moment Retrieval Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Extensive ablation studies and qualitative analyses show the effectiveness of the proposed model. |
D. Kim; S. Yoon; J. W. Hong; C. D. Yoo; |
346 | Statistical, Spectral and Graph Representations for Video-Based Facial Expression Recognition in Children Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose the first approach that (i) constructs video-level heterogeneous graph representation for facial expression recognition in children, and (ii) predicts children�s facial expressions using the automatically detected Action Units (AUs). |
N. I. Abbasi; S. Song; H. Gunes; |
347 | Deriving Explainable Discriminative Attributes Using Confusion About Counterfactual Class Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel counterfactual explanation method, Discriminative Gradients (DiscGrad) that derives explainable discriminative attributes by considering not only the predicted class but also the counterfactual classes. |
N. Yang; T. Kang; K. Jung; |
348 | Realistic Monocular-To-3d Virtual Try-On Via Multi-Scale Characteristics Capture Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In prior methods, the fundamental problems lie in the limitations on texture retention during garment deformation and the lack of feature context capture during depth estimation. To address these problems, we propose a new 3D virtual try-on network via multi-scale characteristic capture (VTON-MC), which can produce an exact 3D model with the generated photo-realistic monocular image. |
C. Du; et al. |
349 | Optimizing Latent Space Directions for Gan-Based Local Image Editing Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: We thus present a novel objective function to evaluate the locality of an image edit. |
E. Pajouheshgar; T. Zhang; S. S�sstrunk; |
350 | Towards Using Clothes Style Transfer for Scenario-Aware Person Video Generation Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: To further improve the generation performance, we propose a novel framework with disentangled multi-branch encoders and a shared decoder. |
J. Xu; et al. |
351 | Multi-Domain Unsupervised Image-to-Image Translation with Appearance Adaptive Convolution Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Despite the impressive results, they mainly focus on the I2I translation between two domains, so the multi-domain I2I translation still remains a challenge. To address this problem, we propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework that leverages the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance while preserving the given geometric content. |
S. Jeong; J. Lee; K. Sohn; |
352 | VR-FAM: Variance-Reduced Encoder with Nonlinear Transformation for Facial Attribute Manipulation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Existing works suffer from the entanglement of facial attributes, leading to unexpected artifacts and the loss of facial identity information after editing. To alleviate these issues, we propose a novel FAM framework based on StyleGAN, termed VR-FAM, which can meet the requirements of FAM�editing ability, distortion, and fidelity. |
Y. Yuan; S. Ma; J. Zhang; |
353 | Wavelet-Based Unsupervised Label-to-Image Translation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: State-of-the-art conditional Generative Adversarial Networks (GANs) need a huge amount of paired data to accomplish this task while generic un-paired image-to-image translation frameworks underperform in comparison, because they color-code semantic layouts and learn correspondences in appearance instead of semantic content. Starting from the assumption that a high quality generated image should be segmented back to its semantic layout, we propose a new Unsupervised paradigm for SIS (USIS) that makes use of a self-supervised segmentation loss and whole image wavelet based discrimination. |
G. Eskandar; M. Abdelsamad; K. Armanious; S. Zhang; B. Yang; |
354 | Fast Graph Sampling for Short Video Summarization Using Gershgorin Disc Alignment Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the problem of efficiently summarizing a short video into several keyframes, leveraging recent progress in fast graph sampling. |
S. Sahami; G. Cheung; C. -W. Lin; |
355 | Towards Practical and Efficient Long Video Summary Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we find that the Kernel Temporal Segmentation (KTS) method designed for detecting the shot boundaries in SOTA VS methods is time-consuming while handling long videos. |
X. Ke; B. Chang; H. Wu; F. Xu; S. Zhong; |
356 | Cut And Continuous Paste Towards Real-Time Deep Fall Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a simple and efficient framework to detect falls through a single and small-sized convolutional neural network. |
S. Hwang; M. Ki; S. -H. Lee; S. Park; B. -K. Jeon; |
357 | Mannet: A Large-Scale Manipulated Image Detection Dataset And Baseline Evaluations Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, no large-scale dataset having manipulated images generated using both handcrafted and deep learning algorithms is available. Therefore, in this research, we have proposed a large dataset with more than 5.5 million images, termed as ManNet dataset. |
A. Singh; S. Chhabra; P. Majumdar; R. Singh; M. Vatsa; |
358 | Approaches Toward Physical and General Video Anomaly Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: We introduce the Physical Anomalous Trajectory or Motion (PHANTOM) dataset 1, which contains six different video classes. |
L. Kart; N. Cohen; |
359 | Considering User Agreement in Learning to Predict The Aesthetic Quality Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we thus propose (1) a re-adapted multi-task attention network to predict both the mean opinion score and the standard deviation in an end-to-end manner; (2) a brand-new confidence interval ranking loss that encourages the model to focus on image-pairs that are less certain about the difference of their aesthetic scores. |
S. Ling; A. Pastor; J. Wang; P. L. Callet; |
360 | No-Reference Quality Assessment of Variable Frame-Rate Videos Using Temporal Bandpass Statistics Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we propose a first-of-a-kind blind VQA model for evaluating HFR videos, which we dub the Framerate-Aware Videos Evaluator w/o Reference (FAVER). |
Q. Zheng; Z. Tu; Y. Fan; X. Zeng; A. C. Bovik; |
361 | Towards Joint Frame-Level and MOS Quality Predictions with Low-Complexity Objective Models Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Consequently, an original way to train the models, using jointly the subjective scores and the frame level scores of a full-reference metric, is proposed. |
J. Jung; A. Giraud; M. Song; S. Li; X. Li; S. Liu; |
362 | Teaching CNNs to Mimic Human Visual Cognitive Process & Regularise Texture-Shape Bias Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: We propose CognitiveCNN, a new intuitive architecture, inspired from feature integration theory in psychology to utilise human-interpretable feature like shape, texture, edges etc. to reconstruct, and classify the image. |
S. Mohla; A. Nasery; B. Banerjee; |
363 | Subjective And Objective Quality Assessment Of Mobile Gaming Video Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Abstract: Nowadays, with the vigorous expansion and development of gaming video streaming techniques and services, the expectation of users, especially the mobile phone users, for higher … |
S. Wen; S. Ling; J. Wang; X. Chen; Y. Jing; P. L. Callet; |
364 | ER-PIQA: A Task-Guided Pedestrian Image Quality Assessment Via Embedding Reconstruction Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a novel task-guided method is proposed to measure pedestrain image quality based on embedding reconstruction without the involvement of subjective labels. |
Y. Zhong; H. Pan; B. Tang; Z. Liu; Y. Zhu; J. Yin; |
365 | Multiscale Crowd Counting and Localization By Multitask Point Supervision Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: We propose a multitask approach for crowd counting and person localization in a unified framework. |
M. Zand; H. Damirchi; A. Farley; M. Molahasani; M. Greenspan; A. Etemad; |
366 | Super-Resolution of Satellite Images By Two-Dimensional RRDB and Edge-Enhancement Generative Adversarial Network Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We use Kaggle and AID open experimental datasets to test and compare the results among different methods. |
Y. -Z. Chen; T. -J. Liu; K. -H. Liu; |
367 | Leveraging Local Temporal Information for Multimodal Scene Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel self-attention block that leverages both local and global temporal relation-ships between the video frames to obtain better contextualized representations for the individual frames. |
S. Sahu; P. Goyal; |
368 | Predicting Human Motion Using Key Subsequences Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Usually, human motion tends to repeat itself and follows patterns that are well-represented by a few short key subsequences. Based on the above observations, we propose an attention-based feed-forward network, which is explicitly guided by the key subsequences, for human motion prediction. |
M. Li; M. Pei; W. Liang; |
369 | Dynamic Texture Recognition Using PDV Hashing and Dictionary Learning on Multi-Scale Volume Local Binary Pattern Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: STLBP often encounters the high-dimension problem as its dimension increases exponentially, so that STLBP could only utilize a small neighborhood. To tackle this problem, we propose a method for dynamic texture recognition using PDV hashing and dictionary learning on multi-scale volume local binary pattern (PHD-MVLBP). |
R. Ding; J. Ren; H. Yu; J. Li; |
370 | Do You Live A Healthy Life? Analyzing Lifestyle By Visual Life Logging Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we investigate the problem of lifestyle analysis and build a visual lifelogging dataset for lifestyle analysis (VLDLA). |
Q. Gao; M. Pei; H. Shen; |
371 | Weighted Wavelet-Based Spectral-Spatial Transforms For CFA-Sampled Raw Camera Image Compression Considering Image Features Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This study introduces weighted WSSTs (WWSSTs) that work especially for the CFA-sampled raw images with many edges well. |
L. Huang; T. Suzuki; |
372 | Jmpnet: Joint Motion Prediction for Learning-Based Video Compression Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, problems such as tail shadow and background distortion in the predicted frame remain unsolved. To tackle these problems, JMPNet is introduced in this paper to provide more accurate motion information by using both optical flow and dynamic local filter as well as an attention map to further fuse these motion information in a smarter way. |
D. Li; et al. |
373 | A Low-Parametric Model for Bit-Rate Estimation of VVC Residual Coding Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a set of four features together with a linear model, which is able to estimate the rate of arbitrary residual blocks which were compressed using the VVC standard. |
F. Brand; C. Herglotz; A. Kaup; |
374 | OPTE: Online Per-Title Encoding for Live Video Streaming Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces an online per-title encoding scheme (OPTE) for live video streaming applications. |
V. V. Menon; H. Amirpour; M. Ghanbari; C. Timmerer; |
375 | SADN: Learned Light Field Image Compression with Spatial-Angular Decorrelation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel end-to-end spatial-angular-decorrelated network (SADN) for high-efficiency light field image compression. |
K. Tong; X. Jin; C. Wang; F. Jiang; |
376 | Hierarchical Feature Aggregation Network for Deep Image Compression Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Existing CNN-based methods for image compression extract features through serially connected high-to-low (encoder) or low-to-high (decoder) resolution stages, leading to insufficient utilization of hierarchical features. To solve this problem, we present a hierarchical feature aggregation network (HFAN) for generating more informative latent representations. |
W. Li; Z. Du; H. He; J. Tang; G. Wu; |
377 | Accurate Instance Segmentation Via Collaborative Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an instance segmentation model, named CoMask, that effectively alleviates the scale variation issue and addresses the precise localization. |
T. Chen; X. Hu; J. Xiao; G. Zhang; S. Wang; |
378 | Dynamic Binary Neural Network By Learning Channel-Wise Thresholds Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This process limits representation capacity of BNNs since different samples may adapt to unequal thresholds. To address this problem, we propose a dynamic BNN (DyBNN) incorporating dynamic learnable channel-wise thresholds of Sign function and shift parameters of PReLU. |
J. Zhang; Z. Su; Y. Feng; X. Lu; M. Pietik�inen; L. Liu; |
379 | Self-Supervised Learning on A Lightweight Low-Light Image Enhancement Model with Curve Refinement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Another challenge for paired training networks is the limited generalization capacity caused by the sample bias. To overcome these two challenges, we propose a lightweight self-supervised low-light image enhancement method, that trains with low light images only. |
W. Wu; W. Wang; K. Jiang; X. Xu; R. Hu; |
380 | Semantically Proportional Patchmix for Few-Shot Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Although excelling at distinguishing training data, these models are not well generalized to unseen data, probably due to insufficient feature representations on evaluation. To tackle this issue, we propose Semantically Proportional Patchmix (SePPMix), in which patches are cut and pasted among training images and the ground truth labels are mixed proportionally to the semantic information of the patches. |
J. Wang; J. Xu; Y. Pan; Z. Xu; |
381 | Noise Suppression for Improved Few-Shot Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we identify that noise suppression is important to improve the performance of FSL algorithms. |
Z. Chen; T. Ji; S. Zhang; F. Zhong; |
382 | Online Continual Learning Using Enhanced Random Vector Functional Link Networks Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an online continual learning algorithm based on an enhanced Random Vector Functional Link Network (OCL-eRVFL), that learns a sequence of tasks continually, where each task is defined by streaming data with each sample arriving once and only once. |
C. S. Yin Wong; G. Yang; A. Ambikapathi; R. Savitha; |
383 | A Generalized Kernel Risk Sensitive Loss for Robust Two-Dimensional Singular Value Decomposition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, 2DSVD algorithm is based on the squared error loss, which may exaggerate the projection errors with the presence of outliers. To solve this problem, we propose a generalized kernel risk sensitive loss for measuring the projection error in 2DSVD, which automatically eliminates the outlier information during optimization. |
M. Zhang; Y. Gao; J. Zhou; |
384 | Video Frame Interpolation Via Local Lightweight Bidirectional Encoding with Channel Attention Cascade Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a lightweight-driven video frame interpolation network (L2BEC2) is proposed. |
X. Ding; P. Huang; D. Zhang; X. Zhao; |
385 | Sain: Similarity-Aware Video Frame Interpolation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Since moving objects usually have similarities in consecutive frames, we propose a similarity-aware video frame interpolation method (SAIN) that searches patches with similar texture in the embedding space from input frames to extract features and capture image details. |
Y. Lv; W. Yang; W. Zuo; Q. Liao; R. Zhu; |
386 | Self-Learned Video Super-Resolution with Augmented Spatial and Temporal Context Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address the issue and get rid of the synthetic paired data, in this paper, we make exploration in utilizing the internal self-similarity redundancy within the video to build a Self-Learned Video Super-Resolution (SLVSR) method, which only needs to be trained on the input testing video itself. |
Z. Fan; J. Liu; W. Yang; W. Xiang; Z. Guo; |
387 | Deformable Convolution Dense Network for Compressed Video Quality Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a Multi-frame Residual Dense Network (MRDN) with deformable convolution is developed to improve the quality of the compressed video, by utilizing high-quality frame to compensate the low-quality frame. |
J. Liu; M. Zhou; M. Xiao; |
388 | Convolutional ISTA Network with Temporal Consistency Constraints for Video Reconstruction from Event Cameras Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Current deep networks achieve high-quality video reconstruction from events, but most of them are large and difficult to interpret. In this work, we present a solution to this problem by systematically designing a deep network based on sparse representation. |
S. Liu; R. Alexandru; P. L. Dragotti; |
389 | PMP-NET: Rethinking Visual Context for Scene Graph Generation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we revisit the concept of incorporating visual context via a randomly ordered bidirectional Long Short Temporal Memory (biLSTM) based baseline, and show that noisy estimation is worse than random. |
X. Tong; R. Wang; C. Wang; S. Zhang; X. Cao; |
390 | Improve Image Captioning Via Relation Modeling Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a novel approach that combines scene graphs with Transformer, which we call SGT, to explicitly encode available visual relationships between detected objects. |
F. Huang; Z. Li; |
391 | Equal Loss: A Simple Loss Function for Noise Robust Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that DNN learning with Cross Entropy is not robust to label noise and exhibits imbalance between the gradient of clean and noisy samples. |
L. Cui; H. Peng; Y. Li; C. Li; X. Xing; |
392 | Informative Attention Supervision for Grounded Video Description Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Moreover, the prevailing attention loss functions enforce the GVDMs to focus equally on all sampled regions when the GVDMs generate words, which may make it difficult for the model to attend to informative regions and thus degrade the quality of the generated sentences. To alleviate the above problems, we propose an informative attention supervision method including a novel attention groundtruth sampling method and a group-based weak grounding supervision. |
B. Wan; W. Jiang; Y. Fang; |
393 | Spatial-Context-Aware Deep Neural Network for Multi-Class Image Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Over the past few decades, solutions exploring relationships between semantic labels have made great progress. |
J. Zhang; Q. Zhang; J. Ren; Y. Zhao; J. Liu; |
394 | Transtl: Spatial-Temporal Localization Transformer for Multi-Label Video Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Generally, there exist many complex action labels in real-world videos and these actions are with inherent dependencies at both spatial and temporal domains. Motivated by this observation, we propose TranSTL, a spatial-temporal localization Transformer framework for MLVC task. |
H. Wu; M. Li; Y. Liu; H. Liu; C. Xu; X. Li; |
395 | Deep Video Inpainting Guided By Audio-Visual Self-Supervision Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Humans can easily imagine a scene from auditory information based on their prior knowledge of audio-visual events. In this paper, we mimic this innate human ability in deep learning models to improve the quality of video inpainting. |
K. Kim; J. Jung; W. J. Kim; S. -E. Yoon; |
396 | Navigating Audio-Visual Event Detection Across Mismatched Modalities Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We focus on AV parsing on fully unconstrained data where the audio and visual events do not necessarily co-present. |
G. Li; X. Xu; M. Wu; K. Yu; |
397 | Look, Listen and Pay More Attention: Fusing Multi-Modal Information for Video Violence Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Most existing works focus on single modal data analysis, which is not effective when multi-modality is available. Therefore, we propose a two-stage multi-modal information fusion method for violence detection: 1) the first stage adopts multiple instance learning strategies to refine video-level hard labels into clip-level soft labels, and 2) the next stage uses multi-modal information fused attention module to achieve fusion, and supervised learning is carried out using the soft labels generated at the first stage. |
D. -L. Wei; C. -G. Liu; Y. Liu; J. Liu; X. -G. Zhu; X. -H. Zeng; |
398 | Multi-Modal Learning with Text Merging for TEXTVQA Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a Multi-Modal Learning framework with Text Merging (MML&TM in short) for TextVQA, where we develop a text merging (TM) algorithm, which can effectively merge the word-level text obtained from the text recognition module to construct line-level and paragraph-level texts for enhancing semantic context, which is crucial to visual text understanding. |
C. Xu; Z. Xu; Y. He; S. Zhou; J. Guan; |
399 | A Novel Part Feature Integration and Fusion Method for Fine-Grained Vehicle Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel light-weight feature integration and fusion method to enhance the discriminative ability of deep convolutional features for the task of fine-grained vehicle recognition. |
P. Wang; Y. Cao; L. Lu; |
400 | Monocular Vehicle 3D Bounding Box Estimation Using Homograhy and Geometry in Traffic Scene Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel vehicle 3D bounding box estimation method making use of the 3D-2D geometry consistency and homography transformation. |
Y. Chen; F. Liu; K. Pei; |
401 | FSM: Feature Sampling Module for Object Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Towards enhancing the quality of the features, we propose a Feature Sampling Module (FSM), which learns multiple two-dimensional Gaussian distributions by the sampling network (SN) and applies those Gaussian masks to extract valid information of the features. |
X. Yi; B. Ma; J. Wu; |
402 | Rethinking Two-B-Real Net for Real-Time Salient Object Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: And its backbone is borrowed from image classification tasks, may be inefficient for SOD due to the deficiency of task-specific design. To handle these problems, we propose a novel and efficient structure named short-range concatenate module (SRCM) by removing structure redundancy. |
S. Kuang; S. Meng; B. Xiao; L. Tang; B. Li; |
403 | Balanced Ranking and Sorting For Class Incremental Object Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose balanced ranking and sorting (BRS), to tackle the catastrophic forgetting and data imbalance problems for CIOD. |
B. Cui; H. Qu; X. Huang; S. Yu; |
404 | Multi-Scale Reinforcement Learning Strategy for Object Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a Multi-scale Reinforcement Learning Strategy (MRLS) for balanced multi-scale training. |
Y. Luo; X. Cao; J. Zhang; L. Pan; T. Wang; Q. Feng; |
405 | Deep Object Detection with Example Attribute Based Prediction Modulation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Deep object detectors suffer from the gradient contribution imbalance during training. In this paper, we point out that such imbalance can be ascribed to the imbalance in example attributes, e.g., difficulty and shape variation degree. |
Z. Wu; C. Liu; C. Huang; J. Wen; Y. Xu; |
406 | Universal Efficient Variable-Rate Neural Image Compression Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, computational complexity and rate flexibility are still two major challenges for its practical deployment. To tackle these problems, this paper proposes two universal modules named Energy-based Channel Gating(ECG) and Bit-rate Modulator(BM), which can be directly embedded into existing end-to-end image compression models. |
S. Yin; C. Li; Y. Bao; Y. Liang; F. Meng; W. Liu; |
407 | AdderIC: Towards Low Computation Cost Image Compression Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Although numerous progress has been made in learned image compression, the computation cost is still at a high level. To address this problem, we propose AdderIC, which utilizes adder neural networks (AdderNet) to construct an image compression framework. |
B. Li; Y. Xin; C. Li; Y. Bao; F. Meng; Y. Liang; |
408 | DCNGAN: A Deformable Convolution-Based GAN with QP Adaptation for Perceptual Quality Enhancement of Compressed Video Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. |
S. Zhang; L. Herranz; M. Mrak; M. G. Blanch; S. Wan; F. Yang; |
409 | Specialised Video Quality Model For Enhanced User Generated Content (UGC) With Special Effects Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose to conduct a benchmark on existing full-reference, non-reference, and aesthetic quality metrics for UGC with special effects. |
A. -F. Perrin; Y. Xie; T. Zhang; Y. Liao; J. Li; P. L. Callet; |
410 | Improving Maximum Likelihood Difference Scaling Method To Measure Inter Content Scale Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: The goal of most subjective studies is to place a set of stimuli on a perceptual scale. |
A. Pastor; L. Krasula; X. Zhu; Z. Li; P. Le Callet; |
411 | Texture Information Boosts Video Quality Assessment Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we deeply investigate three elements of HVS, including texture masking, content-dependency, and temporal-memory effects from an experimental perspective. |
A. -X. Zhang; Y. -G. Wang; |
412 | Plug-and-Play and Relay Regularizations on Noisy Low Rank Tensor Completion for Snapshot Multispectral Image Restoration Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To improve the restoration performance, we introduce two regularizations in a Plug-and-Play (PnP) manner. |
K. Ozawa; |
413 | LERPS: Lighting Estimation and Relighting for Photometric Stereo Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a deep learning framework to perform three tasks jointly: (i) lighting estimation, (ii) image relighting, and (iii) surface normal estimation, all from a single input image of an object with non-Lambertian surface and general reflectance. |
A. Tiwari; S. Raman; |
414 | A Unified Two-Stage Model for Separating Superimposed Images Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a human vision-inspired framework for separating superimposed images. |
H. Duan; X. Min; W. Shen; G. Zhai; |
415 | Parameter-Free Style Projection for Arbitrary Image Style Transfer Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Existing feature transformation algorithms often suffer from loss of content or style details, non-natural stroke patterns, and unstable training. To mitigate these issues, this paper proposes a new feature-level style transformation technique, named Style Projection, for parameter-free, fast, and effective content-style transformation. |
S. Huang; et al. |
416 | Optimization of Compressive Light Field Display in Dual-Guided Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Conventionally, the excessive processing time impacts its practical value in commercial, along with the severe degradation of display brightness. Therefore, in this paper, we propose a learning-based factorization framework to promote the visual results and expedite the layer decomposition and display adaption. |
Y. Sun; Z. Li; L. Li; S. Wang; W. Gao; |
417 | ARM 4-BIT PQ: SIMD-Based Acceleration for Approximate Nearest Neighbor Search on ARM Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We then apply shuffle operations for each using the ARM-specific NEON instruction. By making this simple but critical modification, we achieve a dramatic speedup for the 4-bit PQ on an ARM architecture. |
Y. Matsui; Y. Imaizumi; N. Miyamoto; N. Yoshifuji; |
418 | Iterative Learning for Distorted Image Restoration Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the influence of different learning schemes on fitting capability and tackle the problem by proposing a novel iterative learning scheme. |
C. Wang; et al. |
419 | JE2NET: Joint Exploitation and Exploration in Reinforcement Learning Based Image Restoration Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, we argue that these agents rely on pre-trained RL models with fixed-length paths for restoration, which performs poorly in the case of unknown distortions. To address these issues, we propose a joint exploitation and exploration reinforcement learning network (JE2Net). |
X. Zhang; W. Gao; H. Yuan; G. Li; |
420 | Multiple Patch-Aware Network for Faster Real-World Image Dehazing Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Besides, we propose a novel data enhancement method called Concentration Sampling Enhancement (CSE), which generates new training samples by haze concentration sampling based on hazy images and clear images. |
K. Yang; J. Zhang; X. Lang; |
421 | Learning to Fuse Heterogeneous Features for Low-Light Image Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To break down the limitation, we propose a new classification-driven enhancement method with heterogeneous feature fusion. |
Z. Tang; L. Ma; X. Shang; X. Fan; |
422 | Deep Scale-Aware Image Smoothing Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a deep-learning-based scale-aware image smoothing method, which is built on a downscaling-upscaling mechanism with attention. |
J. Li; K. Qin; R. Xu; H. Ji; |
423 | A Multiscale Gradient-Backpropagation Optimization Framework for Deformable Convolution Based Compressed Video Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a multiscale gradient-backpropagation optimization framework is proposed for the deformable convolution based compressed video quality enhancement. |
Y. Gao; M. Jia; S. Li; X. Cai; M. Ye; F. Dufaux; |
424 | Downstream Augmentation Generation For Contrastive Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we aim at improving the augmentation process and propose an augmentation generator, a network that learns to augment images for contrastive learning. |
T. Hayase; S. Yasutomi; N. Inoue; |
425 | Few-Shot Learning with Improved Local Representations Via Bias Rectify Module Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a Deep Bias Rectify Network (DBRN) to fully exploit the spatial information that exists in the structure of the feature representations. |
C. Dong; Q. Ye; W. Meng; K. Yang; |
426 | Image-to-Video Re-Identification Via Mutual Discriminative Knowledge Transfer Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a mutual discriminative knowledge distillation framework to transfer a video-based richer representation to an image based representation more effectively. |
P. Wang; F. Wang; H. Li; |
427 | DynSNN: A Dynamic Approach to Reduce Redundancy in Spiking Neural Networks Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, inspired by the topology of neuronal co-activity in the neural system, we propose a dynamic pruning framework (dubbed DynSNN) for SNNs, enabling us to seamlessly optimize network topology on the fly almost without accuracy loss. |
F. Liu; W. Zhao; Y. Chen; Z. Wang; F. Dai; |
428 | MEJIGCLU: More Effective Jigsaw Clustering For Unsupervised Visual Representation Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To achieve competitive results to contrastive learning with low computational overhead, we propose a new unsupervised representation learning method with jigsaw clustering and classification as pretext tasks motivate the network to learn discriminative feature. |
Y. Zhang; Q. Liu; Y. Zhao; Y. Liang; |
429 | Ganet: Unary Attention Reaches Pairwise Attention Via Implicit Group Clustering in Light-Weight CNNs Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: The two groups of attention, unary and pair-wise attention, seem like being incompatible as fire and water due to the completely different operations. In this paper, we propose a Group Attention (GA) block to bridge the gap between these two attentions and merely leverage unary attention to lightweightly reach the effect of pairwise attention, based on the implicit group clustering of light-weight CNNs. |
C. Zhuang; Y. Sun; |
430 | Find The Way Back: Invertible Kernel Estimator For Blind Image Super-Resolution Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We address the task of zero-shot blind image super-resolution, where it aims to recover the high-resolution details from the low-resolution input image under a challenging problem setting of having no external training data, no prior assumption on the downsampling kernel, and no pre-training components used for estimating the downsampling kernel. |
T. -W. Chang; W. -C. Chiu; C. -C. Huang; |
431 | Fine-Grained Dynamic Loss for Accurate Single-Image Super-Resolution Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Developing new loss function provides a promising SISR solution, i.e. one should be beyond the existing regression loss functions, which encounter problem in reconstructing the image texture details. For such goal, this paper proposes a dynamic fine-grained loss function. |
H. Wang; G. Zhang; Z. Lei; |
432 | Multi-Frame Super-Resolution With Raw Images Via Modified Deformable Convolution Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a novel model towards multi-frame super-resolution, which leverages multiple RAW images and yields a super-resolved RGB image. |
G. Li; L. Qiu; H. Zhang; F. Xie; Z. Jiang; |
433 | Local-Global Feature Aggregation for Light Field Image Super-Resolution Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Besides, due to the limitations of CNNs, these methods can�t fully model the global spatial properties of the whole LF images. In this paper, we propose a network with Local-Global Feature Aggregation (LF-LGFA) to handle these problems for LF image SR. |
Y. Wang; Y. Lu; S. Wang; W. Zhang; Z. Wang; |
434 | Pyramid Fusion Attention Network For Single Image Super-Resolution Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, these methods exclusively consider interdependencies among channels or spatials, leading to equal treatment of channel-wise or spatial-wise features thus hindering the power of AM. In this paper, we propose a pyramid fusion attention network (PFAN) to tackle this problem. |
H. He; Z. Du; W. Li; J. Tang; G. Wu; |
435 | VCD: View-Constraint Disentanglement for Action Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose the View-Constraint Disentanglement (VCD) framework for cross-view action recognition. |
X. Zhong; et al. |
436 | Privacy-Preserving Action Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Specifically, we propose to use unified actor score (UAS) to enhance the action recognition accuracy. |
C. Zou; D. Yuan; L. Lan; H. Chi; |
437 | Spatio-Temporal Motion Aggregation Network for Video Action Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose the Spatio-Temporal Motion Aggregation mechanism for integrating the local motion feature from a short term snippet and the longer spatio-temporal information to predict the action category. |
H. Zhang; X. Zhao; |
438 | TP-VIT: A Two-Pathway Vision Transformer for Video Action Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: How to use multiple pathways and multiple streams with Transformer for action recognition has not been studied. To address this issue, we present a novel structure namely Two-Pathway Vision Transformer (TP-ViT). |
Y. Jing; F. Wang; |
439 | Learning Task-Specific Representation for Video Anomaly Detection with Spatial-Temporal Attention Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a spatial-temporal attention mechanism to learn inter- and intra-correlations of video clips, and the boosted features are encouraged to be task-specific via the mutual cosine embedding loss. |
Y. Liu; J. Liu; X. Zhu; D. Wei; X. Huang; L. Song; |
440 | W-ART: Action Relation Transformer for Weakly-Supervised Temporal Action Localization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose W-ART, a relation Transformer to explicitly capture the relationships between action segments. |
M. Li; H. Wu; Y. Liu; H. Liu; C. Xu; X. Li; |
441 | MS-ROCANet: Multi-Scale Residual Orthogonal-Channel Attention Network for Scene Text Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a Multi-scale Residual Orthogonal-Channel Attention Network (MS-ROCANet) is proposed to improve the recall and accuracy of scene text detection. |
J. Liu; S. Wu; D. He; G. Xiao; |
442 | Bi-Directional Normalization and Color Attention-Guided Generative Adversarial Network for Image Enhancement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a bi-directional normalization and color attention-guided generative adversarial network (BNCAGAN) for unsupervised image enhancement. |
S. Liu; G. Xiao; X. Xu; S. Wu; |
443 | Dual-Attention Network for Few-Shot Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address the issue, we propose a Dual-Attention Network (DANet) for few-shot segmentation. |
Z. Chen; H. Wang; S. Zhang; F. Zhong; |
444 | Attention Guided Invariance Selection for Local Feature Descriptors Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Besides, we propose a novel parallel self-attention module to get meta descriptors with the global receptive field, which guides the invariance selection more correctly. |
J. Li; G. Li; T. H. Li; |
445 | Attention Probe: Vision Transformer Distillation in The Wild Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose to effectively compress ViTs using the unlabeled data in the wild, consisting of two stages. |
J. Wang; M. Cao; S. Shi; B. Wu; Y. Yang; |
446 | Stacked Multi-Scale Attention Network for Image Colorization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a stacked multi-scale attention network (SMSANet) for image colorization. |
B. Jiang; F. Xu; J. Xia; C. Yang; W. Huang; Y. Huang; |
447 | CRPN: Distinguish Novel Categories Via Class-Relevant Region Proposal Network for Few-Shot Object Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a Class-relevant Region Proposal Network (CRPN). |
H. Wang; Y. Li; S. Wang; |
448 | An Efficient Framework for Detection and Recognition of Numerical Traffic Signs Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this letter, we fully explore the relationship between different traffic signs with digital characters and transform the category objects into multi-level classes to alleviate the uneven distribution of samples. |
Z. Li; M. Chen; Y. He; L. Xie; H. Su; |
449 | Divergence-Guided Feature Alignment for Cross-Domain Object Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To remedy the defects, in this paper, we propose a novel divergence-guided feature alignment method for cross-domain object detection. |
Z. Li; R. Togo; T. Ogawa; M. Haseyama; |
450 | PGTRNET: Two-Phase Weakly Supervised Object Detection with Pseudo Ground Truth Refinement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Besides, we propose a novel online PGT refinement approach to steadily improve the quality of PGT by fully taking advantage of the power of FSD during the second-phase training, decoupling the first and second-phase models. |
J. Wang; H. Zhou; X. Yu; |
451 | Novel Instance Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Thus, a new instance mining model is proposed in this paper to excavate the novel samples from the base set. |
W. Liu; C. Wang; S. Yu; C. Tao; J. Wang; J. Wu; |
452 | BiP-Net: Bidirectional Perspective Strategy Based Arbitrary-Shaped Text Detection Network Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, to detect arbitrary-shaped text instances with high detection accuracy and speed simultaneously, we propose a Bidirectional Perspective strategy based Network (BiP-Net). |
C. Yang; M. Chen; Y. Yuan; Q. Wang; |
453 | A Novel Lightweight Network for Fast Monocular Depth Estimation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a lightweight network which leverages the advantages of dimension-wise convolutions and depthwise separable convolutions to reduce complexity in the architecture. |
T. Heydrich; Y. Yang; X. Ma; Y. Liu; S. Du; |
454 | A Lightweight Self-Supervised Training Framework for Monocular Depth Estimation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a lightweight self-supervised training framework which utilizes computationally cheap methods to compute ground truth approximations. |
T. Heydrich; Y. Yang; S. Du; |
455 | PU-Refiner: A Geometry Refiner with Adversarial Learning for Point Cloud Upsampling Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present PU-Refiner, a generative adversarial network for point cloud upsampling. |
H. Liu; H. Yuan; R. Hamzaoui; W. Gao; S. Li; |
456 | CF-Net: Complementary Fusion Network for Rotation Invariant Point Cloud Completion Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our CF-Net can achieve competitive results both geometrically and semantically as demonstrated in this paper. |
B. -F. Chen; Y. -M. Yeh; Y. -C. Lu; |
457 | TH-Net: A Method Of Single 3d Object Tracking Based On Transformers And Hausdorff Distance Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new 3D object tracking method called Transformer-Hausdorff Net (TH-Net). |
Z. Zhang; N. Sang; X. Wang; |
458 | Enrich Features for Few-Shot Point Cloud Classification Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, these methods require a lot of labeled data as support, which is challenging to obtain. To alleviate this problem, we propose a novel few-shot point cloud classification method to classify new categories given a few labeled samples. |
H. Feng; W. Liu; Y. Wang; B. Liu; |
459 | Semi-Supervised 360� Depth Estimation from Multiple Fisheye Cameras with Pixel-Level Selective Loss Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study a practical omnidirectional depth estimation with neural networks that enables effective learning on real world data obtained using wide-baseline multiple fish-eye cameras. |
J. Lee; D. Park; D. Lee; D. Ji; |
460 | Underwater Stereo Matching Via Unsupervised Appearance And Feature Adaptation Networks Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In addition, the domain gap also leads to the failure of directly applying existing models of terrestrial scenes to underwater scenes. Therefore, this paper proposes a novel underwater depth estimation network which can infer depth maps from real underwater stereo images in an unsupervised adaptation manner. |
W. Zhong; Y. Yuan; X. Ye; D. Zheng; R. Xu; |
461 | Domain Adaptation Via Mutual Information Maximization for Handwriting Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To improve the model�s generalization ability for sequence modeling task, this paper proposes to use domain adaptation with statistical distribution alignment and entropy regularization. |
P. Tang; et al. |
462 | Attribute-Conditioned Face Swapping Network for Low-Resolution Images Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel Attribute-Conditioned Face Swapping Network (AFSNet) to preserve attributes and handle low resolution images. |
A. Li; J. Hu; C. Fu; X. Zhang; J. Zhou; |
463 | Learning Multiple Explainable and Generalizable Cues for Face Anti-Spoofing Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, many other generalizable cues are unexplored for face anti-spoofing, which limits their performance under cross-dataset testing. To this end, we propose a novel framework to learn multiple explainable and generalizable cues (MEGC) for face anti-spoofing. |
Y. Bian; P. Zhang; J. Wang; C. Wang; S. Pu; |
464 | Off-The-Grid Covariance-Based Super-Resolution Fluctuation Microscopy Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present a gridless problem accounting for the independence of fluctuations. |
B. Laville; L. Blanc-F�raud; G. Aubert; |
465 | Simultaneous Nonlocal Low-Rank And Deep Priors For Poisson Denoising Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel approach using simultaneous nonlocal low-rank and deep priors (SNLDP) for Poisson denoising. |
Z. Zha; B. Wen; X. Yuan; J. Zhou; C. Zhu; |
466 | Double Closed-Loop Network for Image Deblurring Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a deep learning network with double closed-loop structure is introduced to tackle the image deblurring problem. |
Y. Liu; Y. Zhang; Q. Li; J. Kong; M. Qi; J. Wang; |
467 | Single Image De-Raining with High-Low Frequency Guidance Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a new High-Low-Frequency Guided De-raining (HLFGD) method to remove the rain streaks clearly while reserve the image details. |
Y. Zhang; Y. Xiang; L. Cai; Y. Fu; W. Huo; J. Xia; |
468 | Detail Generation and Fusion Networks for Image Inpainting Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel detail generation and fusion network (DGFNet) to strengthen the generation of texture details for image inpainting, which includes a dual-stream texture generation network and a multi-scale difference perception fusion network. |
W. Yang; W. Shi; |
469 | Adaptive Weighted Network With Edge Enhancement Module For Monocular Self-Supervised Depth Estimation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Besides, factors such as occlusion and texture sparsity can lead to the failure of the photometric consistency, affecting the prediction performance. To overcome these deficiencies, an adaptive weighted monocular self-supervised depth estimation framework that exploits enhanced edge information and texture sparsity based adaptive weights is proposed. |
H. Liu; Y. Zhu; G. Hua; W. Huang; R. Ding; |
470 | Pas-Mef: Multi-Exposure Image Fusion Based On Principal Component Analysis, Adaptive Well-Exposedness And Saliency Map Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: To minimize the information loss and produce high quality HDR-like images for LDR screens, this study proposes an efficient multi-exposure fusion (MEF) approach with a simple yet effective weight extraction method relying on principal component analysis, adaptive well-exposedness and saliency maps. |
D. Karakaya; O. Ulucan; M. Turkan; |
471 | PDD-Net: A Precise Defect Detection Network Based on Point Set Representation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: iii) Extreme imbalance problem between defects and background classes during training. To address these issues, we propose a novel anchor-free defect detection network named PDD-Net. |
M. Ban; R. Ding; J. Zhang; T. Guo; T. Wang; |
472 | Solving The Long-Tailed Problem Via Intra- And Inter-Category Balance Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel gradient harmonized mechanism with category-wise adaptive precision to decouple the difficulty and sample size imbalance in the long-tailed problem, which are correspondingly solved via intra- and inter-category balance strategies. |
R. Zhang; T. Lin; R. Zhang; Y. Xu; |
473 | Extracting and Distilling Direction-Adaptive Knowledge for Lightweight Object Detection in Remote Sensing Images Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Recently, some lightweight convolutional neural network (CNN) models have been proposed for airborne or spaceborne remote sensing object detection (RSOD) tasks. |
Z. Huang; W. Li; R. Tao; |
474 | Pseudo-Interacting Guided Network for Few-Shot Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a novel network that combines a universal cross-guided branch with a new pseudo-interacting guided branch. |
X. Luo; J. Luo; Z. Duan; J. Tan; T. Zhang; |
475 | Few-Shot Generation By Modeling Stereoscopic Priors Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a few-shot generative network which leverages 3D priors to improve the diversity and quality of generated images. |
Y. Wang; Q. Wang; D. Zhang; |
476 | Relative Viewpoint Estimation Based on Structured 3d Representation Alignment Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a relative viewpoint estimation method using an end-to-end trainable network that learns structured 3D representations. |
K. Matsuzaki; K. Kawamura; |
477 | Deep Markov Clustering for Panoptic Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we adopt a box-free strategy and incorporate a graph-based clustering method to merge repetitive kernel weights for object instances. |
M. Ye; Y. Zhang; S. Zhu; A. Xie; D. Zhang; |
478 | Multi-Task Learning Improves The Brain Stoke Lesion Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel multi-task learning framework to achieve enhanced segmentation of stroke lesions. |
L. Liu; C. Huang; C. Cai; X. Zhang; Q. Hu; |
479 | Mixed Transformer U-Net for Medical Image Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: Furthermore, SA can only model self-affinities within a single sample, ignoring the potential correlations of the overall dataset. To address these problems, we propose a novel Transformer module named Mixed Transformer Module (MTM) for simultaneous inter- and intra- affinities learning. |
H. Wang; et al. |
480 | Contrastive Translation Learning For Medical Image Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work proposes an advantageous domain translation mechanism to improve the perceptual ability of the network for accurate unlabeled target data segmentation. |
W. Zeng; W. Fan; D. Shen; Y. Chen; X. Luo; |
481 | Fast Video Object Segmentation Via Dynamic YOLACT Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: VOS can be considered an extension of semantic segmentation from a static image to a dynamic image sequence. Following this idea, we propose a fast VOS framework based on YOLACT, a real-time static image segmentation framework. |
T. Meng; W. Zhang; |
482 | Depth Removal Distillation for RGB-D Semantic Segmentation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Therefore, it is extremely challenging to take full advantage of RGB-D semantic segmentation methods for segmenting RGB images without the depth input. To address this challenge, a general depth removal distillation method is proposed to remove depth dependence from RGB-D semantic segmentation model by knowledge distillation, which can be employed to any CNN-based segmentation network structure. |
T. Fang; Z. Liang; X. Shao; Z. Dong; J. Li; |
483 | Mask-Based Attention Parallel Network for In-the-Wild Facial Expression Recognition Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: But most previous attention-based methods are inadequate in locating crucial expression-related regions precisely and capturing useful facial expression features comprehensively. For these reasons, we present a novel mask-based attention parallel network (MAPNet). |
L. Ju; X. Zhao; |
484 | SDNET: Lightweight Facial Expression Recognition For Sample Disequilibrium Literature Review Related Patents Related Grants Related Orgs Related Experts Details Abstract: Facial expression recognition (FER) based on the convolutional neural network (CNN) in the wild have numerous challenges. For instance, the complexity of the network model makes … |
L. Zhou; S. Li; Y. Wang; J. Liu; |
485 | A Novel Micro-Expression Recognition Approach Using Attention-Based Magnification-Adaptive Networks Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, the single fixed magnification strategy, widely used in existing works of MER, is not appropriate for different subjects, because each subject has specific expression intensity corresponding to different MEs. To cope with this issue, we propose a novel Attention-based Magnification-Adaptive Network (AMAN) to learn adaptive magnification levels for the ME representation. |
M. Wei; W. Zheng; Y. Zong; X. Jiang; C. Lu; J. Liu; |
486 | Lipreading Model Based On Whole-Part Collaborative Learning Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on the full use of spatial information in lipreading tasks. |
W. Tian; H. Zhang; C. Peng; Z. -Q. Zhao; |
487 | What Is The Patient Looking At? Robust Gaze-Scene Intersection Under Free-Viewing Conditions Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We demonstrate the utility of the proposed algorithm in regressing the PoR from scenes captured in the Intensive Care Unit (ICU) at Chelsea & Westminster Hospital NHS Foundation Trusta. |
A. Al-Hindawi; M. P. Vizcaychipi; Y. Demiris; |
488 | GAZEATTENTIONNET: Gaze Estimation with Attentions Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel structure named GazeAttentionNet. |
H. Huang; L. Ren; Z. Yang; Y. Zhan; Q. Zhang; J. Lv; |
489 | Low-Light Image Enhancement Via Feature Restoration Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Most existing Retinex-based methods deal with the noise and color distortion via some careful designs to denoising and/or color correction. In this paper, we propose a simple yet effective network from the perspective of feature map restoration to mitigate such issues without constructing any explicit modules. |
Y. Yang; Y. Zhang; X. Guo; |
490 | HIRL: Hybrid Image Restoration Based on Hierarchical Deep Reinforcement Learning Via Two-Step Analysis Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, each tool adopted inevitably introduces additional noise and will affect the subsequent recovery results. To address this issue, in this paper, we propose a hierarchical deep reinforcement learning framework (HIRL), which balance both benefits and noises brought by each tool and select the appropriate type and degree tools. |
X. Zhang; W. Gao; |
491 | High-Fidelity Portrait Editing Via Exploring Differentiable Guided Sketches from The Latent Space Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Nonetheless, passing sketch information to the generating model directly is nontrivial. To this end, we present an algorithm that addresses the problem of well controlling the generation process via differentiable guided sketches from latent space. |
C. Wang; C. Cao; Y. Fu; X. Xue; |
492 | Learning Adjustable Image Rescaling with Joint Optimization of Perception and Distortion Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on the invertible rescaling net (IRN) which learns image downscaling and upscaling together, we propose a joint optimization method to train just one model that could achieve adjustable trade-off between perception and distortion for upscaling at inference time. |
Z. Pan; |
493 | FSOINET: Feature-Space Optimization-Inspired Network For Image Compressive Sensing Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, we propose the idea of achieving information flow phase by phase in feature space and design a Feature-Space Optimization-Inspired Network (dubbed FSOINet) to implement it by mapping both steps of proximal gradient descent algorithm from pixel space to feature space. |
W. Chen; C. Yang; X. Yang; |
494 | Disentangled Feature-Guided Multi-Exposure High Dynamic Range Imaging Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a disentangled feature-guided HDR network (DFGNet) to alleviate the above-stated problems. |
K. Lee; Y. I. Jang; N. I. Cho; |
495 | Defending Against Universal Attack Via Curvature-Aware Category Adversarial Training Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a curvature-aware category adversarial training method to avoid excessive perturbations. |
P. Du; X. Zheng; L. Liu; H. Ma; |
496 | SP Attack: Single-Perspective Attack for Generating Adversarial Omnidirectional Images Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: The safety of Deep Neural Networks (DNNs) processing omnidirectional images (ODIs) is an under-researched topic. In this paper, we propose a novel sparse attack, named Single-Perspective (SP) Attack, towards fooling these models by perturbing only one perspective image (PI) rendered from the target ODI. |
Y. Zhang; Y. Liu; J. Liu; P. Zhan; L. Wang; Z. Xu; |
497 | Few-Shot One-Class Domain Adaptation Based On Frequency For Iris Presentation Attack Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We thus define a new domain adaptation setting called Few-shot One-class Domain Adaptation (FODA), where adaptation only relies on a limited number of target bonafide samples. To address this problem, we propose a novel FODA framework based on the expressive power of frequency information. |
Y. Li; Y. Lian; J. Wang; Y. Chen; C. Wang; S. Pu; |
498 | Pixinwav: Residual Steganography for Hiding Pixels in Audio Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: While previous works focused on unimodal setups (e.g., hiding images in images, or hiding audio in audio), PixInWav targets the multimodal case of hiding images in audio. To this end, we propose a novel residual architecture operating on top of short-time discrete cosine transform (STDCT) audio spectrograms. |
M. Geleta; C. Punt�; K. McGuinness; J. Pons; C. Canton; X. Giro-i-Nieto; |
499 | A Semi-Handcrafted Keypoint Detector with Discriminative Feature Encoding Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: And yet, the intrinsic relationships of key-points have not been explored actively, which may lead to the ambiguity of feature codes for further analysis. To tackle this problem, in this paper, we introduce a novel semi-handcrafted keypoint detector through a scheme of discriminative feature representations (SDFR). |
Y. Xie; L. Guan; |
500 | Safari from Visual Signals: Recovering Volumetric 3d Shapes Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a convex approach for recovering a detailed 3D volumetric geometry of several objects from visual signals. |
A. Agudo; |
501 | Coupled Feature Learning Via Structured Convolutional Sparse Coding for Multimodal Image Fusion Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A novel method for learning correlated features in multimodal images based on convolutional sparse coding with applications to image fusion is presented. |
F. G. Veshki; S. A. Vorobyov; |
502 | DOMAINDESC: Learning Local Descriptors With Domain Adaptation Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel descriptor DomainDesc which is invariant as much as possible by learning local Descriptor with Domain adaptation. |
R. Xu; et al. |
503 | Multi-Head Relu Implicit Neural Representation Networks Literature Review Related Patents Related Grants Related Orgs Related Experts Related Code Details Highlight: In this paper, a novel multi-head multi-layer perceptron (MLP) structure is presented for implicit neural representation (INR). |
A. Aftab; A. Morsali; S. Ghaemmaghami; |
504 | An Efficient Method for Model Pruning Using Knowledge Distillation with Few Samples Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present Progressive Feature Distribution Distillation (PFDD) without modifying network structures, which surpasses FSKD. |
Z. Zhou; Y. Zhou; Z. Jiang; A. Men; H. Wang; |
505 | Adaptive Intra-Group Aggregation for Co-Saliency Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, the feature aggregation between group feature representation and individual feature representation is still a challenging issue. In this work, we propose a novel adaptive intra-group aggregation (AIGA) method, which provides a new perspective to investigate the interaction relationship between group and single-image features and aggregate these features in an adaptive way. |
G. Ren; T. Dai; T. Stathaki; |
506 | Novel Class Discovery: A Dependency Approach Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we look at the problem where the model is required to discover novel classes never encountered in the labeled set. |
T. Mukherjee; N. Deligiannis; |
507 | Single-Shot Balanced Detector for Geospatial Object Detection Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, to achieve excellent speed/accuracy trade-off for geospatial object detection, a single-shot balanced detector is presented. |
Y. Liu; Q. Li; Y. Yuan; Q. Wang; |
508 | Regularized Latent Space Exploration for Discriminative Face Super-Resolution Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a regularized latent space exploration approach to facilitate self-supervised face super-resolution. |
R. Shi; J. Zhang; Y. Li; S. Ge; |
509 | Enhancing and Dissecting Crowd Counting By Synthetic Data Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this article, we propose a simulated crowd counting dataset CrowdX, which has a large scale, accurate labeling, parameterized realization, and high fidelity. |
Y. Hou; et al. |
510 | Multi-Pose Virtual Try-On Via Self-Adaptive Feature Filtering Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Prior methods lack an effective geometric deformation to maintain the original image details resulting in many details loss in the head and garment. To address this problem, we propose a new multi-pose virtual try-on network, which can fit a garment to the corresponding area of a person in arbitrary poses. |
C. Du; F. Yu; M. Jiang; X. Wei; T. Peng; X. Hu; |
511 | Histogram-Guided Semantic-Aware Colorization Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel histogram-guided semantic-aware colorization method, which explicitly builds the correspondences between global colors and local features with an attention mechanism and uses a differentiable histogram loss to impose the histogram of the results. |
J. Zhang; Y. Xiao; G. Chen; Q. Sun; F. Xu; C. -S. Leung; |
512 | Content Preserving Scale Space Network for Fast Image Restoration from Noisy-Blurry Pairs Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a fast method to estimate a latent image given a pair of noisy-blurry images. |
G. R. K S; N. Krishnan; B. H. Pawan Prasad; S. Lomte; |
513 | Flow-Based Point Cloud Completion Network with Adversarial Refinement Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a coarse-to-fine approach to complete the partial point cloud with two stages: 1) Flow-based Completion Network, a principled probabilistic model that built on continuous normalizing flow to generate coarse completions conditioned on partial inputs. |
R. Bao; Y. Ren; G. Li; W. Gao; S. Liu; |
514 | Weakly Supervised Point Cloud Upsampling VIA Optimal Transport Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Existing learning-based methods usually train a point cloud upsampling model with synthesized, paired sparse-dense point clouds. |
Z. Li; W. Wang; N. Lei; R. Wang; |
515 | Point Cloud Denoising Using Normal Vector-Based Graph Wavelet Shrinkage Literature Review Related Patents |