Paper Digest: Recent Papers on AI for Music
Paper Digest Team extracted all recent AI for Music related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
This list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that empowers you to read, write, get answers and review.
Try us today and unlock the full potential of our services for free!
TABLE 1: Paper Digest: Recent Papers on AI for Music
Paper | Author(s) | Source | Date | |
---|---|---|---|---|
1 | Music and Art: A Study in Cross-modal Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose guidelines for using music to enhance the experience of viewing art, and we propose directions for future research. |
Paul Warren; Paul Mulholland; Naomi Barker; | arxiv-cs.HC | 2025-01-09 |
2 | Music Tagging with Classifier Group Chains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose music tagging with classifier chains that model the interplay of music tags. |
Takuya Hasumi; Tatsuya Komatsu; Yusuke Fujita; | arxiv-cs.SD | 2025-01-09 |
3 | Evaluating Interval-based Tokenization for Pitch Representation in Symbolic Music Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a general framework for building interval-based tokenizations. |
Dinh-Viet-Toan Le; Louis Bigo; Mikaela Keller; | arxiv-cs.IR | 2025-01-08 |
4 | Multi-label Cross-lingual Automatic Music Genre Classification from Lyrics with Sentence BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a multi-label, cross-lingual genre classification system based on multilingual sentence embeddings generated by sBERT. |
Tiago Fernandes Tavares; Fabio José Ayres; | arxiv-cs.IR | 2025-01-07 |
5 | MAJL: A Model-Agnostic Joint Learning Framework for Music Source Separation and Pitch Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods still face two critical challenges that limit the improvement of both tasks: the lack of labeled data and joint learning optimization. To address these challenges, we propose a Model-Agnostic Joint Learning (MAJL) framework for both tasks. |
Haojie Wei; Jun Yuan; Rui Zhang; Quanyu Dai; Yueguo Chen; | arxiv-cs.SD | 2025-01-07 |
6 | SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and An Open-Source Professional Testset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a high-fidelity singing voice conversion system. |
YIQUAN ZHOU et. al. | arxiv-cs.SD | 2025-01-06 |
7 | A System for Melodic Harmonization Using Schoenberg Regions, Giant Steps, and Church Modes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, I describe Harmonizer, a prototype system for melodic harmonization. |
Frederick Fernandes; | arxiv-cs.SD | 2025-01-05 |
8 | Can Impressions of Music Be Extracted from Thumbnail Images? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This type of information is underrepresented in existing music caption datasets due to the challenges associated with extracting it directly from music data. To address this issue, we propose a method for generating music caption data that incorporates non-musical aspects inferred from music thumbnail images, and validated the effectiveness of our approach through human evaluations. |
Takashi Harada; Takehiro Motomitsu; Katsuhiko Hayashi; Yusuke Sakai; Hidetaka Kamigaito; | arxiv-cs.CL | 2025-01-05 |
9 | MusicGen-Stem: Multi-stem Music Generation and Edition Through Autoregressive Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To do so, we train one specialized compression algorithm per stem to tokenize the music into parallel streams of tokens. |
Simon Rouard; Robin San Roman; Yossi Adi; Axel Roebel; | arxiv-cs.SD | 2025-01-03 |
10 | On The Robustness of Cover Version Identification Models: A Study Using Cover Versions from YouTube Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we annotate a subset of songs from YouTube sampled by a multi-modal uncertainty sampling approach and evaluate state-of-the-art models. |
Simon Hachmeier; Robert Jäschke; | arxiv-cs.MM | 2025-01-02 |
11 | MMVA: Multimodal Matching Based on Valence and Arousal Across Images, Music, and Musical Captions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Multimodal Matching based on Valence and Arousal (MMVA), a tri-modal encoder framework designed to capture emotional content across images, music, and musical captions. |
Suhwan Choi; Kyu Won Kim; Myungjoo Kang; | arxiv-cs.SD | 2025-01-02 |
12 | MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a self-supervised music representation learning model for music understanding. |
HAINA ZHU et. al. | arxiv-cs.SD | 2025-01-02 |
13 | Unrolled Creative Adversarial Network For Generating Novel Musical Pieces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a classical system was employed alongside a new system to generate creative music. |
Pratik Nag; | arxiv-cs.SD | 2024-12-31 |
14 | Text2midi: Generating Symbolic Music from Captions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces text2midi, an end-to-end model to generate MIDI files from textual descriptions. |
KESHAV BHANDARI et. al. | arxiv-cs.SD | 2024-12-21 |
15 | Music Genre Classification: Ensemble Learning with Subcomponents-level Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The letter introduces a novel approach by combining ensemble learning with attention to sub-components, aiming to enhance the accuracy of identifying music genres. |
Yichen Liu; Abhijit Dasgupta; Qiwei He; | arxiv-cs.SD | 2024-12-20 |
16 | Tuning Music Education: AI-Powered Personalization in Learning Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the second case study we prototype adaptive piano method books that use Automatic Music Transcription to generate exercises at different skill levels while retaining a close connection to musical interests. |
Mayank Sanganeria; Rohan Gala; | arxiv-cs.SD | 2024-12-18 |
17 | Detecting Machine-Generated Music with Explainability — A Challenge and Early Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By providing a comprehensive comparison of benchmark results and their interpretability, we propose several directions to inspire future research to develop more robust and effective detection methods for MGM. |
Yupei Li; Qiyang Sun; Hanqian Li; Lucia Specia; Björn W. Schuller; | arxiv-cs.SD | 2024-12-17 |
18 | Leveraging User-Generated Metadata of Online Videos for Cover Song Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a multi-modal approach for cover song identification on online video platforms. |
Simon Hachmeier; Robert Jäschke; | arxiv-cs.MM | 2024-12-16 |
19 | A Benchmark and Robustness Study of In-Context-Learning with Large Language Models in Music Entity Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we provide a novel dataset of user-generated metadata and conduct a benchmark and a robustness study using recent LLMs with in-context-learning (ICL). |
Simon Hachmeier; Robert Jäschke; | arxiv-cs.CL | 2024-12-16 |
20 | Interpreting Graphic Notation with MusicLDM: An AI Improvisation of Cornelius Cardew’s Treatise Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a novel method for composing and improvising music inspired by Cornelius Cardew’s Treatise, using AI to bridge graphic notation and musical expression. |
Tornike Karchkhadze; Keren Shao; Shlomo Dubnov; | arxiv-cs.SD | 2024-12-12 |
21 | Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel method named Visuals Music Bridge (VMB). |
BAISEN WANG et. al. | arxiv-cs.CV | 2024-12-12 |
22 | Frechet Music Distance: A Metric For Generative Symbolic Music Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we introduce the Frechet Music Distance (FMD), a novel evaluation metric for generative symbolic music models, inspired by the Frechet Inception Distance (FID) in computer vision and Frechet Audio Distance (FAD) in generative audio. |
Jan Retkowski; Jakub Stępniak; Mateusz Modrzejewski; | arxiv-cs.SD | 2024-12-10 |
23 | Source Separation & Automatic Transcription for Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using spectrogram masking, deep neural networks, and the MuseScore API, we attempt to create an end-to-end pipeline that allows for an initial music audio mixture (e.g.. |
Bradford Derby; Lucas Dunker; Samarth Galchar; Shashank Jarmale; Akash Setti; | arxiv-cs.SD | 2024-12-09 |
24 | AI TrackMate: Finally, Someone Who Will Give Your Music More Than Just Sounds Great! Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The rise of bedroom producers has democratized music creation, while challenging producers to objectively evaluate their work. To address this, we present AI TrackMate, an LLM-based music chatbot designed to provide constructive feedback on music productions. |
Yi-Lin Jiang; Chia-Ho Hsiung; Yen-Tung Yeh; Lu-Rong Chen; Bo-Yu Chen; | arxiv-cs.SD | 2024-12-09 |
25 | Advancing Music Therapy: Integrating Eastern Five-Element Music Theory and Western Techniques with AI in The Novel Five-Element Harmony System Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this article, we developed a music therapy system for the first time by applying the theory of the five elements in music therapy to practice. |
Yubo Zhou; Weizhen Bian; Kaitai Zhang; Xiaohan Gu; | arxiv-cs.HC | 2024-12-09 |
26 | MuMu-LLaMA: Multi-modal Music Understanding and Generation Via Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address this, we introduce a dataset with 167.69 hours of multi-modal data, including text, images, videos, and music annotations. Based on this dataset, we propose MuMu-LLaMA, a model that leverages pre-trained encoders for music, images, and videos. |
Shansong Liu; Atin Sakkeer Hussain; Qilong Wu; Chenshuo Sun; Ying Shan; | arxiv-cs.SD | 2024-12-09 |
27 | VidMusician: Video-to-Music Generation with Semantic-Rhythmic Alignment Via Hierarchical Visual Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose VidMusician, a parameter-efficient video-to-music generation framework built upon text-to-music models. |
SIFEI LI et. al. | arxiv-cs.SD | 2024-12-09 |
28 | Jess+: Designing Embodied AI for Interactive Music-making Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we discuss the conceptualisation and design of embodied AI within an inclusive music-making project. |
Craig Vear; Johann Benerradi; | arxiv-cs.HC | 2024-12-09 |
29 | M6: Multi-generator, Multi-domain, Multi-lingual and Cultural, Multi-genres, Multi-instrument Machine-Generated Music Detection Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Detecting machine-generated music (MGMD) is, therefore, critical to safeguarding these domains, yet the field lacks comprehensive datasets to support meaningful progress. To address this gap, we introduce \textbf{M6}, a large-scale benchmark dataset tailored for MGMD research. |
Yupei Li; Hanqian Li; Lucia Specia; Björn W. Schuller; | arxiv-cs.SD | 2024-12-08 |
30 | Semi-Supervised Contrastive Learning for Controllable Video-to-Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, identifying the best music for a video can be a difficult and time-consuming task. To address this challenge, we propose a novel framework for automatically retrieving a matching music clip for a given video, and vice versa. |
Shanti Stewart; Gouthaman KV; Lie Lu; Andrea Fanelli; | arxiv-cs.MM | 2024-12-08 |
31 | Aligned Music Notation and Lyrics Transcription Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces and formalizes, for the first time, the Aligned Music Notation and Lyrics Transcription (AMNLT) challenge, which addresses the complete transcription of vocal scores by jointly considering music symbols, lyrics, and their synchronization. |
Eliseo Fuentes-Martínez; Antonio Ríos-Vila; Juan C. Martinez-Sevilla; David Rizo; Jorge Calvo-Zaragoza; | arxiv-cs.CV | 2024-12-05 |
32 | Missing Melodies: AI Music Generation and Its Nearly Complete Omission of The Global South Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conducted an extensive analysis of over one million hours of audio datasets used in AI music generation research and manually reviewed more than 200 papers from eleven prominent AI and music conferences and organizations (AAAI, ACM, EUSIPCO, EURASIP, ICASSP, ICML, IJCAI, ISMIR, NeurIPS, NIME, SMC) to identify a critical gap in the fair representation and inclusion of the musical genres of the Global South in AI research. |
Atharva Mehta; Shivam Chauhan; Monojit Choudhury; | arxiv-cs.SD | 2024-12-05 |
33 | Exploring Transformer-Based Music Overpainting for Jazz Piano Variations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce VAR4000, a subset of a larger dataset for jazz piano performances, consisting of 4,352 training pairs. |
Eleanor Row; Ivan Shanin; György Fazekas; | arxiv-cs.SD | 2024-12-05 |
34 | Relationships Between Keywords and Strong Beats in Lyrical Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Artificial Intelligence (AI) song generation has emerged as a popular topic, yet the focus on exploring the latent correlations between specific lyrical and rhythmic features remains limited. In contrast, this pilot study particularly investigates the relationships between keywords and rhythmically stressed features such as strong beats in songs. |
Callie C. Liao; Duoduo Liao; Ellie L. Zhang; | arxiv-cs.SD | 2024-12-05 |
35 | SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We observe that the differences between singing and talking audios manifest in terms of frequency and amplitude. |
YAN LI et. al. | arxiv-cs.CV | 2024-12-04 |
36 | MusicGen-Chord: Advancing Music Generation Through Chord Progressions and Interactive Web-UI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: MusicGen is a music generation language model (LM) that can be conditioned on textual descriptions and melodic features. We introduce MusicGen-Chord, which extends this capability by incorporating chord progression features. |
Jongmin Jung; Andreas Jansson; Dasaem Jeong; | arxiv-cs.SD | 2024-11-29 |
37 | Music2Fail: Transfer Music to Failed Recorder Style Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate another style transfer scenario called “failed-music style transfer”. |
CHON IN LEONG et. al. | arxiv-cs.SD | 2024-11-27 |
38 | Synthesising Handwritten Music with GANs: A Comprehensive Evaluation of CycleWGAN, ProGAN, and DCGAN Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper addresses the data scarcity problem by applying Generative Adversarial Networks (GANs) to synthesise realistic handwritten music sheets. |
Elona Shatri; Kalikidhar Palavala; George Fazekas; | arxiv-cs.CV | 2024-11-25 |
39 | Proceedings of The 6th International Workshop on Reading Music Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The International Workshop on Reading Music Systems (WoRMS) is a workshop that tries to connect researchers who develop systems for reading music, such as in the field of Optical … |
Jorge Calvo-Zaragoza; Alexander Pacha; Elona Shatri; | arxiv-cs.CV | 2024-11-24 |
40 | A Training-Free Approach for Music Style Transfer with Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel training-free approach leveraging pre-trained Latent Diffusion Models (LDMs). |
SOOYOUNG KIM et. al. | arxiv-cs.SD | 2024-11-24 |
41 | Mode-conditioned Music Learning and Composition: A Spiking Neural Network Inspired By Neuroscience and Psychology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a spiking neural network inspired by brain mechanisms and psychological theories to represent musical modes and keys, ultimately generating musical pieces that incorporate tonality features. |
Qian Liang; Yi Zeng; Menghaoran Tang; | arxiv-cs.SD | 2024-11-22 |
42 | DAIRHuM: A Platform for Directly Aligning AI Representations with Human Musical Judgments Applied to Carnatic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a platform for exploring the Direct alignment between AI music model Representations and Human Musical judgments (DAIRHuM). |
Prashanth Thattai Ravikumar; | arxiv-cs.SD | 2024-11-22 |
43 | Generative AI for Music and Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this dissertation, I introduce the three main directions of my research centered around generative AI for music and audio: 1) multitrack music generation, 2) assistive music creation tools, and 3) multimodal learning for audio and music. |
Hao-Wen Dong; | arxiv-cs.SD | 2024-11-21 |
44 | Building Music with Lego Bricks and Raspberry Pi Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a system to build music in an intuitive and accessible way, with Lego bricks, is presented. |
Ana M. Barbancho; Lorenzo J. Tardon; Isabel Barbancho; | arxiv-cs.HC | 2024-11-20 |
45 | Song Form-aware Full-Song Text-to-Lyrics Generation with Multi-Level Granularity Syllable Count Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework for lyrics generation that enables multi-level syllable control at the word, phrase, line, and paragraph levels, aware of song form. |
Yunkee Chae; Eunsik Shin; Hwang Suntae; Seungryeol Paik; Kyogu Lee; | arxiv-cs.CL | 2024-11-20 |
46 | Oblivious Algorithms for Maximum Directed Cut: New Upper and Lower Bounds Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we narrow the gap between upper and lower bounds on the best approximation ratio achievable by oblivious algorithms for Max-Directed-Cut. |
Samuel Hwang; Noah G. Singer; Santhoshini Velusamy; | arxiv-cs.DS | 2024-11-19 |
47 | Improving Controllability and Editability for Pretrained Text-to-Music Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These challenges include providing sufficient control over the generated content and allowing for flexible, precise edits. This thesis tackles these issues by introducing a series of advancements that progressively build upon each other, enhancing the controllability and editability of text-to-music generation models. |
Yixiao Zhang; | arxiv-cs.SD | 2024-11-19 |
48 | Zero-Shot Crate Digging: DJ Tool Retrieval Using Speech Activity, Music Structure And CLAP Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we demonstrate an approach to discovering DJ tools in personal music collections. |
Iroro Orife; | arxiv-cs.SD | 2024-11-18 |
49 | Do Captioning Metrics Reflect Music Semantic Alignment? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present cases where traditional metrics are vulnerable to syntactic changes, and show they do not correlate well with human judgments. By addressing these issues, we aim to emphasize the need for a critical reevaluation of how music captions are assessed. |
Jinwoo Lee; Kyogu Lee; | arxiv-cs.SD | 2024-11-18 |
50 | Examining Platformization in Cultural Production: A Comparative Computational Analysis of Hit Songs on TikTok and Spotify Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores how TikTok and Spotify, situated in different governance and user contexts, could influence digital music production and reception within each platform and between each other. |
Na Ta; Fang Jiao; Cong Lin; Cuihua Shen; | arxiv-cs.SI | 2024-11-17 |
51 | Language Models for Music Medicine Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose fine-tuning MusicGen, a music-generating transformer model, to create short musical clips that assist patients in transitioning from negative to desired emotional states. |
EMMANOUIL NIKOLAKAKIS et. al. | arxiv-cs.SD | 2024-11-13 |
52 | PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, creating symbolic music that is both long-structured and expressive remains a considerable challenge. In this paper, we propose PerceiverS (Segmentation and Scale), a novel architecture designed to address this issue by leveraging both Effective Segmentation and Multi-Scale attention mechanisms. |
Yungang Yi; Weihua Li; Matthew Kuo; Quan Bai; | arxiv-cs.AI | 2024-11-12 |
53 | Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a data generation framework for rich music discovery dialogue using a large language model (LLM) and user intents, system actions, and musical attributes. |
SeungHeon Doh; Keunwoo Choi; Daeyong Kwon; Taesu Kim; Juhan Nam; | arxiv-cs.SD | 2024-11-11 |
54 | Timing and Dynamics of The Rosanna Shuffle Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this analysis, we examine the timing and dynamics of the original drum track, focusing on rhythmic variations such as swing factor, microtiming deviations, tempo drift, and the overall dynamics of the hi-hat pattern. |
Esa Räsänen; Niko Gullsten; Otto Pulkkinen; Tuomas Virtanen; | arxiv-cs.SD | 2024-11-11 |
55 | Generating Mixcode Popular Songs with Artificial Intelligence: Concepts, Plans, and Speculations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper discusses a proposed project integrating artificial intelligence and popular music, with the ultimate goal of creating a powerful tool for implementing music for social transformation, education, healthcare, and emotional well-being. |
Abhishek Kaushik; Kayla Rush; | arxiv-cs.IR | 2024-11-10 |
56 | Harnessing High-Level Song Descriptors Towards Natural Language-Based Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we assess LMs effectiveness in recommending songs based on user natural language descriptions and items with descriptors like genres, moods, and listening contexts. |
Elena V. Epure; Gabriel Meseguer-Brocal; Darius Afchar; Romain Hennequin; | arxiv-cs.IR | 2024-11-08 |
57 | Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Babel Bardo, a system that uses Large Language Models (LLMs) to transform speech transcriptions into music descriptions for controlling a text-to-music model. |
Felipe Marra; Lucas N. Ferreira; | arxiv-cs.SD | 2024-11-06 |
58 | MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, to date, there has been no work that considers them jointly to explore the modality alignment within. To bridge this gap, we propose a novel framework, termed MoMu-Diffusion, for long-term and synchronous motion-music generation. |
FUMING YOU et. al. | arxiv-cs.SD | 2024-11-04 |
59 | PIAST: A Multimodal Piano Dataset with Audio, Symbolic and Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While piano music has become a significant area of study in Music Information Retrieval (MIR), there is a notable lack of datasets for piano solo music with text labels. To address this gap, we present PIAST (PIano dataset with Audio, Symbolic, and Text), a piano music dataset. |
HAYEON BANG et. al. | arxiv-cs.SD | 2024-11-04 |
60 | Sing-On-Your-Beat: Simple Text-Controllable Accompaniment Generations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With advancements in deep learning, previous research has focused on generating suitable accompaniments but often lacks precise alignment with the desired instrumentation and genre. To address this, we propose a straightforward method that enables control over the accompaniment through text prompts, allowing the generation of music that complements the vocals and aligns with the song instrumental and genre requirements. |
Quoc-Huy Trinh; Minh-Van Nguyen; Trong-Hieu Nguyen Mau; Khoa Tran; Thanh Do; | arxiv-cs.SD | 2024-11-03 |
61 | I’ve Heard This Before: Initial Results on Tiktok’s Impact On The Re-Popularization of Songs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we analyze how TikTok helps to revitalize older songs. |
Breno Matos; Francisco Galuppo; Rennan Cordeiro; Flavio Figueiredo; | arxiv-cs.SI | 2024-11-02 |
62 | Assessing The Impact of Sampling, Remixes, and Covers on Original Song Popularity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using Who Sampled data and Google Trends, we examine how the popularity of a borrowing song affects the original. |
Guilherme Soares S. dos Santos; Flavio Figueiredo; | arxiv-cs.SI | 2024-11-02 |
63 | Music Foundation Model As Generic Booster for Music Downstream Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SoniDo, a music foundation model (MFM) designed to extract hierarchical features from target music samples. |
WEIHSIANG LIAO et. al. | arxiv-cs.SD | 2024-11-02 |
64 | MIRFLEX: Music Information Retrieval Feature Library for Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces an extendable modular system that compiles a range of music feature extraction models to aid music information retrieval research. |
Anuradha Chopra; Abhinaba Roy; Dorien Herremans; | arxiv-cs.SD | 2024-11-01 |
65 | Machine Learning Framework for Audio-Based Content Evaluation Using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents a machine learning framework for assessing similarity between audio content and predicting sentiment score. |
Aris J. Aristorenas; | arxiv-cs.SD | 2024-10-31 |
66 | Semi-Supervised Self-Learning Enhanced Music Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To handle the noisy label issue, we propose a semi-supervised self-learning (SSSL) method, which can differentiate between samples with correct and incorrect labels in a self-learning manner, thus effectively utilizing the augmented segment-level data. |
Yifu Sun; Xulong Zhang; Monan Zhou; Wei Li; | arxiv-cs.SD | 2024-10-29 |
67 | Emotion-Guided Image to Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an emotion-guided image-to-music generation framework that leverages the Valence-Arousal (VA) emotional space to produce music that aligns with the emotional tone of a given image. |
Souraja Kundu; Saket Singh; Yuji Iwahori; | arxiv-cs.SD | 2024-10-29 |
68 | ExpressiveSinger: Multilingual and Multi-Style Score-based Singing Voice Synthesis with Expressive Performance Control Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Singing Voice Synthesis (SVS) has significantly advanced with deep generative models, achieving high audio quality but still struggling with musicality, mainly due to the lack of … |
Shuqi Dai; Ming-Yu Liu; Rafael Valle; Siddharth Gururani; | ACM Multimedia | 2024-10-28 |
69 | MidiTok Visualizer: A Tool for Visualization and Analysis of Tokenized MIDI Symbolic Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Symbolic music research plays a crucial role in music-related machine learning, but MIDI data can be complex for those without musical expertise. To address this issue, we present MidiTok Visualizer, a web application designed to facilitate the exploration and visualization of various MIDI tokenization methods from the MidiTok Python package. |
Michał Wiszenko; Kacper Stefański; Piotr Malesa; Łukasz Pokorzyński; Mateusz Modrzejewski; | arxiv-cs.SD | 2024-10-27 |
70 | Symbotunes: Unified Hub for Symbolic Music Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Therefore, directly comparing the methods or becoming acquainted with them may present challenges. To mitigate this issue we introduce Symbotunes, an open-source unified hub for symbolic music generative models. |
Paweł Skierś; Maksymilian Łazarski; Michał Kopeć; Mateusz Modrzejewski; | arxiv-cs.SD | 2024-10-27 |
71 | An Approach to Hummed-tune and Song Sequences Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper covers details about the pre-processed data from the original type (mp3) to usable form for training and inference. |
LOC BAO PHAM et. al. | arxiv-cs.SD | 2024-10-27 |
72 | MusicFlow: Cascaded Flow Matching for Text Guided Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce MusicFlow, a cascaded text-to-music generation model based on flow matching. |
K R PRAJWAL et. al. | arxiv-cs.SD | 2024-10-27 |
73 | Arabic Music Classification and Generation Using Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The dataset used in this project consists of new and classical Egyptian music pieces composed by different composers. |
MOHAMED ELSHAARAWY et. al. | arxiv-cs.SD | 2024-10-25 |
74 | Melody Construction for Persian Lyrics Using LSTM Recurrent Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The present paper investigated automatic melody construction for Persian lyrics as an input. |
Farshad Jafari; Farzad Didehvar; Amin Gheibi; | arxiv-cs.SD | 2024-10-23 |
75 | Exploring Tokenization Methods for Multitrack Sheet Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the tokenization of multitrack sheet music in ABC notation, introducing two methods–bar-stream and line-stream patching. |
Yashan Wang; Shangda Wu; Xingjian Du; Maosong Sun; | arxiv-cs.SD | 2024-10-23 |
76 | Striking A New Chord: Neural Networks in Music Information Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we compare LSTM, Transformer, and GPT models against a widely-used markov model to predict a chord event following a sequence of chords. |
Farshad Jafari; Claire Arthur; | arxiv-cs.IT | 2024-10-23 |
77 | Audio-to-Score Conversion Model Based on Whisper Methodology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This thesis innovatively introduces the Orpheus’ Score, a custom notation system that converts music information into tokens, designs a custom vocabulary library, and trains a corresponding custom tokenizer. |
Hongyao Zhang; Bohang Sun; | arxiv-cs.SD | 2024-10-22 |
78 | Musinger: Communication of Music Over A Distance with Wearable Haptic Display and Touch Sensitive Surface Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the integration of auditory and tactile experiences in musical haptics, focusing on enhancing sensory dimensions of music through touch. Addressing the gap in translating auditory signals to meaningful tactile feedback, our research introduces a novel method involving a touch-sensitive recorder and a wearable haptic display that captures musical interactions via force sensors and converts these into tactile sensations. |
MIGUEL ALTAMIRANO CABRERA et. al. | arxiv-cs.HC | 2024-10-21 |
79 | Music2P: A Multi-Modal AI-Driven Tool for Simplifying Album Cover Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our ultimate goal is to provide a tool that empowers musicians and producers, especially those with limited resources or expertise, to create compelling album covers. |
Joong Ho Choi; Geonyeong Choi; Ji Eun Han; Wonjin Yang; Zhi-Qi Cheng; | cikm | 2024-10-21 |
80 | OpenMU: Your Swiss Army Knife for Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present OpenMU-Bench, a large-scale benchmark suite for addressing the data scarcity issue in training multimodal language models to understand music. |
MENGJIE ZHAO et. al. | arxiv-cs.SD | 2024-10-20 |
81 | ArchiTone: A LEGO-Inspired Gamified System for Visualized Music Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Informed by formative investigation and inspired by LEGO, we introduce ArchiTone, a gamified system that employs constructivism by visualizing music theory concepts as musical blocks and buildings for music education. |
JIAXING YU et. al. | arxiv-cs.HC | 2024-10-20 |
82 | Audio Processing Using Pattern Recognition for Music Genre Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research aims to contribute to improving music recommendation systems and content curation. |
Sivangi Chatterjee; Srishti Ganguly; Avik Bose; Hrithik Raj Prasad; Arijit Ghosal; | arxiv-cs.SD | 2024-10-19 |
83 | Music Therapy for Autism Spectrum Disorder: A Comprehensive Literature Review on Therapeutic Efficacy, Limitations, and AI Integration Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Autism Spectrum Disorder (ASD) is a neurological and developmental condition that presents considerable social, behavioral, and communicative challenges to those diagnosed with … |
Beatrice Low; Xindi Liu; Richard Z. Li; Elizabeth Ren; Jasmine X Zhang; | 2024 IEEE 15th Annual Ubiquitous Computing, Electronics & … | 2024-10-17 |
84 | CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Challenges in managing linguistic diversity and integrating various musical modalities are faced by current music information retrieval systems. These limitations reduce their … |
SHANGDA WU et. al. | ArXiv | 2024-10-17 |
85 | MeloTrans: A Text to Symbolic Music Generation Model Following Human Composition Habit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To archive this, we develop the POP909$\_$M dataset, the first to include labels for musical motifs and their variants, providing a basis for mimicking human compositional habits. Building on this, we propose MeloTrans, a text-to-music composition model that employs principles of motif development rules. |
YUTIAN WANG et. al. | arxiv-cs.SD | 2024-10-17 |
86 | Do We Need More Complex Representations for Structure? A Comparison of Note Duration Representation for Music Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we inquire if the off-the-shelf Music Transformer models perform just as well on structural similarity metrics using only unannotated MIDI information. |
Gabriel Souza; Flavio Figueiredo; Alexei Machado; Deborah Guimarães; | arxiv-cs.SD | 2024-10-14 |
87 | M2M-Gen: A Multimodal Framework for Automated Background Music Generation in Japanese Manga Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces M2M Gen, a multi modal framework for generating background music tailored to Japanese manga. |
Megha Sharma; Muhammad Taimoor Haseeb; Gus Xia; Yoshimasa Tsuruoka; | arxiv-cs.SD | 2024-10-13 |
88 | Small Tunes Transformer: Exploring Macro & Micro-Level Hierarchies for Skeleton-Conditioned Melody Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we delve into the multi-level structures within music from macro-level and micro-level hierarchies. |
Yishan Lv; Jing Luo; Boyuan Ju; Xinyu Yang; | arxiv-cs.SD | 2024-10-11 |
89 | Symbolic Music Generation with Fine-grained Interactive Textural Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The problem of symbolic music generation presents unique challenges due to the combination of limited data availability and the need for high precision in note pitch. To overcome these difficulties, we introduce Fine-grained Textural Guidance (FTG) within diffusion models to correct errors in the learned distributions. |
Tingyu Zhu; Haoyu Liu; Zhimin Jiang; Zeyu Zheng; | arxiv-cs.SD | 2024-10-10 |
90 | Song Emotion Classification of Lyrics with Out-of-Domain Data Under Label Scarcity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We examine the novel usage of a large out-of-domain dataset as a creative solution to the challenge of training data scarcity in the emotional classification of song lyrics. |
Jonathan Sakunkoo; Annabella Sakunkoo; | arxiv-cs.CL | 2024-10-08 |
91 | Algorithmic Collective Action in Recommender Systems: Promoting Songs By Reordering Playlists Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The success of the collective is measured by the increase in test-time recommendations of the targeted song, given a constraint on the impact on user experience. We introduce two easy-to-implement strategies towards this goal and test their efficacy on a publicly available recommender system model used in production by a major music streaming platform. |
Joachim Baumann; Celestine Mendler-Dünner; | nips | 2024-10-07 |
92 | Art2Mus: Bridging Visual Arts and Music Through Cross-Modal Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing image-to-music models are limited to simple images, lacking the capability to generate music from complex digitized artworks. To address this gap, we introduce $\mathcal{A}\textit{rt2}\mathcal{M}\textit{us}$, a novel model designed to create music from digitized artworks or text inputs. |
Ivan Rinaldi; Nicola Fanelli; Giovanna Castellano; Gennaro Vessio; | arxiv-cs.MM | 2024-10-07 |
93 | UniMuMo: Unified Text, Music and Motion Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: We introduce UniMuMo, a unified multimodal model capable of taking arbitrary text, music, and motion data as input conditions to generate outputs across all three modalities. To … |
HAN YANG et. al. | ArXiv | 2024-10-06 |
94 | Enriching Music Descriptions with A Finetuned-LLM and Metadata for Text-to-Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, users also articulate a need to explore music that shares similarities with their favorite tracks or artists, such as \textit{I need a similar track to Superstition by Stevie Wonder}. To address these concerns, this paper proposes an improved Text-to-Music Retrieval model, denoted as TTMR++, which utilizes rich text descriptions generated with a finetuned large language model and metadata. |
SeungHeon Doh; Minhee Lee; Dasaem Jeong; Juhan Nam; | arxiv-cs.SD | 2024-10-04 |
95 | SoundSignature: What Type of Music Do You Like? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: . In this paper, we highlight the application’s innovative features and educational potential, and present findings from a pilot user study that evaluates its efficacy and usability. |
Brandon James Carone; Pablo Ripollés; | arxiv-cs.SD | 2024-10-04 |
96 | CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Contrastive Long-form Language-Audio Pretraining (\textbf{CoLLAP}) to significantly extend the perception window for both the input audio (up to 5 minutes) and the language descriptions (exceeding 250 words), while enabling contrastive learning across modalities and temporal dynamics. |
JUNDA WU et. al. | arxiv-cs.SD | 2024-10-03 |
97 | Generating Symbolic Music from Natural Language Prompts Using An LLM-Enhanced Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present MetaScore, a new dataset consisting of 963K musical scores paired with rich metadata, including free-form user-annotated tags, collected from an online music forum. |
Weihan Xu; Julian McAuley; Taylor Berg-Kirkpatrick; Shlomo Dubnov; Hao-Wen Dong; | arxiv-cs.SD | 2024-10-02 |
98 | Agent-Driven Large Language Models for Mandarin Lyric Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we developed a multi-agent system that decomposes the melody-to-lyric task into sub-tasks, with each agent controlling rhyme, syllable count, lyric-melody alignment, and consistency. |
Hong-Hsiang Liu; Yi-Wen Liu; | arxiv-cs.CL | 2024-10-02 |
99 | Analyzing Byte-Pair Encoding on Monophonic and Polyphonic Symbolic Music: A Focus on Musical Phrase Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Given that symbolic music can differ significantly from text, particularly with polyphony, we investigate how BPE behaves with different types of musical content. This study provides a qualitative analysis of BPE’s behavior across various instrumentations and evaluates its impact on a musical phrase segmentation task for both monophonic and polyphonic music. |
Dinh-Viet-Toan Le; Louis Bigo; Mikaela Keller; | arxiv-cs.IR | 2024-10-02 |
100 | Do Music Generation Models Encode Music Theory? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, we introduce SynTheory, a synthetic MIDI and audio music theory dataset, consisting of tempos, time signatures, notes, intervals, scales, chords, and chord progressions concepts. |
Megan Wei; Michael Freeman; Chris Donahue; Chen Sun; | arxiv-cs.SD | 2024-10-01 |
101 | Melody-Guided Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the Melody-Guided Music Generation (MG2) model, a novel approach using melody to guide the text-to-music generation that, despite a simple method and limited resources, achieves excellent performance. |
Shaopeng Wei; Manzhen Wei; Haoyu Wang; Yu Zhao; Gang Kou; | arxiv-cs.SD | 2024-09-30 |
102 | Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes integrating a text-to-music model with a large language model to generate music with form. |
Lilac Atassi; | arxiv-cs.SD | 2024-09-30 |
103 | SongTrans: An Unified Song Transcription and Alignment Method for Lyrics and Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While there are tools available for lyric or note transcription tasks, they all need pre-processed data which is relatively time-consuming (e.g., vocal and accompaniment separation). Besides, most of these tools are designed to address a single task and struggle with aligning lyrics and notes (i.e., identifying the corresponding notes of each word in lyrics). |
SIWEI WU et. al. | arxiv-cs.SD | 2024-09-22 |
104 | MuCodec: Ultra Low-Bitrate Music Codec Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Due to the complexity of music backgrounds and the richness of vocals, solely relying on modeling semantic or acoustic information cannot effectively reconstruct music with both vocals and backgrounds. To address this issue, we propose MuCodec, specifically targeting music compression and reconstruction tasks at ultra low bitrates. |
YAOXUN XU et. al. | arxiv-cs.SD | 2024-09-20 |
105 | Exploring Bat Song Syllable Representations in Self-supervised Audio Encoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: How well can deep learning models trained on human-generated sounds distinguish between another species’ vocalization types? |
Marianne de Heer Kloots; Mirjam Knörnschild; | arxiv-cs.SD | 2024-09-19 |
106 | $\text{M}^\text{6}(\text{GPT})^\text{3}$: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text Using Genetic Algorithms, Probabilistic Methods and GPT Models in Any Progression and Time Signature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a genetic algorithm for the generation of melodic elements. |
Jakub Poćwiardowski; Mateusz Modrzejewski; Marek S. Tatara; | arxiv-cs.SD | 2024-09-19 |
107 | FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents FruitsMusic, a metadata corpus of Japanese idol-group songs in the real world, precisely annotated with who sings what and when. |
Hitoshi Suda; Shunsuke Yoshida; Tomohiko Nakamura; Satoru Fukayama; Jun Ogata; | arxiv-cs.SD | 2024-09-19 |
108 | Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a latent diffusion-based multi-track generation model capable of both source separation and multi-track music synthesis by learning the joint probability distribution of tracks sharing a musical context. |
Tornike Karchkhadze; Mohammad Rasool Izadi; Shlomo Dubnov; | arxiv-cs.SD | 2024-09-18 |
109 | METEOR: Melody-aware Texture-controllable Symbolic Orchestral Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose METEOR, a model for Melody-aware Texture-controllable Orchestral music generation. |
Dinh-Viet-Toan Le; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-09-18 |
110 | Evaluation of Pretrained Language Models on Music Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Music-text multimodal systems have enabled new approaches to Music Information Research (MIR) applications such as audio-to-text and text-to-audio retrieval, text-based song … |
Yannis Vasilakis; Rachel M. Bittner; Johan Pauwels; | ArXiv | 2024-09-17 |
111 | PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such issues highlight the need for publicly available, copyright-free musical data, in which there is a large shortage, particularly for symbolic music data. To alleviate this issue, we present PDMX: a large-scale open-source dataset of over 250K public domain MusicXML scores collected from the score-sharing forum MuseScore, making it the largest available copyright-free symbolic music dataset to our knowledge. |
Phillip Long; Zachary Novack; Taylor Berg-Kirkpatrick; Julian McAuley; | arxiv-cs.SD | 2024-09-16 |
112 | Unveiling and Mitigating Bias in Large Language Model Recommendations: A Path to Fairness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the interplay between bias and LLM-based recommendation systems, focusing on music, song, and book recommendations across diverse demographic and cultural groups. |
Anindya Bijoy Das; Shahnewaz Karim Sakib; | arxiv-cs.IR | 2024-09-16 |
113 | MusicLIME: Explainable Multimodal Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce MusicLIME, a model-agnostic feature importance explanation method designed for multimodal music models. |
Theodoros Sotirou; Vassilis Lyberatos; Orfeas Menis Mastromichalakis; Giorgos Stamou; | arxiv-cs.SD | 2024-09-16 |
114 | ELMI: Interactive and Intelligent Sign Language Translation of Lyrics for Song Signing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ELMI, an accessible song-signing tool that assists in translating lyrics into sign language. |
Suhyeon Yoo; Khai N. Truong; Young-Ho Kim; | arxiv-cs.HC | 2024-09-15 |
115 | Constructing A Singing Style Caption Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing open-source audio-text datasets for voice generation tend to capture only a very limited range of attributes, often missing musical characteristics of the audio. To fill this gap, we introduce S2Cap, an audio-text pair dataset with a diverse set of attributes. |
Hyunjong Ok; Jaeho Lee; | arxiv-cs.CL | 2024-09-15 |
116 | Prevailing Research Areas for Music AI in The Era of Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We then overview different generative models, forms of evaluating these models, and their computational constraints/limitations. Subsequently, we highlight applications of these generative models towards extensions to multiple modalities and integration with artists’ workflow as well as music education systems. |
Megan Wei; Mateusz Modrzejewski; Aswin Sivaraman; Dorien Herremans; | arxiv-cs.SD | 2024-09-14 |
117 | A Survey of Foundation Models for Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The rapid advancement of artificial intelligence (AI) has introduced new ways to analyze music, aiming to replicate human understanding of music and provide related services. |
WENJUN LI et. al. | arxiv-cs.SD | 2024-09-14 |
118 | Towards Leveraging Contrastively Pretrained Neural Audio Embeddings for Recommender Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our experiments demonstrate that neural embeddings, particularly those generated with the Contrastive Language-Audio Pretraining (CLAP) model, present a promising approach to enhancing music recommendation tasks within graph-based frameworks. |
Florian Grötschla; Luca Strässle; Luca A. Lanzendörfer; Roger Wattenhofer; | arxiv-cs.SD | 2024-09-13 |
119 | Seed-Music: A Unified Framework for High Quality and Controlled Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Seed-Music, a suite of music generation systems capable of producing high-quality music with fine-grained style control. |
YE BAI et. al. | arxiv-cs.SD | 2024-09-13 |
120 | Bridging Paintings and Music — Exploring Emotion Based Music Generation Through Paintings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research develops a model capable of generating music that resonates with the emotions depicted in visual arts, integrating emotion labeling, image captioning, and language models to transform visual inputs into musical compositions. |
Tanisha Hisariya; Huan Zhang; Jinhua Liang; | arxiv-cs.SD | 2024-09-12 |
121 | VMAS: Video-to-Music Generation Via Semantic Alignment in Web Music Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a framework for learning to generate background music from video inputs. |
Yan-Bo Lin; Yu Tian; Linjie Yang; Gedas Bertasius; Heng Wang; | arxiv-cs.MM | 2024-09-11 |
122 | A Two-Stage Band-Split Mamba-2 Network For Music Separation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper applies Mamba-2 with a two-stage strategy, which introduces residual mapping based on the mask method, effectively compensating for the details absent in the mask and further improving separation performance. |
Jinglin Bai; Yuan Fang; Jiajie Wang; Xueliang Zhang; | arxiv-cs.SD | 2024-09-10 |
123 | RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current separating methods struggle to fully remove noise or excessively suppress signal components, affecting the naturalness and similarity of the processed audio. To tackle this, our study introduces RobustSVC, a novel any-to-one SVC framework that converts noisy vocals into clean vocals sung by the target singer. |
WEI CHEN et. al. | arxiv-cs.SD | 2024-09-10 |
124 | An End-to-End Approach for Chord-Conditioned Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the inaccuracy of automatic chord extractors, we devise a robust cross-attention mechanism augmented with dynamic weight sequence to integrate extracted chord information into song generations and reduce frame-level flaws, and propose a novel model termed Chord-Conditioned Song Generator (CSG) based on it. |
SHUOCHEN GAO et. al. | arxiv-cs.SD | 2024-09-10 |
125 | Benchmarking Sub-Genre Classification For Mainstage Dance Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the absence of comprehensive datasets and high-performing methods in the classification of mainstage dance music, this work introduces a novel benchmark comprising a new dataset and a baseline. |
Hongzhi Shu; Xinglin Li; Hongyu Jiang; Minghao Fu; Xinyu Li; | arxiv-cs.SD | 2024-09-10 |
126 | Musical Chords: A Novel Java Algorithm and App Utility to Enumerate Chord-Progressions Adhering to Music Theory Guidelines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these limitations, a novel Java Algorithm and automated music theory chord progression and variations generator App has been developed. This App offers a piano user interface, that applies music theory to generate all possible four-chord and eight-chord progressions and produces three alternate variations of the generated progressions selected by the user. |
Aditya Lakshminarasimhan; | arxiv-cs.SD | 2024-09-09 |
127 | SongCreator: Lyrics-based Universal Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While various aspects of song generation have been explored by previous works, such as singing voice, vocal composition and instrumental arrangement, etc., generating songs with both vocals and accompaniment given lyrics remains a significant challenge, hindering the application of music generation models in the real world. In this light, we propose SongCreator, a song-generation system designed to tackle this challenge. |
SHUN LEI et. al. | arxiv-cs.SD | 2024-09-09 |
128 | Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method based on dual diffusion bridges, trained using the CocoChorales Dataset, which consists of unpaired monophonic single-instrument audio data. |
MICHELE MANCUSI et. al. | arxiv-cs.SD | 2024-09-09 |
129 | Mel-RoFormer for Vocal Separation and Vocal Melody Transcription Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Mel-RoFormer, a spectrogram-based model featuring two key designs: a novel Mel-band Projection module at the front-end to enhance the model’s capability to capture informative features across multiple frequency bands, and interleaved RoPE Transformers to explicitly model the frequency and time dimensions as two separate sequences. |
Ju-Chiang Wang; Wei-Tsung Lu; Jitong Chen; | arxiv-cs.SD | 2024-09-06 |
130 | Enhancing Sequential Music Recommendation with Personalized Popularity Awareness Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, music consumption is characterized by a prevalence of repeated listening, i.e., users frequently return to their favourite tracks, an important signal that could be framed as individual or personalized popularity. This paper addresses these challenges by introducing a novel approach that incorporates personalized popularity information into sequential recommendation. |
Davide Abbattista; Vito Walter Anelli; Tommaso Di Noia; Craig Macdonald; Aleksandr Vladimirovich Petrov; | arxiv-cs.IR | 2024-09-06 |
131 | Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This process involves composing each instrument to align with existing ones in terms of beat, dynamics, harmony, and melody, requiring greater precision and control over tracks than text prompts usually provide. In this work, we address these challenges by extending the MusicLDM, a latent diffusion model for music, into a multi-track generative model. |
Tornike Karchkhadze; Mohammad Rasool Izadi; Ke Chen; Gerard Assayag; Shlomo Dubnov; | arxiv-cs.SD | 2024-09-04 |
132 | MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal Precision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing research primarily focuses on Western music and encounters challenges in generating melodies for Chinese traditional music, especially in capturing modal characteristics and emotional expression. To address these issues, we propose a new architecture, the Dual-Feature Modeling Module, which integrates the long-range dependency modeling of the Mamba Block with the global structure capturing capabilities of the Transformer Block. |
JIATAO CHEN et. al. | arxiv-cs.SD | 2024-09-04 |
133 | Diffusion-Based Sound Synthesis in Music Production Related Papers Related Patents Related Grants Related Venues Related Experts View |
Pierre-Louis Wolfgang Léon Suckrow; Christoph Johannes Weber; Sylvia Rothe; | El Farmaceutico | 2024-09-02 |
134 | Considerations and Concerns of Professional Game Composers Regarding Artificially Intelligent Music Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Artificially intelligent music technology (AIMT) is a promising field with great potential for creating innovation in music. However, the considerations and concerns surrounding … |
Kyle Worrall; Tom Collins; | IEEE Transactions on Games | 2024-09-01 |
135 | MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current techniques dedicated to symbolic music generation generally encounter two significant challenges: training data’s lack of information about chords and scales and the requirement of specially designed model architecture adapted to the unique format of symbolic music representation. In this paper, we solve the above problems by introducing new symbolic music representation with MusicLang chord analysis model. |
Jinlong Zhu; Keigo Sakurai; Ren Togo; Takahiro Ogawa; Miki Haseyama; | arxiv-cs.SD | 2024-09-01 |
136 | Application Research of Short-Time Fourier Transform in Music Generation Based on The Parallel WaveGan System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Despite the widespread use of Fourier transform (FT) networks and generative adversarial networks (GANs) in audio signal processing, their practical effectiveness in unsupervised … |
Jun Min; Zhiwei Gao; Lei Wang; Aihua Zhang; | IEEE Transactions on Industrial Informatics | 2024-09-01 |
137 | FLUX That Plays Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper explores a simple extension of diffusion-based rectified flow Transformers for text-to-music generation, termed as FluxMusic. |
Zhengcong Fei; Mingyuan Fan; Changqian Yu; Junshi Huang; | arxiv-cs.SD | 2024-08-31 |
138 | Toward A More Complete OMR Solution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we focus on the MUSCIMA++ v2.0 dataset, which represents musical notation as a graph with pairwise relationships among detected music objects, and we consider both stages together. |
Guang Yang; Muru Zhang; Lin Qiu; Yanming Wan; Noah A. Smith; | arxiv-cs.CV | 2024-08-30 |
139 | REFFLY: Melody-Constrained Lyrics Editing Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce REFFLY (REvision Framework For Lyrics), the first revision framework designed to edit arbitrary forms of plain text draft into high-quality, full-fledged song lyrics. |
Songyan Zhao; Bingxuan Li; Yufei Tian; Nanyun Peng; | arxiv-cs.CL | 2024-08-30 |
140 | Video to Music Moment Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to bridge the gap, we propose in this paper video to music moment retrieval (VMMR) as a new task. |
ZIJIE XIN et. al. | arxiv-cs.MM | 2024-08-29 |
141 | Do Recommender Systems Promote Local Music? A Reproducibility Study Using Music Streaming Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To assess the robustness of this study’s conclusions, we conduct a comparative analysis using proprietary listening data from a global music streaming service, which we publicly release alongside this paper. |
KRISTINA MATROSOVA et. al. | arxiv-cs.IR | 2024-08-29 |
142 | Transformers Meet ACT-R: Repeat-Aware and Sequential Listening Session Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This is a crucial limitation for music recommendation, as repeatedly listening to the same song over time is a common phenomenon that can even change the way users perceive this song. In this paper, we introduce PISA (Psychology-Informed Session embedding using ACT-R), a session-level sequential recommender system that overcomes this limitation. |
Viet-Anh Tran; Guillaume Salha-Galvan; Bruno Sguerra; Romain Hennequin; | arxiv-cs.IR | 2024-08-29 |
143 | Multimodal Music Datasets? Challenges and Future Goals in Music Processing Related Papers Related Patents Related Grants Related Venues Related Experts View |
Anna-Maria Christodoulou; Olivier Lartillot; Alexander Refsum Jensenius; | Int. J. Multim. Inf. Retr. | 2024-08-28 |
144 | Knowledge Discovery in Optical Music Recognition: Enhancing Information Retrieval with Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We make our implementation, pre-processing scripts, trained models, and evaluation results publicly available to support further research and development. |
Elona Shatri; George Fazekas; | arxiv-cs.IR | 2024-08-27 |
145 | Unlocking Potential in Pre-Trained Music Language Models for Versatile Multi-Track Music Arrangement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a unified sequence-to-sequence framework that enables the fine-tuning of a symbolic music language model for multiple multi-track arrangement tasks, including band arrangement, piano reduction, drum arrangement, and voice separation. |
Longshen Ou; Jingwei Zhao; Ziyu Wang; Gus Xia; Ye Wang; | arxiv-cs.SD | 2024-08-27 |
146 | Foundation Models for Music: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The paper offers insights into future challenges and trends on FMs for music, aiming to shape the trajectory of human-AI collaboration in the music realm. |
YINGHAO MA et. al. | arxiv-cs.SD | 2024-08-26 |
147 | SONICS: Synthetic Or Not — Identifying Counterfeit Songs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, we highlight the importance of modeling long-range temporal dependencies in songs for effective authenticity detection, an aspect overlooked in existing methods. To capture these patterns, we propose a novel model, SpecTTTra, that is up to 3 times faster and 6 times more memory efficient compared to popular CNN and Transformer-based models while maintaining competitive performance. |
Md Awsafur Rahman; Zaber Ibn Abdul Hakim; Najibul Haque Sarker; Bishmoy Paul; Shaikh Anowarul Fattah; | arxiv-cs.SD | 2024-08-26 |
148 | LyCon: Lyrics Reconstruction from The Bag-of-Words Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our study introduces a novel method for generating copyright-free lyrics from publicly available Bag-of-Words (BoW) datasets, which contain the vocabulary of lyrics but not the lyrics themselves. |
Haven Kim; Kahyun Choi; | arxiv-cs.CL | 2024-08-26 |
149 | SONICS: Synthetic Or Not – Identifying Counterfeit Songs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The recent surge in AI-generated songs presents exciting possibilities and challenges. While these inventions democratize music creation, they also necessitate the ability to … |
Md Awsafur Rahman; Zaber Ibn Abdul Hakim; Najibul Haque Sarker; Bishmoy Paul; S. Fattah; | ArXiv | 2024-08-26 |
150 | A Tighter Complexity Analysis of SparseGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we improved the analysis of the running time of SparseGPT [Frantar, Alistarh ICML 2023] from $O(d^{3})$ to $O(d^{\omega} + d^{2+a+o(1)} + d^{1+\omega(1,1,a)-a})$ for any $a \in [0, 1]$, where $\omega$ is the exponent of matrix multiplication. |
Xiaoyu Li; Yingyu Liang; Zhenmei Shi; Zhao Song; | arxiv-cs.DS | 2024-08-22 |
151 | Towards Estimating Personal Values in Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, as highly subjective text, song lyrics present a challenge in terms of sampling songs to be annotated, annotation methods, and in choosing a method for aggregation. In this project, we take a perspectivist approach, guided by social science theory, to gathering annotations, estimating their quality, and aggregating them. |
Andrew M. Demetriou; Jaehun Kim; Sandy Manolios; Cynthia C. S. Liem; | arxiv-cs.CL | 2024-08-22 |
152 | Oh, Behave! Country Representation Dynamics Created By Feedback Loops in Music Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the dynamics of representation of local (i.e., country-specific) and US-produced music in user profiles and recommendations. |
Oleg Lesota; Jonas Geiger; Max Walder; Dominik Kowald; Markus Schedl; | arxiv-cs.IR | 2024-08-21 |
153 | SentHYMNent: An Interpretable and Sentiment-Driven Model for Algorithmic Melody Harmonization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce two major novel elements: a nuanced mixture-based representation for musical sentiment, including a web tool to gather data, as well as a sentiment- and theory-driven harmonization model, SentHYMNent. |
STEPHEN HAHN et. al. | kdd | 2024-08-21 |
154 | Mo�sai: Efficient Text-to-Music Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our work, we bridge text and music via a text-to-music generation model that is highly efficient, expressive, and can handle long-term structure. |
Flavio Schneider; Ojasv Kamal; Zhijing Jin; Bernhard Sch�lkopf; | acl | 2024-08-20 |
155 | Rage Music Classification and Analysis Using K-Nearest Neighbour, Random Forest, Support Vector Machine, Convolutional Neural Networks, and Gradient Boosting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compare methods of classification in the application of audio analysis with machine learning and identify optimal models. |
Akul Kumar; | arxiv-cs.SD | 2024-08-20 |
156 | Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and Accompaniment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel task called Text-to-Song synthesis which incorporates both vocal and accompaniment generation. |
ZHIQING HONG et. al. | acl | 2024-08-20 |
157 | Do Large Language Models Latently Perform Multi-Hop Reasoning? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study whether Large Language Models (LLMs) latently perform multi-hop reasoning with complex prompts such as �The mother of the singer of �Superstition� is�. |
Sohee Yang; Elena Gribovskaya; Nora Kassner; Mor Geva; Sebastian Riedel; | acl | 2024-08-20 |
158 | DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fill the gap, we propose DisMix, a generative framework in which the pitch and timbre representations act as modular building blocks for constructing the melody and instrument of a source, and the collection of which forms a set of per-instrument latent representations underlying the observed mixture. |
YIN-JYUN LUO et. al. | arxiv-cs.SD | 2024-08-20 |
159 | Rhyme-aware Chinese Lyric Generator Based on GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the rhyming quality of generated lyrics, we incorporate integrated rhyme information into our model, thereby improving lyric generation performance. |
YIXIAO YUAN et. al. | arxiv-cs.CL | 2024-08-19 |
160 | The Evolution of Inharmonicity and Noisiness in Contemporary Popular Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we use modified MPEG-7 features to explore and characterise the evolution of noise and inharmonicity in popular music since 1961. |
Emmanuel Deruty; David Meredith; Stefan Lattner; | arxiv-cs.SD | 2024-08-15 |
161 | A New Dataset, Notation Software, and Representation for Computational Schenkerian Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With a larger corpus of Schenkerian data, it may be possible to infuse machine learning models with a deeper understanding of musical structure, thus leading to more human results. To encourage further research in Schenkerian analysis and its potential benefits for music informatics and generation, this paper presents three main contributions: 1) a new and growing dataset of SchAs, the largest in human- and computer-readable formats to date (>140 excerpts), 2) a novel software for visualization and collection of SchA data, and 3) a novel, flexible representation of SchA as a heterogeneous-edge graph data structure. |
STEPHEN NI-HAHN et. al. | arxiv-cs.SD | 2024-08-13 |
162 | Tactile Melodies: A Desk-Mounted Haptics for Perceiving Musical Experiences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel interface for experiencing music through haptic impulses to the palm of the hand. |
Raj Varshith Moora; Gowdham Prabhakar; | arxiv-cs.HC | 2024-08-12 |
163 | TEAdapter: Supply Abundant Guidance for Controllable Text-to-music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Accordingly, we introduce the TEAcher Adapter (TEAdapter), a compact plugin designed to guide the generation process with diverse control information provided by users. |
JIALING ZOU et. al. | arxiv-cs.SD | 2024-08-09 |
164 | Quantifying The Corpus Bias Problem in Automatic Music Transcription Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We identify two primary sources of distribution shift: the music, and the sound. |
Lukáš Samuel Marták; Patricia Hu; Gerhard Widmer; | arxiv-cs.SD | 2024-08-08 |
165 | The Algorithmic Nature of Song-sequencing: Statistical Regularities in Music Albums Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on a review of anecdotal beliefs, we explored patterns of track-sequencing within professional music albums. |
Pedro Neto; Martin Hartmann; Geoff Luck; Petri Toiviainen; | arxiv-cs.MM | 2024-08-08 |
166 | Arrange, Inpaint, and Refine: Steerable Long-term Music Audio Generation and Editing Via Content-based Controls Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While Large Language Models (LLMs) have shown promise in generating high-quality music, their focus on autoregressive generation limits their utility in music editing tasks. To bridge this gap, To address this gap, we propose a novel approach leveraging a parameter-efficient heterogeneous adapter combined with a masking training scheme. |
Liwei Lin; Gus Xia; Yixiao Zhang; Junyan Jiang; | ijcai | 2024-08-03 |
167 | Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Current lyric-to-melody generation methods struggle with the lack of paired lyric-melody data to train, and the lack of adherence to composition guidelines, resulting in melodies that do not sound human-composed. To address these issues, we propose a novel paradigm called Re-creation of Creations (ROC) that combines the strengths of both rule-based and neural-based methods. |
Ang Lv; Xu Tan; Tao Qin; Tie-Yan Liu; Rui Yan; | ijcai | 2024-08-03 |
168 | InstructME: An Instruction Guided Music Edit Framework with Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop InstructME, an Instruction guided Music Editing and remixing framework based on latent diffusion models. |
BING HAN et. al. | ijcai | 2024-08-03 |
169 | Retrieval Guided Music Captioning Via Multimodal Prefixes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we put forward a new approach to music captioning, the task of automatically generating natural language descriptions for songs. |
Nikita Srivatsan; Ke Chen; Shlomo Dubnov; Taylor Berg-Kirkpatrick; | ijcai | 2024-08-03 |
170 | MusicMagus: Zero-Shot Text-to-Music Editing Via Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the task of editing these generated music remains a significant challenge. This paper introduces a novel approach to edit music generated by such models, enabling the modification of specific attributes, such as genre, mood, and instrument, while maintaining other aspects unchanged. |
YIXIAO ZHANG et. al. | ijcai | 2024-08-03 |
171 | MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in The Field of Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present MuChin, the first open-source music description benchmark in Chinese colloquial language, designed to evaluate the performance of multimodal LLMs in understanding and describing music. |
ZIHAO WANG et. al. | ijcai | 2024-08-03 |
172 | Generating High-quality Symbolic Music Using Fine-grained Discriminators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose to decouple the melody and rhythm from music, and design corresponding fine-grained discriminators to tackle the aforementioned issues. |
ZHEDONG ZHANG et. al. | arxiv-cs.SD | 2024-08-03 |
173 | Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a project that revives a piece of 15th-century Korean court music, Chihwapyeong and Chwipunghyeong, composed upon the poem Songs of the Dragon Flying to Heaven. |
DANBINAERIN HAN et. al. | arxiv-cs.SD | 2024-08-02 |
174 | Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Nested Music Transformer (NMT), an architecture tailored for decoding compound tokens autoregressively, similar to processing flattened tokens, but with low memory usage. |
Jiwoo Ryu; Hao-Wen Dong; Jongmin Jung; Dasaem Jeong; | arxiv-cs.SD | 2024-08-02 |
175 | MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, their evaluation poses considerable challenges, and it remains unclear how to effectively assess their ability to correctly interpret music-related inputs with current methods. Motivated by this, we introduce MuChoMusic, a benchmark for evaluating music understanding in multimodal language models focused on audio. |
BENNO WECK et. al. | arxiv-cs.SD | 2024-08-02 |
176 | Music2P: A Multi-Modal AI-Driven Tool for Simplifying Album Cover Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our ultimate goal is to provide a tool that empowers musicians and producers, especially those with limited resources or expertise, to create compelling album covers. |
Joong Ho Choi; Geonyeong Choi; Ji-Eun Han; Wonjin Yang; Zhi-Qi Cheng; | arxiv-cs.MM | 2024-08-02 |
177 | PiCoGen2: Piano Cover Generation with Transfer Learning Approach and Weakly Aligned Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This would, however, result in the loss of piano information and accordingly cause inconsistencies between the original and remapped piano versions. To overcome this limitation, we propose a transfer learning approach that pre-trains our model on piano-only data and fine-tunes it on weakly-aligned paired data constructed without note remapping. |
Chih-Pin Tan; Hsin Ai; Yi-Hsin Chang; Shuen-Huei Guan; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-08-02 |
178 | ChordSync: Conformer-Based Alignment of Chord Annotations to Music Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce ChordSync, a novel conformer-based model designed to seamlessly align chord annotations with audio, eliminating the need for weak alignment. |
Andrea Poltronieri; Valentina Presutti; Martín Rocamora; | arxiv-cs.SD | 2024-08-01 |
179 | Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through our baseline, we illustrate how building on top of past research can offer alternatives for music difficulty assessment which are explainable and interpretable. With this, we aim to promote a more effective communication between the Music Information Retrieval (MIR) community and the music education one. |
Pedro Ramoneda; Vsevolod Eremenko; Alexandre D’Hooge; Emilia Parada-Cabaleiro; Xavier Serra; | arxiv-cs.SD | 2024-08-01 |
180 | Can LLMs Reason in Music? An Evaluation of LLMs’ Capability of Music Understanding and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent research has extended the application of large language models (LLMs) such as GPT-4 and Llama2 to the symbolic music domain including understanding and generation. Yet scant research explores the details of how these LLMs perform on advanced music understanding and conditioned generation, especially from the multi-step reasoning perspective, which is a critical aspect in the conditioned, editable, and interactive human-computer co-creation process. |
ZIYA ZHOU et. al. | arxiv-cs.SD | 2024-07-31 |
181 | PiCoGen: Generate Piano Covers with A Two-stage Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cover song generation stands out as a popular way of music making in the music-creative community. In this study, we introduce Piano Cover Generation (PiCoGen), a two-stage … |
Chih-Pin Tan; Shuen-Huei Guan; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-07-30 |
182 | Emotion-driven Piano Music Generation Via Two-stage Disentanglement and Functional Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To further capture features that shape valence, an aspect less explored by previous approaches, we introduce a novel functional representation of symbolic music. |
Jingyue Huang; Ke Chen; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-07-30 |
183 | Emotion-Driven Melody Harmonization Via Melodic Variation and Functional Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel functional representation for symbolic music. |
Jingyue Huang; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-07-29 |
184 | Futga: Towards Fine-grained Music Understanding Through Temporally-enhanced Generative Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing music captioning methods are limited to generating concise global descriptions of short music clips, which fail to capture fine-grained musical characteristics and time-aware musical changes. To address these limitations, we propose FUTGA, a model equipped with fined-grained music understanding capabilities through learning from generative augmentation with temporal compositions. |
JUNDA WU et. al. | arxiv-cs.SD | 2024-07-29 |
185 | Start from Video-Music Retrieval: An Inter-Intra Modal Loss for Cross Modal Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Experiments also show that II loss improves various self-supervised and supervised uni-modal and cross-modal retrieval tasks, and can obtain good retrieval models with a small amount of training samples. |
ZEYU CHEN et. al. | arxiv-cs.MM | 2024-07-28 |
186 | Simulation of Neural Responses to Classical Music Using Organoid Intelligence Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Hence, we present the PyOrganoid library, an innovative tool that facilitates the simulation of organoid learning models, integrating sophisticated machine learning techniques with biologically inspired organoid simulations. |
Daniel Szelogowski; | arxiv-cs.NE | 2024-07-25 |
187 | Dance2MIDI: Dance-driven Multi-instrument Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Bo Han; Yuheng Li; Yixuan Shen; Yi Ren; Feilin Han; | Comput. Vis. Media | 2024-07-24 |
188 | Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, editing music audios remains challenging due to the conflicting desiderata of performing fine-grained alterations on the audio while maintaining a simple user interface. To address this challenge, we propose Audio Prompt Adapter (or AP-Adapter), a lightweight addition to pretrained text-to-music models. |
FANG-DUO TSAI et. al. | arxiv-cs.SD | 2024-07-23 |
189 | Towards Assessing Data Replication in Music Generation with Music Similarity Metrics on Raw Audio Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A relevant discussion and related technical challenge is the potential replication and plagiarism of the training set in AI-generated music, which could lead to misuse of data and intellectual property rights violations. To tackle this issue, we present the Music Replication Assessment (MiRA) tool: a model-independent open evaluation method based on diverse audio music similarity metrics to assess data replication. |
Roser Batlle-Roca; Wei-Hisang Liao; Xavier Serra; Yuki Mitsufuji; Emilia Gómez; | arxiv-cs.SD | 2024-07-19 |
190 | Reducing Barriers to The Use of Marginalised Music Genres in AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: XAI opportunities identified included topics of improving transparency and control of AI models, explaining the ethics and bias of AI models, fine tuning large models with small datasets to reduce bias, and explaining style-transfer opportunities with AI models. Participants in the research emphasised that whilst it is hard to work with small datasets such as marginalised music and AI, such approaches strengthen cultural representation of underrepresented cultures and contribute to addressing issues of bias of deep learning models. |
Nick Bryan-Kinns; Zijin Li; | arxiv-cs.SD | 2024-07-18 |
191 | GraphMuse: A Library for Symbolic Music Graph Processing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Graph Neural Networks (GNNs) have recently gained traction in symbolic music tasks, yet a lack of a unified framework impedes progress. Addressing this gap, we present GraphMuse, a graph processing framework and library that facilitates efficient music graph processing and GNN training for symbolic music tasks. |
Emmanouil Karystinaios; Gerhard Widmer; | arxiv-cs.SD | 2024-07-17 |
192 | Audio Conditioning for Music Generation Via Discrete Bottleneck Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the second model we train a music language model from scratch jointly with a text conditioner and a quantized audio feature extractor. |
Simon Rouard; Yossi Adi; Jade Copet; Axel Roebel; Alexandre Défossez; | arxiv-cs.SD | 2024-07-17 |
193 | BandControlNet: Parallel Transformers-based Steerable Popular Music Generation with Fine-Grained Spatiotemporal Features Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Controllable music generation promotes the interaction between humans and composition systems by projecting the users’ intent on their desired music. The challenge of introducing … |
Jing Luo; Xinyu Yang; Dorien Herremans; | arxiv-cs.SD | 2024-07-15 |
194 | Popular Hooks: A Multimodal Dataset of Musical Hooks for Music Understanding and Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The Internet is rich in unimodal music data, available in either symbolic or audio representations. However, there is a notable scarcity of multimodal music datasets that offer … |
Xinda Wu; Jiaming Wang; Jiaxing Yu; Tieyao Zhang; Kejun Zhang; | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
195 | Striking The Right Chord: A Comprehensive Approach to Amazon Music Search Spell Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we build a multi-stage framework for spell correction solution for music, media and named entity heavy search engines. |
Siddharth Sharma; Shiyun Yang; Ajinkya Walimbe; Tarun Sharma; Joaquin Delgado; | sigir | 2024-07-14 |
196 | The Interpretation Gap in Text-to-Music Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a framework to describe the musical interaction process, which includes expression, interpretation, and execution of controls. |
Yongyi Zang; Yixiao Zhang; | arxiv-cs.SD | 2024-07-14 |
197 | A Preliminary Investigation on Flexible Singing Voice Synthesis Through Decomposed Framework with Inferrable Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As collecting large singing datasets labeled with music scores is an expensive task, we investigate an alternative approach by decomposing the SVS system and inferring different singing voice features. |
Lester Phillip Violeta; Taketo Akama; | arxiv-cs.SD | 2024-07-12 |
198 | Music Proofreading with RefinPaint: Where and How to Modify Compositions Given Context Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose RefinPaint, an iterative technique that improves the sampling process. |
Pedro Ramoneda; Martin Rocamora; Taketo Akama; | arxiv-cs.SD | 2024-07-12 |
199 | Adversarial-MidiBERT: Symbolic Music Understanding Model Based on Unbias Pre-training and Mask Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: It also has a significant influence on the performance of downstream tasks, which also happens in SMU. To address this challenge, we propose Adversarial-MidiBERT, a symbolic music understanding model based on Bidirectional Encoder Representations from Transformers (BERT). |
Zijian Zhao; | arxiv-cs.SD | 2024-07-11 |
200 | From Real to Cloned Singer Identification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As such, methods to identify the original singer in synthetic voices are needed. In this paper, we investigate how singer identification methods could be used for such a task. |
Dorian Desblancs; Gabriel Meseguer-Brocal; Romain Hennequin; Manuel Moussallam; | arxiv-cs.SD | 2024-07-11 |
201 | Music Genre Classification Using Contrastive Dissimilarity Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the digital age, streaming platforms have revolutionized how we access and interact with music, highlighting the need for more intuitive ways to organize and categorize our … |
Gabriel Henrique Costanzi; Lucas O. Teixeira; G. Felipe; George D. C. Cavalcanti; Yandre M. G. Costa; | 2024 31st International Conference on Systems, Signals and … | 2024-07-09 |
202 | MelodyVis: Visual Analytics for Melodic Patterns in Sheet Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we present MelodyVis, a visual application designed in collaboration with musicology experts to explore melodic patterns in digital sheet music. |
MATTHIAS MILLER et. al. | arxiv-cs.HC | 2024-07-07 |
203 | Exploring Real-Time Music-to-Image Systems for Creative Inspiration in Music Creation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a study on the use of a real-time music-to-image system as a mechanism to support and inspire musicians during their creative process. |
Meng Yang; Maria Teresa Llano; Jon McCormack; | arxiv-cs.HC | 2024-07-07 |
204 | Music Era Recognition Using Supervised Contrastive Learning and Artist Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We formulate the task as a music classification problem and propose solutions based on supervised contrastive learning. |
QIQI HE et. al. | arxiv-cs.SD | 2024-07-07 |
205 | MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation Through Pre-Training and Counterfactual Loss Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The model often fails to respond adequately to new, fine-grained bar-level control signals. To address this, we propose two innovative solutions. |
Yangyang Shu; Haiming Xu; Ziqin Zhou; Anton van den Hengel; Lingqiao Liu; | arxiv-cs.SD | 2024-07-05 |
206 | PAGURI: A User Experience Study of Creative Interaction with Text-to-music Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While they are unquestionably a showcase of technological progress, it is not clear yet how they can be realistically integrated into the artistic practice of musicians and music practitioners. This paper aims to address this question via Prompt Audio Generation User Research Investigation (PAGURI), a user experience study where we leverage recent text-to-music developments to study how musicians and practitioners interact with these systems, evaluating their satisfaction levels. |
Francesca Ronchini; Luca Comanducci; Gabriele Perego; Fabio Antonacci; | arxiv-cs.SD | 2024-07-05 |
207 | MUSIC-lite: Efficient MUSIC Using Approximate Computing: An OFDM Radar Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present MUSIC-lite, which exploits approximate computing to generate a design space exploring accuracy-area-power trade-offs. |
Rajat Bhattacharjya; Arnab Sarkar; Biswadip Maity; Nikil Dutt; | arxiv-cs.AR | 2024-07-05 |
208 | MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel task of Colloquial Description-to-Song Generation, which focuses on aligning the generated content with colloquial human expressions. |
ZIHAO WANG et. al. | arxiv-cs.SD | 2024-07-03 |
209 | MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In the domain of symbolic music research, the progress of developing scalable systems has been notably hindered by the scarcity of available training data and the demand for models tailored to specific tasks. To address these issues, we propose MelodyT5, a novel unified framework that leverages an encoder-decoder architecture tailored for symbolic music processing in ABC notation. |
Shangda Wu; Yashan Wang; Xiaobing Li; Feng Yu; Maosong Sun; | arxiv-cs.SD | 2024-07-02 |
210 | Novice-Centered Application Design for Music Creation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the field of music production, the integration of machine learning-based technologies, particularly in the form of intelligent digital audio workstations and smart plugins, has … |
Atsuya Kobayashi; Tetsuro Sato; Kei Tateno; | Companion Publication of the 2024 ACM Designing Interactive … | 2024-07-01 |
211 | Pictures Of MIDI: Controlled Music Generation Via Graphical Prompts for Image-Based Diffusion Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores a user-friendly graphical interface enabling the drawing of masked regions for inpainting by an Hourglass Diffusion Transformer (HDiT) model trained on MIDI piano roll images. |
Scott H. Hawley; | arxiv-cs.SD | 2024-07-01 |
212 | Harmonizing Tradition with Technology: Using AI in Traditional Music Preservation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Traditional music plays a unique role in preserving our history, connecting us to our roots, and fostering a sense of identity and continuity in a rapidly changing world. However, … |
Tiexin Yu; Xinxia Wang; Xu Xiao; Rongshan Yu; | 2024 International Joint Conference on Neural Networks … | 2024-06-30 |
213 | Subtractive Training for Music Stem Insertion Using Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Subtractive Training, a simple and novel method for synthesizing individual musical instrument stems given other instruments as context. |
IVAN VILLA-RENTERIA et. al. | arxiv-cs.SD | 2024-06-27 |
214 | PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Learning musical structures and composition patterns is necessary for both music generation and understanding, but current methods do not make uniform use of learned features to … |
XIAO LIANG et. al. | 2024 IEEE International Conference on Multimedia and Expo … | 2024-06-26 |
215 | Reflection Across AI-based Music Composition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Reflection is fundamental to creative practice. However, the plurality of ways in which people reflect when using AI Generated Content (AIGC) is underexplored. This paper takes … |
COREY FORD et. al. | Proceedings of the 16th Conference on Creativity & Cognition | 2024-06-23 |
216 | Analysing The Effectiveness of Online Digital Audio Software and Offline Audio Studios in Fostering Chinese Folk Music Composition Skills in Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Object: This study was designed to compare the effectiveness of online digital audio software Logic Pro X and a university offline audio studio in terms of the perspective and … |
Xiaowei Lei; | J. Comput. Assist. Learn. | 2024-06-22 |
217 | The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With ZIQI-Eval, we aim to provide a standardized and robust evaluation framework that facilitates a comprehensive assessment of LLMs’ music-related abilities. |
JIAJIA LI et. al. | arxiv-cs.SD | 2024-06-22 |
218 | Mustango: Toward Controllable Text-to-Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Mustango: a music-domain-knowledge-inspired text-to-music system based on diffusion. |
JAN MELECHOVSKY et. al. | naacl | 2024-06-20 |
219 | Emotion-aware Personalized Music Recommendation with A Heterogeneity-aware Deep Bayesian Network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this article, we propose four types of heterogeneity that an EMRS should account for: emotion heterogeneity across users, emotion heterogeneity within a user, music mood preference heterogeneity across users, and music mood preference heterogeneity within a user. |
ERKANG JING et. al. | arxiv-cs.AI | 2024-06-20 |
220 | JEN-1 DreamStyler: Customized Musical Concept Learning Via Pivotal Parameters Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method for customized text-to-music generation, which can capture the concept from a two-minute reference music and generate a new piece of music conforming to the concept. |
Boyu Chen; Peike Li; Yao Yao; Alex Wang; | arxiv-cs.SD | 2024-06-18 |
221 | MusicScore: A Dataset for Music Score Modeling and Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose MusicScore, a large-scale music score dataset collected and processed from the International Music Score Library Project (IMSLP). |
Yuheng Lin; Zheqi Dai; Qiuqiang Kong; | arxiv-cs.MM | 2024-06-17 |
222 | A Bayesian Drift-Diffusion Model of Schachter-Singer’s Two Factor Theory of Emotion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we adopt the same Bayesian framework to model emotion process in accordance with Schachter-Singer’s Two-Factor theory, which argues that emotion is the outcome of cognitive labeling or attribution of a diffuse pattern of autonomic arousal (Schachter & Singer, 1962). |
Lance Ying; Audrey Michal; Jun Zhang; | arxiv-cs.CE | 2024-06-16 |
223 | Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present JASCO, a temporally controlled text-to-music generation model utilizing both symbolic and audio-based conditions. |
Or Tal; Alon Ziv; Itai Gat; Felix Kreuk; Yossi Adi; | arxiv-cs.SD | 2024-06-16 |
224 | Diff-BGM: A Diffusion Model for Video Background Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose to align the video and music sequentially by introducing a segment-aware cross-attention layer. |
Sizhe Li; Yiming Qin; Minghang Zheng; Xin Jin; Yang Liu; | cvpr | 2024-06-13 |
225 | Are We There Yet? A Brief Survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we provide a comprehensive overview of the available music-emotion datasets and discuss evaluation standards as well as competitions in the field. |
Jaeyong Kang; Dorien Herremans; | arxiv-cs.SD | 2024-06-13 |
226 | MeLFusion: Synthesizing Music from Image and Language Cues Using Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: MeLFusion is a text-to-music diffusion model with a novel "visual synapse" which effectively infuses the semantics from the visual modality into the generated music. To facilitate research in this area we introduce a new dataset MeLBench and propose a new evaluation metric IMSM. |
Sanjoy Chowdhury; Sayan Nag; K J Joseph; Balaji Vasan Srinivasan; Dinesh Manocha; | cvpr | 2024-06-13 |
227 | MuseChat: A Conversational Music Recommendation System for Videos IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Their inability to interact with users for further refinements or to provide explanations leads to a less satisfying experience. We address these issues with MuseChat a first-of-its-kind dialogue-based recommendation system that personalizes music suggestions for videos. |
Zhikang Dong; Xiulong Liu; Bin Chen; Pawel Polak; Peng Zhang; | cvpr | 2024-06-13 |
228 | DITTO: Diffusion Inference-Time T-Optimization for Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Diffusion Inference-Time T-Optimization (DITTO), a general-purpose framework for controlling pre-trained text-to-music diffusion models at inference-time via optimizing initial noise latents. |
Zachary Novack; Julian McAuley; Taylor Berg-Kirkpatrick; Nicholas J. Bryan; | icml | 2024-06-12 |
229 | Flexible Music-Conditioned Dance Generation with Style Description Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Flexible Dance Generation with Style Description Prompts (DGSDP), a diffusion-based framework suitable for diversified tasks of dance generation by fully leveraging the semantics of music style. |
Hongsong Wang; Yin Zhu; Xin Geng; | arxiv-cs.CV | 2024-06-12 |
230 | Adaptive Accompaniment with ReaLchords Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose ReaLchords, an online generative model for improvising chord accompaniment to user melody. |
YUSONG WU et. al. | icml | 2024-06-12 |
231 | TokSing: Singing Voice Synthesis Based on Discrete Tokens Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce TokSing, a discrete-based SVS system equipped with a token formulator that offers flexible token blendings. |
YUNING WU et. al. | arxiv-cs.SD | 2024-06-12 |
232 | Emotion Manipulation Through Music — A Deep Learning Interactive Visual Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel way to manipulate the emotional content of a song using AI tools. |
Adel N. Abdalla; Jared Osborne; Razvan Andonie; | arxiv-cs.SD | 2024-06-12 |
233 | MusicFlow: Cascaded Flow Matching for Text Guided Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce MusicFlow, a cascaded text-to-music generation model based on flow matching. |
K R PRAJWAL et. al. | icml | 2024-06-12 |
234 | LLark: A Multimodal Instruction-Following Language Model for Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LLark, an instruction-tuned multimodal model for \emph{music} understanding. |
Joshua P Gardner; Simon Durand; Daniel Stoller; Rachel M Bittner; | icml | 2024-06-12 |
235 | Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore two zero-shot editing techniques for audio signals, which use DDPM inversion with pre-trained diffusion models. |
Hila Manor; Tomer Michaeli; | icml | 2024-06-12 |
236 | MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the MOSA (Music mOtion with Semantic Annotation) dataset, which contains high quality 3-D motion capture data, aligned audio recordings, and note-by-note semantic annotations of pitch, beat, phrase, dynamic, articulation, and harmony for 742 professional music performances by 23 professional musicians, comprising more than 30 hours and 570 K notes of data. |
YU-FEN HUANG et. al. | arxiv-cs.SD | 2024-06-10 |
237 | VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we systematically study music generation conditioned solely on the video. |
ZEYUE TIAN et. al. | arxiv-cs.CV | 2024-06-06 |
238 | Innovations in Cover Song Detection: A Lyrics-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cover songs are alternate versions of a song by a different artist. Long being a vital part of the music industry, cover songs significantly influence music culture and are … |
Maximilian Balluff; Peter Mandl; Christian Wolff; | ArXiv | 2024-06-06 |
239 | Negative Feedback for Music Personalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show the benefits of using real negative feedback both as inputs into the user sequence and also as negative targets for training a next-song recommender system for internet radio. |
M. Jeffrey Mei; Oliver Bembom; Andreas F. Ehmann; | arxiv-cs.LG | 2024-06-06 |
240 | STraDa: A Singer Traits Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The annotated-strada consists of two hundred tracks that are balanced in terms of 2 genders, 5 languages, and 4 age groups. To show its use for model training and bias analysis thanks to its metadata’s richness and downloadable audio files, we benchmarked singer sex classification (SSC) and conducted bias analysis. |
Yuexuan Kong; Viet-Anh Tran; Romain Hennequin; | arxiv-cs.SD | 2024-06-06 |
241 | Intelligent Text-Conditioned Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this project, we apply a similar approach to bridge the gap between natural language and music. |
Zhouyao Xie; Nikhil Yadala; Xinyi Chen; Jing Xi Liu; | arxiv-cs.MM | 2024-06-02 |
242 | DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Distilled Diffusion Inference-Time T -Optimization (or DITTO-2), a new method to speed up inference-time optimization-based control and unlock faster-than-real-time generation for a wide-variety of applications such as music inpainting, outpainting, intensity, melody, and musical structure control. |
Zachary Novack; Julian McAuley; Taylor Berg-Kirkpatrick; Nicholas Bryan; | arxiv-cs.SD | 2024-05-30 |
243 | May The Dance Be with You: Dance Generation Framework for Non-Humanoids Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: If an agent can recognize the relationship between visual rhythm and music, it will be able to dance by generating a motion to create a visual rhythm that matches the music. Based on this, we propose a framework for any kind of non-humanoid agents to learn how to dance from human videos. |
Hyemin Ahn; | arxiv-cs.CV | 2024-05-30 |
244 | CoDancers: Music-Driven Coherent Group Dance Generation with Choreographic Unit Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Dance and music are intimately interconnected, with group dance being a crucial part of dance artistry. Consequently, Music-Driven Group Dance Generation has been a fundamental … |
KAIXING YANG et. al. | Proceedings of the 2024 International Conference on … | 2024-05-30 |
245 | Socially-Motivated Music Recommendation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Extensive literature spanning psychology, sociology, and musicology has sought to understand the motivations for why people listen to music, including both individually and … |
Benjamin Lacker; Samuel F. Way; | International Conference on Web and Social Media | 2024-05-28 |
246 | Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models Via Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Previous approaches in this domain have been constrained by the necessity to train specific editing models from scratch, which is both resource-intensive and inefficient; other research uses large language models to predict edited music, resulting in imprecise audio reconstruction. To Combine the strengths and address these limitations, we introduce Instruct-MusicGen, a novel approach that finetunes a pretrained MusicGen model to efficiently follow editing instructions such as adding, removing, or separating stems. |
YIXIAO ZHANG et. al. | arxiv-cs.SD | 2024-05-28 |
247 | Enhancing Music Genre Classification Through Multi-Algorithm Analysis and User-Friendly Visualization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The aim of this study is to teach an algorithm how to recognize different types of music. |
Navin Kamuni; Dheerendra Panwar; | arxiv-cs.SD | 2024-05-27 |
248 | Quality-aware Masked Diffusion Transformer for Enhanced Music Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, diffusion-based text-to-music (TTM) generation has gained prominence, offering an innovative approach to synthesizing musical content from textual descriptions. … |
CHANG LI et. al. | ArXiv | 2024-05-24 |
249 | QA-MDT: Quality-aware Masked Diffusion Transformer for Enhanced Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In open-source datasets, issues such as low-quality music waveforms, mislabeling, weak labeling, and unlabeled data significantly hinder the development of music generation models. To address these challenges, we propose a novel paradigm for high-quality music generation that incorporates a quality-aware training strategy, enabling generative models to discern the quality of input music waveforms during training. |
CHANG LI et. al. | arxiv-cs.SD | 2024-05-24 |
250 | SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models By Searching Up-to-Date Internet Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a plug-and-play framework, for augmenting existing LVLMs in handling visual question answering (VQA) about up-to-date knowledge, dubbed SearchLVLMs. |
CHUANHAO LI et. al. | arxiv-cs.CV | 2024-05-23 |
251 | Music Genre Classification: Training An AI Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research I explore various machine learning algorithms for the purpose of music genre classification, using features extracted from audio signals.The systems are namely, a Multilayer Perceptron (built from scratch), a k-Nearest Neighbours (also built from scratch), a Convolutional Neural Network and lastly a Random Forest wide model. |
Keoikantse Mogonediwa; | arxiv-cs.SD | 2024-05-23 |
252 | The Rarity of Musical Audio Signals Within The Space of Possible Audio Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A white noise signal can access any possible configuration of values, though statistically over many samples tends to a uniform spectral distribution, and is highly unlikely to … |
Nick Collins; | arxiv-cs.SD | 2024-05-23 |
253 | A Dataset and Baselines for Measuring and Predicting The Music Piece Memorability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the abundance of music, certain pieces remain more memorable and often gain greater popularity. Inspired by this phenomenon, we focus on measuring and predicting music memorability. |
Li-Yang Tseng; Tzu-Ling Lin; Hong-Han Shuai; Jen-Wei Huang; Wen-Whei Chang; | arxiv-cs.IR | 2024-05-21 |
254 | SYMPLEX: Controllable Symbolic Music Generation Using Simplex Diffusion with Vocabulary Priors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new approach for fast and controllable generation of symbolic music based on the simplex diffusion, which is essentially a diffusion process operating on probabilities rather than the signal space. |
Nicolas Jonason; Luca Casini; Bob L. T. Sturm; | arxiv-cs.SD | 2024-05-21 |
255 | What Makes A Viral Song? Unraveling Music Virality Factors Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The viral phenomenon is present in several contexts, combining the advantages of streaming platforms and other social networks. Music is no exception. Viral songs are widely … |
Gabriel P. Oliveira; Ana Paula Couto da Silva; Mirella M. Moro; | Proceedings of the 16th ACM Web Science Conference | 2024-05-21 |
256 | A Genre-Based Analysis of New Music Streaming at Scale Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rise of on-demand music streaming platforms and novel recommendation algorithms have brought a transformative shift in music listening, where users have an effectively endless … |
Julie Jiang; Aditya Ponnada; Ang Li; Benjamin Lacker; Samuel F. Way; | Proceedings of the 16th ACM Web Science Conference | 2024-05-21 |
257 | End-to-End Full-Page Optical Music Recognition for Pianoform Sheet Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first truly end-to-end approach for page-level OMR. |
Antonio Ríos-Vila; Jorge Calvo-Zaragoza; David Rizo; Thierry Paquet; | arxiv-cs.CV | 2024-05-20 |
258 | Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we make the first attempt to model a full music piece under the realization of compositional hierarchy. |
Ziyu Wang; Lejun Min; Gus Xia; | arxiv-cs.SD | 2024-05-16 |
259 | MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding Space Binding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces MVBind, an innovative Music-Video embedding space Binding model for cross-modal retrieval. |
Jiajie Teng; Huiyu Duan; Yucheng Zhu; Sijing Wu; Guangtao Zhai; | arxiv-cs.MM | 2024-05-15 |
260 | SMUG-Explain: A Framework for Symbolic Music Graph Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present Score MUsic Graph (SMUG)-Explain, a framework for generating and visualizing explanations of graph neural networks applied to arbitrary prediction tasks on musical scores. |
Emmanouil Karystinaios; Francesco Foscarin; Gerhard Widmer; | arxiv-cs.SD | 2024-05-15 |
261 | Dance Any Beat: Blending Beats with Visuals in Dance Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These methods also require precise keypoint annotations, complicating data collection and limiting the use of self-collected video datasets. To overcome these challenges, we introduce a novel task: generating dance videos directly from images of individuals guided by music. |
Xuanchen Wang; Heng Wang; Dongnan Liu; Weidong Cai; | arxiv-cs.CV | 2024-05-15 |
262 | Naturalistic Music Decoding from EEG Data Via Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we explore the potential of using latent diffusion models, a family of powerful generative models, for the task of reconstructing naturalistic music from electroencephalogram (EEG) recordings. |
EMILIAN POSTOLACHE et. al. | arxiv-cs.SD | 2024-05-14 |
263 | Changing Your Tune: Lessons for Using Music to Encourage Physical Activity Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Our research investigated whether music can communicate physical activity levels in daily life. Past studies have shown that simple musical tunes can provide wellness information, … |
Matthew Clark; Afsaneh Doryab; | Proceedings of the ACM on Interactive, Mobile, Wearable and … | 2024-05-13 |
264 | Modeling User Attention in Music Recommendation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: With the popularity of online music services, personalized music recommendation has garnered much research interest. Recommendation models are typically trained on datasets … |
SUNHAO DAI et. al. | 2024 IEEE 40th International Conference on Data Engineering … | 2024-05-13 |
265 | MARingBA: Music-Adaptive Ringtones for Blended Audio Notification Delivery Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Audio notifications provide users with an efficient way to access information beyond their current focus of attention. Current notification delivery methods, like phone ringtones, … |
Alexander Wang; Yi Fei Cheng; David Lindlbauer; | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
266 | A Way for Deaf and Hard of Hearing People to Enjoy Music By Exploring and Customizing Cross-modal Music Concepts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Deaf and hard of hearing (DHH) people enjoy music and access it using a music-sensory substitution system that delivers sound together with the corresponding visual and tactile … |
Youjin Choi; Junryeol Jeon; ChungHa Lee; Yeongeun Noh; Jin-Hyuk Hong; | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
267 | Towards An Accessible and Rapidly Trainable Rhythm Sequencer Using A Generative Stacked Autoencoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the integration of generative stacked autoencoder structures for rhythm generation, within a conventional melodic step-sequencer. |
Alex Wastnidge; | arxiv-cs.SD | 2024-05-11 |
268 | Waves Push Me to Slumberland: Reducing Pre-Sleep Stress Through Spatio-Temporal Tactile Displaying of Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Despite the fact that spatio-temporal patterns of vibration, characterized as rhythmic compositions of tactile content, have exhibited an ability to elicit specific emotional … |
HUI ZHANG et. al. | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
269 | Music Emotion Prediction Using Recurrent Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the application of recurrent neural networks to recognize emotions conveyed in music, aiming to enhance music recommendation systems and support therapeutic interventions by tailoring music to fit listeners’ emotional states. |
Xinyu Chang; Xiangyu Zhang; Haoruo Zhang; Yulu Ran; | arxiv-cs.SD | 2024-05-10 |
270 | Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel fine-tuning approach that prepends the rhyming word at the start of each lyric, which allows the critical rhyming decision to be made before the model commits to the content of the lyric (as during reverse language modeling), but maintains compatibility with the word order of regular PLMs as the lyric itself is still generated in left-to-right order. |
TOMMASO PASINI et. al. | arxiv-cs.CL | 2024-05-08 |
271 | Mozart’s Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, current models for image- and video-to-music synthesis struggle to capture the nuanced emotions and atmosphere conveyed by visual content. To fill this gap, we propose Mozart’s Touch, a multi-modal music generation framework capable of generating music aligned with cross-modal inputs such as images, videos, and text. |
Jiajun Li; Tianze Xu; Xuesong Chen; Xinrui Yao; Shuchang Liu; | arxiv-cs.SD | 2024-05-04 |
272 | DialectDecoder: Human/machine Teaming for Bird Song Classification and Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View |
BRITTANY STORY et. al. | Ecol. Informatics | 2024-05-01 |
273 | ComposerX: Multi-Agent Symbolic Music Composition with LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To further explore and enhance LLMs’ potential in music composition by leveraging their reasoning ability and the large knowledge base in music history and theory, we propose ComposerX, an agent-based symbolic music generation framework. |
QIXIN DENG et. al. | arxiv-cs.SD | 2024-04-28 |
274 | COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present COCOLA (Coherence-Oriented Contrastive Learning for Audio), a contrastive learning method for musical audio representations that captures the harmonic and rhythmic coherence between samples. |
RUBEN CIRANNI et. al. | arxiv-cs.SD | 2024-04-25 |
275 | Music Style Transfer With Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The existing music style transfer methods generate spectrograms with artifacts, leading to significant noise in the generated audio. To address these issues, this study proposes a music style transfer framework based on diffusion models (DM) and uses spectrogram-based methods to achieve multi-to-multi music style transfer. |
Hong Huang; Yuyi Wang; Luyao Li; Jun Lin; | arxiv-cs.SD | 2024-04-23 |
276 | Musical Word Embedding for Music Tagging and Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in the domain of music, the word embedding may have difficulty understanding musical contexts or recognizing music-related entities like artists and tracks. To address this issue, we propose a new approach called Musical Word Embedding (MWE), which involves learning from various types of texts, including both everyday and music-related vocabulary. |
SeungHeon Doh; Jongpil Lee; Dasaem Jeong; Juhan Nam; | arxiv-cs.SD | 2024-04-21 |
277 | Music Consistency Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, the application of consistency models in music generation remains largely unexplored. To address this gap, we present Music Consistency Models (\texttt{MusicCM}), which leverages the concept of consistency models to efficiently synthesize mel-spectrogram for music clips, maintaining high quality while minimizing the number of sampling steps. |
Zhengcong Fei; Mingyuan Fan; Junshi Huang; | arxiv-cs.SD | 2024-04-20 |
278 | Track Role Prediction of Single-Instrumental Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a deep learning model designed to automatically predict the track-role of single-instrumental music sequences. |
Changheon Han; Suhyun Lee; Minsam Ko; | arxiv-cs.SD | 2024-04-20 |
279 | Large Language Models: From Notes to Musical Form Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Adapting a recent music generation model, this paper proposes a novel method to generate music with form. |
Lilac Atassi; | arxiv-cs.SD | 2024-04-18 |
280 | MIDGET: Music Conditioned 3D Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET based on Dance motion Vector Quantised Variational AutoEncoder (VQ-VAE) model and Motion Generative Pre-Training (GPT) model to generate vibrant and highquality dances that match the music rhythm. |
Jinwu Wang; Wei Mao; Miaomiao Liu; | arxiv-cs.SD | 2024-04-18 |
281 | Violin Music Emotion Recognition with Fusion of CNN-BiGRU and Attention Mechanism Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music emotion recognition has garnered significant interest in recent years, as the emotions expressed through music can profoundly enhance our understanding of its deeper … |
Sihan Ma; Ruohua Zhou; | Inf. | 2024-04-16 |
282 | Long-form Music Generation with Latent Diffusion IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that by training a generative model on long temporal contexts it is possible to produce long-form music of up to 4m45s. |
ZACH EVANS et. al. | arxiv-cs.SD | 2024-04-16 |
283 | Using Tangible Interaction to Design Musicking Artifacts for Non-musicians Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a Research through Design exploration of the potential for using tangible interactions to enable active music experiences – musicking – for non-musicians. |
Lucía Montesinos; Halfdan Hauch Jensen; Anders Sundnes Løvlie; | arxiv-cs.HC | 2024-04-15 |
284 | MIR-MLPop: A Multilingual Pop Music Dataset with Time-Aligned Lyrics and Audio Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We introduce MIR-MLPop, a publicly available multilingual pop music dataset designed for automatic lyrics transcription and lyrics alignment in polyphonic music. The dataset … |
J. Wang; Chung-Che Wang; Chon-In Leong; Jyh-Shing Roger Jang; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
285 | Girls Rocking The Code: Gender-dependent Stereotypes, Engagement & Comprehension in Music Programming Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: One of the greatest challenges in early programming education is to achieve learning success while also creating initial interest. This is particularly difficult for girls, who … |
Isabella Graßl; Gordon Fraser; | 2024 IEEE/ACM 46th International Conference on Software … | 2024-04-14 |
286 | A Scalable Sparse Transformer Model for Singing Melody Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Extracting the melody of a singing voice is an essential task within the realm of music information retrieval (MIR). Recently, transformer based models have drawn great attention … |
Shuai Yu; Jun Liu; Yi Yu; Wei Li; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
287 | GPT-4 Driven Cinematic Music Generation Through Text Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents Herrmann-11, a multimodal framework to generate background music tailored to movie scenes, by integrating state-of-the-art vision, language, music, and speech … |
Muhammad Taimoor Haseeb; Ahmad Hammoudeh; Gus G. Xia; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
288 | MuPT: A Generative Symbolic Music Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. |
XINGWEI QU et. al. | arxiv-cs.SD | 2024-04-09 |
289 | Exploring Diverse Sounds: Identifying Outliers in A Music Corpus Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore music outliers, investigating their potential usefulness for music discovery and recommendation systems. |
Le Cai; Sam Ferguson; Gengfa Fang; Hani Alshamrani; | arxiv-cs.SD | 2024-04-09 |
290 | The NES Video-Music Database: A Dataset of Symbolic Video Game Music Paired with Gameplay Videos Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Neural models are one of the most popular approaches for music generation, yet there aren’t standard large datasets tailored for learning music directly from game data. To address this research gap, we introduce a novel dataset named NES-VMDB, containing 98,940 gameplay videos from 389 NES games, each paired with its original soundtrack in symbolic format (MIDI). |
Igor Cardoso; Rubens O. Moraes; Lucas N. Ferreira; | arxiv-cs.SD | 2024-04-05 |
291 | A Computational Analysis of Lyric Similarity Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, many of these systems do not fully consider human perceptions of lyric similarity, primarily due to limited research in this area. To bridge this gap, we conducted a comparative analysis of computational methods for modeling lyric similarity with human perception. |
Haven Kim; Taketo Akama; | arxiv-cs.CL | 2024-04-02 |
292 | Practical End-to-End Optical Music Recognition for Pianoform Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: (b) We create a dev and test set for benchmarking typeset OMR with MusicXML ground truth based on the OpenScore Lieder corpus. |
Jiří Mayer; Milan Straka; Jan Hajič jr.; Pavel Pecina; | arxiv-cs.CV | 2024-03-20 |
293 | Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Prompt-Singer, the first SVS method that enables attribute controlling on singer gender, vocal range and volume with natural language. |
YONGQI WANG et. al. | arxiv-cs.SD | 2024-03-18 |
294 | Using Multimodal Learning Analytics to Examine Learners’ Responses to Different Types of Background Music During Reading Comprehension Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Previous studies have evaluated the affordances and challenges of performing cognitively demanding learning tasks with background music (BGM), yet the effects of various types of … |
Ying Que; J. T. D. Ng; Xiao Hu; Mitchell Kam Fai Mak; Peony Tsz Yan Yip; | Proceedings of the 14th Learning Analytics and Knowledge … | 2024-03-18 |
295 | QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Quaternion-Enhanced Attention Network (QEAN) for visual dance synthesis from a quaternion perspective, which consists of a Spin Position Embedding (SPE) module and a Quaternion Rotary Attention (QRA) module. |
ZHIZHEN ZHOU et. al. | arxiv-cs.GR | 2024-03-18 |
296 | CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we design a deep learning model and test it on common types of sensing signals (sine wave or Frequency Modulated Continuous Wave FMCW) as inputs with various agnostic concurrent music and speech. |
Yin Li; Rajalakshmi Nanadakumar; | arxiv-cs.SD | 2024-03-15 |
297 | Influencing Factors and Modeling Methods of Vocal Music Teaching Quality Supported By Artificial Intelligence Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In order to explore the maturity of online concerts and the digital content of music resources, this article analyzes the role of artificial intelligence in music education, … |
Yang Yuan; | Int. J. Web Based Learn. Teach. Technol. | 2024-03-13 |
298 | Application-Oriented Talents Training for Music Majors in Colleges and Universities Based on Internet Remote Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper mainly studies the cultivation of applied talents of music majors in colleges and universities based on internet remote technology. By analyzing the definition and … |
Lin Shui; Yuan Feng; Mengting Zhong; Yuanhui Qin; | Int. J. Web Based Learn. Teach. Technol. | 2024-03-12 |
299 | SiTunes: A Situational Music Recommendation Dataset with Physiological and Psychological Signals Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: With an increasing number of music tracks available online, music recommender systems have become popular and ubiquitous. Previous research indicates that people’s preferences, … |
VADIM GRIGOREV et. al. | Proceedings of the 2024 Conference on Human Information … | 2024-03-10 |
300 | Can Audio Reveal Music Performance Difficulty? Insights from The Piano Syllabus Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automatically estimating the performance difficulty of a music piece represents a key process in music education to create tailored curricula according to the individual needs of … |
Pedro Ramoneda; Minhee Lee; Dasaem Jeong; J. J. Valero-Mas; Xavier Serra; | arxiv-cs.SD | 2024-03-06 |
301 | Interactive Melody Generation System for Enhancing The Creativity of Musicians Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a system designed to enumerate the process of collaborative composition among humans, using automatic music composition technology. |
So Hirawata; Noriko Otani; | arxiv-cs.SD | 2024-03-05 |
302 | Optimized Multiscale Deep Bidirectional Gated Recurrent Neural Network Fostered Practical Teaching of University Music Course Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music education has a rich historical background. Nevertheless, the introduction of modern teaching methods is relatively delayed. In recent years, there has been a remarkable … |
Yuanyuan Hu; | J. Intell. Fuzzy Syst. | 2024-02-28 |
303 | Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such models, possibly originally developed for text and adapted for symbolic music, are trained on various tasks. We describe these models, in particular deep learning models, through different prisms, highlighting music-specialized mechanisms. |
Dinh-Viet-Toan Le; Louis Bigo; Mikaela Keller; Dorien Herremans; | arxiv-cs.IR | 2024-02-27 |
304 | SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SongComposer, an innovative LLM designed for song composition. |
SHUANGRUI DING et. al. | arxiv-cs.SD | 2024-02-27 |
305 | Classical Music Education in China: The Effectiveness of The WeChat Social Media Platform and Its Impact on The Communicative and Cognitive Skills of Music Students Related Papers Related Patents Related Grants Related Venues Related Experts View |
Tao Chen; | Educ. Inf. Technol. | 2024-02-27 |
306 | Singer Identification Model Using Data Augmentation and Enhanced Feature Conversion with Hybrid Feature Vector and Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View |
Serhat Hizlisoy; R. Arslan; E. Çolakoğlu; | EURASIP J. Audio Speech Music. Process. | 2024-02-26 |
307 | Do Large Language Models Latently Perform Multi-Hop Reasoning? IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We study whether Large Language Models (LLMs) latently perform multi-hop reasoning with complex prompts such asThe mother of the singer of ‘Superstition’ is. We look for … |
Sohee Yang; E. Gribovskaya; Nora Kassner; Mor Geva; Sebastian Riedel; | ArXiv | 2024-02-26 |
308 | ChatMusician: Understanding and Generating Music Intrinsically with LLM IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce ChatMusician, an open-source LLM that integrates intrinsic musical abilities. |
RUIBIN YUAN et. al. | arxiv-cs.SD | 2024-02-25 |
309 | A Survey of Music Generation in The Context of Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article presents a thorough review of music representation, feature analysis, heuristic algorithms, statistical and parametric modelling, and human and automatic evaluation measures, along with a discussion of which approaches and models seem most suitable for live interaction. |
ISMAEL AGCHAR et. al. | arxiv-cs.SD | 2024-02-23 |
310 | ByteComposer: A Human-like Melody Composition Method Based on Language Model Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, how to design a human-aligned and interpretable melody composition system is still under-explored. To solve this problem, we propose ByteComposer, an agent framework emulating a human’s creative pipeline in four separate steps : Conception Analysis – Draft Composition – Self-Evaluation and Modification – Aesthetic Selection. |
XIA LIANG et. al. | arxiv-cs.SD | 2024-02-23 |
311 | Understanding Human-AI Collaboration in Music Therapy Through Co-Design with Therapists Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We presented the co-design outcomes involving the integration of musical AIs into a music therapy process, which was developed from a theoretical framework rooted in emotion-focused therapy. |
Jingjing Sun; Jingyi Yang; Guyue Zhou; Yucheng Jin; Jiangtao Gong; | arxiv-cs.HC | 2024-02-22 |
312 | Below 58 BPM, Involving Real-time Monitoring and Self-medication Practices in Music Performance Through IoT Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The project presented in this paper illustrates the design process for the development of an IoT system that monitors a specific bio-metric parameter (heart rate) in real time and … |
Nicolò Merendino; Antonio Rodà; Raul Masu; | Frontiers Comput. Sci. | 2024-02-21 |
313 | Music Style Transfer with Time-Varying Inversion of Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a music style transfer approach that effectively captures musical attributes using minimal data. |
SIFEI LI et. al. | arxiv-cs.SD | 2024-02-21 |
314 | MCSSME: Multi-Task Contrastive Learning for Semi-supervised Singing Melody Extraction from Polyphonic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, to deal with data scarcity limitation, we propose a self-consistency regularization (SCR) method to train the model on the unlabeled data. |
Shuai Yu; | aaai | 2024-02-20 |
315 | MusER: Musical Element-Based Regularization for Generating Symbolic Music with Emotion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, prior research on deep learning-based emotional music generation has rarely explored the contribution of different musical elements to emotions, let alone the deliberate manipulation of these elements to alter the emotion of music, which is not conducive to fine-grained element-level control over emotions. To address this gap, we present a novel approach employing musical element-based regularization in the latent space to disentangle distinct elements, investigate their roles in distinguishing emotions, and further manipulate elements to alter musical emotions. |
Shulei Ji; Xinyu Yang; | aaai | 2024-02-20 |
316 | N-gram Unsupervised Compoundation and Feature Injection for Better Symbolic Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel method, NG-Midiformer, for understanding symbolic music sequences that leverages the N-gram approach. |
Jinhao Tian; Zuchao Li; Jiajia Li; Ping Wang; | aaai | 2024-02-20 |
317 | Structure-informed Positional Encoding for Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, multi-scale hierarchical structure is a distinctive feature of music signals. To leverage this information, we propose a structure-informed positional encoding framework for music generation with Transformers. |
Manvi Agarwal; Changhong Wang; Gaël Richard; | arxiv-cs.SD | 2024-02-20 |
318 | Responding to The Call: Exploring Automatic Music Composition Using A Knowledge-Enhanced Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we train the composition module using the call-response pairs, supplementing it with musical knowledge in terms of rhythm, melody, and harmony. |
ZHEJING HU et. al. | aaai | 2024-02-20 |
319 | Does AI-assisted Creation of Polyphonic Music Increase Academic Motivation? The DeepBach Graphical Model and Its Use in Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the modern music industry, AI music generators have gained particular importance. The use of AI greatly simplifies the creation of polyphony. In addition, it can increase … |
Na Yuan; | J. Comput. Assist. Learn. | 2024-02-20 |
320 | V2Meow: Meowing to The Visual Beat Via Video-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose V2Meow, a video-to-music generation system capable of producing high-quality music audio for a diverse range of video input types using a multi-stage autoregressive model. |
KUN SU et. al. | aaai | 2024-02-20 |
321 | DeepSRGM — Sequence Classification and Ranking in Indian Classical Music with Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a deep learning based approach to Raga recognition. |
Sathwik Tejaswi Madhusudhan; Girish Chowdhary; | arxiv-cs.SD | 2024-02-15 |
322 | Phantom in The Opera: Adversarial Music Attack for Robot Dialogue System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study explores the vulnerability of robot dialogue systems’ automatic speech recognition (ASR) module to adversarial music attacks. Specifically, we explore music as a … |
Sheng Li; Jiyi Li; Yang Cao; | Frontiers Comput. Sci. | 2024-02-15 |
323 | DeepSRGM – Sequence Classification and Ranking in Indian Classical Music Via Deep Learning IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A vital aspect of Indian Classical Music (ICM) is Raga, which serves as a melodic framework for compositions and improvisations alike. Raga Recognition is an important music … |
S. Madhusudhan; Girish V. Chowdhary; | ArXiv | 2024-02-15 |
324 | An Order-Complexity Aesthetic Assessment Model for Aesthetic-aware Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Computational aesthetic evaluation has made remarkable contribution to visual art works, but its application to music is still rare. |
Xin Jin; Wu Zhou; Jingyu Wang; Duo Xu; Yongsen Zheng; | arxiv-cs.CV | 2024-02-13 |
325 | Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents the Sheet Music Transformer, the first end-to-end OMR model designed to transcribe complex musical scores without relying solely on monophonic strategies. |
Antonio Ríos-Vila; Jorge Calvo-Zaragoza; Thierry Paquet; | arxiv-cs.CV | 2024-02-12 |
326 | RaveNET: Connecting People and Exploring Liminal Space Through Wearable Networks in Music Performance Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: RaveNET connects people to music, enabling musicians to modulate sound using signals produced by their own bodies or the bodies of others. We present three wearable prototype … |
Rachel Freire; Valentin Martinez-Missir; Courtney N. Reed; P. Strohmeier; | Proceedings of the Eighteenth International Conference on … | 2024-02-11 |
327 | Evaluating Co-Creativity Using Total Information Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a method to compute the information flow using pre-trained generative models as entropy estimators. |
Vignesh Gokul; Chris Francis; Shlomo Dubnov; | arxiv-cs.SD | 2024-02-09 |
328 | MusicMagus: Zero-Shot Text-to-Music Editing Via Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, music generation usually involves iterative refinements, and how to edit the generated music remains a significant challenge. This paper introduces a novel approach to the editing of music generated by such models, enabling the modification of specific attributes, such as genre, mood and instrument, while maintaining other aspects unchanged. |
YIXIAO ZHANG et. al. | arxiv-cs.SD | 2024-02-08 |
329 | Hierarchical Multi-head Attention LSTM for Polyphonic Symbolic Melody Generation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Ahmet Kaşif; Selçuk Sevgen; Alper Ozcan; C. Catal; | Multim. Tools Appl. | 2024-02-08 |
330 | Towards Feature-based Versioning for Musicological Research Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper discusses the management of revisions and variants of musical works for the context of musicological research. Domain-specific languages (DSLs) are a fundamental tool … |
P. Grünbacher; Markus Neuwirth; | Proceedings of the 18th International Working Conference on … | 2024-02-07 |
331 | Melody Generation Based on Deep Ensemble Learning Using Varying Temporal Context Length Related Papers Related Patents Related Grants Related Venues Related Experts View |
Baibhav Nag; Asif Iqbal Middya; Sarbani Roy; | Multim. Tools Appl. | 2024-02-02 |
332 | Everyday Uses of Music Listening and Music Technologies By Caregivers and People with Dementia: Survey and Focus Group Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To ensure music technologies are appropriately designed for supporting caregivers and people living with dementia, there remains a need to better understand how music is currently used in everyday care at home. We aimed to understand how people with dementia and their caregivers use music technologies in everyday caring, as well as challenges they experience using music and technology. |
DIANNA VIDAS et. al. | arxiv-cs.HC | 2024-02-01 |
333 | Dance-to-Music Generation with Encoder-based Textual Inversion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: They often overlook the nuanced management of temporal rhythm, which is indispensable in crafting music for dance, since it intricately aligns the musical beats with the dancers’ movements. Recognizing this gap, we propose an encoder-based textual inversion technique to augment text-to-music models with visual control, facilitating personalized music generation. |
SIFEI LI et. al. | arxiv-cs.SD | 2024-01-31 |
334 | Dance-to-Music Generation with Encoder-based Textual Inversion of Diffusion Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The seamless integration of music with dance movements is essential for communicating the artistic intent of a dance piece. This alignment also significantly improves the … |
SIFEI LI et. al. | ArXiv | 2024-01-31 |
335 | SongBsAb: A Dual Prevention Approach Against Singing Voice Conversion Based Illegal Song Covers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose SongBsAb, the first proactive approach to tackle SVC-based illegal song covers. |
GUANGKE CHEN et. al. | arxiv-cs.SD | 2024-01-30 |
336 | Music Auto-Tagging with Robust Music Representation Learned Via Domain Adversarial Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a method inspired by speech-related tasks to enhance music auto-tagging performance in noisy settings. |
Haesun Joung; Kyogu Lee; | arxiv-cs.SD | 2024-01-27 |
337 | MoodLoopGP: Generating Emotion-Conditioned Loop Tablature Music with Multi-Granular Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, building upon LooperGP, a loopable tablature generation model, this paper explores endowing systems with control over conveyed emotions. To enable such conditional generation, we propose integrating musical knowledge by utilizing multi-granular semantic and musical features during model training and inference. |
Wenqian Cui; Pedro Sarmento; Mathieu Barthet; | arxiv-cs.SD | 2024-01-23 |
338 | MELODY: Robust Semi-Supervised Hybrid Model for Entity-Level Online Anomaly Detection with Multivariate Time Series Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the problem of anomaly detection for deployments. |
Jingchao Ni; Gauthier Guinet; Peihong Jiang; Laurent Callot; Andrey Kan; | arxiv-cs.LG | 2024-01-18 |
339 | Exploring The Diversity of Music Experiences for Deaf and Hard of Hearing People Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Sensory substitution or enhancement techniques have been proposed to enable deaf or hard of hearing (DHH) people to listen to and even compose music. |
Kyrie Zhixuan Zhou; Weirui Peng; Yuhan Liu; Rachel F. Adler; | arxiv-cs.HC | 2024-01-17 |
340 | Emotional Behavior Analysis of Music Course Evaluation Based on Online Comment Mining Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study investigates the method of analyzing emotional tendencies in music courses and its application in lesson plan evaluation. Using a weighted method to analyze emotional … |
Nan Li; | Int. J. Inf. Technol. Web Eng. | 2024-01-17 |
341 | Link Me Baby One More Time: Social Music Discovery on Spotify Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the social and contextual factors that influence the outcome of person-to-person music recommendations and discovery. |
Shazia’Ayn Babul; Desislava Hristova; Antonio Lima; Renaud Lambiotte; Mariano Beguerisse-Díaz; | arxiv-cs.SI | 2024-01-16 |
342 | ScripTONES: Sentiment-Conditioned Music Generation for Movie Scripts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a two-stage pipeline for generating music from a movie script. |
Vishruth Veerendranath; Vibha Masti; Utkarsh Gupta; Hrishit Chaudhuri; Gowri Srinivasa; | arxiv-cs.MM | 2024-01-13 |
343 | Sub-Band and Full-Band Interactive U-Net with Dprnn for Demixing Cross-Talk Stereo Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper proposes to use U-Nets to extract information from predefined sub-bands and full-band for stereo music demixing. Experimental results show that the proposed system can … |
HAN YIN et. al. | 2024 IEEE International Conference on Acoustics, Speech, … | 2024-01-11 |
344 | Singer Identity Representation Learning Using Self-Supervised Techniques Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the same level of progress has not been achieved for singing voices. To bridge this gap, we suggest a framework for training singer identity encoders to extract representations suitable for various singing-related tasks, such as singing voice similarity and synthesis. |
Bernardo Torres; Stefan Lattner; Gaël Richard; | arxiv-cs.SD | 2024-01-10 |
345 | On Using Artificial Intelligence to Predict Music Playlist Success Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of digital music platforms has fundamentally transformed the way we engage with and organize music. As playlist creation has gained widespread popularity, there is … |
Roberto Cavicchioli; J. Hu; Marco Furini; | 2024 IEEE 21st Consumer Communications & Networking … | 2024-01-06 |
346 | MusicAOG: An Energy-Based Model for Learning and Sampling A Hierarchical Representation of Symbolic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addressing the challenge of interpretability and generalizability of artificial music intelligence, this paper introduces a novel symbolic representation that amalgamates both explicit and implicit musical information across diverse traditions and granularities. |
YIKAI QIAN et. al. | arxiv-cs.SD | 2024-01-05 |
347 | Continuous Emotion-Based Image-to-Music Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Image-to-music generation aims to generate realistic pure music according to a given image. Although many previous works are conducted on bridging image and music, they mainly … |
Yajie Wang; Mulin Chen; Xuelong Li; | IEEE Transactions on Multimedia | 2024-01-01 |
348 | Full-Page Music Symbols Recognition: State-of-the-Art Deep Model Comparison for Handwritten and Printed Music Scores Related Papers Related Patents Related Grants Related Venues Related Experts View |
Ali Yesilkanat; Yann Soullard; Bertrand Coüasnon; Nathalie Girard; | International Workshop on Document Analysis Systems | 2024-01-01 |
349 | Time-Delay Estimation Based on An Enhanced Modified MUSIC With Co-Prime Frequency Sampling for Rough Pavement Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In cases involving a rough interface, the echo frequency behavior of ultrawideband ground-penetrating radar (UWB-GPR) approximates a nonlinear Gaussian function. This … |
BIYUN MA et. al. | IEEE Geoscience and Remote Sensing Letters | 2024-01-01 |
350 | Copyright and The Production of Hip Hop Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Whereas the role of patents in cumulative innovation has been well established, little work has examined the impact that copyright policy may have on cumulative innovation in … |
J. Watson; | SSRN Electronic Journal | 2024-01-01 |
351 | MusicECAN: An Automatic Denoising Network for Music Recordings With Efficient Channel Attention Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this work, we address the long-standing problem of automatic recorded music denoising. In previous audio denoising research, the primary focus has been on speech, and music … |
Haonan Cheng; Shulin Liu; Zhicheng Lian; Long Ye; Qin Zhang; | IEEE/ACM Transactions on Audio, Speech, and Language … | 2024-01-01 |
352 | Multi-Layer Combined Frequency and Periodicity Representations for Multi-Pitch Estimation of Multi-Instrument Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multi-pitch estimation (MPE) is one of the most important tasks in automatic music transcription (AMT). Since music generally involves a wide variety of instruments, MPE should be … |
Tomoki Matsunaga; Hiroaki Saito; | IEEE/ACM Transactions on Audio, Speech, and Language … | 2024-01-01 |
353 | Cross-Modal Interaction Via Reinforcement Feedback for Audio-Lyrics Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The task of retrieving audio content relevant to lyric queries and vice versa plays a critical role in music-oriented applications. In this process, robust feature representations … |
Dong Zhou; Fang Lei; Lin Li; Yongmei Zhou; Aimin Yang; | IEEE/ACM Transactions on Audio, Speech, and Language … | 2024-01-01 |
354 | DanceComposer: Dance-to-Music Generation Using A Progressive Conditional Music Generator Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: A wonderful piece of music is the essence and soul of dance, which motivates the study of automatic music generation for dance. To create appropriate music from dance, cross-modal … |
Xiao Liang; Wensheng Li; Lifeng Huang; Chengying Gao; | IEEE Transactions on Multimedia | 2024-01-01 |
355 | Beyond The Trends: Evolution and Future Directions in Music Recommender Systems Research Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The study of Music Recommender Systems (MRS) has become crucial in digital music consumption, influencing how people discover and interact with music. This comprehensive analysis … |
Babak Amiri; Nikan Shahverdi; Amirali Haddadi; Yalda Ghahremani; | IEEE Access | 2024-01-01 |
356 | A Computationally Light MUSIC Based Algorithm for Automotive RADARs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, a computationally light single-snapshot multiple signal classification (MUSIC) algorithm is presented for multidimensional estimation in the framework of automotive … |
M. A. Maisto; A. Dell’Aversano; Adriana Brancaccio; Ivan Russo; Raffaele Solimene; | IEEE Transactions on Computational Imaging | 2024-01-01 |
357 | A Short Survey and Comparison of CNN-Based Music Genre Classification Using Multiple Spectral Features Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The goal of music genre classification is to identify the genre of given feature vectors representing certain characteristics of music clips. In addition, to improve the accuracy … |
W. Seo; Sung-Hyun Cho; Paweł Teisseyre; Jaesung Lee; | IEEE Access | 2024-01-01 |
358 | Moûsai: Efficient Text-to-Music Diffusion Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent years have seen the rapid development 001 of large generative models for text; however, 002 much less research has explored the connection 003 between text and another … |
Flavio Schneider; Ojasv Kamal; Zhijing Jin; Bernhard Schölkopf; | Annual Meeting of the Association for Computational … | 2024-01-01 |
359 | The Beauty of Repetition: An Algorithmic Composition Model With Motif-Level Repetition Generator and Outline-to-Music Generator in Symbolic Music Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Most musical compositions utilize repetition as a fundamental element to create captivating aesthetic experiences. However, the potential of repetition in machine-learning-based … |
ZHEJING HU et. al. | IEEE Transactions on Multimedia | 2024-01-01 |
360 | Drawlody: Sketch-Based Melody Creation With Enhanced Usability and Interpretability Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Sketch-based melody creation systems enable people to compose melodies by converting human-sketched melody contours into coherent melodies that fit the depicted contours. This … |
Qihao Liang; Ye Wang; | IEEE Transactions on Multimedia | 2024-01-01 |
361 | EXPLORE — Explainable Song Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel approach combining advanced algorithms and an interactive user interface. |
Abhinav Arun; Mehul Soni; Palash Choudhary; Saksham Arora; | arxiv-cs.IR | 2023-12-30 |
362 | Deep Neural Network Architectures for Audio Emotion Recognition Performed on Song and Speech Modalities Related Papers Related Patents Related Grants Related Venues Related Experts View |
Souha Ayadi; Z. Lachiri; | Int. J. Speech Technol. | 2023-12-28 |
363 | EnchantDance: Unveiling The Potential of Music-Driven Dance Movement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce the EnchantDance framework, a state-of-the-art method for dance generation. |
BO HAN et. al. | arxiv-cs.SD | 2023-12-26 |
364 | Combinatorial Music Generation Model with Song Structure Graph Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a symbolic music generation model with the song structure graph analysis network. |
Seonghyeon Go; Kyogu Lee; | arxiv-cs.SD | 2023-12-23 |
365 | Improving Chinese Pop Song and Hokkien Gezi Opera Singing Voice Synthesis By Enhancing Local Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, the synthesized audio exhibits local incongruities (e. g. , local pronunciation jitter or local noise). To address this problem, we propose two methods to enhance local modeling in the acoustic model. |
Peng Bai; Yue Zhou; Meizhen Zheng; Wujin Sun; Xiaodong Shi; | emnlp | 2023-12-22 |
366 | Byte Pair Encoding for Symbolic Music IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that Byte Pair Encoding, a compression technique widely used for natural language, significantly decreases the sequence length while increasing the vocabulary size. |
Nathan Fradet; Nicolas Gutowski; Fabien Chhel; Jean-Pierre Briot; | emnlp | 2023-12-22 |
367 | ALCAP: Alignment-Augmented Music Captioner Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, a comprehensive understanding of music necessitates the integration of both these elements. In this study, we delve into this overlooked realm by introducing a method to systematically learn multimodal alignment between audio and lyrics through contrastive learning. |
ZIHAO HE et. al. | emnlp | 2023-12-22 |
368 | Total Variation in Popular Rap Vocals from 2009-2023: Extension of The Analysis By Georgieva, Ripolles & McFee Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an analysis of fundamental frequency (F0) variation in rap vocals over the past 14 years, focusing on song examples that represent the state of modern rap music. |
Kelvin L Walls; Iran R Roman; Bea Steers; Elena Georgieva; | arxiv-cs.SD | 2023-12-21 |
369 | A Unified Representation Framework for The Evaluation of Optical Music Recognition Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we identify the need of a common music representation language and propose the Music Tree Notation (MTN) format, with the idea to construct a common endpoint for OMR research that allows coordination, reuse of technology and fair evaluation of community efforts. |
Pau Torras; Sanket Biswas; Alicia Fornés; | arxiv-cs.CV | 2023-12-20 |
370 | User-centric Item Characteristics for Personalized Multimedia Systems: A Systematic Review Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multimedia item characteristics are used in domains, such as recommender systems and information retrieval. In this work we distinguish two main groups of item characteristics: … |
Elham Motamedi; Marko Tkalcic; | Intelligenza Artificiale | 2023-12-15 |
371 | StemGen: A Music Generation Model That Listens IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present an alternative paradigm for producing music generation models that can listen and respond to musical context. |
JULIAN D. PARKER et. al. | arxiv-cs.SD | 2023-12-14 |
372 | WikiMuTe: A Web-sourced Dataset of Semantic Descriptions for Music Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present WikiMuTe, a new and open dataset containing rich semantic descriptions of music. |
Benno Weck; Holger Kirchhoff; Peter Grosche; Xavier Serra; | arxiv-cs.CL | 2023-12-14 |
373 | Computational Copyright: Towards A Royalty Model for Music Generative AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present algorithmic solutions employing data attribution techniques. |
Junwei Deng; Shiyuan Zhang; Jiaqi Ma; | arxiv-cs.AI | 2023-12-11 |
374 | MART: Learning Hierarchical Music Audio Representations with Part-Whole Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our quest to comprehend the bottom-up structure of music, we introduce MART, a hierarchical music representation learning approach that facilitates feature interactions among cropped music clips while considering their part-whole hierarchies. |
DONG YAO et. al. | arxiv-cs.SD | 2023-12-11 |
375 | Semantic Dependency Network for Lyrics Generation from Melody Related Papers Related Patents Related Grants Related Venues Related Experts View |
Wei Duan; Yi Yu; Keizo Oyama; | Neural Computing and Applications | 2023-12-09 |
376 | The Impact of Social Robots’ Presence and Roles on Children’s Performance in Musical Instrument Practice Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Research on the educational applications of social robots has shown how they can motivate children and help improve academic learning outcomes. Here, we examine how robots can … |
Heqiu Song; Emilia I. Barakova; Jaap Ham; P. Markopoulos; | Br. J. Educ. Technol. | 2023-12-07 |
377 | Music-Graph2Vec: An Efficient Method for Embedding Pitch Segment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Learning low-dimensional continuous vector representation for short pitch segment extracted from songs is has been confirmed to contain tonal features of music, which is key to … |
Taiwei Wu; Jianhao Zhang; Lian Duan; Yuanzhe Cai; | Proceedings of the 5th ACM International Conference on … | 2023-12-06 |
378 | A Semi-Supervised Deep Learning Approach to Dataset Collection for Query-By-Humming Task Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a deep learning data collection technique and introduce Covers and Hummings Aligned Dataset (CHAD), a novel dataset that contains 18 hours of short music fragments, paired with time-aligned hummed versions. |
AMANTUR AMATOV et. al. | arxiv-cs.SD | 2023-12-02 |
379 | Performance Bound Optimization for MIMO Radar Direction Finding With MUSIC Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The multiple signal classification (MUSIC) algorithm has been widely applied in direction finding with multiple-input-multiple-output (MIMO) radar. To enhance the angle estimation … |
Wenjun Wu; Bo Tang; Ran Tao; | IEEE Transactions on Aerospace and Electronic Systems | 2023-12-01 |
380 | DanceMeld: Unraveling Dance Phrases with Hierarchical Latent Codes for Music-to-Dance Synthesis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the realm of 3D digital human applications, music-to-dance presents a challenging task. Given the one-to-many relationship between music and dance, previous methods have been … |
Xin Gao; Liucheng Hu; Peng Zhang; Bang Zhang; Liefeng Bo; | ArXiv | 2023-11-30 |
381 | Barwise Music Structure Analysis with The Correlation Block-Matching Segmentation Algorithm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we extend an MSA algorithm called the Correlation Block-Matching (CBM) algorithm introduced by (Marmoret et al., 2020, 2022b). |
Axel Marmoret; Jérémy E. Cohen; Frédéric Bimbot; | arxiv-cs.SD | 2023-11-30 |
382 | Motion to Dance Music Generation Using Latent Diffusion Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The role of music in games and animation, particularly in dance content, is essential for creating immersive and entertaining experiences. Although recent studies have made … |
Vanessa Tan; Junghyun Nam; Juhan Nam; Jun-yong Noh; | SIGGRAPH Asia 2023 Technical Communications | 2023-11-28 |
383 | A Brief Scoping Review of Musical Performance Support System in IEEE Study Fields Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper investigates the current state of research regarding technological innovation for practical music performance and education. Digital devices and informatic technologies … |
Yasumasa Yamaguchi; | 2023 IEEE International Conference on Teaching, Assessment … | 2023-11-28 |
384 | Visual Signatures of Music Mood Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Majority of the existing methods of music visualization utilized mostly frequency, tempo and volume which are rendered in realtime as animated images for the music being played. … |
Hanqin Wang; A. Sourin; | SIGGRAPH Asia 2023 Posters | 2023-11-28 |
385 | Automatic Time Signature Determination for New Scores Using Lyrics for Latent Rhythmic Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach that only uses lyrics as input to automatically generate a fitting time signature for lyrical songs and uncover the latent rhythmic structure utilizing explainable machine learning models. |
Callie C. Liao; Duoduo Liao; Jesse Guessford; | arxiv-cs.LG | 2023-11-26 |
386 | Predominant Audio Source Separation in Polyphonic Music Related Papers Related Patents Related Grants Related Venues Related Experts View |
L. Reghunath; R. Rajan; | EURASIP Journal on Audio, Speech, and Music Processing | 2023-11-24 |
387 | A Recurrent Connectionist Model of Melody Perception : An Exploration Using TRACX2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Are similar, or even identical, mechanisms used in the computational modeling of speech segmentation, serial image processing and music processing? We address this question by exploring how TRACX2, (French et al., 2011; French \& Cottrell, 2014; Mareschal \& French, 2017), a recognition-based, recursive connectionist autoencoder model of chunking and sequence segmentation, which has successfully simulated speech and serial-image processing, might be applied to elementary melody perception. |
Daniel Defays; Robert French; Barbara Tillmann; | arxiv-cs.AI | 2023-11-21 |
388 | Equipping Pretrained Unconditional Music Transformers with Instrument and Genre Controls Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The ”pretraining-and-finetuning” paradigm has become a norm for training domain-specific models in natural language processing and computer vision. In this work, we aim to examine this paradigm for symbolic music generation through leveraging the largest ever symbolic music dataset sourced from the MuseScore forum. |
Weihan Xu; Julian McAuley; Shlomo Dubnov; Hao-Wen Dong; | arxiv-cs.SD | 2023-11-20 |
389 | M2UGen: Multi-modal Music Understanding and Generation with The Power of Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The current landscape of research leveraging large language models (LLMs) is experiencing a surge. Many works harness the powerful reasoning capabilities of these models to … |
Atin Sakkeer Hussain; Shansong Liu; Chenshuo Sun; Ying Shan; | ArXiv | 2023-11-19 |
390 | Encoding Performance Data in MEI with The Automatic Music Performance Analysis and Comparison Toolkit (AMPACT) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new method of encoding performance data in MEI using the recently added \texttt{ |
Johanna Devaney; Cecilia Beauchamp; | arxiv-cs.SD | 2023-11-19 |
391 | M$^{2}$UGen: Multi-modal Music Understanding and Generation with The Power of Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, research that combines both understanding and generation using LLMs is still limited and in its nascent stage. To address this gap, we introduce a Multi-modal Music Understanding and Generation (M$^{2}$UGen) framework that integrates LLM’s abilities to comprehend and generate music for different modalities. |
Shansong Liu; Atin Sakkeer Hussain; Qilong Wu; Chenshuo Sun; Ying Shan; | arxiv-cs.SD | 2023-11-19 |
392 | The Persian Piano Corpus: A Collection Of Instrument-Based Feature Extracted Data Considering Dastgah Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The research in the field of music is rapidly growing, and this trend emphasizes the need for comprehensive data. Though researchers have made an effort to contribute their own datasets, many data collections lack the requisite inclusivity for comprehensive study because they are frequently focused on particular components of music or other specific topics. |
Parsa Rasouli; Azam Bastanfard; | arxiv-cs.SD | 2023-11-18 |
393 | ConCollA – A Smart Emotion-based Music Recommendation System for Drivers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music recommender system is an area of information retrieval system that suggests customized music recommendations to users based on their previous preferences and experiences … |
JIGNA S. PATEL et. al. | Scalable Comput. Pract. Exp. | 2023-11-17 |
394 | Retrieval Augmented Generation of Symbolic Music with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the use of large language models (LLMs) for music generation using a retrieval system to select relevant examples. |
Nicolas Jonason; Luca Casini; Carl Thomé; Bob L. T. Sturm; | arxiv-cs.SD | 2023-11-17 |
395 | The Song Describer Dataset: A Corpus of Audio Captions for Music-and-Language Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models. |
ILARIA MANCO et. al. | arxiv-cs.SD | 2023-11-16 |
396 | Can MusicGen Create Training Data for MIR Tasks? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We are investigating the broader concept of using AI-based generative music systems to generate training data for Music Information Retrieval (MIR) tasks. |
Nadine Kroher; Helena Cuesta; Aggelos Pikrakis; | arxiv-cs.SD | 2023-11-15 |
397 | Exploring Variational Auto-Encoder Architectures, Configurations, and Datasets for Generative Music Explainable AI Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper contributes a systematic examination of the impact that different combinations of Variational Auto-Encoder models (MeasureVAE and AdversarialVAE), configurations of latent space in the AI model (from 4 to 256 latent dimensions), and training datasets (Irish folk, Turkish folk, Classical, and pop) have on music generation performance when 2 or 4 meaningful musical attributes are imposed on the generative model. |
Nick Bryan-Kinns; Bingyuan Zhang; Songyan Zhao; Berker Banar; | arxiv-cs.SD | 2023-11-14 |
398 | Music ControlNet: Multiple Time-varying Controls for Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Music ControlNet, a diffusion-based music generation model that offers multiple precise, time-varying controls over generated audio. |
Shih-Lun Wu; Chris Donahue; Shinji Watanabe; Nicholas J. Bryan; | arxiv-cs.SD | 2023-11-12 |
399 | Attitudes of Music Scholars Towards Digital Musicology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music research appears to lag behind other fields in adopting methods coming from the digital humanities (DH). Researchers have hypothesized that this might be due to the gulf … |
Audrey Laplante; Jean-Sébastien Sauvé; | Proceedings of the 10th International Conference on Digital … | 2023-11-10 |
400 | Text Boundaries Do Not Provide A Better Segmentation of Gregorian Antiphons Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: It has been previously proposed that syllable and word boundaries in Gregorian chant texts can be used to segment chant melodies in a more meaningful way than segmentation methods … |
Vojtěch Lanz; Jan Hajič; | Proceedings of the 10th International Conference on Digital … | 2023-11-10 |
401 | Exploring Early Vocal Music and Its Lute Arrangements: Using F-TEMPO As A Musicological Tool Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In its earliest state, F-TEMPO (Full-Text searching of Early Music Prints Online) enabled searching in the musical content of about 30,000 page-images of early printed music from … |
Tim Crawford; David Lewis; Alastair Porter; | Proceedings of the 10th International Conference on Digital … | 2023-11-10 |
402 | Visual Presentation and Exploration of Musical Corpora: Case Study: Oskar Kolberg’s Opera Omnia Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Extensive musical collections are growing with increasing momentum, and there are progressively more digital tools for analysing musical corpora. These tools visualize statistical … |
Anna Maria Matuszewska; | Proceedings of the 10th International Conference on Digital … | 2023-11-10 |
403 | Understanding The Needs of Music Editors in A Digital World. Adding Support for Editorial Markup to The Mei-friend Editor Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The mei-friend editor aims to address the challenges faced in the last mile of preparing MEI encodings, specifically the conversion and correction of the encodings through a … |
Anna Plaksin; | Proceedings of the 10th International Conference on Digital … | 2023-11-10 |
404 | An Algorithmic Approach to Automated Symbolic Transcription of Hindustani Vocals Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Although a sizable body of digital music scholarship has focused on automatic transcription, it has almost exclusively been applied to Western music. In this paper, we outline an … |
Rhythm Jain; Claire Arthur; | Proceedings of the 10th International Conference on Digital … | 2023-11-10 |
405 | Proceedings of The 5th International Workshop on Reading Music Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: The International Workshop on Reading Music Systems (WoRMS) is a workshop that tries to connect researchers who develop systems for reading music, such as in the field of Optical … |
Jorge Calvo-Zaragoza; Alexander Pacha; Elona Shatri; | arxiv-cs.CV | 2023-11-07 |
406 | The Music Meta Ontology: A Flexible Semantic Model for The Interoperability of Music Metadata Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is nonetheless an open challenge due to the complexity of musical concepts arising from different genres, styles, and periods — standing to benefit from a lingua franca to accommodate various stakeholders (musicologists, librarians, data engineers, etc.). To initiate this transition, we introduce the Music Meta ontology, a rich and flexible semantic model to describe music metadata related to artists, compositions, performances, recordings, and links. |
Jacopo de Berardinis; Valentina Anita Carriero; Albert Meroño-Peñuela; Andrea Poltronieri; Valentina Presutti; | arxiv-cs.IR | 2023-11-07 |
407 | Are Words Enough? On The Semantic Conditioning of Affective Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This concern seems highly relevant today, considering the exponential growth of natural language processing using deep learning models where it is possible to prompt semantic propositions to generate music automatically. This scoping review aims to analyze and discuss the possibilities of music generation conditioned by emotions. |
Jorge Forero; Gilberto Bernardes; Mónica Mendes; | arxiv-cs.MM | 2023-11-06 |
408 | Controllable Music Production with Diffusion Models and Guidance Gradients IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate how conditional generation from diffusion models can be used to tackle a variety of realistic tasks in the production of music in 44.1kHz stereo audio with sampling-time guidance. |
Mark Levy; Bruno Di Giorgi; Floris Weers; Angelos Katharopoulos; Tom Nickson; | arxiv-cs.SD | 2023-11-01 |
409 | Video2Music: Suitable Music Generation from Videos Using An Affective Multimodal Transformer Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we develop a generative music AI framework, Video2Music, that can match a provided video. |
Jaeyong Kang; Soujanya Poria; Dorien Herremans; | arxiv-cs.SD | 2023-11-01 |
410 | Multimodal Multifaceted Music Emotion Recognition Based on Self-Attentive Fusion of Psychology-Inspired Symbolic and Acoustic Features Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper describes automatic music emotion recognition (MER) that aims to estimate the valence and arousal (V/A) scores of a piece of piano music. The emotion is multifaceted in … |
Jiahao Zhao; Kazuyoshi Yoshii; | 2023 Asia Pacific Signal and Information Processing … | 2023-10-31 |
411 | JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This departure from the typical workflows of professional composers hinders the ability to refine details in specific tracks. To address this gap, we propose JEN-1 Composer, a unified framework designed to efficiently model marginal, conditional, and joint distributions over multi-track music using a single model. |
Yao Yao; Peike Li; Boyu Chen; Alex Wang; | arxiv-cs.SD | 2023-10-29 |
412 | HeartRhythm: ECG-Based Music Preference Classification in Popular Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Electrocardiogram (ECG) is a promising signal for music psychology research, but it is rare to find a music study applying ECG to machine learning, especially in music preference … |
Phairot Autthasan; Petchkla Sukontaman; Theerawit Wilaiprasitporn; Soravitt Sangnark; | 2023 IEEE SENSORS | 2023-10-29 |
413 | Exploring The Emotional Landscape of Music: An Analysis of Valence Trends and Genre Variations in Spotify Music Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper conducts an intricate analysis of musical emotions and trends using Spotify music data, encompassing audio features and valence scores extracted through the Spotipi API. |
Shruti Dutta; Shashwat Mookherjee; | arxiv-cs.SD | 2023-10-29 |
414 | Content-based Controls For Music Large Language Modeling IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We aim to further equip the models with direct and content-based controls on innate music languages such as pitch, chords and drum track. |
Liwei Lin; Gus Xia; Junyan Jiang; Yixiao Zhang; | arxiv-cs.AI | 2023-10-26 |
415 | Miditok: A Python Package for MIDI File Tokenization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Language models, such as Transformers, have been used with symbolic music for a variety of tasks among which music generation, modeling or transcription, with state-of-the-art performances. |
Nathan Fradet; Jean-Pierre Briot; Fabien Chhel; Amal El Fallah Seghrouchni; Nicolas Gutowski; | arxiv-cs.LG | 2023-10-26 |
416 | HoloSinger: Semantics and Music Driven Motion Generation with Octahedral Holographic Projection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Lyrics and music are both significant for a singer to perform a song. Therefore, it is important in singer’s motion generation to model both semantic and acoustic correlation with … |
ZEYU JIN et. al. | Proceedings of the 31st ACM International Conference on … | 2023-10-26 |
417 | The Effects of Viewing Formats and Song Genres on Audience Experiences in Virtual Avatar Concerts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: With the recent advancements in multimedia and computer graphics technology, virtual avatar concerts have become increasingly popular among both well-known celebrities and … |
Sebin Lee; Daye Kim; Jungjin Lee; | Proceedings of the 31st ACM International Conference on … | 2023-10-26 |
418 | Towards A System Supporting Music Feedback Exercise in Physical Tele-Rehabilitation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: For decades music has been used successfully in sports and especially physical rehabilitation in order to motivate people and increase satisfaction in the actual workout. The … |
Alexander Carôt; Thomas Hans Fritz; Katja Englert; | 2023 4th International Symposium on the Internet of Sounds | 2023-10-26 |
419 | Analysis of Accessible Digital Musical Instruments Through The Lens of Disability Models: A Case Study with Instruments Targeting D/Deaf People Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music educators and researchers have grown increasingly aware of the need for traditional musical practices to promote inclusive music for disabled people. Inclusive music … |
Erivan Gonçalves Duarte; Isabelle Cossette; Marcelo M. Wanderley; | Frontiers Comput. Sci. | 2023-10-24 |
420 | Efficient Neural Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present **M**e**L**o**D**y (**M** for music; **L** for LM; **D** for diffusion), an LM-guided diffusion model that generates music audios of state-of-the-art quality meanwhile reducing 95.7\% or 99.6\% forward passes in MusicLM, respectively, for sampling 10s or 30s music. |
MAX W. Y. LAM et. al. | nips | 2023-10-24 |
421 | Simple and Controllable Music Generation IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce MusicGen, a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens. |
JADE COPET et. al. | nips | 2023-10-24 |
422 | MARBLE: Music Audio Representation Benchmark for Universal Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In the era of extensive intersection between art and Artificial Intelligence (AI), such as image generation and fiction co-creation, AI for music remains relatively nascent, particularly in music understanding. This is evident in the limited work on deep music representations, the scarcity of large-scale datasets, and the absence of a universal and community-driven benchmark. To address this issue, we in troduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE. |
RUIBIN YUAN et. al. | nips | 2023-10-24 |
423 | DISCO-10M: A Large-Scale Music Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Music datasets play a crucial role in advancing research in machine learning for music. However, existing music datasets suffer from limited size, accessibility, and lack of audio resources. To address these shortcomings, we present DISCO-10M, a novel and extensive music dataset that surpasses the largest previously available music dataset by an order of magnitude. |
Luca Lanzendörfer; Florian Grötschla; Emil Funke; Roger Wattenhofer; | nips | 2023-10-24 |
424 | Playing with Feeling: Exploring Vibrotactile Feedback and Aesthetic Experiences for Developing Haptic Wearables for Blind and Low Vision Music Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Musical haptic wearables (MHWs) that convey information through vibrotactile feedback holds the potential to support the music learning of a blind or low vision (BLV) music … |
Leon Lu; Jin Kang; Chase Crispin; Audrey Girouard; | Proceedings of the 25th International ACM SIGACCESS … | 2023-10-22 |
425 | MUSE: Music Recommender System with Shuffle Play Recommendation Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on our observation that the shuffle play sessions hinder the overall training process of music recommender systems mainly due to the high unique transition rates of shuffle play sessions, we propose a Music Recommender System with Shuffle Play Recommendation Enhancement (MUSE). |
Yunhak Oh; Sukwon Yun; Dongmin Hyun; Sein Kim; Chanyoung Park; | cikm | 2023-10-21 |
426 | Composer Style-specific Symbolic Music Generation Using Vector Quantized Discrete Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose to combine a vector quantized variational autoencoder (VQ-VAE) and discrete diffusion models for the generation of symbolic music with desired composer styles. |
Jincheng Zhang; György Fazekas; Charalampos Saitis; | arxiv-cs.SD | 2023-10-21 |
427 | Fast Diffusion GAN Model for Symbolic Music Generation Controlled By Emotions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a diffusion model combined with a Generative Adversarial Network, aiming to (i) alleviate one of the remaining challenges in algorithmic music generation which is the control of generation towards a target emotion, and (ii) mitigate the slow sampling drawback of diffusion models applied to symbolic music generation. |
Jincheng Zhang; György Fazekas; Charalampos Saitis; | arxiv-cs.SD | 2023-10-21 |
428 | Music Augmentation and Denoising For Peak-Based Audio Fingerprinting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, real-world applications of audio identification often happen in noisy environments, which can cause these systems to fail. In this work, we tackle this problem by introducing and releasing a new audio augmentation pipeline that adds noise to music snippets in a realistic way, by stochastically mimicking real-world scenarios. |
Kamil Akesbi; Dorian Desblancs; Benjamin Martin; | arxiv-cs.SD | 2023-10-20 |
429 | MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the recent success of large language models (LLMs) in task automation, we develop a system, named MusicAgent, which integrates numerous music-related tools and an autonomous workflow to address user requirements. More specifically, we build 1) toolset that collects tools from diverse sources, including Hugging Face, GitHub, and Web API, etc. 2) an autonomous workflow empowered by LLMs (e.g., ChatGPT) to organize these tools and automatically decompose user requests into multiple sub-tasks and invoke corresponding music tools. |
DINGYAO YU et. al. | arxiv-cs.CL | 2023-10-18 |
430 | Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing AI music systems fall short in orchestrating multiple subsystems for diverse needs. To address this gap, we introduce Loop Copilot, a novel system that enables users to generate and iteratively refine music through an interactive, multi-round dialogue interface. |
Yixiao Zhang; Akira Maezawa; Gus Xia; Kazuhiko Yamamoto; Simon Dixon; | arxiv-cs.SD | 2023-10-18 |
431 | Multimodal Exploration in Elementary Music Classroom Related Papers Related Patents Related Grants Related Venues Related Experts View |
Martha Papadogianni; M. E. Altinsoy; Areti Andreopoulou; | Journal on Multimodal User Interfaces | 2023-10-18 |
432 | Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the knowledge within classical semantic-based pretrained models in much detail. |
XUEYAO ZHANG et. al. | arxiv-cs.SD | 2023-10-17 |
433 | Exploring Musical, Lyrical, and Network Dimensions of Music Sharing Among Depression Individuals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work seeks to study the differences in music preferences between individuals diagnosed with depression and non-diagnosed individuals, exploring numerous facets of music, including musical features, lyrics, and musical networks. |
Qihan Wang; Anique Tahir; Zeyad Alghamdi; Huan Liu; | arxiv-cs.CY | 2023-10-17 |
434 | Computational Analysis of Jazz Music: Estimating Tonality Through Chord Progression Distances Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Currently, research in music informatics focuses extensively on music theory, particularly on the theoretical systems of Western classical music dating back to the 19th century. … |
Yuta Yamamoto; Tetsuya Mizutani; | Proceedings of the 7th International Conference on Computer … | 2023-10-17 |
435 | Lyricist-Singer Entropy Affects Lyric-Lyricist Classification Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we observed a relationship between lyricist-singer entropy or the variety of singers associated with a single lyricist and lyric-lyricist classification performance. |
Mitsuki Morita; Masato Kikuchi; Tadachika Ozono; | arxiv-cs.SD | 2023-10-17 |
436 | Joint Music and Language Attention Models for Zero-shot Music Tagging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a zero-shot music tagging system modeled by a joint music and language attention (JMLA) model to address the open-set music tagging problem. |
Xingjian Du; Zhesong Yu; Jiaju Lin; Bilei Zhu; Qiuqiang Kong; | arxiv-cs.SD | 2023-10-16 |
437 | BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods often suffer from unnatural generation effects or fail to fully explore the correlation between music and dance. To overcome these challenges, we propose BeatDance, a novel beat-based model-agnostic contrastive learning framework. |
KAIXING YANG et. al. | arxiv-cs.SD | 2023-10-16 |
438 | CoCoFormer: A Controllable Feature-rich Polyphonic Music Generation Method Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, the self-supervised method improves the loss function and performs joint training through conditional control input and unconditional input training. |
Jiuyang Zhou; Tengfei Niu; Hong Zhu; Xingping Wang; | arxiv-cs.SD | 2023-10-15 |
439 | Impact of Time and Note Duration Tokenizations on Deep Learning Symbolic Music Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we analyze the common tokenization methods and experiment with time and note duration representations. |
Nathan Fradet; Nicolas Gutowski; Fabien Chhel; Jean-Pierre Briot; | arxiv-cs.SD | 2023-10-12 |
440 | LLark: A Multimodal Instruction-Following Language Model for Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LLark, an instruction-tuned multimodal model for \emph{music} understanding. |
Josh Gardner; Simon Durand; Daniel Stoller; Rachel M. Bittner; | arxiv-cs.SD | 2023-10-10 |
441 | Generating Appreciation Contents of Artworks By Combining Images and Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Nowadays, accessing artwork online has become commonplace. However, there is a lack of interest and participation in appreciating these works. To address this, we adopt a novel … |
Takumi Miyamoto; Yuanyuan Wang; | 2023 IEEE 12th Global Conference on Consumer Electronics … | 2023-10-10 |
442 | MuseChat: A Conversational Music Recommendation System for Videos IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Their inability to interact with users for further refinements or to provide explanations leads to a less satisfying experience. We address these issues with MuseChat, a first-of-its-kind dialogue-based recommendation system that personalizes music suggestions for videos. |
Zhikang Dong; Bin Chen; Xiulong Liu; Pawel Polak; Peng Zhang; | arxiv-cs.LG | 2023-10-09 |
443 | The Latest Technological Developments in Chinese Music Education: Motifs of National Musical Culture and Folklore in Modern Electronic Music Related Papers Related Patents Related Grants Related Venues Related Experts View |
Lei Lei; | Educ. Inf. Technol. | 2023-10-09 |
444 | Video-Music Retrieval with Fine-Grained Cross-Modal Alignment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents a novel video-music retrieval method for videos containing humans. Our method constructs a cross-modal common embedding space of video features based on human … |
Yuki Era; Ren Togo; Keisuke Maeda; Takahiro Ogawa; M. Haseyama; | 2023 IEEE International Conference on Image Processing … | 2023-10-08 |
445 | Stackable Music: A Marker-Based Augmented Reality Music Synthesis Game Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Augmented reality (AR) allows the rendering of digital content on top of the physical space, which is a promising medium for tangible interaction. Marker-based AR is widely used … |
Max Chen; Shano Liang; Gillian Smith; | Companion Proceedings of the Annual Symposium on … | 2023-10-06 |
446 | Deep Generative Models of Music Expectation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose to use modern deep probabilistic generative models in the form of a Diffusion Model to compute an approximate likelihood of a musical input sequence. |
Ninon Lizé Masclef; T. Anderson Keller; | arxiv-cs.SD | 2023-10-05 |
447 | Analysis for Online Music Education Under Internet and Big Data Environment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Online music teaching has brought great challenges to traditional music education. This paper investigates the paradigm of online music education and uses neural networks to … |
Lanfang Zhang; | Int. J. Web Based Learn. Teach. Technol. | 2023-10-02 |
448 | M2C: Concise Music Representation for 3D Dance Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generating 3D dance motions that are synchronized with music is a difficult task, as it involves modelling the complex interplay between musical rhythms and human body movements. … |
Matthew Marchellus; In Kyu Park; | 2023 IEEE/CVF International Conference on Computer Vision … | 2023-10-02 |
449 | Time Delay Stability Analysis of Pairwise Interactions Amongst Ensemble-Listener RR Intervals and Expressive Music Features Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Time Delay Stability (TDS) can reveal physiological function and states in networked organs. Here, we introduce a novel application of TDS to a musical setting to study … |
Mateusz Soliński; Courtney N. Reed; Elaine Chew; | 2023 Computing in Cardiology (CinC) | 2023-10-01 |
450 | Syllable-level Lyrics Generation from Melody Exploiting Character-level Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, pre-trained language models specifically designed at the syllable level are publicly unavailable. To solve these challenging issues, we propose to exploit fine-tuning character-level language models for syllable-level lyrics generation from symbolic melody. |
Zhe Zhang; Karol Lasocki; Yi Yu; Atsuhiro Takasu; | arxiv-cs.CL | 2023-10-01 |
451 | Investigating The Influence of Background Music on The Performance of A CVEP-Based BCI Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Brain-computer interfaces (BCIs) like e.g. different types of EEG-based BCI spellers (up to date the most common BCI applications) allow new methods of control and interaction … |
LISA HENKE et. al. | 2023 IEEE International Conference on Systems, Man, and … | 2023-10-01 |
452 | Music- and Lyrics-driven Dance Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To complement it, we introduce JustLMD, a new multimodal dataset of 3D dance motion with music and lyrics. |
WENJIE YIN et. al. | arxiv-cs.MM | 2023-09-30 |
453 | Exploring The Effects of Event-induced Sudden Influx of Newcomers to Online Pop Music Fandom Communities: Content, Interaction, and Engagement Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Online fandom communities (OFCs) provide a convenient space for fans to create, collect, and discuss the content of their mutual interest (e.g., music artists). Real-world events … |
Qingyu Guo; Chuhan Shi; Zhuohao Yin; Chengzhong Liu; Xiaojuan Ma; | Proceedings of the ACM on Human-Computer Interaction | 2023-09-28 |
454 | Video Background Music Generation: Dataset, Method and Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This is a challenging task since it requires music-video datasets, efficient architectures for video-to-music generation, and reasonable metrics, none of which currently exist. To close this gap, we introduce a complete recipe including dataset, benchmark model, and evaluation metric for video background music generation. |
LE ZHUO et. al. | iccv | 2023-09-27 |
455 | Looking at The FAccTs: Exploring Music Industry Professionals’ Perspectives on Music Streaming Services and Recommendations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music recommender systems, commonly integrated into streaming services, help listeners find music. Previous research on such systems has focused on providing the best possible … |
Karlijn Dinnissen; Isabella Saccardi; Marloes Vredenborg; Christine Bauer; | Proceedings of the 2nd International Conference of the ACM … | 2023-09-27 |
456 | Synthia’s Melody: A Benchmark Framework for Unsupervised Domain Adaptation in Audio Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We, in part, attribute this to the lack of an appropriate benchmark dataset. To address this gap, we present Synthia’s melody, a novel audio data generation framework capable of simulating an infinite variety of 4-second melodies with user-specified confounding structures characterised by musical keys, timbre, and loudness. |
Chia-Hsin Lin; Charles Jones; Björn W. Schuller; Harry Coppock; | arxiv-cs.SD | 2023-09-26 |
457 | Machine Learning Music Emotion Recognition Based on Audio Features Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music is a kind of language expressing emotion, different structures and content contain different emotions. For the music emotion classification problem, a novel music emotion … |
Keju Wang; Cheng Qian; Lijun Zhang; | 2023 IEEE 6th International Conference on Information … | 2023-09-23 |
458 | CrossSinger: A Cross-Lingual Multi-Singer High-Fidelity Singing Voice Synthesizer Trained on Monolingual Singers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose CrossSinger, which is a cross-lingual singing voice synthesizer based on Xiaoicesing2. |
Xintong Wang; Chang Zeng; Jun Chen; Chunhui Wang; | arxiv-cs.SD | 2023-09-22 |
459 | Interactive Music Distance Education Platform Based on RBF Algorithm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: INTRODUCTION: Since the 21st century, Internet technology has been developing rapidly, and the field of education has gradually broken through the traditional offline teaching … |
Sujie He; | EAI Endorsed Trans. Scalable Inf. Syst. | 2023-09-22 |
460 | Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As the main contribution of this work, we propose enhancing control of multi-instrument synthesis by conditioning a generative model on a specific performance and recording environment, thus allowing for better guidance of timbre and style. |
Ben Maman; Johannes Zeitler; Meinard Müller; Amit H. Bermano; | arxiv-cs.SD | 2023-09-21 |
461 | Passage Summarization with Recurrent Models for Audio-Sheet Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, two challenges that arise out of this strategy are the requirement of strongly aligned data to train the networks, and the inherent discrepancies of musical content between audio and sheet music snippets caused by local and global tempo differences. In this paper, we address these two shortcomings by designing a cross-modal recurrent network that learns joint embeddings that can summarize longer passages of corresponding audio and sheet music. |
Luis Carvalho; Gerhard Widmer; | arxiv-cs.SD | 2023-09-21 |
462 | Self-Supervised Contrastive Learning for Robust Audio-Sheet Music Retrieval Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the scarcity of annotated data from real musical content affects the capability of such methods to generalize to real retrieval scenarios. In this work, we investigate whether we can mitigate this limitation with self-supervised contrastive learning, by exposing a network to a large amount of real music data as a pre-training step, by contrasting randomly augmented views of snippets of both modalities, namely audio and sheet images. |
Luis Carvalho; Tobias Washüttl; Gerhard Widmer; | arxiv-cs.SD | 2023-09-21 |
463 | Towards Robust and Truly Large-Scale Audio-Sheet Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article we attempt to provide an insightful examination of the current developments on audio-sheet music retrieval via deep learning methods. |
Luis Carvalho; Gerhard Widmer; | arxiv-cs.SD | 2023-09-21 |
464 | K-pop Lyric Translation: Dataset, Analysis, and Neural-Modelling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To broaden the scope of genres and languages in lyric translation studies, we introduce a novel singable lyric translation dataset, approximately 89\% of which consists of K-pop song lyrics. |
Haven Kim; Jongmin Jung; Dasaem Jeong; Juhan Nam; | arxiv-cs.CL | 2023-09-20 |
465 | Investigating Personalization Methods in Text to Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the personalization of text-to-music diffusion models in a few-shot setting. |
Manos Plitsis; Theodoros Kouzelis; Georgios Paraskevopoulos; Vassilis Katsouros; Yannis Panagakis; | arxiv-cs.SD | 2023-09-20 |
466 | Popularity Degradation Bias in Local Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the effect of popularity degradation bias in the context of local music recommendations. |
April Trainor; Douglas Turnbull; | arxiv-cs.IR | 2023-09-20 |
467 | MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose MelodyGLM, a multi-task pre-training framework for generating melodies with long-term structure. |
XINDA WU et. al. | arxiv-cs.SD | 2023-09-19 |
468 | Motif-Centric Representation Learning for Symbolic Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we aim to learn the implicit relationship between motifs and their variations via representation learning, using the Siamese network architecture and a pretraining and fine-tuning pipeline. |
Yuxuan Wu; Roger B. Dannenberg; Gus Xia; | arxiv-cs.SD | 2023-09-19 |
469 | HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces the HumTrans dataset, which is publicly available and primarily designed for humming melody transcription. |
Shansong Liu; Xu Li; Dian Li; Ying Shan; | arxiv-cs.SD | 2023-09-18 |
470 | Positive and Risky Message Assessment for Music Products Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a pioneering research challenge: evaluating positive and potentially harmful messages within music products. |
Yigeng Zhang; Mahsa Shafaei; Fabio A. González; Thamar Solorio; | arxiv-cs.CL | 2023-09-18 |
471 | Unified Pretraining Target Based Video-music Retrieval With Music Rhythm And Video Optical Flow Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, our proposed approach leverages a unified target set to perform video/music pretraining and produces clip-level embeddings to preserve temporal information. |
Tianjun Mao; Shansong Liu; Yunxuan Zhang; Dian Li; Ying Shan; | arxiv-cs.MM | 2023-09-17 |
472 | Estimating Mutual Information for Spike Trains: A Bird Song Example Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Zebra finch are a model animal used in the study of audition. |
Jake Witter; Conor Houghton; | arxiv-cs.IT | 2023-09-14 |
473 | Localify.org: Locally-focus Music Artist and Event Recommendation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Cities with strong local music scenes enjoy many social and economic benefits. To this end, we are interested in developing a locally-focused artist and event recommendation … |
DOUGLAS TURNBULL et. al. | Proceedings of the 17th ACM Conference on Recommender … | 2023-09-14 |
474 | MuRS: Music Recommender Systems Workshop Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: 1 WORKSHOP DESCRIPTION AND RATIONALE Music recommendation has been a prominent use case in the RecSys community since the early days [4, 14]. With the growth of music streaming … |
Andres Ferraro; Peter Knees; Massimo Quadrana; Tao Ye; F. Gouyon; | Proceedings of the 17th ACM Conference on Recommender … | 2023-09-14 |
475 | SingFake: Singing Voice Deepfake Detection IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose the singing voice deepfake detection task. |
Yongyi Zang; You Zhang; Mojtaba Heydari; Zhiyao Duan; | arxiv-cs.SD | 2023-09-14 |
476 | Comparative Assessment of Markov Models and Recurrent Neural Networks for Jazz Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims to compare the performance of a simple Markov chain model and a recurrent neural network (RNN) model, two popular models for sequence generating tasks, in jazz music improvisation. |
Conrad Hsu; Ross Greer; | arxiv-cs.SD | 2023-09-14 |
477 | Undecidability Results and Their Relevance in Modern Music Making Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The study adopts a multidimensional approach, focusing on five key areas: (1) the Turing completeness of Ableton, a widely used digital audio workstation, (2) the undecidability of satisfiability in sound creation utilizing an array of effects, (3) the undecidability of constraints on polymeters in musical compositions, (4) the undecidability of satisfiability in just intonation harmony constraints, and (5) the undecidability of new ordering systems. |
Halley Young; | arxiv-cs.SD | 2023-09-11 |
478 | Exploring Music Genre Classification: Algorithm Analysis and Deployment Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a study on music genre classification using a combination of Digital Signal Processing (DSP) and Deep Learning (DL) techniques. |
Ayan Biswas; Supriya Dhabal; Palaniandavar Venkateswaran; | arxiv-cs.SD | 2023-09-09 |
479 | Using Deep Learning and Genetic Algorithms for Melody Generation and Optimization in Music Related Papers Related Patents Related Grants Related Venues Related Experts View |
Ling Dong; | Soft Computing | 2023-09-09 |
480 | A Long-Tail Friendly Representation Framework for Artist and Music Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a Long-Tail Friendly Representation Framework (LTFRF) that utilizes neural networks to model the similarity relationship. |
Haoran Xiang; Junyu Dai; Xuchen Song; Furao Shen; | arxiv-cs.SD | 2023-09-08 |
481 | Exploring Significance of SPOC: A Path to Modernization of Music Cloud Computing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: INTRODUCTION: With the development of the information age and the application of cloud computing and big data technology, new changes have occurred in the field of education. … |
Zhaoxia Li; | EAI Endorsed Trans. Scalable Inf. Syst. | 2023-09-06 |
482 | Self-Similarity-Based and Novelty-based Loss for Music Structure Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we propose a supervised approach for the task of music boundary detection. |
Geoffroy Peeters; | arxiv-cs.SD | 2023-09-05 |
483 | FSD: An Initial Chinese Dataset for Fake Song Detection IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, we employ the FSD dataset for the training of ADD models. We subsequently evaluate these models under two scenarios: one with the original songs and another with separated vocal tracks. |
YUANKUN XIE et. al. | arxiv-cs.SD | 2023-09-05 |
484 | DAACI-VoDAn: Improving Vocal Detection with New Data and Methods Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Vocal detection (VD) algorithms aim to detect the presence of vocals in music recordings and are an essential preprocessing step for other tasks, including singer identification … |
Helena Cuesta; N. Kroher; A. Pikrakis; Stojan Djordjevic; | 2023 31st European Signal Processing Conference (EUSIPCO) | 2023-09-04 |
485 | MDSC: Towards Evaluating The Style Consistency Between Music and Dance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose MDSC(Music-Dance-Style Consistency), the first evaluation metric which assesses to what degree the dance moves and music match. |
Zixiang Zhou; Baoyuan Wang; | arxiv-cs.SD | 2023-09-03 |
486 | Towards Contrastive Learning in Music Video Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Contrastive learning is a powerful way of learning multimodal representations across various domains such as image-caption retrieval and audio-visual representation learning. In this work, we investigate if these findings generalize to the domain of music videos. |
Karel Veldkamp; Mariya Hendriksen; Zoltán Szlávik; Alexander Keijser; | arxiv-cs.IR | 2023-09-01 |
487 | Enhancing The Vocal Range of Single-speaker Singing Voice Synthesis with Melody-unsupervised Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on our previous work, this work proposes a melody-unsupervised multi-speaker pre-training method conducted on a multi-singer dataset to enhance the vocal range of the single-speaker, while not degrading the timbre similarity. |
Shaohuan Zhou; Xu Li; Zhiyong Wu; Ying Shan; Helen Meng; | arxiv-cs.SD | 2023-09-01 |
488 | HADES: Hash-Based Audio Copy Detection System for Copyright Protection in Decentralized Music Sharing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Preventive measures to stop copyright infringement are yet to be implemented on current decentralized music-sharing platforms. There is no mechanism to reject modified audio … |
MUHAMMAD RASYID et. al. | IEEE Transactions on Network and Service Management | 2023-09-01 |
489 | Exploring T-spherical Fuzzy Sets for Enhanced Evaluation of Vocal Music Classroom Teaching Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Vocal music is a relatively complex skill and skill course, which not only plays an important role in music education in universities, but also cultivates students’ musical level … |
Yani Lu; | Int. J. Knowl. Based Intell. Eng. Syst. | 2023-08-31 |
490 | An Interactive Tool for Exploring Score-Aligned Performances: Opportunities for Enhanced Music Engagement Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music scholars and enthusiasts have long been engaged with both performance recordings and musical scores, but inconveniently, these two closely connected mediums are usually … |
Caitlin Sales; Peiyi Wang; Yucong Jiang; | Proceedings of the 18th International Audio Mostly … | 2023-08-30 |
491 | Sequential Pitch Distributions for Raga Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we attempt to detect the raga using a novel feature to extract sequential or temporal information from an audio sample. |
Vishwaas Narasinh; Senthil Raja G; | arxiv-cs.SD | 2023-08-30 |
492 | Symbolic & Acoustic: Multi-domain Music Emotion Modeling for Instrumental Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most prior works were uni-domain and showed weak consistency between arousal modeling performance and valence modeling performance. Based on this background, we designed a multi-domain emotion modeling method for instrumental music that combines symbolic analysis and acoustic analysis. |
Kexin Zhu; Xulong Zhang; Jianzong Wang; Ning Cheng; Jing Xiao; | arxiv-cs.SD | 2023-08-28 |
493 | InstructME: An Instruction Guided Music Edit And Remix Framework with Latent Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop InstructME, an Instruction guided Music Editing and remixing framework based on latent diffusion models. |
BING HAN et. al. | arxiv-cs.SD | 2023-08-28 |
494 | Automated Conversion of Music Videos Into Lyric Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, making such videos can be challenging and time-consuming as the lyrics need to be added in synchrony and visual harmony with the video. Informed by prior work and close examination of existing lyric videos, we propose a set of design guidelines to help creators make such videos. |
JIAJU MA et. al. | arxiv-cs.HC | 2023-08-28 |
495 | Fairness Through Domain Awareness: Mitigating Popularity Bias For Music Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we explore the intrinsic relationship between music discovery and popularity bias. |
Rebecca Salganik; Fernando Diaz; Golnoosh Farnadi; | arxiv-cs.CY | 2023-08-28 |
496 | Utilizing Mood-Inducing Background Music in Human-Robot Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: \setcounter{footnote}{2}\footnote{An earlier version of part of the material in this paper appeared originally in the first author’s Ph.D. |
Elad Liebman; Peter Stone; | arxiv-cs.AI | 2023-08-27 |
497 | A Computational Evaluation Framework for Singable Lyric Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a computational framework for the quantitative evaluation of singable lyric translation, which seamlessly integrates musical, linguistic, and cultural dimensions of lyrics. |
Haven Kim; Kento Watanabe; Masataka Goto; Juhan Nam; | arxiv-cs.CL | 2023-08-25 |
498 | Training Audio Transformers for Cover Song Identification Related Papers Related Patents Related Grants Related Venues Related Experts View |
Te Zeng; F. Lau; | EURASIP Journal on Audio, Speech, and Music Processing | 2023-08-25 |
499 | A Comprehensive Survey for Evaluation Methodologies of AI-Generated Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to comprehensively evaluate the subjective, objective, and combined methodologies for assessing AI-generated music, highlighting the advantages and disadvantages of each approach. |
Zeyu Xiong; Weitao Wang; Jing Yu; Yue Lin; Ziyan Wang; | arxiv-cs.SD | 2023-08-25 |
500 | Emotion-Aligned Contrastive Learning Between Images and Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we address the task of retrieving emotionally-relevant music from image queries by learning an affective alignment between images and music audio. |
Shanti Stewart; Kleanthis Avramidis; Tiantian Feng; Shrikanth Narayanan; | arxiv-cs.MM | 2023-08-24 |
501 | A Survey of AI Music Generation Tools and Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this work, we provide a comprehensive survey of AI music generation tools, including both research projects and commercialized applications. To conduct our analysis, we … |
Yueyue Zhu; Jared Baca; Banafsheh Rekabdar; Reza Rawassizadeh; | ArXiv | 2023-08-24 |
502 | Music Genre Classification Using Support Vector Machine Techniques Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The classification of musical genres is crucial for enhancing music lovers’ listening experiences, considering the vast amount of music available worldwide. This study conducted … |
Arvin Yuwono; Christopher Alexander Tjiandra; Christopher Owen; I. Manuaba; | 2023 International Conference on Information Management and … | 2023-08-24 |
503 | Deep Learning for Music: A Systematic Literature Review Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recently, Artificial Intelligence development and implementation are becoming faster and more popular. Artificial Intelligence has appeared to help humans in their daily … |
Daniel Kevin Kurniawan; Gregorius Revyanno Alexander; Sidharta Sidharta; | 2023 International Conference on Information Management and … | 2023-08-24 |
504 | Exploiting Time-Frequency Conformers for Music Audio Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, the necessity for music audio enhancement (referred to as music enhancement from this point onward), involving the transformation of degraded audio recordings into pristine high-quality music, has surged to augment the auditory experience. To address this issue, we propose a music enhancement system based on the Conformer architecture that has demonstrated outstanding performance in speech enhancement tasks. |
Yunkee Chae; Junghyun Koo; Sungho Lee; Kyogu Lee; | arxiv-cs.SD | 2023-08-24 |
505 | LingGe: An Automatic Ancient Chinese Poem-to-Song Generation System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel system, named LingGe ("伶歌" in Chinese), to generate songs for ancient Chinese poems automatically. |
Yong Shan; Jinchao Zhang; Huiying Ren; Yao Qiu; Jie Zhou; | ijcai | 2023-08-23 |
506 | Humming2Music: Being A Composer As Long As You Can Humming Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an automatic music generation system to lower the threshold of creating music. |
Yao Qiu; Jinchao Zhang; Huiying Ren; Yong Shan; Jie Zhou; | ijcai | 2023-08-23 |
507 | Q&A: Query-Based Representation Learning for Multi-Track Symbolic Music Re-Arrangement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we tackle rearrangement problems via self-supervised learning, in which the mapping styles can be regarded as conditions and controlled in a flexible way. |
Jingwei Zhao; Gus Xia; Ye Wang; | ijcai | 2023-08-23 |
508 | Graph-based Polyphonic Multitrack Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nonetheless, there is a lack of works that consider graph representations in the context of deep learning systems for music generation. This paper bridges this gap by introducing a novel graph representation for music and a deep Variational Autoencoder that generates the structure and the content of musical graphs separately, one after the other, with a hierarchical architecture that matches the structural priors of music. |
Emanuele Cosenza; Andrea Valenti; Davide Bacciu; | ijcai | 2023-08-23 |
509 | Linear-Sized Spectral Sparsifiers and The Kadison-Singer Problem Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: 182, 327-350 (2015)], has been informally thought of as a strengthening of Batson, Spielman, and Srivastava’s theorem that every undirected graph has a linear-sized spectral sparsifier [SICOMP 41, 1704-1721 (2012)]. We formalize this intuition by using a corollary of the MSS result to derive the existence of spectral sparsifiers with a number of edges linear in their number of vertices for all undirected, weighted graphs. |
Phevos Paschalidis; Ashley Zhuang; | arxiv-cs.DS | 2023-08-23 |
510 | Discrete Diffusion Probabilistic Models for Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents the direct generation of Polyphonic Symbolic Music using D3PMs. |
Matthias Plasser; Silvan Peter; Gerhard Widmer; | ijcai | 2023-08-23 |
511 | JEPOO: Highly Accurate Joint Estimation of Pitch, Onset and Offset for Music Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a highly accurate method for joint estimation of pitch, onset and offset, named JEPOO. |
Haojie Wei; Jun Yuan; Rui Zhang; Yueguo Chen; Gang Wang; | ijcai | 2023-08-23 |
512 | MusicJam: Visualizing Music Insights Via Generated Narrative Illustrations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in these techniques, the figures are usually pre-selected or statically generated, so they cannot precisely convey insights of different pieces of music. To address this issue, in this paper, we introduce MusicJam, a music visualization system that is able to generate narrative illustrations to represent the insight of the input music. |
CHUER CHEN et. al. | arxiv-cs.HC | 2023-08-22 |
513 | Modeling Bends in Popular Music Guitar Tablatures Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Tablature notation is widely used in popular music to transcribe and share guitar musical content. As a complement to standard score notation, tablatures transcribe performance … |
Alexandre D’Hooge; Louis Bigo; Ken D’eguernel; | ArXiv | 2023-08-22 |
514 | Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Text-to-music generation (T2M-Gen) faces a major obstacle due to the scarcity of large-scale publicly available music datasets with natural language captions. To address this, we propose the Music Understanding LLaMA (MU-LLaMA), capable of answering music-related questions and generating captions for music files. |
Shansong Liu; Atin Sakkeer Hussain; Chenshuo Sun; Ying Shan; | arxiv-cs.SD | 2023-08-22 |
515 | Exploring Integration Mechanism of Music Instructional Design and Education Informatization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: INTRODUCTION: In the new era of information society, the knowledge reform of school music education is getting more and more attention. Traditional teaching methods need to make … |
Chenchen Wang; | EAI Endorsed Trans. Scalable Inf. Syst. | 2023-08-21 |
516 | An Intelligent Sparse Feature Extraction Approach for Music Data Component Recognition and Analysis of Hybrid Instruments Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, a sparse feature extraction method is presented based on sparse decomposition and multiple musical instrument component dictionaries to address the challenges of … |
Yi Liao; Zhen Gui; | J. Intell. Fuzzy Syst. | 2023-08-20 |
517 | Contrastive Learning Based Deep Latent Masking for Music Source Separation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent studies on music source separation have extended their applicability to generic audio signals. Real-time applications for music source separation are necessary to provide … |
Jihyun Kim; Hong-Goo Kang; | Interspeech | 2023-08-20 |
518 | MTANet: Multi-band Time-frequency Attention Network for Singing Melody Extraction from Polyphonic Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Singing melody extraction is an important task in music information retrieval. In this paper, we propose a multi-band time-frequency attention network (MTANet) for singing melody … |
Yuan Gao; Ying Hu; Liusong Wang; Hao Huang; Liang He; | Interspeech | 2023-08-20 |
519 | TrOMR:Transformer-Based Polyphonic Optical Music Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a transformer-based approach with excellent global perceptual capability for end-to-end polyphonic OMR, called TrOMR. |
Yixuan Li; Huaping Liu; Qiang Jin; Miaomiao Cai; Peng Li; | arxiv-cs.CL | 2023-08-18 |
520 | Digital Musical Instruments in Special Educational Needs Schools: Requirements from The Music Teachers’ Perspective and The Status Quo in Germany Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Digital musical instruments (DMIs) offer the possibility to create barrier-free access to active music-making and to unique sound aesthetics for a broad group of people, including … |
Andreas Förster; Steffen Lepa; | ACM Transactions on Accessible Computing | 2023-08-17 |
521 | “Why Are There So Many Steps?”: Improving Access to Blind and Low Vision Music Learning Through Personal Adaptations and Future Design Ideas Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music can be a catalyst for self-development, creative expression, and community building for blind or low vision (BLV) individuals. However, BLV music learners face complex … |
Leon Lu; K. Cochrane; Jin Kang; Audrey Girouard; | ACM Transactions on Accessible Computing | 2023-08-16 |
522 | Modeling The Local Geography of Country Music Concerts in U.S. Urban Areas: Insights from Big Data Analysis of Live Music Events Related Papers Related Patents Related Grants Related Venues Related Experts View |
Tianyu Li; | Urban Informatics | 2023-08-15 |
523 | Ms3: A Parser for MuseScore Files, Serving As Data Factory for Annotated Music Corpora Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Digital Musicology is a vibrant and quickly growing discipline that addresses traditional and novel music-related research questions with digital and computational means (Honing, … |
Johannes Hentschel; M. Rohrmeier; | J. Open Source Softw. | 2023-08-14 |
524 | BigWavGAN: A Wave-To-Wave Generative Adversarial Network for Music Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To unleash the potential of large DNN models in music SR, we propose BigWavGAN, which incorporates Demucs, a large-scale wave-to-wave model, with State-Of-The-Art (SOTA) discriminators and adversarial training strategies. |
Yenan Zhang; Hiroshi Watanabe; | arxiv-cs.SD | 2023-08-12 |
525 | An Autoethnographic Exploration of XAI in Algorithmic Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an autoethnographic study of the use of the MeasureVAE generative music XAI model with interpretable latent dimensions trained on Irish folk music. |
Ashley Noel-Hirst; Nick Bryan-Kinns; | arxiv-cs.SD | 2023-08-11 |
526 | DiVa: An Iterative Framework to Harvest More Diverse and Valid Labels from User Comments for Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, current solutions fail to resolve it as they cannot produce diverse enough mappings to make up for the information missed by the gold labels. Based on the observation that such missing information may already be presented in user comments, we propose to study the automated music labeling in an essential but under-explored setting, where the model is required to harvest more diverse and valid labels from the users’ comments given limited gold labels. |
HONGRU LIANG et. al. | arxiv-cs.IR | 2023-08-09 |
527 | Research on Musical Tone Recognition Method Based on Improved RNN for Vocal Music Teaching Network Courses Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The test results show that the fast Fourier process with multiple time superposition and a dimension length of 40 is most beneficial to the accuracy of the model. The loss curve … |
Kaiyi Long; | Int. J. Web Based Learn. Teach. Technol. | 2023-08-09 |
528 | JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces JEN-1, a universal high-fidelity model for text-to-music generation. |
PEIKE LI et. al. | arxiv-cs.SD | 2023-08-09 |
529 | Sudowoodo: A Chinese Lyric Imitation System with Source Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce \textbf{\textit{Sudowoodo}}, a Chinese lyrics imitation system that can generate new lyrics based on the text of source lyrics. |
YONGZHU CHANG et. al. | arxiv-cs.CL | 2023-08-08 |
530 | Amplifying The Music Listening Experience Through Song Comments on Music Streaming Platforms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, such emotional aspects are often ignored by current platforms, which affects the listeners’ ability to find music that triggers specific personal feelings. To address this gap, this study proposes a novel approach that leverages deep learning methods to capture contextual keywords, sentiments, and induced mechanisms from song comments. |
LONGFEI CHEN et. al. | arxiv-cs.HC | 2023-08-07 |
531 | Search Engine and Recommendation System for The Music Industry Built with JinaAI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Often people face difficulty in searching for a song solely based on the title, hence a solution is proposed to complete a search analysis through a single query input and is matched with the lyrics of the songs present in the database. |
Ishita Gopalakrishnan; Sanjjushri Varshini R; Ponshriharini V; | arxiv-cs.LG | 2023-08-07 |
532 | Bootstrapping Contrastive Learning Enhanced Music Cold-Start Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there are hardly any studies done on this task. Therefore, in this paper, we will formalize the problem of Music Cold-Start Matching detailedly and give a scheme. |
Xinping Zhao; Ying Zhang; Qiang Xiao; Yuming Ren; Yingchun Yang; | arxiv-cs.IR | 2023-08-05 |
533 | DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conventional autoregressive methods introduce compounding errors during sampling and struggle to capture the long-term structure of dance sequences. To address these limitations, we present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation. |
QIAOSONG QI et. al. | arxiv-cs.GR | 2023-08-05 |
534 | Towards Improving Harmonic Sensitivity and Prediction Stability for Singing Melody Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an input feature modification and a training objective modification based on two assumptions. |
Keren Shao; Ke Chen; Taylor Berg-Kirkpatrick; Shlomo Dubnov; | arxiv-cs.SD | 2023-08-04 |
535 | An Interpretable, Flexible, and Interactive Probabilistic Framework for Melody Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, most recent models are practically impossible to interpret or musically fine-tune, as they use deep neural networks with thousands of parameters. We introduce an interpretable, flexible, and interactive model, SchenkComposer, for melody generation that empowers users to be creative in all aspects of the music generation pipeline and allows them to learn from the process. |
Stephen Hahn; Rico Zhu; Simon Mak; Cynthia Rudin; Yue Jiang; | kdd | 2023-08-04 |
536 | MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, to tackle these challenges, we first construct a state-of-the-art text-to-music model, MusicLDM, that adapts Stable Diffusion and AudioLDM architectures to the music domain. We achieve this by retraining the contrastive language-audio pretraining model (CLAP) and the Hifi-GAN vocoder, as components of MusicLDM, on a collection of music data samples. |
KE CHEN et. al. | arxiv-cs.SD | 2023-08-03 |
537 | Music De-limiter Networks Via Sample-wise Gain Inversion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce music de-limiter networks that estimate uncompressed music from heavily compressed signals. |
Chang-Bin Jeon; Kyogu Lee; | arxiv-cs.SD | 2023-08-02 |
538 | Exploring How A Generative AI Interprets Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We aim to investigate how closely neural networks (NNs) mimic human thinking. As a step in this direction, we study the behavior of artificial neuron(s) that fire most when the … |
G. Barenboim; L. Debbio; J. Hirn; V. Sanz; | ArXiv | 2023-07-31 |
539 | LP-MusicCaps: LLM-Based Pseudo Music Captioning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its importance, researchers face challenges due to the costly and time-consuming collection process of existing music-language datasets, which are limited in size. To address this data scarcity issue, we propose the use of large language models (LLMs) to artificially generate the description sentences from large-scale tag datasets. |
SeungHeon Doh; Keunwoo Choi; Jongpil Lee; Juhan Nam; | arxiv-cs.SD | 2023-07-30 |
540 | Towards A New Interface for Music Listening: A User Experience Study on YouTube Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We also propose wireframes of a video streaming service for better audio-visual music listening in two stages: search and listening. By these wireframes, we offer practical solutions to enhance user satisfaction with YouTube for music listening. |
Ahyeon Choi; Eunsik Shin; Haesun Joung; Joongseek Lee; Kyogu Lee; | arxiv-cs.HC | 2023-07-27 |
541 | DisCover: Disentangled Music Representation Learning for Cover Song Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we set the goal of disentangling version-specific and version-invariant factors, which could make it easier for the model to learn invariant music representations for unseen query songs. |
JIAHAO XUN et. al. | sigir | 2023-07-25 |
542 | When The Music Stops: Tip-of-the-Tongue Retrieval for Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a study of Tip-of-the-tongue (ToT) retrieval for music, where a searcher is trying to find an existing music entity, but is unable to succeed as they cannot accurately recall important identifying information. |
Samarth Bhargav; Anne Schuth; Claudia Hauff; | sigir | 2023-07-25 |
543 | Music, Motion, and Mixed Reality: An Interdisciplinary, Problem-Based Educational Experience Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This talk describes an interdisciplinary educational experience involving cohorts of students studying Computer Science, Dance, 3D Digital Design and Music. The experience takes a … |
Joe Geigel; Thomas Warfield; Yunn-Shan Ma; Dan Roach; S. Foster; | ACM SIGGRAPH 2023 Educator’s Forum | 2023-07-23 |
544 | Melody Slot Machine HD Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Melody Slot Machine HD is an application that allows users to experience generating melodies using musical structures while playing slot machines. To make it possible, we use the … |
Masatoshi Hamanaka; | ACM SIGGRAPH 2023 Appy Hour | 2023-07-23 |
545 | Music Genre Classification with ResNet and Bi-GRU Using Visual Spectrograms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the other hand, deep learning classification architectures like the traditional Convolutional Neural Networks (CNN) are effective in capturing the spatial hierarchies but struggle to capture the temporal dynamics inherent in music data. To address these challenges, this study proposes a novel approach using visual spectrograms as input, and propose a hybrid model that combines the strength of the Residual neural Network (ResNet) and the Gated Recurrent Unit (GRU). |
Junfei Zhang; | arxiv-cs.SD | 2023-07-20 |
546 | From West to East: Who Can Understand The Music of The Others Better? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: At the same time, the vast majority of these models have been trained on Western pop/rock music and related styles. This leads to research questions on whether these models can be used to learn representations for different music cultures and styles, or whether we can build similar music audio embedding models trained on data from different cultures or styles. |
Charilaos Papaioannou; Emmanouil Benetos; Alexandros Potamianos; | arxiv-cs.SD | 2023-07-19 |
547 | Polyffusion: A Diffusion Model for Polyphonic Score Generation with Internal and External Controls IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: We propose Polyffusion, a diffusion model that generates polyphonic music scores by regarding music as image-like piano roll representations. The model is capable of controllable … |
Lejun Min; Junyan Jiang; Gus G. Xia; Jingwei Zhao; | ArXiv | 2023-07-19 |
548 | Mood Classification of Bangla Songs Based on Lyrics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music can evoke various emotions, and with the advancement of technology, it has become more accessible to people. Bangla music, which portrays different human emotions, lacks … |
Maliha Mahajebin; Mohammad Rifat Ahmmad Rashid; N. Mansoor; | ArXiv | 2023-07-19 |
549 | Collaborative Music-making: Special Educational Needs School Assistants As Facilitators in Performances with Accessible Digital Musical Instruments Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The field of research dedicated to Accessible Digital Musical Instruments (ADMIs) is growing and there is an increased interest in promoting diversity and inclusion in … |
Hans Lindetorp; Marianne Svahn; Josefine Hölling; Kjetil Falkenberg; Emma Frid; | Frontiers Comput. Sci. | 2023-07-19 |
550 | Deep Learning for Multi-Structured Javanese Gamelan Note Generator Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Javanese gamelan, a traditional Indonesian musical style, has several song structures called gendhing. Gendhing (songs) are written in conventional notation and require gamelan … |
Arik Kurniawati; E. M. Yuniarno; Y. Suprapto; | Knowl. Eng. Data Sci. | 2023-07-18 |
551 | JAZZVAR: A Dataset of Variations Found Within Solo Piano Performances of Jazz Standards for Music Overpainting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we outline the curation process for obtaining and sorting the repertoire, the pipeline for creating the Original and Variation pairs, and our analysis of the dataset. |
Eleanor Row; Jingjing Tang; George Fazekas; | arxiv-cs.SD | 2023-07-18 |
552 | ProgGP: From GuitarPro Tablature Neural Generation To Progressive Metal Production Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We extend this work by fine-tuning a pre-trained Transformer model on ProgGP, a custom dataset of 173 progressive metal songs, for the purposes of creating compositions from that genre through a human-AI partnership. |
Jackson Loth; Pedro Sarmento; CJ Carr; Zack Zukowski; Mathieu Barthet; | arxiv-cs.SD | 2023-07-11 |
553 | On The Effectiveness of Speech Self-supervised Learning for Music Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, research exploring the effectiveness of applying speech SSL models to music recordings has been limited. We explore the music adaption of SSL with two distinctive speech-related models, data2vec1.0 and Hubert, and refer to them as music2vec and musicHuBERT, respectively. |
YINGHAO MA et. al. | arxiv-cs.SD | 2023-07-11 |
554 | Collaborative Song Dataset (CoSoD): An Annotated Dataset of Multi-artist Collaborations in Popular Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: The Collaborative Song Dataset (CoSoD) is a corpus of 331 multi-artist collaborations from the 2010-2019 BillboardHot 100year-end charts. The corpus is annotated with formal … |
M. Duguay; Kate Mancey; J. Devaney; | ArXiv | 2023-07-10 |
555 | VampNet: Music Generation Via Masked Acoustic Token Modeling IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce VampNet, a masked acoustic token modeling approach to music synthesis, compression, inpainting, and variation. |
Hugo Flores Garcia; Prem Seetharaman; Rithesh Kumar; Bryan Pardo; | arxiv-cs.SD | 2023-07-10 |
556 | Emotion-Guided Music Accompaniment Generation Based on Variational Autoencoder Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing models struggle to effectively characterize human emotions within neural network models while composing music. To address this issue, we propose the use of an easy-to-represent emotion flow model, the Valence/Arousal Curve, which allows for the compatibility of emotional information within the model through data transformation and enhances interpretability of emotional factors by utilizing a Variational Autoencoder as the model structure. |
Qi Wang; Shubing Zhang; Li Zhou; | arxiv-cs.SD | 2023-07-08 |
557 | Unsupervised Melody-to-Lyrics Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. |
YUFEI TIAN et. al. | acl | 2023-07-08 |
558 | Songs Across Borders: Singable and Controllable Neural Lyric Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper bridges the singability quality gap by formalizing lyric translation into a constrained translation problem, converting theoretical guidance and practical techniques from translatology literature to prompt-driven NMT approaches, exploring better adaptation methods, and instantiating them to an English-Chinese lyric translation system. |
Longshen Ou; Xichu Ma; Min-Yen Kan; Ye Wang; | acl | 2023-07-08 |
559 | LaunchpadGPT: Language Model As Music Visualization Designer on Launchpad Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Launchpad is a musical instrument that allows users to create and perform music by pressing illuminated buttons. To assist and inspire the design of the Launchpad light effect, and provide a more accessible approach for beginners to create music visualization with this instrument, we proposed the LaunchpadGPT model to generate music visualization designs on Launchpad automatically. |
Siting Xu; Yunlong Tang; Feng Zheng; | arxiv-cs.SD | 2023-07-07 |
560 | Track Mix Generation on Music Streaming Services Using Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Track Mix, a personalized playlist generation system released in 2022 on the music streaming service Deezer. |
WALID BENDADA et. al. | arxiv-cs.IR | 2023-07-06 |
561 | Classification Of Carnatic Music Ragas Using RNN Deep Learning Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Present-day listeners can find it challenging to invest the time and effort necessary to learn the fundamentals of a particular genre of music. It would be easier to study … |
Krishnendu R; P. S S; | 2023 14th International Conference on Computing … | 2023-07-06 |
562 | Detection of Explicit Lyrics in Hindi Music Using LSTM Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music can significantly impact a child’s development, but the increasing amount of sexual and violent content in song lyrics has raised concerns. Despite efforts to filter … |
NOMI BARUAH et. al. | 2023 14th International Conference on Computing … | 2023-07-06 |
563 | Recommendation of Independent Music Based on Sentiment Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recently, the use of social media platforms for recommendations has dramatically increased. Today, people frequently use social media websites like Facebook and Twitter, as well … |
Amrutha B.; S. M.; | 2023 14th International Conference on Computing … | 2023-07-06 |
564 | Deep Learning Approaches for Melody Generation: An Evaluation Using LSTM, BILSTM and GRU Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Music generation is an application of machine learning that has garnered significant attention over the recent past. In this study we generated musical notes using three deep … |
Meera Subramanian; Lakshmi Swetha S; Rajalakshmi V R; | 2023 14th International Conference on Computing … | 2023-07-06 |
565 | Track Mix Generation on Music Streaming Services Using Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper introduces Track Mix, a personalized playlist generation system released in 2022 on the music streaming service Deezer. Track Mix automatically generates “mix” … |
WALID BENDADA et. al. | Proceedings of the 17th ACM Conference on Recommender … | 2023-07-06 |
566 | Design and Implementation of AI Based Efficient Emotion Detection and Music Recommendation System Summary Related Papers Related Patents |