Paper Digest: ICDE 2025 Papers & Highlights
Interested users can choose to read all ICDE-2025 papers in our digest console, which supports more features.
To search for papers presented at ICDE-2025 on a specific topic, please make use of the search by venue (ICDE-2025) service. To summarize the latest research published at ICDE-2025 on a specific topic, you can utilize the review by venue (ICDE-2025) service. To synthesizes the findings from ICDE 2025 into comprehensive reports, give a try to ICDE-2025 Research. If you are interested in browsing papers by author, we have a comprehensive list of all ICDE-2025 authors & their papers.
This curated list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that gets you the personalized and comprehensive updates on the latest research in your field. It also empowers you to read articles, write articles, get answers, conduct literature reviews and generate research reports.
Experience the full potential of our services today!
TABLE 1: Paper Digest: ICDE 2025 Papers & Highlights
| Paper | Author(s) | |
|---|---|---|
| 1 | Description-Similarity Rules: Towards Flexible Feature Engineering for Entity Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, they suffer model retraining cost to select features, and can hardly customize to different EM tasks. To tackle this problem, we propose Description-Similarity Rules (DSR) for EM feature engineering. |
Yafeng Tang; Zheng Liang; Hongzhi Wang; Xiaoou Ding; Tianyu Mu; Huan Hu; |
| 2 | Online Federated Learning on Distributed Unknown Data Using UAVs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the scenario of multiple UAVs performing Federated Learning (FL) tasks. |
Xichong Zhang; Haotian Xu; Yin Xu; Mingjun Xiao; Jie Wu; Jinrui Zhou; |
| 3 | Grounding Natural Language to SQL Translation with Data-Based Self-Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Cyclesql, an iterative framework designed for end-to-end translation models to autonomously generate the best output through self-evaluation. |
Yuankai Fan; Tonghui Ren; Can Huang; Zhenying He; X. Sean Wang; |
| 4 | AllHands :Ask Me Anything on Large-scale Verbatim Feedback Via Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Allhands, an innovative ana-lytic framework that transforms traditional large-scale feedback analysis tasks through a natural language interface, leveraging large language models (LLMs). |
Chaoyun Zhang; Zicheng Ma; Yuhao Wu; Shilin He; Si Qin; Minghua Ma; Xiaoting Qin; Yu Kang; Yuyi Liang; Xiaoyu Gou; Yajie Xue; Qingwei Lin; Saravan Rajmohan; Dongmei Zhang; Qi Zhang; |
| 5 | Learnable Sparse Customization in Heterogeneous Edge Computing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Learnable Personalized Sparsification for heterogeneous Federated learning (FedLPS), which achieves the learnable customization of heterogeneous sparse models with importance-associated patterns and adaptive ratios to simultaneously tackle system and statistical heterogeneity. |
Jingjing Xue; Sheng Sun; Min Liu; Yuwei Wang; Zhuotao Liu; Jingyuan Wang; |
| 6 | Many Hands Make Light Work: Accelerating Edge Inference Via Multi-Client Collaborative Caching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the aforementioned challenges, we propose an efficient inference framework, CoCa, which leverages a multi-client collaborative caching mechanism to accelerate edge inference. |
Wenyi Liang; Jianchun Liu; Hongli Xu; Chunming Qiao; Liusheng Huang; |
| 7 | ALT-Index: A Hybrid Learned Index for Concurrent Memory Database Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the optimized ART layer, we introduce a fast and compact pointer buffer to further improve the overall performance. |
Yuxin Yang; Fang Wang; Mengya Lei; Peng Zhang; Dan Feng; |
| 8 | Large-Scale Spatiotemporal Kernel Density Visualization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although a recent approach, the sliding-window-based solution (SWS), reduces the time complexity of STKDV, it (i) is unable to reduce the time complexity for supporting STKDV-based exploratory analysis, (ii) is not theoretically efficient, and (iii) does not provide optimization techniques for bandwidth tuning. To eliminate these drawbacks, we propose a prefix-set-based solution (PREFIX) that encompasses three methods, namely PREFIXsingle (addressing (i)), PREFIXmultiple (addressing (ii)), and PREFIXtuning (addressing (iii)). |
Tsz Nam Chan; Pak Lon Ip; Bojian Zhu; Leong Hou U; Dingming Wu; Jianliang Xu; Christian S. Jensen; |
| 9 | HourglassSketch: An Efficient and Scalable Framework for Graph Stream Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose HourglassSketch, a two-stage data structure, for high-accuracy graph stream summarization. |
Jiarui Guo; Boxuan Chen; Kaicheng Yang; Tong Yang; Zirui Liu; Qiuheng Yin; Sha Wang; Yuhan Wu; Xiaolin Wang; Bin Cui; Tao Li; Xi Peng; Renhai Chen; Gong Zhang; |
| 10 | CADRL: Category-Aware Dual-Agent Reinforcement Learning for Explainable Recommendations Over Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (2) The excessive reliance on short recommendation paths due to efficiency concerns. To surmount these challenges, we propose a category-aware dual-agent reinforcement learning (CADRL) model for explainable recommendations over KGs. |
Shangfei Zheng; Hongzhi Yin; Tong Chen; Xiangjie Kong; Jian Hou; Pengpeng Zhao; |
| 11 | SOUND: Sanity Checking of Pipelines for Uncertain and Sparse Data Series Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Sound to enable sanity checking of pipelines in the presence of typical quality issues in data series. |
Hermann Stolte; Iftach Sadeh; Elisa Pueschel; Avigdor Gal; Matthias Weidlich; |
| 12 | Preserving K-Connectivity in Dynamic Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose efficient algorithms to significantly improve the theoretical running time for both edge insertion and edge deletion compared with the baseline. |
Gengda Zhao; Dong Wen; Xiaoyang Wang; Kai Wang; Xuemin Lin; |
| 13 | An Adaptive Sampling Algorithm for The Top-$K$ Group Betweenness Centrality Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Betweenness centrality is one of the key centrality measures in many applications including community detections in biological networks, vulnerability detections in communication … |
Wenzheng Xu; Honglin Mao; Heng Shao; Weifa Liang; Jian Peng; Wen Huang; Zichuan Xu; Pan Zhou; Jeffrey Xu Yu; |
| 14 | BQSched: A Non-Intrusive Scheduler for Batch Concurrent Queries Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The latest reinforcement learning (RL) based methods have the potential to capture these patterns from feedback, but it is non-trivial to apply them directly due to the large scheduling space, high sampling cost, and poor sample utilization. Motivated by these challenges, we propose BQSched, a non-intrusive Scheduler for Batch concurrent Queries via reinforcement learning. |
Chenhao Xu; Chunyu Chen; Jinglin Peng; Jiannan Wang; Jun Gao; |
| 15 | DEPA-Delta Shifting and Distribution Shaping for Efficient Adaptive Indexing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose two novel techniques, namely Delta Shift Partitioning and Distribution Shaping Partitioning, that achieve tenfold better performance than the competitors without compromising the adaptivity of the index. |
Ahmad Khazaie; Holger Pirk; |
| 16 | DaVinci Sketch: A Versatile Sketch for Efficient and Comprehensive Set Measurements Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces DaVinci Sketch, a versatile sketch designed to efficiently handle various set measurement tasks using a single unified data structure. |
Yanshu Wang; Jianan Ji; Chao-Hsuan Liu; Hengyang Zhou; Tong Yang; |
| 17 | Explaining Expert Search and Team Formation Systems with ExES Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, state-of-the-art solutions to this problem lack trans-parency. To address this issue, we propose ExES, a tool designed to explain expert search and team formation systems using factual and counterfactual methods from the field of explainable artificial intelligence (XAI). |
Kiarash Golzadeh; Lukasz Golab; Jarek Szlichta; |
| 18 | OOCC: One-Round Optimistic Concurrency Control for Read-Only Disaggregated Transactions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces OOCC, a novel One-round Optimistic Concurrency Control method tailored for disaggregated trans-actions. |
Hao Wu; Mingxing Zhang; Kang Chen; Xia Liao; Yingdi Shan; Yongwei Wu; |
| 19 | Efficient Learning-Based Graph Simulation for Temporal Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on simulating temporal graphs, which aim to reproduce the structural and temporal properties of the observed real-life temporal graphs. |
Sheng Xiang; Chenhao Xu; Dawei Cheng; Xiaoyang Wang; Ying Zhang; |
| 20 | Towards Fair Graph Neural Networks Via Graph Counterfactual Without Sensitive Attributes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a framework named Fairwos (improving Fairness withQut sensitive attributes). |
Xuemin Wang; Tianlong Gu; Xuguang Bao; Liang Chang; |
| 21 | HINSCAN: Efficient Structural Graph Clustering Over Heterogeneous Information Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in many real applications, such as bibliographic networks and knowledge graphs, the input graphs is heterogeneous information networks which consist of multi-typed and interconnected objects, which makes SCAN cannot be applied to cluster. Therefore, in this paper, we study the SCAN problem over heterogeneous information networks. |
Long Yuan; Xiaotong Sun; Zi Chen; Peng Cheng; Longbin Lai; Xuemin Lin; |
| 22 | Ratel: Optimizing Holistic Data Movement to Fine-tune 100B Model on A Consumer GPU Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on LLM fine-tuning on a single consumer-grade GPU in a commodity server with limited main memory capacity, which is accessible to most AI researchers. |
Changyue Liao; Mo Sun; Zihan Yang; Jun Xie; Kaiqi Chen; Binhang Yuan; Fei Wu; Zeke Wang; |
| 23 | Efficient Pruning Via Entailment Cardinality Estimation for Fast Top-Down Logic Rule Mining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an efficient pruning method based on entailment cardinality estimation. |
Ruoyu Wang; Raymond Wong; Daniel Sun; |
| 24 | Hyperion: Co-Optimizing SSD Access and GPU Computation for Cost-Efficient GNN Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present Hyperion, a cost-efficient system for terabyte-scale GNN training. |
Jie Sun; Mo Sun; Zheng Zhang; Zuocheng Shi; Jun Xie; Zihan Yang; Jie Zhang; Zeke Wang; Fei Wu; |
| 25 | Boosting Accuracy and Efficiency for Vector Retrieval with Local Scaling Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through both empirical and theoretical analysis, we identify that the existence of antihubs is the root cause of these performance limitations. To mitigates the negative impact of antihubs, we propose a highly efficient graph-based vector retrieval framework named Local Scaling Graph (LSG) by introducing more incident edges for them in a systematic way. |
Hongya Wang; Wenlong Wu; Cong Luo; Aobei Bian; Chunguang Meng; Yishuo Wu; Ji Sun; |
| 26 | Promi: Progressive Live Migration in Distributed Database Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Promi, a live data migration method that progressively migrates data at the granularity of mini-partitions instead of entire partitions. |
Zhenghao Ding; Xinyi Zhang; Wei Lu; Wenlong Ma; Wenliang Zhang; Xiaoyong Du; |
| 27 | Efficient Methods for Accurate Sparse Trajectory Recovery and Map Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present efficient methods TRMMA and MMA for accurate trajectory recovery and map matching, respectively, where MMA serves as the first step of TRMMA. |
Wei Tian; Jieming Shi; Man Lung Yiu; |
| 28 | Incremental Stream Query Placement in Massively Distributed and Volatile Infrastructures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose ISQP, a framework that keeps the operator placements valid under query and infrastructure changes. |
Ankit Chaudhary; Kaustubh Beedkar; Jeyhun Karimov; Felix Lang; Steffen Zeuch; Volker Markl; |
| 29 | LETFramework: Let The Universal Sketch Be Accurate Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LETFramework (short for Lossless ExTraction Framework) to optimize the performance of the universal sketch. |
Ruijie Miao; Xiangwei Deng; Zicang Xu; Ziyun Zhang; Tong Yang; |
| 30 | A Cost-Effective and Decompression-Transparent Compressor for OLTP-Oriented Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present DPTC, a cost-effective and decompression-transparent approach designed to compress data pages, the basic storage unit of OLTP database systems. |
Hao Hu; Qiyang Zheng; Xiangyu Zou; Lisha Qin; Chengwei Zhang; Wanchuan Zhang; Zhaoheng Jiang; Dingwen Tao; Hongpeng Wang; Wen Xia; |
| 31 | Heterogeneous-Aware Traffic Prediction: A Privacy-Preserving Federated Learning Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle data quality heterogeneity, Fed4TP introduces a dual-driven method, i.e., global detection and local denoising, to improve client data quality. |
Zhihao Zeng; Ziquan Fang; Yuting Huang; Qilong Wang; Lu Chen; Yunjun Gao; |
| 32 | GeoTP: Latency-Aware Geo-Distributed Transaction Processing in Database Middlewares Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GeoTP, a latency-aware geo-distributed transaction processing approach in database middleware. |
Qiyu Zhuang; Xinyue Shi; Shuang Liu; Wei Lu; Zhanhao Zhao; Yuxing Chen; Tong Li; Anqun Pan; Xiaoyong Du; |
| 33 | Exploring SIMD Vectorization in Aggregation Pipelines for Encoded IoT Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper identifies operators to process and accelerate IoT aggregation queries based on encoded data arrays, extensible to integrate thread-level and instruction-level designs. |
Rui Kang; Shaoxu Song; Jianmin Wang; |
| 34 | CommunityDF: A Guided Denoising Diffusion Approach for Community Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose CommunityDF, a novel framework that applies DDPMs to the community search problem, which involves identifying subgraphs containing nodes closely related to a given query node. |
Jiazun Chen; Yikuan Xia; Jun Gao; Zhao Li; Hongyang Chen; |
| 35 | Having It Both Ways: Single Trajectory Embedding for Similarity Computation with Pairwise Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, creating a robust embedding model presents challenges, including the lack of direct involvement in the computational similarity process, adherence to non-metric similarity spaces, and the integration of precise similarity computation alignments. To address these challenges, we introduce DTisT, a novel embedding framework that enhances trajectory embeddings by pairwise learning from dual-trajectory input models. |
Jianing Si; Haitao Yuan; Xiang Li; Nan Jiang; Xiao Ma; Guoliang Li; Shangguang Wang; |
| 36 | Auto-TSF: Towards Proxy-Model-Based Meta-Learning for Automatic Time Series Forecasting Algorithm Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Proxy-Model-based meta-learning TSF-CASH approach named Auto- Tsf. |
Tianyu Mu; Hongzhi Wang; Chen Liang; Xinyue Shao; |
| 37 | HC-SpMM: Accelerating Sparse Matrix-Matrix Multiplication for Graphs with Hybrid GPU Cores Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present HC-SpMM, a pioneering algorithm that leverages Hybrid GPU Cores (Tensor cores and CUDA cores) to accelerate SpMM for graphs. |
Zhonggen Li; Xiangyu Ke; Yifan Zhu; Yunjun Gao; Yaofeng Tu; |
| 38 | Efficient $\eta$-Threshold Maintenance in Dynamic Uncertain Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is costly to recompute all $\eta$-thresholds from scratch whenever the uncertain graphs face update operations, e.g., edge insertion and deletion, and the modifications on edge probability. Motivated by this, we introduce efficient $\eta$-threshold maintenance algorithms tailored for dynamic uncertain graphs in this paper. |
Yu Chen; Qing Liu; Yifan Zhu; Yunjun Gao; |
| 39 | Dataset Discovery Via Line Charts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore a novel dataset discovery problem, dataset discovery via line charts, focusing on the use of line charts as queries to discover datasets within a large data repository that are capable of generating similar line charts. |
Daomin Ji; Hui Luo; Zhifeng Bao; J. Shane Culpepper; |
| 40 | Vista: Vector Indexing and Search for Large-Scale Imbalanced Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the challenges faced by current advanced ANNS approaches when dealing with vectors characterized by imbalanced distributions, which negatively impact search efficiency. |
Yujian Fu; Cheng Chen; Yao Chen; Weng-Fai Wong; Bingsheng He; |
| 41 | Time-Aware Influence Minimization Via Blocking Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the Time-aware Influence Minimization (TIMIN) problem in social networks, focusing on minimizing negative influence concerning a critical deadline by temporarily blocking specific nodes in the given social network. |
Xueqin Chang; Jiajie Fu; Qing Liu; Yunjun Gao; Baihua Zheng; |
| 42 | TierBase: A Workload-Driven Cost-Optimized Key-Value Store Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a Space-Performance Cost Model for key-value store, designed to guide cost-effective storage configuration decisions. |
Zhitao Shen; Shiyu Yang; Weibo Chen; Kunming Wang; Yue Li; Jiabao Jin; Wei Jia; Junwei Chen; Yuan Su; Xiaoxia Duan; Wei Chen; Lei Wang; Jie Song; Ruoyi Ruan; Xuemin Lin; |
| 43 | Joinable Search Over Multi-Source Spatial Datasets: Overlap, Coverage, and Efficiency Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To support two cases of joinable search over multiple spatial data sources seamlessly, we propose a multi-source spatial dataset search framework. |
Wenzhe Yang; Sheng Wang; Zhiyu Chen; Yuan Sun; Zhiyong Peng; |
| 44 | Maximal Clique Enumeration with Hybrid Branching and Early Termination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach, HBBMC, a hybrid framework combining vertex-oriented BK branching and edge-oriented BK branching, where the latter adopts a branch-and-bound framework which forms the sub-branches by expanding the partial clique with a edge. |
Kaixin Wang; Kaiqiang Yu; Cheng Long; |
| 45 | TopTune: Tailored Optimization for Categorical and Continuous Knobs Towards Accelerated and Improved Database Performance Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose TopTune, which employs tailored optimization for continuous and categorical knobs, to achieve accelerated tuning efficiency and improved tuning performance. |
Rukai Wei; Yu Liu; Yufeng Hou; Heng Cui; Yongqiang Zhang; Ke Zhou; |
| 46 | CrossEM: A Prompt Tuning Framework for Cross-Modal Entity Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To support EM on heterogeneous entity with different data formats and modalities, we propose cross-modal entity matching in this paper. |
Qin Yuan; Ye Yuan; Zhenyu Wen; Chi Chen; Guoren Wang; |
| 47 | CrossETR: A Semantic-Driven Framework for Entity Matching Across Images and Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current semantically matching solutions over cross-modal data face the obstacle of low training efficiency, since their time complexity quadratically grows with the number of entities. To alleviate this issue, we present a novel framework (namely CrossETR) that follows an exploration-then-refinement paradigm. |
Qin Yuan; Zhenyu Wen; Jiaxu Qian; Ye Yuan; Guoren Wang; |
| 48 | StructRide: A Framework to Exploit The Structure Information of Shareability Graph in Ridesharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, the graph is a powerful tool to analyze the structure information between nodes. Therefore, in this paper, we propose a framework, namely StructRide, to utilize the structure information to improve the results for ridesharing problems. |
Jiexi Zhan; Yu Chen; Peng Cheng; Lei Chen; Wangze Ni; Xuemin Lin; |
| 49 | A Storage Model with Fine-Grained In-Storage Query Processing for Spatio-Temporal Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this approach suffers from significant data movement overhead between hosts and drives. To address this issue, this work introduces Groundhog, an efficient in-storage computing technique designed specifically for spatio-temporal queries, aimed at reducing unnecessary data movement and computations. |
Yang Guo; Tianyu Wang; Zizhan Chen; Zili Shao; |
| 50 | Numerical Estimation of Spatial Distributions Under Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the problem of private spatial distribution estimation, where we collect spatial data from individuals and aim to minimize the distance between the actual distribution and estimated one under Local Differential Privacy (LDP). |
Leilei Du; Peng Cheng; Libin Zheng; Xiang Lian; Lei Chen; Wei Xi; Wangze Ni; |
| 51 | Accelerating D-Core Maintenance Over Dynamic Directed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the peeling-based method suffers from efficiency issues, e.g., it may degenerate into recomputing all the D-cores and is inefficient for batch updates due to sequential processing. To address these limitations, we introduce novel algorithms for incrementally maintaining D-cores in dynamic graphs. |
Xuankun Liao; Qing Liu; Jiaxin Jiang; Byron Choi; Bingsheng He; Jianliang Xu; |
| 52 | Most Probable Maximum Weighted Butterfly Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Most Probable Maximum Weighted Butterfly (MPMB), which holds the highest probability of becoming a maximum weighted butterfly on an uncertain bipartite network. |
Yu Shao; Peng Cheng; Longbin Lai; Long Yuan; Wangze Ni; Xuemin Lin; |
| 53 | OneRoundSTL: In-Database Seasonal-Trend Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose OneRoundSTL, which pre-calculates offline some results in each individual page and concatenates the pre-calculated results online at query time to obtain the decomposition outcome. |
Zijie Chen; Shaoxu Song; Jianmin Wang; |
| 54 | BLEND: A Unified Data Discovery System Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reduce the execution runtime of discovery pipelines, we propose a unified index structure and a rule- and cost-based optimizer that rewrites SQL statements into low-level operators when possible. |
Mahdi Esmailoghli; Christoph Schnell; Renée J. Miller; Ziawasch Abedjan; |
| 55 | Efficient Route and Area Matching Query in Dynamic Road Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the route and area matching (ROAM) problem in dynamic road networks. |
Yikun Wang; Dian Ouyang; Zhuoran Wang; Dong Wen; Xuemin Lin; |
| 56 | PBSM: Predictive Bi-Preference Stable Matching in Spatial Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a result, they gain suboptimal assignment results in most cases. Inspired by this, we propose a novel problem, named the Predictive Bi-preference Stable Match problem (PBSM), with the goal of maximizing the preferences of both workers and tasks by taking into account the social network of workers and task completion sequence. |
Yuan Xie; Yumeng Liu; Xu Zhou; Yifang Yin; Kenli Li; Roger Zimmermann; |
| 57 | Self-Supervised Trajectory Representation Learning with Multi-Scale Spatio-Temporal Feature Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, most existing methods do not sufficiently capture the multi-faceted temporal features within trajectories. To fill these gaps, we propose a novel self-supervised Trajectory $R$epresentation $L$earning model with multi-scale spatio-temporal features exploration called TrajRL. |
Hong Xia; Xiao Zhang; Yuan Cao; Lei Cao; Yanwei Yu; Junyu Dong; |
| 58 | Towards Robustness of Text-to-Visualization Translation Against Lexical and Phrasal Variability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we thoroughly examine the robustness of current text-to-vis models, an area that has not previously been explored. |
Jinwei Lu; Yuanfeng Song; Haodi Zhang; Chen Jason Zhang; Kaishun Wu; Raymond Chi-Wing Wong; |
| 59 | DAG*: A Novel A*-Alike Algorithm for Optimal Workflow Execution Across IoT Platforms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce DAG*, an A*-alike algorithm that prunes large amounts of the search space explored for suggesting the most efficient workflow execution with formal optimality guarantees. |
Errikos Streviniotis; Dimitrios Banelas; Nikos Giatrakos; Antonios Deligiannakis; |
| 60 | Revelio: Revealing Important Message Flows in Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Revelio, a novel method to provide faithful explanations of message flows in GNNs. |
Haoyu He; Isaiah J. King; H. Howie Huang; |
| 61 | FAHL: An Efficient Labeling Index for Flow-Aware Shortest Path Querying in Road Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (2) maintenance latency, the traffic-flow and edges’ weights undergo high-frequency changes with different traffic conditions, meaning that our index must be able to support high-frequency updates. To end this, we propose a novel Flow-Aware Hierarchical Labeling Index (FAHL) in this paper. |
Tangpeng Dan; Xiao Pan; Bolong Zheng; Xiaofeng Meng; |
| 62 | DATA-WA: Demand-Based Adaptive Task Assignment with Dynamic Worker Availability Windows Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To reduce the search space of task assignments and be efficient, we propose a worker dependency separation approach based on graph partition and a task value function with reinforcement learning. |
Jinwen Chen; Jiannan Guo; Dazhuo Qiu; Yawen Li; Guanhua Ye; Yan Zhao; Kai Zheng; |
| 63 | On Simplifying Large-Scale Spatial Vectors: Fast, Memory-Efficient, and Cost-Predictable $k$-Means Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a fast, memory-efficient, and cost-predictable $k$-means called Dask-means. |
Yushuai Ji; Zepeng Liu; Sheng Wang; Yuan Sun; Zhiyong Peng; |
| 64 | QaVA: Query-Aware Video Analysis Framework Based on Data Access Pattern Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a query-aware video analysis framework, QaVA, to improve query performance further. |
Tianxiong Zhong; Zhiwei Zhang; Yihang Fu; Guo Lu; Ye Yuan; Guoren Wang; |
| 65 | Approximate Vector Set Search: A Bio-Inspired Approach for High-Dimensional Spaces Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we aim to address the efficiency challenges posed by the combinatorial explosion in vector set search, as well as the curse of dimensionality inherited from single-vector search. |
Yiqi Li; Sheng Wang; Zhiyu Chen; Shangfeng Chen; Zhiyong Peng; |
| 66 | DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we prove that directly aligning the representations of LLMs and collaborative models is suboptimal for enhancing downstream recommendation tasks performance, based on the information theorem. |
Xihong Yang; Heming Jing; Zixing Zhang; Jindong Wang; Huakang Niu; Shuaiqiang Wang; Yu Lu; Junfeng Wang; Dawei Yin; Xinwang Liu; En Zhu; Defu Lian; Erxue Min; |
| 67 | Scaling Asynchronous Graph Query Processing Via Partitioned Stateful Traversal Machines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the widespread availability of many-core CPUs and high-speed networking in modern datacenters, existing distributed graph query systems struggle with their inherent inefficiencies, resulting in low hardware utilization and poor query performance on these state-of-the-art hardware. To address these challenges, we introduce the Partitioned Stateful Traversal Machine (PSTM), which extends the Gremlin graph traversal machine. |
Shaoyuan Chen; Hongtao Chen; Shaonan Ma; Yajie Qin; Zheng Wang; Weiyu Xie; Mingxing Zhang; Kang Chen; Xia Liao; Yingdi Shan; Jinlei Jiang; Yongwei Wu; |
| 68 | GFlux: A Fast GPU-Based Out-of-Memory Multi-Hop Query Processing Framework for Trillion-Edge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Three key issues arise when processing multi-hop queries on large-scale graphs using GPUs: the need for an efficient graph format, effective scheduling of accesses to graph partitions on storage, and dynamic buffer management on both the host and GPUs. To address these issues, we propose an efficient GPU-based out-of-memory multi-hop query processing framework called GFlux. |
Seyeon Oh; Heeyong Yoon; Donghyoung Han; Min-soo Kim; |
| 69 | TempSched: A Temperature-Aware Storage Scheduler for Time Series Across Cloud-Edge-Device Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although numerous research have studied hot and cold classification for relational data, these methods are not suitable for time series which has strong timeliness and complex access patterns. Therefore, in this paper, we present TempSched, a temperature-aware storage scheduler for time series across CED, which can identify hot and cold time series and predict data temperature efficiently to perform storage scheduling in advance. |
Shuangshuang Cui; Hongzhi Wang; Xianglong Liu; Xiaoou Ding; |
| 70 | Federated Trajectory Similarity Learning with Privacy-Preserving Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enable decentralized training and improved privacy, we propose a federated trajectory similarity learning framework that features privacy-preserving clustering based on a client-server architecture. |
Hao Miao; Ziqiao Liu; Yan Zhao; Kai Zheng; Yupu Zhang; Christian S. Jensen; |
| 71 | Path-Based Summary Explanations for Graph Recommenders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose summary explanations, i.e., explanations that highlight why a user or a group of users receive a set of item recommendations and why an item, or a group of items, is recommended to a set of users as an effective means to provide insights into the collective behavior of the recommender. |
Danae Pla Karidi; Evaggelia Pitoura; |
| 72 | Data Poisoning Attacks to Local Differential Privacy Protocols for Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we bridge the gap by demonstrating that an attacker can inject fake users into LDP protocols for graphs and design data poisoning attacks to degrade the quality of graph metrics. |
Xi He; Kai Huang; Qingqing Ye; Haibo Hu; |
| 73 | A Bargaining-Based Approach for Feature Trading in Vertical Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a bargaining-based feature trading approach in VFL to facilitate economically efficient transactions. |
Yue Cui; Liuyi Yao; Zitao Li; Yaliang Li; Keqin Zhong; Bingyi Liu; Bolin Ding; Xiaofang Zhou; |
| 74 | An Efficient Memoization Engine for Concurrent Graph Query Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present KGraph, a novel graph processing memoization engine to efficiently handle CGQs on large graphs by performing memoization on graphs. |
Sen Gao; Shengliang Lu; Shixuan Sun; Yuchen Li; Bingsheng He; |
| 75 | TeMatch: A Fast Temporal Subgraph Matching Framework with Temporal-Aware Subgraph Matching Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce TeMatch, a high-performance framework designed to be compatible with any enumeration-based solution for temporal subgraph matching. |
Chengying Huan; Heng Zhang; Yongchao Liu; Likang Chen; Xuran Wang; Yongchun Jiang; Shaonan Ma; Yanjun Wu; |
| 76 | Efficient Maximum Fair Clique Search Over Large Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The pruning techniques derived from these upper bounds can significantly trim unnecessary search space during the branch-and-bound procedure. Adding to this, we present a heuristic algorithm with a linear time complexity, employing both a degree-based greedy strategy and a colored degree-based greedy strategy to identify a larger relative fair clique. |
Qi Zhang; Rong-Hua Li; Zifan Zheng; Hongchao Qin; Ye Yuan; Guoren Wang; |
| 77 | Exact and Efficient Similar Subtrajectory Search: Integrating Constraints and Simplification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, SimSub may return a subtrajectory with extremely limited length, e.g., a single point, which may not align with the expectations of real-world applications. To solve this issue, we propose a constrained SimSub (cSimSub) problem, where the length of the returned subtrajectory must be greater than or equal to a user-specified integer $C$. |
Liwei Deng; Fei Wang; Tianfu Wang; Yan Zhao; Yuyang Xia; Kai Zheng; |
| 78 | A Deep Dive Into Protocol Design: How to Improve IPFS Performance Without Sacrificing Decentralization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct a series of experiments and analyses to identify the performance advantages and constraints associated with the IPFS decentralized protocol. |
Wenbin Zhu; Zhaoyan Shen; Mengying Zhao; Dongxiao Yu; Bingzhe Li; |
| 79 | Maximal Similar-Weight Biclique Enumeration for Large Bipartite Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the problem of maximal similar-weight biclique enumeration for large bipartite graphs. |
Jianye Yang; Lei Xing; Ziyi Ma; Xi Luo; Cuiyun Gao; Xuemin Lin; |
| 80 | Effective and General Distance Computation for Approximate Nearest Neighbor Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In particular, compared to ADSampling, our method achieves a speedup of 1.6 to 2.1 times on real-world datasets while providing higher accuracy. |
Mingyu Yang; Wentao Li; Jiabao Jin; Xiaoyao Zhong; Xiangyu Wang; Zhitao Shen; Wei Jia; Wei Wang; |
| 81 | Collaborative Imputation for Multivariate Time Series with Convergence Guarantee Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the collaborative imputation with the convergence guarantee. |
Yu Sun; Xinyu Yang; Shaoxu Song; Ying Zhang; Xiaojie Yuan; |
| 82 | DELRec: Distilling Sequential Pattern to Enhance LLMs-Based Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve the performance of LLMs-based SR, we propose a novel framework, Distilling Sequential Pattern to Enhance LLMs-based Sequential Recommendation (DELRec), which aims to extract knowledge from conventional SR models and enable LLMs to easily comprehend and utilize the extracted knowledge for more effective SRs. |
Haoyi Zhang; Guohao Sun; Jinhu Lu; Guanfeng Liu; Xiu Susie Fang; |
| 83 | Chameleon: Adaptive and Scalable Stream Processing Over Sensor Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One way of achieving this is to adapt data generation to the rate of changes in the real world. In this systems paper, we propose Chameleon, a sensor-driven protocol for network-efficient data management that treats sensors as first-class components of a stream processing system. |
Dimitrios Giouroukis; Varun Pandey; Steffen Zeuch; Volker Markl; |
| 84 | Facility Location for Fair and Equitable Query Results Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we develop methods to choose items that are highly representative of their surrounding data items, while still satisfying a social equity constraint. |
Sara Cohen; Helen Sternbach; |
| 85 | High Throughput Shortest Distance Query Processing on Large Dynamic Road Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing solutions can hardly handle high throughput queries on large dynamic road networks due to either slow query efficiency or poor dynamic adaption. In this paper, we leverage graph partitioning and propose novel Partitioned Shortest Path (PSP) indexes to address this problem. |
Xinjie Zhou; Mengxuan Zhang; Lei Li; Xiaofang Zhou; |
| 86 | Pilos: Scalable Large-Subgraph Matching By Online Spectral Filtering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Pilos,a novel matching algorithm that substantially improves the filtering phase of a typical matching algorithm and computes up to 60% fewer candidates for verification. |
Konstantinos Skitsas; Davide Mottin; Panagiotis Karras; |
| 87 | Updating An Adaptive Spatial Index Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we introduce GLIDE, a novel method that intertwines the adaptive indexing and incremental updating of a spatial-object data set. |
Fatemeh Zardbani; Konstantinos Lampropoulos; Nikos Mamoulis; Panagiotis Karras; |
| 88 | Trail: A Knowledge Graph-Based Approach for Attributing Advanced Persistent Threats Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate their utility for cyberattack attribution. |
Isaiah J. King; Ramiro Ramirez; Benjamin Bowman; H. Howie Huang; |
| 89 | Experimental Analysis of Multi-Step Pipelines for Fair Classifications – More Than The Sum of Their Parts? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We take a broader perspective, considering approaches in the context of a multi-step pipeline. |
Nico Lassig; Melanie Herschel; |
| 90 | BSG4Bot:Efficient Bot Detection Based on Biased Heterogeneous Subgraphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they still face limitations including the expensive training on large underlying graph, the performance degradation when “similar neighborhood patterns” assumption preferred by GNNs is not satisfied, and the distinguishable features of bots in a highly adversarial context. Motivated by these limitations, this paper proposes a method named BSG4Bot with an intuition that GNNs training on Biased SubGraphs can improve both performance and time/space efficiency in bot detection. |
Hao Miao; Zida Liu; Jun Gao; |
| 91 | MassBFT: Fast and Scalable Geo-Distributed Byzantine Fault-Tolerant Consensus Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents MassBFT, a Byzantine fault-tolerant geo-consensus protocol that achieves high performance and scalability. |
Zeshun Peng; Yanfeng Zhang; Tinghao Feng; Weixing Zhou; Xiaohua Li; Ge Yu; |
| 92 | CuckooGraph: A Scalable and Space-Time Efficient Data Structure for Large-Scale Dynamic Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel data structure for large-scale dynamic graphs called CuckooGraph. |
Zhuochen Fan; Yalun Cai; Zirui Liu; Jiarui Guo; Xin Fan; Tong Yang; Bin Cui; |
| 93 | Towards Lightweight Time Series Forecasting: A Patch-Wise Transformer with Weak Data Enriching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To contend with the two limitations, we propose LiPFormer, a novel Lightweight Patch-wise Transformer with weak data enriching. |
Meng Wang; Jintao Yang; Bin Yang; Hui Li; Tongxin Gong; Bo Yang; Jiangtao Cui; |
| 94 | A Length Enhanced B+-Tree Based Index for Efficient Set Similarity Query Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building upon LeB, we present an efficient algorithm, LeBQ, which leverages length filtering and symmetric difference allocation to determine the key bounds for a query, enabling the key bounds computation only once for each query $Q$ and avoiding costly similarity bounds computation in a node-wise manner. |
Lianyin Jia; Shiqi Luo; Jiaman Ding; Suprio Ray; Mengjuan Li; Xiuxing Li; |
| 95 | Know Your Account: Double Graph Inference-Based Account De-Anonymization on Ethereum Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel double graph-based Ethereum account de-anonymization inference method, dubbed DBG4ETH, which aims to capture the behavioral patterns of accounts comprehensively and has more robust analytical and judgment capabilities for current complex and continuously generated transaction behaviors. |
Shuyi Miao; Wangjie Qiu; Hongwei Zheng; Qinnan Zhang; Xiaofan Tu; Xunan Liu; Yang Liu; Jin Dong; Zhiming Zheng; |
| 96 | More Bang for Your Buck(et): Fast and Space-Efficient Hardware-Accelerated Coarse-Granular Indexing on GPUs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that all three problems can be tackled by a single design change: Generalizing RX to become a coarse-granular index cgRX, which no longer indexes individual keys, but key buckets. |
Justus Henneberg; Felix Schuhknecht; Rosina Kharal; Trevor Brown; |
| 97 | Structure-Preference Enabled Graph Embedding Generation Under Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Graph embedding generation techniques aim to learn low-dimensional vectors for each node in a graph and have recently gained increasing research attention. |
Sen Zhang; Qingqing Ye; Haibo Hu; |
| 98 | PGB: Benchmarking Differentially Private Synthetic Graph Generation Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose PGB (Private Graph Benchmark), a comprehensive benchmark designed to enable researchers to compare differentially private graph generation algorithms fairly. |
Shang Liu; Hao Du; Yang Cao; Bo Yan; Jinfei Liu; Masatoshi Yoshikawa; |
| 99 | Think Twice Before Imputation: Optimizing Data Imputation Order for Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A critical challenge within this context is establishing the optimal order for imputing a set of incomplete samples. To address this, we propose an iterative approach that strategically determines the imputation order based on the potential impact on model performance. |
Jiaxuan Zhang; Haitao Yuan; Jianing Si; Nan Jiang; Shangguang Wang; |
| 100 | BOS: Bit-Packing with Outlier Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to store both the upper and lower outliers separately, namely Bit-packing with Outlier Separation (BOS). |
Jinzhao Xiao; Zihan Guo; Shaoxu Song; |
| 101 | SAGE: A Framework of Precise Retrieval for RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is hard to make an ideal balance. In this paper, we introduce a RAG framework, named SAGE, designed to overcome these limitations. |
Jintao Zhang; Guoliang Li; Jinyang Su; |
| 102 | Searching Society Over Large Heterogeneous Information Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel model called society. |
Xuan Liu; Lu Chen; Chengfei Liu; Rui Zhou; |
| 103 | Efficient Dynamic Attributed Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fill the research gap, we introduce VRDAG, a novel variational recurrent framework for efficient dynamic attributed graph generation. |
Fan Li; Xiaoyang Wang; Dawei Cheng; Cong Chen; Ying Zhang; Xuemin Lin; |
| 104 | Computing Shapley Values in Preference Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper tackles the novel problem of computing Shapley values when multiple data owners collaborate to answer preference queries. |
Jiayao Zhang; Chirong Zhang; Jian Pei; Xuan Luo; Jianliang Xu; Jinfei Liu; |
| 105 | Interactive Search with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This may harm the long-term benefit, leading to a large number of rounds in the overall process. To address this, we propose two algorithms based on reinforcement learning, aiming to effectively improve the overall interaction process. |
Weicheng Wang; Victor Junqiu Wei; Min Xie; Di Jiang; Lixin Fan; Haijun Yang; |
| 106 | Bargaining-Based Data Markets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Gearing toward raw data trading, we propose a three-stage bar-gaining model to formulate trading dynamics, which ascertains the data price agreed by both sellers and buyers. |
Yuran Bi; Jinfei Liu; Kui Ren; Yihang Wu; Yang Cao; |
| 107 | HIGGS: HIerarchy-Guided Graph Stream Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a novel item-based, bottom-up hierarchical structure, called HIGGS. |
Xuan Zhao; Xike Xie; Christian S. Jensen; |
| 108 | Universal Set Similarity Search Via Multi-Task Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Set similarity search, as a foundational operation in data processing with diverse applications in different domains, has been extensively studied. |
Zhong Yang; Bolong Zheng; Guohui Li; Xi Zhao; Xiaofang Zhou; |
| 109 | Guiding Index Tuning Exploration with Potential Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel learning-based index advisor named GITEE, which increases tuning efficiency and effectiveness by intelligently guiding the exploration of the large search space on candidate index. |
Kecheng Luo; Ruiyang Ma; Peng Cai; Aoying Zhou; Zhiwei Ye; Dunbo Cai; Ling Qian; |
| 110 | Meta-Learning Based CTR Algorithm Selection and Hyperparameter Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, ordinary users fail to do so due to the lack of domain knowledge. In this paper, we remedy this deficiency by proposing AutoCTR, an efficient meta-learning based Combined Algorithm Selection and Hyperparameter Optimization (CASH) algorithm, to help non-expert users quickly find the best CTR model. |
Chunnan Wang; Junzhe Wang; Xiang Chen; Xintong Song; Tianyu Mu; Hongzhi Wang; |
| 111 | TabSketchFM: Sketch-Based Tabular Representation Learning for Data Discovery Over Data Lakes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present TabSketchFM, a neural tabular model for data discovery over data lakes. |
Aamod Khatiwada; Harsha Kokel; Ibrahim Abdelaziz; Subhajit Chaudhury; Julian Dolby; Oktie Hassanzadeh; Zhenhan Huang; Tejaswini Pedapati; Horst Samulowitz; Kavitha Srinivas; |
| 112 | All-in-One: Heterogeneous Interaction Modeling for Cold-Start Rating Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the explicit relations constructed based on data between different entities may be unreliable and irrelevant, which limits the performance ceiling of a specific recommendation task. Motivated by this, in this paper, we propose a flexible framework dubbed heterogeneous interaction rating network (HIRE). |
Shuheng Fang; Kangfei Zhao; Yu Rong; Jeffrey Xu Yu; Zhixun Li; |
| 113 | CDA: Cost-Sensitive Data Acquisition for Incomplete Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We start with a scenario where data records are available on a row-wise basis, which proves to be an NP-hard problem. To solve this problem, we introduce an efficient row-wise greedy algorithm (RGreedy), which approaches an approximation ratio of 1. |
Kaiyu Li; Xiaohui Yu; Jian Pei; |
| 114 | Hyper: Hybrid Physical Design Advisor with Multi-agent Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the extension of these efforts to the present problem has proven challenging due to 1) the larger search space of hybrid PD selections, 2) the inadequate consideration of the complex interactions between heterogeneous PDs, and 3) the inaccurate evaluation made by the what-if optimizer. To address these issues, we propose a Hybrid physical design advisor (Hyper) with multi-agent reinforcement learning. |
Zhicheng Pant; Yuanjia Zhang; Chengcheng Yang; Ahmad Ghazal; Rong Zhang; Huiqi Hui; Xiaoju Wu; Yu Dong; Xuan Zhou; |
| 115 | Learned Compression of Nonlinear Time Series with Random Access Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Furthermore, all these methods lack awareness of certain special regularities of time series, whose trends over time can often be described by some linear and nonlinear functions. To address these issues, we introduce NeaTS, a randomly-accessible compression scheme that approximates the time series with a sequence of nonlinear functions of different kinds and shapes, carefully selected and placed by a partitioning algorithm to minimise the space. |
Andrea Guerra; Giorgio Vinciguerra; Antonio Boffa; Paolo Ferragina; |
| 116 | MISS: An Incomplete Tabular Data Representation System with Missing Mechanism Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel incomplete tabular data representation system, named MISS. |
Yangyang Wu; Shuwei Liang; Lei Qiang; Xiaoye Miao; Xinkui Zhao; Junlan Cai; Yunjun Gaol; Jianwei Yin; |
| 117 | SIT: Selective Incremental Training for Dynamic Knowledge Graph Embedding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an efficient selective incremental training framework for DKGE, namely SIT. |
Zhifeng Jia; Hanmo Liu; Haoyang Li; Lei Chen; |
| 118 | DyFMVP: Say Goodbye to Staleness! Fresh Memory Vigorous Preserver for Continuous-Time Dynamic Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the effectiveness, memory-based methods suffer from two fatal fundamental flaws (i.e. long-standing staleness and memory offline update time issues), which are usually overlooked by constraints of rigid assumptions in recent works. To address the above flaws, in this work, we propose a novel Dynamic Fresh Memory Vigorous Preserver named DyFMVP, a well-designed state-traceable dual memory architecture for efficient anti-staling representation learning. |
Jianye Pang; Xinjie Zhu; Xiaofei Xiong; |
| 119 | MEST: An Efficient Authenticated Secondary Index in Blockchain Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose the first authenticated secondary index MEST for verifiable non-primary key queries. |
Jinping Jia; Yichen Gao; Yifei Zhen; Zhao Zhang; Qian Kun; Cheqing Jin; |
| 120 | Are We Wasting Time? A Fast, Accurate Performance Evaluation Framework for Knowledge Graph Link Predictors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we show that this approach has serious limitations since the ranking metrics produced do not properly reflect true outcomes. In this paper, we present a thorough analysis of these effects along with the following findings. |
Filip Cornell; Yifei Jin; Jussi Karlgren; Sarunas Girdzijauskas; |
| 121 | Towards Dynamic Boolean Range Query Over Hybrid-Storage Blockchains: A Secure and Reliably Verifiable Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To support dynamic queries with forward security, we propose an adaptive version-control update scheme to integrate into VKF. |
Ningning Cui; Dong Wang; Jianxin Li; Huaijie Zhu; Xiaochun Yang; Jianliang Xu; |
| 122 | TwCache: Thread-Wise Cache Management with High Concurrency Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This contention arises when multiple threads attempt to update the LRU list simultaneously. Motivated by this issue, we propose a new cache management scheme called twCache, designed to deliver high performance in concurrent environments. |
Yigui Yuan; Peiquan Jin; Xiaoliang Wang; |
| 123 | FreewayML: An Adaptive and Stable Streaming Learning Framework for Dynamic Data Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a shift graph based on the distances between data distributions and define three distinct data shift patterns. |
Zheng Qin; Zheheng Liang; Lijie Xu; Wentao Wu; Mingchao Wu; Wuqiang Shen; Wei Wang; |
| 124 | DataVisT5: A Pre-Trained Language Model for Jointly Understanding Text and Data Visualization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Data VisT5, a novel PLM tailored for DV that enhances the T5 architecture through a hybrid objective pre-training and multi-task fine-tuning strategy, integrating text and DV datasets to effectively interpret cross-modal semantics. |
Zhuoyue Wan; Yuanfeng Song; Shuaimin Li; Chen Jason Zhang; Raymond Chi-Wing Wong; |
| 125 | Training-Free Heterogeneous Graph Condensation Via Data Selection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The second is low efficiency, HGCond follows the existing GC methods designed for homogeneous graphs and leverages the sophisticated optimization paradigm, resulting in a time-consuming condensing procedure. In light of these challenges, we present the first Training Free Heterogeneous Graph Condensation method, termed FreeHGC, facilitating both efficient and high-quality generation of heterogeneous condensed graphs. |
Yuxuan Liang; Wentao Zhang; Xinyi Gao; Ling Yang; Chong Chen; Hongzhi Yin; Yunhai Tong; Bin Cui; |
| 126 | LORE: Learning-Based Resource Recommendation for Big Data Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Mapping queries to their resource consumption is a complex task. To tackle this challenge, we propose a novel learning-based query resource recommendation method called LORE. |
Yan Li; Liwei Wang; Bolong Zheng; Zhiyong Peng; |
| 127 | Extendible RDMA-Based Remote Memory KV Store with Dynamic Perfect Hashing Index Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel dynamic perfect hashing index without sacrificing associativity, and uses it to devise an RDMA-based remote memory KV store called CuckooDuo. |
Zirui Liu; Xian Niu; Wei Zhou; Yisen Hong; Zhouran Shi; Tong Yang; Yuchao Zhang; Yuhan Wu; Yikai Zhao; Zhuochen Fan; Bin Cui; |
| 128 | Towards Scalable and Efficient Graph Structure Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our review of the existing GSL literature, combined with empirical studies, reveals two primary limitations: low scalability and low efficiency. To mitigate these limitations, we introduce Random Walk-based Graph Structure Learning (RWGSL), a new GSL method that utilizes random walk strategies and operates in a parameter-free manner. |
Siqi Shen; Wentao Zhang; Chengshuo Du; Chong Chen; Fangcheng Fu; Yingxia Shao; Bin Cui; |
| 129 | Effective Task Assignment in Mobility Prediction-Aware Spatial Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate a problem we term Task Assignment in Mobility Prediction-aware Spatial Crowdsourcing (TAMP). |
Huiling Li; Yafei Li; Wei Chen; Shuo He; Mingliang Xu; Jianliang Xu; |
| 130 | IsGCL: Informative Sample-Aware Progressive Graph Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: 2) For informative negatives mining, most existing studies either overly emphasize hard negatives despite their potential unreliability, or rely on precise clustering pseudo-labels, which are error-prone especially in the early training stage. To solve the above challenges, we propose an informative sample-aware progressive graph contrastive learning framework, which filters both uninformative positives and negatives. |
Juxiang Zeng; Pinghui Wang; Linbo Ma; Jing Tao; Xiaohong Guan; |
| 131 | Columnar Formatted Inverted Index for Highly-Paralleled, Vectorized Query Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Driven by the findings, we propose to reconcile the in-memory index as columnar structures. |
Weichen Zhao; Minghao Zhao; Huiqi Hu; Weining Qian; |
| 132 | Rottnest: Indexing Data Lakes for Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Rottnest, a general system that builds additional lightweight indices on top of data lakes. |
Ziheng Wang; Sasha Krassovsky; Conor Kennedy; Alex Aiken; Weston Pace; Rain Jiang; Huayi Zhang; Chenyu Jiang; Wei Xu; |
| 133 | TDT: Tensor Based Directed Truss Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a truss decomposition framework based on tensors (TDT), which can leverage the parallelism of heterogeneous hardware backends to speed up the computation and seamlessly integrate with downstream graph ML tasks. |
Guojing Li; Yuanyuan Zhu; Junchao Ma; Ming Zhong; Tieyun Qian; Jeffrey Xu Yu; |
| 134 | Scalable Tabular Hierarchical Metadata Classification in Heterogeneous Structured Large-Scale Datasets Using Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Medical, security, data science research literature, Web tables, contain thousands of such complex tables, but often lack or incorrectly label their complex metadata. In this work, we describe an unsupervised, scalable, contrastive-learning approach for classification of multi-layer, hierarchical metadata in such tables. |
Bhimesh Kandibedala; Gyanendra Shrestha; Anna Pyayt; Todor Ivanov; Michael Gubanov; |
| 135 | UltraWiki: Ultra-Fine-Grained Entity Set Expansion with Negative Seed Entities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Entity Set Expansion (ESE) aims to identify new entities belonging to the same semantic class as the given set of seed entities. Traditional methods solely relied on positive seed … |
Yangning Li; Qingsong Lv; Tianyu Yu; Yinghui Li; Xuming Hu; Wenhao Jiang; Hai-Tao Zheng; Hui Wang; |
| 136 | Understanding and Estimating Error Propagation in Neural Networks for Scientific Data Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a comprehensive framework for optimizing neural network inference in scientific computing by combining data reduction and model quantization while maintaining error-controlled outcomes. |
Weiming He; Qi Chen; Qian Gong; Jing Li; Qing Liu; Norbert Podhorszki; Scott Klasky; Kisung Jung; Cristian Lacey; Jackie Chen; Hongjian Zhu; |
| 137 | Query Weak Equivalence and Its Verification in Analytical Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, for posed queries, we extract their filter condition expressions, which are then transformed into symbolic representations, namely first-order logic formulae. In terms of their partial order, i.e. containment relationship, we introduce Query Lattice, a novel structure that is constructed as a lattice which is partitioned into equivalence classes that are convex to answer queries if we determine they belong to the classes. |
Jinguo You; Wanting Fu; Yuxuan Wang; Peilei He; Kaiqi Liu; Quanqing Xu; |
| 138 | OSTOR: Online Scheduling Framework for Trading Continuous Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present OSTOR, the first online scheduling framework for trading continuous queries. |
Jin Cheng; Ningning Ding; John C.S. Lui; Jianwei Huang; |
| 139 | BL-Tree: The Best of Both Worlds By Combining B+- Tree on Top and LSM – Tree on Bottom Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the problem, we propose BL-Tree by replacing the shattered Level-0 in LSM-Tree with a B+-Tree in byte-addressable Persistent Memory (PM). |
Suzhen Wu; Zuocheng Wang; Shengzhe Wang; Jiahong Chen; Chunfeng Du; Ke Zhou; Jie Zhang; Bo Mao; |
| 140 | SIGMA: An Efficient Heterophilous Graph Neural Network with Fast Global Aggregation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SIGMA, an efficient global heterophilous GNN aggregation integrating the structural similarity measurement SimRank. |
Haoyu Liu; Ningyi Liao; Siqiang Luo; |
| 141 | LOVO: Efficient Complex Object Query in Large-Scale Video Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present LOVO, a novel system designed to efficiently handle compLex Object queries in large-scale VideO datasets. |
Yuxin Liu; Yuezhang Peng; Hefeng Zhou; Hongze Liu; Xinyu Lu; Jiong Lou; Chentao Wu; Wei Zhao; Jie Li; |
| 142 | AimTS: Augmented Series and Image Contrastive Learning for Time Series Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a two-level prototype-based contrastive learning method to effectively utilize various augmentations in multi-source pre-training, which learns representations for TSC that can be generalized to different domains. |
Yuxuan Chen; Shanshan Huang; Yunyao Cheng; Peng Chen; Zhongwen Rao; Yang Shu; Bin Yang; Lujia Pan; Chenjuan Guo; |
| 143 | WAF: An Efficient WebAssembly-Based Execution Environment for User-Defined Functions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This process involves data copying and data layout adjustments, which can significantly impact performance. To address these challenges, we present WAF, a WASM-based UDF execution environment. |
Zhuo Huang; Hao Fan; Junhui Peng; Qi Wu; Song Wu; Chen Yu; Hai Jin; Qiming Liu; Wei Yang; Shuo Yu; |
| 144 | PFedAFM: Adaptive Feature Mixture for Data-Level Personalization in Heterogeneous Federated Learning on Mobile Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, different data samples in one client may also have different features, which are often ignored, resulting in constrained model performances. To bridge this gap, we propose a novel model-heterogeneous personalized Federated learning approach with Adaptive Feature Mixture (pFedAFM) to achieve data-level personalization while maintaining efficient communication and computation. |
Liping Yi; Han Yu; Gang Wang; Xiaoguang Liu; Xiaoxiao Li; |
| 145 | LeSAX Index: A Learned SAX Representation Index for Time Series Similarity Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a learned index approach for TSSS. |
Guozhong Li; Byron Choi; Rundong Zuo; Sourav S Bhowmick; Jianliang Xu; |
| 146 | A-DARTS: Stable Model Selection for Data Repair in Time Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a new configuration-free system, A-DARTS (for Automated DAta Repair in Time Series), to automatically select the best imputation technique for a given faulty time series. |
Mourad Khayati; Guillaume Chacun; Zakhar Tymchenko; Philippe Cudré-Mauroux; |
| 147 | Hamava: Fault-tolerant Reconfigurable Geo-Replication on Heterogeneous Clusters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents heterogeneous and reconfigurable clustered replication for the general environment with arbitrary failures. |
Teias Mane; Xiao Li; Mohammad Sadoghi; Mohsen Lesani; |
| 148 | Adaptive Data and Task Joint Scheduling for Multi-Task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These differences are the source of intricate task gradient relationships and could further lead to varying degrees of impact from conflicts on tasks. To tackle these challenges, we propose DTJS, a novel adaptive Data and Task Joint Scheduling approach for MTL, which uniquely considers the influence of data within each task and the distinct task perception of gradient conflicts from an innovative scheduling perspective. |
Zeyu Liu; Heyan Chai; Chaoyang Li; Lingzhi Wang; Qing Liao; |
| 149 | Adaptive Local Clustering Over Attributed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Given a graph $\mathcal{G}$ and a seed node $v_{s}$, the objective of local graph clustering (LGC) is to identify a subgraph $\mathcal{C}_{s} \in \mathcal{G}$ (a.k.a. local … |
Haoran Zheng; Renchi Yang; Jianliang Xu; |
| 150 | Intervention-Driven Correlation Reduction: A Data Generation Approach for Achieving Counterfactually Fair Predictors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in the context of counterfactual fairness, existing methods for generating fair data are often limited in their applicability and lead to significant performance losses in downstream predictors. To address these issues, this paper proposes a new algorithm for generating counterfactually fair data, allowing predictors trained on this generated data to adhere to counterfactual fairness. |
Dehua Zhou; Bowei Wu; Ke Wang; Qifen Yang; Yuhui Deng; Siu-Ming Yiu; |
| 151 | Random Sampling Over Spatial Range Joins Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is because we must obtain random join samples without running spatial range joins. We address this challenging problem for the first time and aim at designing a time- and space-efficient algorithm. |
Daichi Amagata; |
| 152 | Triangle Counting Over Signed Graphs with Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate a new problem of developing triangle counting algorithms for signed graphs that adhere to centralized differential privacy and local differential privacy, respectively. |
Zening Li; Rong-Hua Li; Fusheng Jin; |
| 153 | DIFFODE: Neural ODE with Differentiable Hidden State for Irregular Time Series Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Neural Ordinary Differential Equations (NODEs) assume a continuous latent dynamic and provide an elegant framework for irregular time series analysis, yet they suffer from limitations like fragmented latent processes and the inability to fully exploit interdependencies among observations. To address these challenges, we propose a novel Differentiable hidden state enhanced neural ODE framework, termed DIFFODE, designed to effectively model irregular time series. |
Yudong Zhang; Xu Wang; Xuan Yu; Zhengyang Zhou; Xing Xu; Lei Bai; Yang Wang; |
| 154 | Towards Learning on Vertically Partitioned Data with Distributed Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Can we achieve privacy-utility trade-offs for VFL with DP comparable to the centralized setting, without trusting any party? In this paper, we make a significant step towards providing a positive answer to this question. |
Ergute Bao; Fei Wei; Yin Yang; Xiaokui Xiao; Tianyu Pang; Chao Du; |
| 155 | Efficient Indexing for Label-Constrained Cohesive Subgraph Queries Over Large Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an index-based method to address the problem of querying k-cores with label constraints in edge-labeled graphs. |
Xin Deng; Peng Peng; Chuanyu Liu; Xianyan Xie; Hui Zhou; Zheng Qin; |
| 156 | Featpilot: Automatic Feature Augmentation on Tabular Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Featpilot, a novel framework that explores and integrates high-quality features in tabular data for ML models. |
Jiaming Liang; Chuan Lei; Xiao Qin; Jiani Zhang; Asterios Katsifodimos; Christos Faloutsos; Huzefa Rangwala; |
| 157 | A-Tune-Online: Efficient and QoS-Aware Online Configuration Tuning for Dynamic Workloads Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose A-Tune-Online, an online configuration tuning system that tackles dynamic workloads, delivering superior tuning efficiency, and QoS guarantee simultaneously to a wide range of online scenarios. |
Yu Shen; Beicheng Xu; Yupeng Lu; Donghui Chen; Huaijun Jiang; Zhipeng Xie; Senbo Fu; Nan Zhang; Yuxin Ren; Ning Jia; Xinwei Hu; Bin Cui; |
| 158 | LIFTus: An Adaptive Multi-Aspect Column Representation Learning for Table Union Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, a significant amount of non-linguistic data, notably represented by domain-specific strings and numerical data in the data lake, are still under-explored in the existing methods. To address this issue, we propose LIFTus, an adaptive multi-aspect column representation for table unionable search, where aspect refers to a concept more flexible than data types, so that a single column can exhibit multiple aspects simultaneously. |
Ermu Qiu; Jun Gao; Yaofeng Tu; Jingru Yang; |
| 159 | PFSSL-D: Generalization Meets Personalization in Dual-Phase Federated Semi-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current FSSL methods are often hindered by an over-reliance on labeled data for initialization, high communication overhead, and suboptimal global model performance in heterogeneous data settings. To overcome these limitations, we propose pFSSL-D, a novel Dual-Phase Generalization and Personalization Pipeline designed to generate several models for unlabeled clients. |
Yuting Li; Wenhua Wang; Tian Wang; |
| 160 | SeSeMI: Secure Serverless Model Inference on Sensitive Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our goal is to design a serverless model inference system that protects models and user request data from untrusted cloud providers. |
Guoyu Hu; Yuncheng Wu; Gang Chen; Tien Tuan Anh Dinh; Beng Chin Ooi; |
| 161 | A Translation-Based Heterogeneous Graph Neural Network for Multiple Knowledge Graphs Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, few studies consider combining these approaches to model translation semantics of various orders. To fill this gap, we propose KG2HIN, a novel KG encoder, which innovatively views head entities, relations, and tail entities as three types of nodes, thereby transforming KGs into HINs (heterogeneous information networks). |
Yaming Yang; Zhuofeng Luo; Zhe Wang; Weigang Lu; Yiheng Lu; Ziyu Guan; Wei Zhao; Yuanhai Lv; |
| 162 | Enhancing Large-Scale Entity Alignment with Critical Structure and High-Quality Context Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: 3) How to address scenarios without alignment seeds? To tackle these challenges, we propose a novel method called ELsEA. |
Qian Zhou; Wei Chen; Li Zhang; Pengpeng Zhao; Jiajie Xu; Lei Zhao; |
| 163 | FedRoad: Secure and Efficient Road Network Queries Over Traffic Data Federation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing studies primarily focused on federated queries over structural data, which does not apply to non-structural road network queries prevalent in daily travel scenarios. To tackle this limitation, this paper proposes FedRoad, the first traffic data federation with secure and efficient road network shortest-path queries over it. |
Shuai Huang; Guoliang Li; Wei Zhou; |
| 164 | Towards Automatic and Efficient Prediction Query Processing in Analytical Database Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce PEPS, an end-to-end analytical database for automatic and efficient prediction query processing. |
Yuchen Peng; Zhongle Xie; Ke Chen; Gang Chen; Lidan Shou; |
| 165 | Backdoor Graph Condensation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an effective backdoor attack against graph condensation, termed BGC. |
Jiahao Wu; Ning Lu; Zeyu Dai; Kun Wang; Wenqi Fan; Shengcai Liu; Qing Li; Ke Tang; |
| 166 | Tailoring The Shapley Value for In-Context Example Selection Towards Data Wrangling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A problem yet to be explored is how to select the examples, to maximize task effectiveness given constraints on the size of the examples. To fill this gap, we introduce the constrained Shapley value (CSV), a tailored variant of the Shapley value with a constraint on the LLM prompt size, to guide example selection. |
Zheng Liang; Hongzhi Wang; Xiaoou Ding; Zhiyu Liang; Chen Liang; Yafeng Tang; Jianzhong Qi; |
| 167 | Imputing Sparse and Noisy Labels for GNNs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Label Boosting Rules (LBRs), which extend graded bisimilarity and embed ML labeling models as predicates. |
Wenfei Fan; Kehan Pang; Chao Tian; |
| 168 | CAM: Asynchronous GPU-Initiated, CPU-Managed SSD Management for Batching Storage Access Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose CAM, the first asynchronous GPU-initialized, CPU-managed SSD management for batching storage access. |
Ziyu Song; Jie Zhang; Jie Sun; Mo Sun; Zihan Yang; Zheng Zhang; Xuzheng Chen; Fei Wu; Huajin Tang; Zeke Wang; |
| 169 | PFCA: Efficient Path Filtering with Causal Analysis for Healthcare Risk Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, the joint relationships between patient features were not emphasized, which are highlighted in clinical practice. In this paper, we propose an efficient Path Filtering with Causal Analysis (PFCA) approach for enhanced healthcare risk prediction to address these challenges. |
Hao Wang; Jiyun Shi; Yuhao Chen; Haochen Xu; Chi Zhang; Zhaojing Luo; Meihui Zhang; |
| 170 | BFES: Towards Optimal Bayesian Frequency Estimation Sketches in Data-Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper studies the problem of sketch-based frequency estimation from a Bayesian statistics point of view which captures uncertainties regarding the frequencies of items in a more flexible and quantitative way compared to the state of the art. |
Francesco Da Dalt; Adrian Perrig; |
| 171 | Distributed Evaluation of Graph Queries Using Recursive Relational Algebra Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a method and its implementation Dist-Μ-RA for the optimized distributed evaluation of recursive relational algebraic terms. |
Sarah Chlyah; Pierre Genevés; Nabil Layaïda; |
| 172 | Efficient Frequency-Aware K-Core Query on Temporal Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the problem, we propose 1) a minimum slope algorithm for computing the frequency in linear time, 2) a space-efficient index that stores the distinct “core frequency” of vertices for addressing arbitrary queries, 3) a propagation algorithm that collects core frequencies by message passing for index construction, and 4) efficient algorithms for retrieving a specific or all skyline results from the index respectively. |
Zhongfan Du; Ming Zhong; Yuanyuan Zhu; Tieyun Qian; Mengchi Liu; Jeffrey Xu Yu; |
| 173 | Towards Accurate Distance Estimation for Distribution-Aware C-ANN Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we reformulate the c-ANN query from the perspective of data distribution. |
Liwei Deng; Penghao Chen; Ximu Zeng; Yuchen Fang; Jin Chen; Yan Zhao; |
| 174 | HAIDES: Adaptive Approximation of Inference Queries Over Unstructured Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present HAIDES, an index-based, domain-agnostic framework for approximating inference on unstructured data. |
Christos C. Papadopoulos; Alkis Simitsis; Torben Bach Pedersen; |
| 175 | Loom: A Deterministic Execution Framework Towards Nested Contract Transactions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Loom, a deterministic execution framework that enhances the efficiency of nested contract transactions. |
Huan Zhang; Xiaodong Qi; Haibo Tang; Zhao Zhang; Cheqing Jin; Aoying Zhou; |
| 176 | Querying Templatized Document Collections with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our key insight is that documents in a collection often follow similar templates that impart a common semantic structure. |
Yiming Lin; Madelon Hulsebos; Ruiying Ma; Shreya Shankar; Sepanta Zeighami; Aditya G. Parameswaran; Eugene Wu; |
| 177 | MaskSearch: Querying Image Masks at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we formalize the problem and propose MaskSearch, a system that focuses on accelerating queries over databases of image masks while guaranteeing the query result accuracy. |
Dong He; Jieyu Zhang; Maureen Daum; Alexander Ratner; Magdalena Balazinska; |
| 178 | GRACEFUL: A Learned Cost Estimator for UDFs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce GRACEFUL, a novel learned cost model to make accurate cost predictions of query plans with UDFs enabling optimization decisions for UDFs in DBMS. |
Johannes Wehrstein; Tiemo Bang; Roman Heinrich; Carsten Binnig; |
| 179 | Fast and Exact Similarity Search in Less Than A Blink of An Eye Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present the SymbOlic Fourier Approximation index (SOFA), which implements fast, exact similarity queries. |
Patrick Schäfer; Jakob Brand; Ulf Leser; Botao Peng; Themis Palpanas; |
| 180 | Indexing Labeled Property Multidigraphs in Entropy Space, with Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the static data case and propose a novel self-index, called CGraphIndex, to compress and index labeled property multidigraphs that for the first time achieves the high-order entropy space for multidigraph properties (the dominant term in practice) and the 1st-order graph entropy for multidigraph structures. |
Hongwei Huo; Yongze Yu; Zongtao He; Jeffrey Scott Vitter; |
| 181 | On Temporal-Constraint Subgraph Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper addresses the challenge of identifying subgraphs that not only structurally align with a given query graph but also satisfy specific temporal-constraints on the edges. We introduce three novel algorithms to tackle this issue: the TCSM-V2V algorithm, which uses a vertex-to-vertex expansion strategy and effectively prunes non-matching vertices by integrating both query and temporal-constraints into a temporal-constraint query graph; the TCSM-E2E algorithm, which employs an edge-to-edge expansion strategy, significantly reducing matching time by minimizing vertex permutation processes; and the TCSM-EVE algorithm, which combines edge-vertex-edge expansion to eliminate duplicate matches by avoiding both vertex and edge permutations. |
Xiaoyu Leng; Guang Zeng; Hongchao Qin; Longlong Lin; Rong-Hua Li; |
| 182 | GCON: Differentially Private Graph Convolutional Network Via Objective Perturbation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing solutions either perturb the graph topology or inject randomness into the graph convolution operations, or overestimate the amount of noise required, resulting in severe distortions of the network’s message aggregation and, thus, poor model utility. Motivated by this, we propose GCON, a novel and effective solution for training GCNs with edge differential privacy. |
Jianxin Wei; Yizheng Zhu; Xiaokui Xiao; Ergute Bao; Yin Yang; Kuntai Cai; Beng Chin Ooi; |
| 183 | Simple Yet Effective Node Property Prediction on Edge Streams Under Distribution Shifts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SPLASH, a simple yet powerful method for predicting node properties on edge streams under distribution shifts. |
Jongha Lee; Taehyung Kwon; Heechan Moon; Kijung Shin; |
| 184 | CloudyBench: A Testbed for A Comprehensive Evaluation of Cloud-Native Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new testbed for cloud-native databases, named CloudyBench. |
Chao Zhang; Guoliang Li; Leyao Liu; Tao Lv; Ju Fan; |
| 185 | MuSha: Subgraph Matching By Multilevel Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Real-world patterns used in graph analysis are often symmetric and contain isomorphic substructures, but existing SM algorithms fail to explore such properties. To fill this gap, we propose MuSha, a multi-objective optimization framework for SM, leveraging multilevel sharing of isomorphic substructure results to speed up SM and symmetry breaking to avoid directly computing symmetric results. |
Hongtai Cao; Qihao Wang; Xiaodong Li; Mohammad Matin Najafi; Kevin Chen-Chuan Chang; Reynold Cheng; |
| 186 | $t$DCDiscover: Mining Threshold Denial Constraints from Time Series Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Denial constraints are vital in data quality management, but traditional mining algorithms struggle with time series data. To address this, we introduce a novel data quality rule, threshold Denial Constraints ($t$DCs), which enables predicate scaling in numerical contexts. |
Xiaoou Ding; Muyun Zhou; Yida Liu; Zekai Qian; Chen Wang; Hongzhi Wang; Jianmin Wang; |
| 187 | FedEcover: Fast and Stable Converging Model-Heterogeneous Federated Learning with Efficient-Coverage Submodel Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce FedEcover, a model-heterogeneous framework to learn a fast and stable converging global model in challenging scenarios with dual heterogeneity of data and client capacity. |
Juntao Liang; Lan Zhang; Xiangmou Qu; Jun Wang; |
| 188 | Privacy-Preserving Triangle Counting in Directed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose a new centralized differentially private algorithm that adds Laplacian noise to the exact numbers by analyzing global sensitivity. |
Ziyao Wei; Qing Liu; Zhikun Zhang; Shouling Ji; Yunjun Gao; |
| 189 | The Most Influenced Community Search on Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address a novel problem in social network analysis: the Most Influenced Community Search (MICS). |
Xueqin Chang; Qing Liu; Yunjun Gao; Baihua Zheng; Yi Cai; Qing Li; |
| 190 | Orthrus: Accelerating Multi-BFT Consensus Through Concurrent Partial Ordering of Transactions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Orthrus, a Multi-BFT protocol that accelerates transaction confirmation through partial ordering while reserving global ordering for transactions requiring stricter sequencing. |
Hanzheng Lyu; Shaokang Xie; Jianyu Niu; Ivan Beschastnikh; Yinqian Zhang; Mohammad Sadeghi; Chen Feng; |
| 191 | Efficient $k$-Truss Breaking and Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the $k$-truss breaking problem (TBP) that aims to find the smallest set of edges whose removal makes the graph free of $k$-truss. |
Ruicheng Zhu; Xintong Wang; Kai Wang; Fan Zhang; Zhengping Qian; Long Yuan; |
| 192 | Towards Online Spatio-Temporal Prediction: A Knowledge Distillation Driven Continual Learning Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, the single-pass nature of online data, combined with the constrained resources of online environments, highlights the urgent need for more efficient and lightweight solutions for online spatio-temporal prediction. To address these challenges, we propose Storm, a knowledge distillation driven continual learning framework. |
Tinghui Luo; Ziquan Fang; Kaixuan Duan; Lu Chen; Panpan Feng; Mingfan Lu; |
| 193 | Defending Against Attribute Inference Attacks in Post-Training of Recommendation Systems Via Unlearning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing attribute protection methods are primarily applied during training, suffering from significant limitations, such as architectural inflexibility, dependence on interaction data, and potential catastrophic degradation in recommendation performance. To overcome these challenges, we propose AttrCloak, an efficient and effective post-training attribute unlearning (AU) framework that removes sensitive information from user embeddings without altering RS training architectures. |
Wenhan Wu; Yili Gong; Jiawei Jiang; Chuang Hu; Xiaobo Zhou; Dazhao Cheng; |
| 194 | Truss Decomposition Under Edge Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To obtain more accurate estimates, we propose the Local algorithm that leverages the local information during the truss decomposition process. |
Yuting Zhang; Wei Ni; Kai Wang; Yizhang He; Conggai Li; |
| 195 | Boosting with Fewer Tokens: Multi-Query Optimization for LLMs Using Node Text and Neighbor Cues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, this paper offers a fresh methodology for optimizing LLM processing of graph tasks, demonstrating great potential. |
Yujie Fang; Xin Li; Yuangang Pan; Xin Huang; Ivor W. Tsang; |
| 196 | Fast Private Retrieval on Key-Value Store with Multiple Values Per Key Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We are the first to extend the setting that keys in the store may appear with different values multiple times. To solve this problem, we propose FEDPIR, a fast single-server keyword PIR protocol that supports querying a large-scale key-value store with multiple values per key. |
Fangming Dong; Pinghui Wang; Yuance Wang; Chen Zhang; Lizhen Cui; |
| 197 | Sustainability-Oriented Task Recommendation in Spatial Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we consider a novel problem of sustainable task recommendation in SC, which aims to minimize the environmental footprint (i.e., pollution) while maintaining acceptable levels of task completion, worker satisfaction, and overall task recommendation efficiency. |
Jinwen Chen; Hao Miao; Dazhuo Qiu; Jiannan Guo; Yawen Li; Yan Zhao; |
| 198 | FeVisQA: Free-Form Question Answering Over Data Visualizations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new task named FeVisQA, referring to Free-form Question Answering over data Visualizations. |
Yuanfeng Song; Jinwei Lu; Yuanwei Song; Caleb Chen Cao; Raymond Chi-Wing Wong; Haodi Zhang; |
| 199 | Interpretable Video Based Stress Detection with Self-Refine Chain Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most models function as black boxes, lacking transparency in their decision-making process, which hinders their trustworthiness. To address this, we propose an interpretable video-based stress detection model that incorporates Chain-of-Thought (CoT) reasoning of large foundation models. |
Yi Dai; Yang Ding; Lei Cao; Kaisheng Zeng; Junrui Tian; Zexi Lin; Ling Feng; |
| 200 | Efficient Execution of SPARQL Queries with OPTIONAL and UNION Expressions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose techniques for optimizing SPARQL-UO queries using BGP execution as a building block, based on a novel BGP-based Evaluation (BE)-Tree representation of query plans. |
Yue Pang; Lei Zou; M. Tamer Özsu; Jiaqi Chen; |
| 201 | Federated Data Analytics with Differentially Private Density Estimation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches, such as output perturbation that adds noise to query results based on differential privacy, often suffer from degraded accuracy due to cumulative privacy budget consumption. In this paper, we introduce ADAPT, a novel framework that addresses this problem by training a privacy-preserving density model over decentralized data. |
Jiayi Wang; Lei Cao; Chengliang Chai; Guoliang Li; |
| 202 | Indexing Strings with Utilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: (3) We propose a novel space-efficient algorithm for estimating the set of the top- $K$ frequent substrings of $S$, thus improving the construction space of the data structure for USI. |
Giulia Bernardini; Huiping Chen; Alessio Conte; Roberto Grossi; Veronica Guerrini; Grigorios Loukides; Nadia Pisanti; Solon P. Pissis; |
| 203 | FedSDP: Federated Self-Derived Prototypes for Personalized Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A personalized layer decouples the body and head, strengthening the generalization and personalization, respectively. Based on this architecture, this study proposes a new PFL framework, Federated Self-Derived Prototypes (FedSDP), to dynamically balance personalization and generalization. |
Jihoon Moon; Ling Liu; Hyuk-Yoon Kwon; |
| 204 | Hounding Data Diversity: Towards Participant Selection in Vertical Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the participant selection problem (PSP) for VFL, which chooses a given number of participants to conduct training while maximizing model accuracy. |
Xiaokai Zhou; Xiao Yan; Fangcheng Fu; Xinyan Lit; Hao Huang; Quanqing Xu; Chuanhui Yang; Bo Du; Tieyun Qian; Jiawei Jiang; |
| 205 | Autumn: A Scalable Read Optimized LSM-Tree Based Key-Value Stores with Fast Point and Range Reads Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present Autumn, a scalable and read-optimized LSM-tree based key-value store with near-optimal worst-case point and range read costs. |
Fuheng Zhao; Zach Miller; Leron Reznikov; Divyakant Agrawal; Amr El Abbadi; |
| 206 | Optimizing Multi-Center Collaboration for Task Assignment in Spatial Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In particular, we propose an Iterative Multi-center Task Assignment and Optimization (IMTAO) framework. |
Ximu Zeng; Jianxing Lin; Liwei Deng; Yuchen Fang; Yan Zhao; Kai Zheng; |
| 207 | KnowTrans: Boosting Transferability of Data Preparation LLMs Via Knowledge Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, transferring DP-LLMs to novel datasets and tasks typically requires a substantial amount of labeled data, which is impractical in many real-world scenarios. To address this, we propose a knowledge augmentation framework for data preparation, dubbed KNOWTRANS. |
Yuhang Ge; Fengyu Li; Yuren Mao; Yanbo Yang; Congcong Ge; Zhaoqiang Chen; Jiang Long; Yunjun Gao; |
| 208 | No Rule Is Forever: Datalog Reasoning with Rule Amendments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Zodiac, a method for reasoning under rule amendments, and ZodiacEdge, a system implementing this method. |
Weiqin Xu; Riccardo Tommasini; Olivier Curé; |
| 209 | White-Box Micro-Adaptive Query Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, statically optimized plans fail to adapt when data characteristics vary within a table. To address these problems, we propose a hazardadaptive approach to query execution. |
Jack Pearce; Hubert Mohr-Daurat; Holger Pirk; |
| 210 | TrajEdge: An Efficient and Lightweight Trajectory Data Analysis Framework in Edge Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These include resource constraints, dynamic network conditions, and inefficient query handling, leading to sub-optimal performance in edge scenarios. To fill this gap, we propose TrajEdge, an efficient and lightweight framework for trajectory data analysis in edge environments. |
Changhao He; Ziquan Fang; Linsen Li; Yunjun Gao; |
| 211 | CaliEX: A Disk-Based Large-Scale GNN Training System with Joint Design of Caching and Execution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these systems either overlook the unique data characteristics of GNN workloads when designing cache plans or fail to fully exploit the multilevel hierarchy of storage and computation in system execution, thus resulting in disk I/O bottleneck and resource under-utilization. To address these issues, we present CaliEX, an advanced disk-based GNN system that employs joint optimizations of caching and execution within and across different training stages. |
Can Su; Haipeng Zhang; Hanyu Zhao; Wenting Shen; Baole Ai; Yong Li; Kaigui Bian; Bin Cui; |
| 212 | Efficient Data Valuation Approximation in Federated Learning: A Sampling-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: After that, we identify a phenomenon termed key combinations, where only limited dataset combinations have a high-impact on final data value. Building on these insights, we propose a practical approximation algorithm, IPSS, which strategically selects high-impact dataset combinations rather than evaluating all possible combinations, thus substantially reducing time cost with minor approximation error. |
Shuyue Wei; Yongxin Tong; Zimu Zhou; Tianran He; Yi Xu; |
| 213 | CrossST: An Efficient Pre-Training Framework for Cross-District Pattern Generalization in Urban Spatio-Temporal Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose CrossST, an efficient pre-training framework designed to capture universal spatio-temporal patterns across large-scale, cross-district scenarios. |
Aoyu Liu; Yaying Zhang; |
| 214 | A Zero-Training Error Correction System with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a zero-training and interpretable EC system, named ZeroEC, that leverages large language models (LLMs) to generate chain-of-thoughts (CoTs) and correction rules for EC, without the need for model training. |
Yangyang Wu; Chen Yang; Mengying Zhu; Xiaoye Miao; Wei Ni; Meng Xi; Xinkui Zhao; Jianwei Yin; |
| 215 | ChainsFormer: Numerical Reasoning on Knowledge Graphs From A Chain Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these approaches often fail to fully leverage the potential of logical paths within the graph, limiting their effectiveness in exploiting the reasoning process. To address these limitations, we propose ChainsFormer, a novel chain-based framework designed to support numerical reasoning. |
Ze Zhao; Bin Lu; Xiaoying Gan; Gu Tang; Luoyi Fu; Xinbing Wang; |
| 216 | Local-to-Cloud Database Synchronization Via Fine-Grained Hybrid Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This can result in data compression rates failing to align well with network bandwidth, causing data to wait for compression or transmission, thereby leading to inferior performance. To address the above issues, we propose a fine-grained hybrid adaptive compression system that (1) parses binlog files into multiple fine-grained blocks, and (2) applies a hybrid combination of multiple compression methods to seamlessly align compression speed with the network bandwidth. |
Guoying Zhu; Haipeng Dai; Kang Yuan; Qian Wang; Lida Chen; Zhenghong Luo; Meng Li; Rong Gu; Xizi Ni; Hua Fan; Dachao Fu; Wenchao Zhou; |
| 217 | TardySketch: A Framework for Cardinality Estimation Adaptable to Sliding Windows Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing sliding cardinality estimation methods suffer from a cardinality barrel-down problem caused by unexpired item elimination in advance and item excessive removal, which remains unresolved so far. In this paper, we propose TardySketch, a sketch framework to make sliding cardinality estimation accurate and efficient by solving the above problem. |
Xuyang Jing; Qinghua Cao; Chenhao Zhang; Zheng Yan; Wenxiu Ding; Witold Pedrycz; Pu Wang; |
| 218 | A Sketch Propagation Framework for Hub Queries on Unmaterialized Relational Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel sketch propagation framework for approximate hub queries in induced relational graphs that avoids explicitly materializing those graphs. |
Yudong Niu; Yuchen Li; Panagiotis Karras; Yanhao Wang; |
| 219 | Privacy-Preserving Approximate Nearest Neighbor Search on High-Dimensional Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing PP-ANNS solutions fall short of meeting the requirements of data privacy, efficiency, accuracy, and minimal user involvement concurrently. To tackle this challenge, we introduce a novel solution that primarily executes PP-ANNS on a single cloud server to avoid the heavy communication overhead between the cloud and the user. |
Yingfan Liu; Yandi Zhang; Jiadong Xie; Hui Li; Jeffrey Xu Yu; Jiangtao Cui; |
| 220 | Hypersistent Sketch: Enhanced Persistence Estimation Via Fast Item Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce the Hypersistent Sketch, an algorithm that significantly enhances persistence estimation through innovative filtering techniques. |
Lu Cao; Qilong Shi; Weiqiang Xiao; Nianfu Wang; Wenjun Li; Zhijun Li; Weizhe Zhang; Mingwei Xu; |
| 221 | Timestamp Approximate Nearest Neighbor Search Over High-Dimensional Vector Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the timestamp graph, a novel structure that supports rapid index updates while minimizing storage costs. |
Yuxiang Wang; Ziyuan He; Yongxin Tong; Zimu Zhou; Yiman Zhong; |
| 222 | Accurate and Efficient Multivariate Time Series Forecasting Via Offline Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods, particularly those based on Transformer architectures, compute pairwise dependencies across all time steps, leading to a computational complexity that scales quadratically with the length of the input. To overcome these challenges, we introduce the Forecaster with Offline Clustering Using Segments (FOCUS), a novel approach to MTS forecasting that simplifies long-range dependency modeling through the use of prototypes extracted via offline clustering. |
Yiming Niu; Jinliang Deng; Lulu Zhang; Zimu Zhou; Yongxin Tong; |
| 223 | MultiRAG: A Knowledge-Guided Framework for Mitigating Hallucination in Multi-Source Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These challenges manifest primarily in two aspects: the sparse distribution of multi-source data that hinders the capture of logical relationships and the inherent inconsistencies among different sources that lead to information conflicts. To address these challenges, we propose MultiRAG, a novel framework designed to mitigate hallucination in multi-source retrieval-augmented generation through knowledge-guided approaches. |
Wenlong Wu; Haofen Wang; Bohan Li; Peixuan Huang; Xinzhe Zhao; Lei Liang; |
| 224 | Interactive Learning for Diverse Top-k Set Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the TDIA algorithm that is asymptotically optimal regarding the user effort needed for interaction. |
Weicheng Wang; Raymond Chi-Wing Wong; Jinyang Li; H.V. Jagadish; |
| 225 | Efficient Maximum Balanced K-biplex Search Over Bipartite Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its utility, the MBBC model suffers from practical limitations: its strict all-to-all connectivity and exact size-equality requirements make it impractical for noisy, incomplete real-world bipartite data. To overcome these limitations, we propose the maximum balanced k-biplex (MBKBP) model, which relaxes the stringent requirements of MBBC. |
Long Yuan; Junyue Xu; Zi Chen; Chuan Ma; Jianqiu Xu; Lu Qin; |
| 226 | MARIOH: Multiplicity-Aware Hypergraph Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, practical constraints often lead to their simplification into projected graphs, resulting in substantial information loss and ambiguity in representing higher-order relationships. In this work, we propose MARIOH, a supervised approach for reconstructing the original hypergraph from its projected graph by leveraging edge multiplicity. |
Kyuhan Lee; Geon Lee; Kijung Shin; |
| 227 | ZeroED: Hybrid Zero-Shot Error Detection Through Large Language Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose ZeroED, a novel hybrid error detection framework, which combines LLM reasoning ability with the machine learning pipeline via zero-shot prompting. |
Wei Ni; Kaihang Zhang; Xiaoye Miao; Xiangyu Zhao; Yangyang Wu; Yaoshu Wang; Jianwei Yin; |
| 228 | On Scalable Query Pricing in Data Marketplaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel arbitrage-free and scalable pricing framework ARIA to calculate the prices for various query types in linear time, including select-project-join and simple aggregate (SPJA) queries. |
Huanhuan Peng; Xiaoye Miao; Yicheng Fu; Jinshan Zhang; Shuiguang Deng; Jianwei Yin; |
| 229 | The SpaceSaving± Family of Algorithms for Data Streams with Bounded Deletions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an advanced analysis of near optimal algorithms that use limited space to solve the frequency estimation, heavy hitters, frequent items, and top-k approximation in the bounded deletion model. |
Fuheng Zhao; Divyakant Agrawal; Amr El Abbadi; Claire Mathieu; Ahmed Metwally; Michel de Rougemont; |
| 230 | Efficient Multivariate Time Series Forecasting Via Calibrated Language Models with Privileged Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the deployment of LLMs often suffers from low efficiency during the inference phase. To address this problem, we introduce TimeKD, an efficient MTSF framework that leverages the calibrated language models and privileged knowledge distillation. |
Chenxi Liu; Hao Miao; Qianxiong Xu; Shaowen Zhou; Cheng Long; Yan Zhao; Ziyue Li; Rui Zhao; |
| 231 | Historically Relevant Event Structuring for Temporal Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These inadequacies restrict representation ability to reflect historical dependencies and future trends thoroughly. To overcome these drawbacks, we propose an innovative TKG reasoning approach towards Historically Relevant Events Structuring (HisRES). |
Jinchuan Zhang; Ming Sun; Chong Mu; Jinhao Zhang; Quanjiang Guo; Ling Tian; |
| 232 | NRP: An Efficient Index for Stochastic Routing in Road Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an efficient index-based solution for RSP queries, called Non-dominated Reliable Path (NRP). |
Libin Wang; Raymond Chi-Wing Wong; |
| 233 | A Just-In-Time Framework for Routing-Oriented Traffic Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel Just-In-Time Traffic Prediction framework that integrates traffic condition with routing queries for efficient localized predictions in multi-query urban environments. |
Jing Zhao; Lei Li; Mengxuan Zhang; Haolun Ma; Xiaofang Zhou; |
| 234 | Fast Maximization of Current Flow Group Closeness Centrality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the widespread applications of identifying crucial nodes, we investigate the problem of maximizing CFCC for a node group $S$ subject to the cardinality constraint $\vert S \vert =k< |
Haisong Xia; Zhongzhi Zhang; |
| 235 | Towards Robust Trajectory Embedding for Similarity Computation: When Triangle Inequality Violations in Distance Metrics Matter Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing Euclidean-based trajectory embeddings often face challenges due to the triangle inequality constraints that do not universally hold for trajectory data. To address this issue, this paper introduces a novel approach by incorporating non-Euclidean geometry, specifically hyperbolic space, into trajectory representation learning. |
Jianing Si; Haitao Yuan; Nan Jiang; Minxiao Chen; Xiao Ma; Shangguang Wang; |
| 236 | Clique Comparator: A Fundamental Operator for Finding A Concise Clique Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we identify the challenge of summary updating to be how to efficiently estimate the overlap between cliques in the summary and each newly found maximal clique. |
Xiaofan Li; Rui Zhou; Lu Chen; Chengfei Liu; |
| 237 | Differentially Private Triangle Counting Assisted By $k$-Anonymity in Two-Party Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The central model cannot be applied when graph data is distributed among multiple parties without a trusted central server, while the local model provides unsatisfactory performance. In this paper, we explore a two-party scenario where each party holds private information about a group of users and is not allowed to disclose this information to the other party. |
Tingxuan Han; Wei Tong; Sheng Zhong; |
| 238 | Towards A Unified Query Plan Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an exploratory case study to investigate query plan representations in nine widely-used database systems. |
Jinsheng Ba; Manuel Rigger; |
| 239 | Ultra-Flexible, Explainable, and Scalable Traffic Prediction with Dynamic Future Routes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, mainstream deep learning frameworks, which rely heavily on historical data, often struggle in realworld applications due to their inadaptability to dynamic future changes, neglect of future traffic flow as the root cause of traffic conditions, and the complexity of model structures for city-scale road networks. To solve these limitations, we propose a Route Data Management System (RouteSys) that integrates a macroscopic simulation module with lightweight traffic prediction models to estimate the future traffic conditions on individual road segments by accurately and efficiently simulating vehicle travel sequences and traffic states in advance. |
Zizhuo Xu; Lei Li; Mengxuan Zhang; Yehong Xu; Xiaofang Zhou; |
| 240 | CLEAR: A Parser-Independent Disambiguation Framework for NL2SQL Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To bridge the gap, this paper introduces the CLEAR framework, a systematic study of disambiguation for NL2SQL, including ambiguity detection, clarification, and reformulation, which benefits any NL2SQL parsers. |
Meng Zhang; Kexin Ma; Liyang Xu; Kedi Zhang; Yuanxi Peng; Ruochun Jin; |
| 241 | Finding Near-Optimal Maximum Set of Disjoint $k$-Cliques in Real-World Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study a new problem, that finds a maximum set of disjoint $k$-cliques in a given large real-world graph with a user-defined fixed number $k$, which can contribute to a good performance of teaming collaborative events in online games. |
Xin Chen; Wenqing Lin; Haoxuan Xie; Sibo Wang; Siqiang Luo; |
| 242 | Hybrid DRAM-NVM R-Trees with Consistency Guarantee Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To reduce DRAM consumption, which mainly depends on the metadata size of a leaf node, we present a shared byte strategy to abolish restrictions on metadata size while still keeping HR-Tree consistency. |
Kaiqi Zhang; Chengyou Shen; Siyuan Zhang; Shengfei Shi; Hong Gao; Yaofeng Tu; Jianzhong Li; |
| 243 | Leveraging Heterogeneous Experts with Advantageous Pattern Memory Learning for Traffic Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, a single modeling approach often struggles to excel across diverse traffic patterns due to the inherent complexities and external influences in traffic scenarios. To address these issues, we propose a method named Memory-enhanced Heterogeneous Mixture of Experts (MH-MoE), which leverages memory-enhanced gating to integrate multiple pretrained models. |
Yueyang Yao; Xingyuan Dai; Yisheng Lv; |
| 244 | Structure and Position-Aware Graph Modeling for Trajectory Similarity Computation Over Road Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the trajectory similarity learning over road networks. |
Peilun Yang; Hanchen Wang; Zhangyi Xu; Zhengping Qian; Yongheng Wang; Ying Zhang; |
| 245 | OMeGa: Boosting Large-scale Graph Embeddings with Heterogeneous Memory Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, the inherent sparsity of graphs induces numerous random accesses in the fundamental Sparse Matrix and Dense Matrix Multiplication (SpMM) operations of graph embedding, hindering high-performance heterogeneous memory processing. To address these challenges, this paper presents OMeGa that focuses on Optimizing heterogeneous Memory processing for large-scale Graph embedding. |
Peng Fang; Siqiang Luo; Fang Wang; Bolong Zheng; Hong Jiang; Dan Feng; Hechang Pan; Xingyu Wan; |
| 246 | Enhance Stability of Network By Edge Anchor Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce and explore the anchor trussness reinforcement problem to reinforce the overall user engagement of networks by anchoring some edges. |
Hongbo Qiu; Renjie Sun; Chen Chen; Xiaoyang Wang; |
| 247 | Machine Learning Inference Pipeline Execution Using Pure SQL Based on Operator Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this strategy may suffer from the combination explosion on search space for complex ML pipelines. To reduce this space, we propose a greedy-based strategy by exploiting the independence among ML operators. |
Qingfeng Pan; Jiahe Zhi; Chenyang Zhang; Chen Xu; Zhao Zhang; Anita Shao; Guanglei Bao; Qiu Cui; Xiaowei Chen; Aoying Zhou; |
| 248 | Space-Efficient Compact Representations for Graph Analytics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To minimize the sizes of these representations, we mathematically formulate two graph reordering problems, MUIP and MHVIP, and provide an NPhardness analysis. To solve these problems, we propose a spaceefficient edge-dropping framework, which, powered by a weightpriority approach, offers approximation ratio guarantees. |
Boyu Yang; Weiguo Zheng; Xiang Lian; Lingfei Zheng; |
| 249 | KARMAD: KAN-Based Adversarial Robust Model for Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods face significant challenges, including imbalanced data, lack of labeled samples, reliance on prior knowledge, poor generalization across diverse industrial scenarios, and low sensitivity to subtle anomalies. To address these limitations, we propose KARMAD, a novel framework that integrates Kolmogorov-Arnold Networks (KANs) for bidirectional function learning, adversarial training to enhance sensitivity to minor anomalies, and an adaptive thresholding strategy for improved precision and transferability. |
Fangke Chen; Xiaotian Qiu; Yihan Ye; Ruyue Jing; Yining Chen; Dawei Gao; |
| 250 | Analyzing and Optimizing Perturbation of DP-SGD Geometrically Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis reveals that, in terms of a perturbed gradient, only the noise on direction has eminent impact on the model efficiency while that on magnitude can be mitigated by optimization techniques, i.e., fine-tuning gradient clipping and learning rate. |
Jiawei Duan; Haibo Hu; Qingqing Ye; Xinyue Sun; |
| 251 | FrontOrder: Frontier-Guided Graph Reordering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On top of the feature vectors, we propose FrontOrder, which has a customized distance metric to characterize the locality between different vertices and leverages $K$-means to cluster vertices with high locality to guide graph reordering. |
Xinmiao Zhang; Cheng Liu; Shengwen Liang; Chenwei Xiong; Yu Zhang; Lei Zhang; Huawei Li; Xiaowei Li; |
| 252 | PrivIM: Differentially Private Graph Neural Networks for Influence Maximization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is because IM requires more complex structural information for training, resulting in an extremely larger DP noise scale than node-level tasks. To tackle these issues, we propose PrivIM, a novel differentially private subgraph-based GNNs framework for IM tasks, which ensures node-level DP guarantees. |
Renxuan Hou; Qingqing Ye; Xun Ran; Sen Zhang; Haibo Hu; |
| 253 | Efficient Structural Clustering Over Hypergraphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, SCAN, the fundamental structural clustering model, is designed for pairwise graphs and fails to capture the unique structural information inherent in hypergraphs when clustering hypergraphs. Motivated by this, we propose a new structural clustering model, HSCAN, specifically for hypergraphs. |
Dong Pan; Xu Zhou; Lingwei Li; Quanqing Xu; Chuanhui Yang; Chenhao Ma; KenLi Li; |
| 254 | AdvSGM: Differentially Private Graph Learning Via Adversarial Skip-Gram Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nevertheless, when applying differential privacy to skip-gram in graphs, it becomes highly challenging due to the complex link relationships, which potentially result in high sensitivity and necessitate substantial noise injection. To tackle this challenge, we present AdvSGM, a differentially private skip-gram for graphs via adversarial training. |
Sen Zhang; Qingqing Ye; Haibo Hu; Jianliang Xu; |
| 255 | Accelerating Shortest Path Counting on Road Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to improve the efficiency of shortest path counting. |
Zebin Chen; Kaiyu Chen; Dong Wen; Zhengyi Yang; Wentao Li; Ying Zhang; |
| 256 | Dual Utilization of Perturbation for Stream Data Publication Under Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By incorporating this deviation into the perturbation process of subsequent values, the previous noise can be calibrated. Following this insight, we introduce the Iterative Perturbation Parameterization (IPP) method, which utilizes current perturbed results to calibrate the subsequent perturbation process. |
Rong Du; Qingqing Ye; Yaxin Xiao; Liantong Yu; Yue Fu; Haibo Hu; |
| 257 | Learning from The Past: Adaptive Parallelism Tuning for Stream Processing Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose StreamTune, a novel approach for adaptive parallelism tuning in stream processing systems. |
Yuxing Han; Lixiang Chen; Haoyu Wang; Zhanghao Chen; Yifan Zhang; Chengcheng Yang; Kongzhang Hao; Zhengyi Yang; |
| 258 | Multi-Class Item Mining Under Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose frameworks for multi-class item mining, along with two mechanisms: validity perturbation to reduce the impact of invalid data, and correlated perturbation to preserve the relationship between labels and items. |
Yulian Mao; Qingqing Ye; Rong Du; Qi Wang; Kai Huang; Haibo Hu; |
| 259 | Joint Dependency and Conflicting Task Allocation in Collaboration-Aware Spatial Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we define and formulate a new problem, called Joint Dependency and Conflicting Task Allocation in Collaboration-aware Spatial Crowdsourcing (JDCTA), which is proved to be NP-hard. |
Jiajun Yao; Lei Yang; Hao Liu; Hui Xiong; |
| 260 | LSM-Community: A Graph Storage System Exploiting Community Structure in Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To dynamically maintain the community structure during graph updates, we present the community-centric dynamic community detection algorithm $(C^{3}D)$. |
Songyao Wang; Chaokun Wang; Fang Niu; Cheng Wu; |
| 261 | Efficient Temporal Simple Path Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To accelerate the processing, we propose an efficient method named Verification in Upper-bound Graph. |
Zhiyang Tang; Yanping Wu; Xiangjun Zai; Chen Chen; Xiaoyang Wang; Ying Zhang; |
| 262 | VGQ: Enabling Verifiable Graph Queries on Blockchain Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To support queries more generally, we propose VGQ, the first verifiable graph query (VGQ) framework that enables efficient graph queries on blockchain systems without altering blockchain storage structures. |
Zhongming Yao; Tianyi Li; Junchang Xin; Yushuai Li; Chenxu Wang; Zhiqiong Wang; Divesh Srivastava; Christian S. Jensen; |
| 263 | Birds of A Feather: Enhancing Multimodal Fake News Detection Via Multi-Element Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their performance is constrained by the inherent knowledge paucity within the target news. To address this challenge, we propose ReTIP, a novel retrieval-enhanced framework for multimodal fake news detection. |
Xueqin Chen; Xiaoyu Huang; Qiang Gao; Li Huang; Jiajing Yu; Guisong Liu; |
| 264 | Catch Me If You Can: A Multi-Agent Synthetic Fraud Detection Framework for Complex Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, imbalanced data distributions and limited labeled examples increase the difficulty of detecting fraud agents. To address these challenges, we propose Catch Me If You Can—a Multi-Agent Framework to generate synthetic datasets and simulate various types of fraudulent behavior, including but not limited to anti-money laundering (AML), credit card fraud, bot attacks, and malicious traffic. |
Qianyu Wang; Wei-Tek Tsai; Tianyu Shi; Zhuang Liu; Bowen Du; |
| 265 | Tag-Filtered Approximate Nearest Neighbor Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Pre-filtering on these tags could boost the recall but lead to a large memory footprint. To address this issue, we propose three strategies in constructing a graph that strikes a balance between the performance and memory footprint; note that we are the first work on tag-frequency-aware graph-based indexing for TFANNS. |
Jiarui Luo; Miao Qiao; Chaoji Zuo; Dong Deng; |
| 266 | AdaMove: Efficient Test-Time Adaptation for Human Mobility Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing approaches mainly train a supervised model based on an offline training dataset, which overlooks the phenomenon that the mobility behaviors of humans vary across time, and the trained models may not achieve ideal performance when applied to the testing data. To tackle this challenge, in this paper, we propose AdaMove, an efficient Test-Time Adaptive (TTA) model for human mobility prediction. |
Huaxu Han; Shuliang Wang; Sijie Ruan; Qianyu Yang; Yuxuan Liang; Ziqiang Yuan; Cheng Long; Hanning Yuan; Yu Zheng; |
| 267 | Optimizing Queries with Many-to-Many Joins Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This has led to much work on developing join algorithms for handling cyclic queries, on compressed (factorized) representations for more efficient storage of intermediate results, and on use of semi-joins or predicate transfer to avoid generating large redundant intermediate results. In this paper, we address a core query optimization problem in this context. |
Hasara Kalumin; Amol Deshpande; |
| 268 | TspSZ: An Efficient Parallel Error-Bounded Lossy Compressor for Topological Skeleton Preservation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce TspSZ, an efficient error-bounded lossy compression framework designed to preserve both critical points and separatrices. |
Mingze Xia; Bei Wang; Yuxiao Li; Pu Jiao; Xin Liang; Hanqi Guo; |
| 269 | TopK-BC: Efficient Maintenance of Top K (p,q)-bicliques Over Streaming Bipartite Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study a new problem to maintain top $k$ densest (p, q)-bicliques over a streaming bipartite graph. |
Xin Deng; Zheng Qin; Peng Peng; Hui Zhou; |
| 270 | Identifying Maximum Defective Bicliques in Large Bipartite Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Finding dense subgraphs in a bipartite graph is a powerful tool for uncovering meaningful patterns and extracting valuable insights across various domains. In this paper, we relax the definition of biclique to $k$-defective biclique by allowing up-to $k$ missing edges, such that larger, but still dense, substructures can be identified. |
Zhiyi Wang; Lijun Chang; Jeffrey Xu Yu; |
| 271 | UMGAD: Unsupervised Multiplex Graph Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (2) In unsupervised scenarios, selecting appropriate anomaly score thresholds remains a significant challenge for accurate anomaly detection. To address the above challenges, we propose a novel Unsupervised Multiplex Graph Anomaly Detection method, named UMGAD. |
Xiang Li; Jianpeng Qi; Zhongying Zhao; Guanjie Zheng; Lei Cao; Junyu Dong; Yanwei Yu; |
| 272 | Online Timestamp-Based Transactional Isolation Checking of Database Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we design Chronos, an efficient timestamp-based offline SI checker. |
Hexu Li; Hengfeng Wei; Hongrong Ouyang; Yuxing Chen; Na Yang; Ruohao Zhang; Anqun Pan; |
| 273 | Privacy-Preserving Screening for Record Linkage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, the substantial computational and communication overheads associated with PPRL hinder its practical adoption in data markets with numerous potential collaborators. Therefore, we present the Screening-then-Linkage framework, which incorporates a lightweight Screening phase prior to the resource-intensive PPRL phase, i.e., PPRS, to mitigate the scalability issue of PPRL. |
Chenyu Huang; Fan Zhang; Huangxun Chen; Yongjun Zhao; Huaming Rao; Peng Chen; Danqing Huang; |
| 274 | Real-Time Single-Source Personalized PageRank Over Evolving Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we define a novel personalized PageRank query, n-steps SSPPR, designed to address the challenges of dynamic environments. |
Sujun Shuai; Xuan Rao; Lisi Chen; Shuo Shang; Shen Gao; |
| 275 | Consistency-Aware Scalable and Authenticated Learned Index for Range Query Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the efficiency of storage, query, verification, and update is still a huge hinder when processing large scale data. To address these challenging issues, in this paper, we propose a novel idea of authenticated learned index that is carefully designed and actively optimized for authenticated query processing. |
Ningning Cui; Dong Wang; Huaijie Zhu; Mo Li; Jingxian Cheng; Jianxin Li; Xiaochun Yang; |
| 276 | Towards Unsupervised Entity Alignment for Highly Heterogeneous Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, there is no solution for unsupervised HHEA. To bridge this gap, this paper formally investigates the unsupervised HHEA problem and proposes an effective unsupervised HHEA solution, AdaCoAgentEA, which addresses the challenges of unsupervised HHEA from the perspective of multi-agent collaboration. |
Runhao Zhao; Weixin Zeng; Jiuyang Tang; Yawen Li; Guanhua Ye; Junping Du; Xiang Zhao; |
| 277 | Rabbit: Retrieval-Augmented Generation Enables Better Automatic Database Knob Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the existing LLM-based tuning methods do not effectively harmonize multi-source external knowledge, leading to missed opportunities for enhanced knob tuning. In light of this, we propose Rabbit, a novel approach that leverages Retrieval-augmented generation to enhance database knob tuning tools, which seamlessly integrates structured historical tuning experience with graph-encoded static knowledge. |
Wenwen Sun; Zhicheng Pan; Zirui Hu; Yu Liu; Chengcheng Yang; Rong Zhang; Xuan Zhou; |
| 278 | Fairness-Aware Active Online Learning with Changing Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work tackles this challenge by addressing a novel paradigm: Fairness-Aware Active Online Learning. We introduce a simple yet effective approach – FACTION, which actively selects the most crucial data points for labeling, going beyond traditional methods by considering both model uncertainty (epistemic uncertainty) and a newly introduced fairness notion derived from this very uncertainty. |
Sadaf MD Halim; Chen Zhao; Xintao Wu; Latifur Khan; Christan Earl Grant; Feng Chen; |
| 279 | Towards Fine-Grained Scalability for Stateful Stream Processing Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose DRRS, an on-the-fly scaling method that reduces performance overhead at the system level with three key innovations: (i) fine-grained scaling signals coupled with a re-routing mechanism that significantly mitigates propagation delay, (ii) a sophisticated record-scheduling mechanism that substantially reduces processing suspension, and (iii) subscale division, a mechanism that partitions migrating states into independent subsets, thereby reducing dependency-related overhead to enable finer-grained control and better runtime adaptability during scaling. |
Yunfan Qing; Wenli Zheng; |
| 280 | Synthesizing Scoring Functions for Rankings Using Symbolic Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To further improve RankHow’s scalability, we propose a novel approximation technique called symbolic gradient descent (SYM-GD). |
Zixuan Chen; Panagiotis Manolios; Mirek Riedewald; |
| 281 | Efficient Integration of Multi-View Attributed Graphs for Clustering and Embedding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a spectrum-guided Laplacian aggregation scheme with an effective objective formulation and two efficient algorithms SGLA and SGLA+, to cohesively integrate all views of $\mathcal{G}$ into an MVAG Laplacian matrix, which readily enables classic graph algorithms to handle $\mathcal{G}$ with superior performance in clustering and embedding tasks. |
Yiran Li; Gongyao Guo; Jieming Shi; Sibo Wang; Qing Li; |
| 282 | MemQ: A Graph-Based Query Memory Prediction Framework for Effective Workload Scheduling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Mem$Q$, a graph-based memory prediction framework designed for effective workload scheduling. |
Yang Wu; Xuanhe Zhou; Xiaoguang Li; Jinhuai Kang; Chunxiao Xing; Tongliang Li; Xinjun Yang; Wenchao Zhou; Feifei Li; Yong Zhang; |
| 283 | $\mathrm{E}^{3}\text{FS}$: Efficient, Secure, and Verifiable Fuzzy Search with Data Updates in Hybrid-Storage Blockchains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose E3FS, the first efficient, secure, and verifiable search scheme over dynamically updatable datasets in HSB systems, supporting multi-keyword fuzzy search, an important search function. |
Pengcheng Sun; Lan Zhang; Jiandong Liu; Chen Tang; Jialiang Wang; |
| 284 | Efficient Core Propagation Based Hierarchical Graph Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle it, we propose theoretical-guaranteed fast solutions, in terms of algorithm complexity and hierarchy levels. |
Jinbin Huang; Zihan Jia; Xin Huang; |
| 285 | GraphPrompter: Multi-Stage Adaptive Prompt Optimization for Graph In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Graph In-Context Learning, with the ability to adapt pre-trained graph models to novel and diverse downstream graphs without updating any parameters, has gained much attention in the community. |
Rui Lv; Zaixi Zhang; Kai Zhang; Qi Liu; Weibo Gao; Jiawei Liu; Jiaxia Yan; Linan Yue; Fangzhou Yao; |
| 286 | Proving Cypher Query Equivalence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GraphQE, an automated prover to determine whether two Cypher queries are semantically equivalent. |
Lei Tang; Wensheng Dou; Yingying Zheng; Lijie Xu; Wei Wang; Jun Wei; Tao Huang; |
| 287 | AID-SQL: Adaptive In-Context Learning of Text-to-SQL with Difficulty-Aware Instruction and Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an adaptive in-context learning approach with difficulty-aware instruction and retrieval-augmented generation to enhance the performance of Text-to-SQL translation (AID-SQL). |
Xiuwen Li; Qifeng Cai; Yang Shu; Chenjuan Guo; Bin Yang; |
| 288 | Approximate Borderline Sampling Using Granular-Ball for Classification Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, an approximate borderline sampling method using GBs is proposed for classification tasks. |
Qin Xie; Qinghua Zhang; Shuyin Xia; |
| 289 | With Anchors or Not: Fairness-Aware Truss-Based Community Search on Attributed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, in this paper, we use the k-truss model, which is a relaxation of the clique but whose members have large engagement and high tie strength, to describe fair communities, namely fair k-truss communities (FTC) and anchored fair k-truss communities (AFTC, using anchored vertices to help satisfying the fairness constraint). |
Xinrui Wang; Zilong Liu; Shixin Ye; Xin Huang; Hong Gao; Xiuzhen Cheng; Dongxiao Yu; |
| 290 | Finding A Summary for All Maximal Bicliques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a summary of these gene-protein relationships not only provides more representative insights but also significantly reduces the time needed for analysis. To find such representative maximal bicliques faster, we propose a method to determine whether to terminate the current search by computing lower bounds. |
Xintong Yu; Rui Zhou; Xiaofan Li; Lu Chen; Chengfei Liu; |
| 291 | Boosting End-to-End Database Isolation Checking Via Mini-Transactions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging MTs’ read-modify-write pattern, we develop highly efficient algorithms to verify strong isolation levels in linear or quadratic time. |
Hengfeng Wei; Jiang Xiao; Na Yang; Si Liu; Zijing Yin; Yuxing Chen; Anqun Pan; |
| 292 | Effective and Scalable Heterogeneous Graph Neural Network Framework with Convolution-oriented Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce the gatekeeping theory in heterogeneous graph learning and investigate the primary challenges limiting current HGNNs. To address these challenges, we propose a novel, effective, and scalable heterogeneous graph neural network framework, the Heterogeneous Convolution-oriented Attention Network (HCAN). |
Ziqian Zhang; Chaokun Wang; Shuwen Zheng; Cheng Wu; Ziyang Liu; Hao Feng; |
| 293 | EPAS: Efficient Online Log Parsing Via Asynchronous Scheduling of LLM Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) offer new opportunities with their advanced semantic capabilities, yet current LLM-based log parsing methods are limited by sequential processing inefficiencies, suboptimal sampling strategies, and lack of robust template refinement mechanisms. To address these challenges, we propose EPAS (Efficient Parsing via Asynchronous Scheduling), a novel parser that utilizes asynchronous scheduling to optimize LLM- based log parsing. |
Xiaolei Chen; Jia Chen; Jie Shi; Peng Wang; Wei Wang; |
| 294 | Efficient Mixed Precision Quantization in Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a theorem for efficient quantized message passing to aggregate integer messages. |
Samir Moustafa; Nils Kriege; Wilfried N. Gansterer; |
| 295 | Anomaly Diagnosis with Siamese Discrepancy Networks in Distributed Cloud Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, since anomalies seldom occur, and anomalies with the same root cause may exhibit significantly different behaviors across different cloud database clusters, existing methods often lack sufficient training data, and they cannot generalize well from some clusters to others. Therefore, we take both anomaly and normal data into consideration, based on an observation that the discrepancy between the anomaly and normal data is relatively consistent compared to the behaviours of anomalies themselves. |
Lingsen Yan; Bolong Zheng; Junjie Qing; Wenlong You; Tingyang Chen; Zhi Xu; Shuncheng Liu; Kai Zeng; Tao Ye; Xiaofang Zhou; |
| 296 | Compatible Unsupervised Anomaly Detection with Multi-Perspective Spatio-Temporal Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the implementation of existing anomaly detection methods is still challenging in (i) capturing the complex spatial and temporal correlations of multivariate time series, (ii) effectively adapting to the unsupervised condition, and (iii) generalizing across nodes in distributed systems. To address these challenges, we design a multi-perspective spatio-temporal attention model, called STAMP, which consists of a prediction module ST-ATTN, a reconstruction module AutoEncoder, and an adversarial optimizing module. |
Tingyang Chen; Bolong Zheng; Shuncheng Liu; Zhujiong Fan; Zhi Xu; Lingsen Yan; Kai Zeng; Tao Ye; Xiaofang Zhou; |
| 297 | OptMatch: An Efficient and Generic Neural Network-Assisted Subgraph Matching Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, the accuracy of the returned approximate results could be improved significantly. Motivated by these observations, we proposed OptMatch, an efficient and generic neural network-assisted subgraph matching approach, in this work. |
Wenzhe Hou; Xiang Zhao; Bo Tang; |
| 298 | Personalized Multi-Interest Modeling for Cross-Domain Recommendation to Cold-Start Users Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a personalized multi-interest modeling framework for CDR to cold-start users, termed as NF-NPCDR. |
Xiaodong Li; Jiawei Sheng; Jiangxia Cao; Xinghua Zhang; Wenyuan Zhang; Yong Sun; Shirui Pan; Zhihong Tian; Tingwen Liu; |
| 299 | Pseudo-label-Based Unsupervised Granular-Ball Division and Fast Spectral Clustering for High-Dimensional Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Clustering such data presents a significant challenge, as existing methods often suffer from slow execution speeds and reduced clustering accuracy. To tackle these issues, we introduce the granular-ball approach, which aims to decrease the number of sample points and enhance processing speed, while also improving clustering accuracy through feature selection. |
Dongdong Cheng; Xiaocui Jiang; Shuyin Xia; Guoyin Wang; |
| 300 | Fastft: Accelerating Reinforced Feature Transformation Via Advanced Exploration Strategies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: (3) Rare significant transformations lead to sparse valuable feedback that hinders the learning processes or leads to less effective results. In response to these challenges, we introduce FASTFT, an innovative framework that leverages a trio of advanced strategies. |
Tianqi He; Xiaohan Huang; Yi Du; Qingqing Long; Ziyue Qiao; Min Wu; Yanjie Fu; Yuanchun Zhou; Meng Xiao; |
| 301 | MLKV: Efficiently Scaling Up Large Embedding Model Training with Disk-based Key-Value Storage Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents MLKV, an efficient, extensible, and reusable data storage framework designed to address the scalability challenges in embedding model training, specifically data stall and staleness. |
Yongjun He; Roger Waleffe; Zhichao Han; Johnu George; Binhang Yuan; Zitao Zhang; Yinan Shan; Yang Zhao; Debojyoti Dutta; Theodoros Rekatsinas; Ce Zhang; |
| 302 | A Bilateral Perspective for Modeling Real-Time Traffic Trends in Live-Streaming Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing approaches predominantly focus on static user-item modeling or unilateral consumer-side metrics, failing to capture the spatiotemporal dependencies in multi-behavior traffic or address the interdependence between production and consumption. To bridge these gaps, we propose a novel paradigm for live-streaming recommendation from a bilateral perspective. |
Rui Li; Pengyuan Gao; Haihan Li; Ling Chai; Shaohao Huang; Ting Xie; |
| 303 | CDMap: Complementarity and Disparity-aware Map Inference Quality Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In view of that, we propose a Complementarity and Disparity-aware Map Inference Framework, called CDMap, consisting of grid dual feature extraction, contextual road difference-embedded grid representation, dual feature complementary network-based road topology prediction and parallel roads disparity-enhanced model optimization. |
Wenyu Wu; Jiali Mao; Jiafan Liu; Yixiao Tong; Lisheng Zhao; Shaosheng Cao; Jilin Hu; Aoying Zhou; Lin Zhou; |
| 304 | Stability Is Not Downtime: Comprehensive Stability Evaluation for Large-Scale Cloud Servers in Alibaba Cloud Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, through our extensive engagement in stability-related work, we have discovered that only 27% of stability issues are related to unavailability, which is in fact a subset of stability. Therefore, in this paper, we propose the Comprehensive Damage Indicator (CDI), comprising three distinct sub-metrics: the Unavailability Indicator, Performance Indicator, and Control-plane Indicator. |
Haoyu Wang; Zhicheng Liu; Yeliang Qiu; Haozhe Li; Hongke Guo; Zhaoliang Zhu; You Zhang; Yu Zhou; Xudong Zheng; |
| 305 | OceanBase Unitization: Building The Next Generation of Online Map Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the architectural design of OceanBase (OB), a distributed database system that “unitizes” services and operations into individual machines. |
Quanqing Xu; Wei Sun; Chuanhui Yang; Jinlong Liu; Ziyun Wei; Fusheng Han; Liang Wang; Xiaowei Zhai; |
| 306 | PAS: Plug-and-Play Prompt Augmentation System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, users often face challenges in writing prompts due to the steep learning curve and significant time investment, and existing automatic prompt engineering (APE) models can be difficult to use. To address this issue, we propose PAS, an LLM-based plug-and-play APE system. |
Miao Zheng; Hao Liang; Fan Yang; Bin Cui; Zenan Zhou; Wentao Zhang; |
| 307 | Scaling and Hardening XLOG: The SQL Azure Hyperscale Log Service Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes how to address scalability challenges by applying various techniques. |
Jack Hu; Eric Lee; Prashanth Purnananda; Hanuma Kodavalla; |
| 308 | Hyperscale Resilient Buffer Pool Extension in Azure SQL Database Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Resilient Buffer Pool Extension (RBPEX) which is a persistent cache present on both compute and storage nodes. |
Rogerio Ramos; Prashanth Purnananda; Hanuma Kodavalla; Chaitanya Gottipati; Harshil Ambagade; Ankit Anvesh; Srikanth Sampath; |
| 309 | DataSculpt: A Holistic Data Management Framework for Long-Context LLMs Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through extensive experimental analysis, we identified three key challenges in designing effective data management strategies that enable the model to achieve long-context capability without sacrificing performance in other tasks: (1) a shortage of long documents across multiple domains, (2) effective construction of context windows, and (3) efficient organization of large-scale datasets. To address these challenges, we introduce DataSculpt, a novel data management framework designed for long-context training. |
Keer Lu; Xiaonan Nie; Zheng Liang; Da Pan; Shusen Zhang; Keshi Zhao; Weipeng Chen; Zenan Zhou; Guosheng Dong; Bin Cui; Wentao Zhang; |
| 310 | BBS: Batch-Based Snapshot for The Cloud Database Backup Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we observed that access to snapshots has the characteristics of locality and continuity. |
Xiaoshuang Peng; Xiaopeng Fan; Shi Cheng; Lingbin Meng; Cuiyun Fu; Wenchao Zhou; Chuliang Weng; |
| 311 | FlowFill: A Fast and Energy-Efficient Streaming Framework for Recovering Missing Values in Industrial Sensor Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing imputation methods often struggle to efficiently handle large data volumes in a streaming manner, typically requiring full access to historical time series and consuming excessive energy and processing time. To address these limitations, we propose FlowFill, a fast and energy-efficient streaming framework for recovering missing values in industrial sensor data. |
Hao Huang; Scott Evans; |
| 312 | GEX: Guiding Expert Tuning with EXplainable AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present GEX, a system that provides interpretable insights into database optimizer behavior using explainable AI techniques. |
Andrew Chai; Alexander Bianchi; Vincent Corvinelli; Parke Godfrey; Jarek Szlichta; Calisto Zuzarte; |
| 313 | Enabling Light-Weight Reasoning Via Cypher Triggers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a general scheme for generating active rules that correctly handle recursion, aggregation, and stratified negation, so as to deploy reactive reasoners over graph data managers. |
Davide Magnanimi; Andrea Colombo; Luigi Bellomarini; Anna Bernasconi; Stefano Ceri; Davide Martinenghi; |
| 314 | Advancing Cloud-Native Cyber Threat Detection with Graph-Based Feature Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite a range of solutions in literature, attack classification remains an arduous task in industrial environments. To address this, we propose a comprehensive and deployment-friendly graph-based framework. |
Tailai Song; Mukharbek Organokov; Lennart Gulikers; Giulio Grassi; Giovanna Carofiglio; Michela Meo; |
| 315 | MSPR: A Multi-Scenario Player Recommendation Framework for Online Gaming Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing recommendation systems are typically designed for single gaming scenarios, making them unable to accommodate diverse needs, which limits their accuracy and relevance. To overcome these challenges, this paper presents a multi-scenario player recommendation (MSPR) framework that delivers context-aware recommendations by capturing the nuanced distinctions and commonalities across multiple scenarios. |
Minghao Chen; Haodong Chen; Yi Zeng; Zhenfeng Liang; |
| 316 | Effectively PAIRing LLMs with Online Marketing Via Progressive Prompting Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we seek to carefully prompt a Large Language Model (LLM) with domain-level knowledge as a better marketing-oriented knowledge miner for marketing-oriented knowledge graph construction, which is non-trivial, suffering from several inevitable issues in real-world marketing scenarios, i.e., uncontrollable relation generation of LLMs, insufficient prompting ability of a single prompt, unaffordable deployment cost of LLMs. |
Chunjing Gan; Dan Yang; Binbin Hu; Ziqi Liu; Yue Shen; Zhiqiang Zhang; Jinjie Gu; Jun Zhou; Guannan Zhang; |
| 317 | Model-Accuracy Aware Query Routing for Smart Logistics Service Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a Model-Accuracy Aware Service that facilitates a flexible trade-off between model prediction accuracy and primary database performance. |
Shanshan Huang; Zhiwei Ye; Peng Cai; Qiwen Dong; |
| 318 | BlendHouse: A Cloud-Native Vector Database System in ByteHouse Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present BlendHouse, a cloud-native and generalized vector database system built on top of the disaggregated storage and computation architecture. |
Zhaojie Niu; Xinhui Tian; Xindong Peng; Xing Chen; |
| 319 | DataLab: A Unified Platform for LLM-Powered Business Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce DataLab, a unified BI platform that integrates a one-stop LLM-based agent framework with an augmented computational notebook interface. |
Luoxuan Weng; Yinghao Tang; Yingchaojie Feng; Zhuo Chang; Ruiqin Chen; Haozhe Feng; Chen Hou; Danqing Huang; Yang Li; Huaming Rao; Haonan Wang; Canshi Wei; Xiaofeng Yang; Yuhui Zhang; Yifeng Zheng; Xiuqi Huang; Minfeng Zhu; Yuxin Ma; Bin Cui; Peng Chen; Wei Chen; |
| 320 | SylphDB: An Active and Adaptive LSM Engine for Update-Intensive Workloads Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we demonstrate that an LSM-tree faces two issues when dealing with update-intensive workloads. |
Jun-Peng Zhu; Zhiwei Ye; Xiaolong He; Peng Cai; Xuan Zhou; Aoying Zhou; Dunbo Cai; Ling Qian; Kai Xu; Liu Tang; Qi Liu; |
| 321 | Scalable Machine Learning for Real-Time Fault Diagnosis in Industrial IoT Cooling Roller Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Evaluations of state-of-the-art fault diagnosis (FD) methods, including online continual learning (OCL) algorithms like Camel, reveal their limitations in meeting the real-time adaptability and data processing demands of HRSCR-HMS. To address these challenges, we propose SRTFD, a scalable framework tailored for real-time fault diagnosis in industrial IoT systems. |
Dandan Zhao; Karthick Sharma; Yuxin Qi; Qixun Liu; Shuhao Zhang; |
| 322 | Few Labels Are All You Need: A Weakly Supervised Framework for Appliance Localization in Smart-Meter Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce CamAL, a weakly supervised approach for appliance pattern localization that only requires information on the presence of an appliance in a household to be trained. |
Adrien Petralia; Paul Boniol; Philippe Charpentier; Themis Palpanas; |
| 323 | GraphEx: A Graph-Based Extraction Method for Advertiser Keyphrase Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce GraphEx, an innovative graph-based approach that recommends keyphrases to sellers using extraction of token permutations from item titles. |
Ashirbad Mishra; Soumik Dey; Hansi Wu; Jinyu Zhao; He Yu; Kaichen Ni; Binbin Li; Kamesh Madduri; |
| 324 | Bridging The Gap: LLM-Powered Transfer Learning for Log Anomaly Detection in New Software Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Cross-system log anomaly detection methods attempt to transfer knowledge from mature systems to new ones but often struggle with syntax differences and system-specific knowledge, which hinders their effectiveness. To address these issues, this paper proposes LogSynergy, a novel transfer learning-based log anomaly detection framework. |
Yicheng Sui; Xiaotian Wang; Tianyu Cui; Tong Xiao; Chenghao He; Shenglin Zhang; Yuzhi Zhang; Xiao Yang; Yongqian Sun; Dan Pei; |
| 325 | MAP: Abandoned Property Detection Using Multifaceted Urban Service Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel approach to abandoned property detection by combining urban service data, such as utility payment records, service requests, and complaints reported by residents, which are available from digital city management systems. |
Liangkai Zhou; Agbonlahor Edomwonyi; Shan Lin; |
| 326 | M2oERank: Multi-Objective Mixture-of-Experts Enhanced Ranking for Satisfaction-Oriented Web Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel PLM-based ranking approach M2oE Rank, Multi-objective Mixture-of-Experts (MoE) enhanced Ranking. |
Yuchen Li; Hao Zhang; Yongqi Zhang; Xinyu Ma; Wenwen Ye; Naifei Song; Shuaiqiang Wang; Haoyi Xiong; Dawei Yin; Lei Chen; |
| 327 | BIGCity: A Universal Spatiotemporal Model for Unified Trajectory and Traffic State Data Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although recent advances in ST data pre-training and ST foundation models aim to develop universal models for ST data analysis, most existing models are “multi-task, solo-data modality” (MTSM), meaning they can handle multiple tasks within either trajectory data or traffic state data, but not both simultaneously. To address this gap, this paper introduces BIGCity, a pioneer multi-task, multi-data modality (MTMD) model for ST data analysis. |
Xie Yu; Jingyuan Wang; Yifan Yang; Qian Huang; Ke Qu; |
| 328 | GalaxyView: Property Graph Transformation for Materialized View Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the practical use of graph databases, storing graphs separately enhances maintainability, while integrating them into a unified graph facilitates advanced analytics. To address these dual needs, we present a GQL-compatible framework for creating graph views across multiple property graphs. |
Bing Tong; Jianheng Tang; Yan Zhou; Chen Zhang; Jia Li; Lei Chen; |
| 329 | Large-Scale Private Computation for Real-World Applications Via Trusted Hardware and Obliviousness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this tutorial, we present the latest advancements in TEE-based oblivious primitives, highlighting multiple applications such as private contact discovery for Signal, our new proposal for anonymous communication resistant to long-term traffic analysis (SP’25), oblivious relational (USENIX’25) and graph databases (PVLDB’24), key transparency, searchable encryption, large-scale software activity monitoring, federated learning, privacy-preserving LLMs, Google’s Privacy Sandbox initiative, Google’s FLEDGE, Google’s Titan Security Key and Google’s Asylo. |
Ioannis Demertzis; |
| 330 | How to Answer Secure and Private SQL Queries? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This tutorial highlights the importance of integrating robust security and privacy measures into query processing to build trustworthy database systems. |
Qiyao Luo; Quanqing Xu; Chuanhui Yang; |
| 331 | A Unified Narrative for Query Processing in Graph Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We thus propose this tutorial to provide a unified narrative for graph query processing, so as to bridge the gap between existent lines of work and offer a comprehensive view of the query processing workflow in graph databases. |
Yue Pang; Lei Zou; M. Tamer Özsu; |
| 332 | An Overview of Path Queries on Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this tutorial, we focus on four major categories of path queries: plain shortest path queries, constrained shortest path queries, shortest path summary queries, and non-shortest path queries. |
Wentao Li; Dong Wen; Lu Qin; Ying Zhang; |
| 333 | AIGC for Graphs: Current Techniques and Future Trends Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This tutorial outlines the latest developments in AIGC for graph generation. |
Hanchen Wang; Dawei Cheng; Ying Zhang; Wenjie Zhang; |
| 334 | Towards Retrieval-Augmented Large Language Models: Data Management and System Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Retrieval-augmented generation (RAG) has become a transformative approach for enhancing large language models (LLMs) by integrating external, reliable, and up-to-date knowledge. |
Wenqi Fan; Pangjing Wu; Yujuan Ding; Liangbo Ning; Shijie Wang; Qing Li; |
| 335 | Machine Learning on The Fly: A Hands-On Tutorial for Streaming Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This tutorial introduces key concepts and techniques in data stream learning, blending foundational theory with practical demonstrations. |
Heitor Murilo Gomes; Nuwan Gunasekara; Yibin Sun; |
| 336 | Data Driven Decision Making with Time Series and Spatio-Temporal Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As part of the continued digitalization of processes throughout society, increasingly large volumes of time series and spatio-temporal data are available. In this tutorial, we focus on data-driven decision making with such data, e.g., enabling greener and more efficient transportation based on traffic time series forecasting. |
Bin Yang; Yuxuan Liang; Chenjuan Guo; Christian S. Jensen; |
| 337 | Navigating Data Errors in Machine Learning Pipelines: Identify, Debug, and Learn Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Addressing data errors such as wrong, missing, noisy, biased, or out-of-distribution values has become a crucial part of the machine learning (ML) development lifecycle. … |
Bojan Karlaš; Babak Salimi; Sebastian Schelter; |
| 338 | DBMS with CXL Memory: What’s New and What’s Next Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this tutorial, we introduce the advantages and application scenarios of tiered CXL memory, pooled CXL memory, and shared CXL memory, as well as the new challenges that arise, including: (1) reducing data access, data exchange, and data transfer cost; (2) optimizing memory allocation and competition to improve memory utilization; (3) managing shared data for failure, series operators, and distributed transactions. |
Yunyan Guo; Zhuopeng Li; Guoliang Li; |
| 339 | DataMorpher: Automatic Data Transformation Using LLM-Based Zero-Shot Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing approaches rely on supervised learning, which requires tremendous data labeling and training overhead. To alleviate such overhead while improving accuracy, we demonstrate a novel system DataMorpher that leverages Large Language Models (LLMs) to generate code that transforms source datasets into a user-specified target format. |
Ankita Sharma; Jaykumar Tandel; Xuanmao Li; Lanjun Wang; Anna Fariha; Liang Zhang; Syed Arsalan Ahmed Naqvi; Irbaz Bin Riaz; Lei Cao; Jia Zou; |
| 340 | Db2une: Tuning IBM Db2 with Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the design of, and a demonstration plan for, Db2une, an automated, query-aware tuning system employing deep-learning techniques to enhance performance while also conserving resources. |
Alexander Bianchi; Rafael Dolores; Andrew Chai; Vincent Corvinelli; Parke Godfrey; Jarek Szlichta; Calisto Zuzarte; |
| 341 | DreamCreek: AI for Battery Formation and Grading Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We will demonstrate how DreamCreek works with a guided tour, and show how it reduces the time of the formation and grading phase to 4 hours, with an error rate in the range of [0.06 %, 1%]. |
Wenfei Fan; Yang Leng; Daji Li; Shuhao Liu; Mingliang Ouyang; Yaoshu Wang; Min Xie; Qiang Yuan; |
| 342 | BitTuner: A Toolbox for Automatically Configuring Learned Data Compressors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, determining the optimal configuration of underlying ML models to maximize compression efficacy is nontrivial. To address this, by analyzing the distribution characteristics of input keys, we propose BitTuner, a novel framework that automatically sets model hyper-parameters to provably achieve the best compression ratio. |
Qiyu Liu; Yuxin Luo; Mengke Cui; Siyuan Han; Jingshu Peng; Jin Li; Lei Chen; |
| 343 | DeviceScope: An Interactive App to Detect and Localize Appliance Patterns in Electricity Consumption Time Series Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces DeviceScope [1], an interactive tool designed to facilitate understanding smart meter data by detecting and localizing individual appliance patterns within a given time period. |
Adrien Petralia; Paul Boniol; Philippe Charpentier; Themis Palpanas; |
| 344 | Towards On-Database Contextual Model Explanation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate DBxAI, a system for explaining predictions of arbitrary machine learning models even if model owners opt not to, giving the right-to-explanation to model users as requested by GDPR. |
Shuai An; Yang Cao; |
| 345 | ScRAG: An Efficient Retrieval Augmented Generation System for ScRNA-seq Data Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we demonstrate scRAG, which can efficiently remove batch effect in cell-type identification and enable reliable new cell discovery, facilitated by GPU-based scRNA-seq data management and Large Language Models (LLMs). |
Yuren Mao; Yifan Zhu; Qing Liu; Peigen Liu; Haoran Yu; Yunjun Gao; |
| 346 | EasyTime: Time Series Forecasting Made Easy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By demonstrating EasyTime11https://decisionintelligence.github.io/EasyTime, we aim to show how it simplifies the use of time-series forecasting and facilitates the development of new generations of time series forecasting methods. |
Xiangfei Qiu; Xiuwen Li; Ruiyang Pang; Zhicheng Pan; Xingjian Wu; Liu Yang; Jilin Hu; Yang Shu; Xuesong Lu; Chengcheng Yang; Chenjuan Guo; Aoying Zhou; Christian S. Jensen; Bin Yang; |
| 347 | Data Backup System with No Impact on Business Processing Utilizing Storage and Container Technologies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We integrated the storage and container technologies into the demonstration system, which can eliminate both system slowdown and downtime. |
Satoru Watanabe; |
| 348 | SwiftDP: An Efficient Framework for Automated Data Preparation Pipeline Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SwiftDP, an efficient framework for automated data preparation based on Monte Carlo Tree Search. |
Liangwei Li; Yiyi Zhang; Ning Wang; |
| 349 | MITra: Populating Graph Traversal Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate MITra, a system for synthesizing Multi-Instance graph Traversal algorithms that traverse from multiple source vertices simultaneously over a single thread. |
Wenyue Zhao; Jia Li; Yang Cao; Nikos Ntarmos; |
| 350 | Graphint: Graph-Based Time Series Clustering Visualisation Tool Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, prevailing methods often encounter obstacles in maintaining data relationships and ensuring interpretability. We present Graphint, an innovative system based on the $k$-Graph methodology that addresses these challenges. |
Paul Boniol; Donato Tiano; Angela Bonifati; Themis Palpanas; |
| 351 | HRLMS: A Data-Driven Hierarchical Reinforcement Learning System for Interactive Rule Intervention and Visualization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, there is an urgent need for the ability to intervene directly and simply during training. To address these issues, we present the Interactive Hierarchical Reinforcement Learning Monitoring System (HRLMS). |
Haodi Zhang; Xiangyu Zeng; Chen Zhang; Yuanfeng Song; Kaishun Wu; |
| 352 | EAST: An Interpretable Knob Estimation System for Cloud Database Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, less attention has been paid to estimating the performance of the knob configuration. To fill this gap, we propose EAST, a knob estimation system to provide interpretable & transferable knob estimation services for cloud databases. |
Yu Yan; Hongzhi Wang; Jian Geng; Zixuan Wang; Xingyan Li; Tianqing Wang; |
| 353 | HiVQ: A Real-time Interactive Visual Query System on Geospatial Big Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, developing such systems has become increasingly challenging in recent years due to the conflict between the unprecedented volume of data and the need for instantaneous feedback. To address this challenge, we present HiVQ, a High-performance Visual Query system for real-time interactive visual query of geospatial big data. |
Zebang Liu; Anran Yang; Mengyu Ma; Luo Chen; Jiali Zhou; Ning Jing; |
| 354 | Popper: A Dataflow System for In-Flight Error Handling in Machine Learning Workflows Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Popper, a dataflow system for building Machine Learning (ML) workflows. |
Adnan Shakeel Ahmed; Abhilash Jindal; Kaustubh Beedkar; |
| 355 | UniClean: A Multi-Signal Fusion Pipeline for Optimizing Data Cleaning Workflow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the growing demand for advanced cleaning tools driven by the complexity of data activities, we propose the UniClean framework for on-demand big data cleaning. |
Xiaoou Ding; Zekai Qian; Hongzhi Wang; Zhe Sun; Siying Chen; Hongbin Su; Huan Hu; |
| 356 | LineageX: A Column Lineage Extraction System for SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we demonstrate LineageX, a lightweight Python library that infers column-level lineage from SQL queries and visualizes it through an interactive interface. |
Shi Heng Zhang; Zhengjie Miao; Jiannan Wang; |
| 357 | CSKQS: A Query System for Collective Spatial Keyword Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we demonstrate a Collective Spatial Keyword Queries System (CSKQS) that supports both the conventional CoSKQ, and a variant called Cost-Aware and Distance-Constrained CoSKQ. |
Harry Kai-Ho Chan; |
| 358 | PixelsDB: Serverless and NL-Aided Data Analytics with Flexible Service Levels and Prices Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents PixelsDB, an open-source data analytic system that allows users who lack system or SQL expertise to explore data efficiently. |
Haoqiong Bian; Dongyang Geng; Haoyang Li; Yunpeng Chai; Anastasia Ailamaki; |
| 359 | RAISIN: A Parallel Subgraph Matching Tool Exploiting Community Structures in Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a subgraph matching tool called RAISIN. |
Songyao Wang; Chaokun Wang; |
| 360 | Chat2DB: Chatting to The Database with Interactive Agent Assisted Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The existing Text-to-SQL parser exhibits limitations in its adaptability to new databases, and its execution accuracy is not sufficient for building conversational applications, typically necessitating further fine-tuning for specific databases. In this paper, we introduce Chat2DB, a conversational system designed for database interactions that enhances parser capabilities, rendering them applicable in real-world contexts. |
Boyan Xu; Yuyuan Cai; Shaobin Shi; Zhifeng Hao; Ruichu Cai; |
| 361 | ARAG: Analysis and Retrieval Augmented Generation for Comprehensive Reasoning Over Socioeconomic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, applying them independently has limited their effectiveness in scenarios that require a synthesis of both data analysis and contemporary information retrieval. To bridge this gap, we introduce the Analysis and Retrieval Augmented Generation (ARAG) framework, which integrates data analysis with the retrieval of up-to-date information. |
Yixiong Xiao; Jingjia Cao; Yangxin Jiang; Jingbo Zhou; |
| 362 | Artemis: A Customizable Workload Generation Toolkit for Benchmarking Cardinality Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To satisfy the need, we introduce Artemis, a customizable workload generator, which can be used to generate various scenarios with the sensitive features for CardEst, including various data dependencies, complex SQL structures, and diverse cardinalities. |
Zirui Hu; Rong Zhang; Chengcheng Yang; Xuan Zhou; Quanqing Xu; Chuanhui Yang; |
| 363 | IKGA: An Interactive Visualization Tool for Knowledge Graph Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there is currently no interactive tool to support KGA research, particularly for visualizing alignment results, hence limiting the understanding of the procedure and also the development of more advanced solutions. To fill in this gap, in this paper, we introduce IKGA, an interactive visualization tool for KGA, which visualizes the alignment process by integrating various algorithms of representation learning and alignment inference-two key steps in KGA. |
Weixin Zeng; Shiqi Zhang; Huang Peng; Zhen Tan; Weidong Xiao; Xiang Zhao; |
| 364 | CBAClean:A Comprehensive System for Recommending Data Cleaning Solutions Through Cost-Benefit Analysis in Data Quality Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This omission can lead to the failure of promising data analysis solutions. To address this, we propose CBAClean, a comprehensive system that integrates cost-benefit analysis into data cleaning. |
Xiaoou Ding; Hongbin Su; Zekai Qian; Wenxuan Cui; Siying Chen; Zheng Liang; Chen Wang; Hongzhi Wang; |
| 365 | Training Data Distribution Estimation for Optimized Pre-training Data Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a new approach, data distribution estimation, which enables the automatic estimation of pretraining data distributions by analyzing the generated outputs of LLMs. |
Hao Liang; Keshi Zhao; Yajie Yang; Bin Cui; Zenan Zhou; Wentao Zhang; |
| 366 | Beyond Bandwidth Doubling: Embrace Bit-Flips and Unlock Processing-in-NAND Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Much of the power (and thus heat) within a NAND chip results from transferring data at a high rate, another symptom of a compute-centric style of processing. Therefore, we argue for data-centric Processing-in-NAND (PiN). |
Maximilian Berens; Yun-Chih Chen; Jian-Jia Chen; Jens Teubner; |
| 367 | Unify: An Unstructured Data Analytics System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, a pertinent question arises: how can we automate unstructured data analytics? To address these challenges, this paper introduces Unify, an innovative system leveraging the capabilities of large language models (LLMs) to automatically generate, optimize, and execute query plans for unstructured data analytics, where queries are articulated in natural language. |
Jiayi Wang; Jianhua Feng; |
| 368 | Optimizing Cloud Data Lake Queries By Minimizing The Query Coverage Set Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We begin by formalizing the problem and proposing a straightforward yet robust theoretical framework that clearly outlines the associated trade-offs. |
Grisha Weintraub; Ehud Gudes; Shlomi Dolev; |
| 369 | MetaLoRA: Tensor-Enhanced Adaptive Low-Rank Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, existing approaches mainly address static adaptations, neglecting the potential benefits of task-aware parameter generation in handling diverse task distributions. To address these limitations, this Ph.D. research proposes a LoRA generation approach to model task relationships and introduces MetaLoRA, a novel parameter-efficient adaptation framework incorporating meta-learning principles. |
Maolin Wang; Xiangyu Zhao; Ruocheng Guo; Junhui Wang; |
| 370 | Leveraging LLMs for Diffusion Prediction in Social Networks: A Fused Attention-aware Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the rich information contained in user profiles and post content is neglected, and the diffusion influence between users is challenging to model. To address these limitations, this paper proposes a fused attention-aware diffusion prediction method based on our fine-tuned Diffusion-LLM to improve the accuracy and interpretability of prediction results. |
Wenbo Shang; |
| 371 | Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel Small Language Model (SLM)-driven system that synergizes advancements in lightweight Retrieval-Augmented Generation (RAG) and semantic-aware data structuring to enable efficient, accurate, and scalable query resolution across diverse data formats. |
Teng Lin; |
| 372 | StreamSC: A Learning-Based Framework for Efficient Subgraph Counting in Stream Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose deep learning-based solution framework named StreamSC to solve the subgraph counting in stream graphs. |
Zhen Xie; Xiang Zhao; |
| 373 | Explore The Disentanglement Mechanism for Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research explores disentanglement mechanisms in deep learning across three core dimensions: model expressivity, optimization paradigms, and interpretability. |
Haiquan Qiu; Quanming Yao; |
| 374 | O(1)-Time Complexity for Fixed Sliding-Window Aggregation Over Out-of-Order Data Streams: (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes two solutions: (1) CMiX for computing the current window, and (2) PWiX for updating the past windows. |
Savong Bou; Toshiyuki Amagasa; Hiroyuki Kitagawa; |
| 375 | Efficient Unsupervised Graph Embedding with Attributed Graph Reduction and Dual-Level Loss: (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an efficient unsupervised graph embedding method named GEARED. |
Ziyang Liu; Chaokun Wang; Hao Feng; Ziyang Chen; |
| 376 | SkipNode: On Alleviating Performance Degradation for Deep Graph Convolutional Networks (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SkipNode, a plug-and-play module that mitigates degradation in deep GCNs. |
Weigang Lu; Yibing Zhan; Binbin Lin; Ziyu Guan; Liu Liu; Baosheng Yu; Wei Zhao; Yaming Yang; Dacheng Tao; |
| 377 | Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our systematic review serves as a roadmap for developing NLIs in the foundation model era. |
Weixu Zhang; Yifei Wang; Yuanfeng Song; Victor Junqiu Wei; Yuxing Tian; Yiyan Qi; Jonathan H. Chan; Raymond Chi-Wing Wong; Haiqin Yang; |
| 378 | Kairos: Enabling Prompt Monitoring of Information Diffusion Over Temporal Networks (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For analyses of temporal information diffusion, temporal graph traversal platforms have recently been proposed; however, it is still infeasible to handle infinitely evolving temporal data, especially for monitoring applications. In this paper, we propose an incremental approach and its graph processing engine, Kairos, to enable prompt monitoring of temporal information diffusion. |
Haifa Gaza; Jaewook Byun; |
| 379 | CausalFormer: An Interpretable Transformer for Temporal Causal Discovery (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current deep learning-based methods usually analyze the parameters of some components of the trained models, which is an incomplete mapping process from the model parameters to the causality and fails to investigate the other components. To address this, this paper presents an interpretable transformer-based causal discovery model termed CausalFormer, which consists of: 1) the causality-aware transformer which learns the causal representation with the multi-kernel causal convolution under the temporal priority constraint, and 2) the decomposition-based causality detector which identifies causality by interpreting the global structure of the trained transformer with the regression relevance propagation. |
Lingbai Kong; Wengen Li; Hanchen Yang; Yichao Zhang; Jihong Guan; Shuigeng Zhou; |
| 380 | Condensing Pre-Augmented Recommendation Data Via Lightweight Policy Gradient Estimation (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the limitations, we propose a lightweight condensation framework tailored for recommendation (DConRec), focusing on condensing user-item historical interaction sets. |
Jiahao Wu; Wenqi Fan; Jingfan Chen; Shengcai Liu; Qijiong Liu; Rui He; Qing Li; Ke Tang; |
| 381 | Interrelated Dense Pattern Detection in Multilayer Networks (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we introduce INDUEN, a novel algorithm designed to detect interrelated densest subgraphs in multilayer networks by leveraging joint optimization of coupled factorization and local search for an elaborate-designed joint density measure. |
Wenjie Feng; Li Wang; Bryan Hooi; See-Kiong Ng; Shenhua Liu; |
| 382 | BGAE: Auto-encoding Multi-view Bipartite Graph Clustering (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Enlightened by the prevalent encoding-decoding in deep learning, this paper rethinks existing paradigms and proposes a novel “auto-encoding” MVBGC framework, named BGAE. |
Liang Li; Yuangang Pan; Jie Liu; Yue Liu; Xinwang Liu; Kenli Li; Ivor W. Tsang; Keqin Li; |
| 383 | A Novel Key Point Based MLCS Algorithm for Big Sequences Mining (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Mining multiple longest common subsequences (MLCS) from a set of sequences of three or more over a finite alphabet $\Sigma$ (a classical NP-hard problem [1]) is an important task … |
Yanni Li; Bing Liu; Tihua Duan; Zhi Wang; Hui Li; Jiangtao Cui; |
| 384 | MC2LS: Towards Efficient Collective Location Selection in Competition: (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an $(1-\frac{1}{e})$-approximate greedy solution to MC2LS, and empirical studies demonstrate the superiority of our proposed solution over the state-of-the-art techniques. |
Meng Wang; Mengfei Zhao; Hui Li; Jiangtao Cui; Bo Yang; Tao Xue; |
| 385 | FELight: Fairness-Aware Traffic Signal Control Via Sample-Efficient Reinforcement Learning (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, existing methods overlook the challenge of sample efficiency, especially when dealing with diversity-limited traffic data. Therefore, we propose a Fairess-aware and sample-Efficient traffic signal control method called FELight. |
Xinqi Du; Ziyue Li; Cheng Long; Yongheng Xing; Philip S. Yu; Hechang Chen; |
| 386 | BiTDB: Constructing A Built-in TEE Secure Database for Embedded Systems (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose BiTDB, a built-in Trusted Execution Environment (TEE) database for embedded systems, to realize higher system availability while ensuring data confidentiality. |
Chengyan Ma; Di Lu; Chaoyue Lv; Ning Xi; Xiaohong Jiang; Yulong Shen; Jianfeng Ma; |
| 387 | Generalized Measure-Biased Sampling and Priority Sampling: (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, weighted sampling must create a sample for each measure column, which leads to expensive storage cost for any table with dozens of columns. To address this issue, we generalize both measure-biased sampler and priority sampler, which can compress the samples but still provide fast approximate answers to both distribution query and subset-sum query within a user-specified error bound. |
Zhao Chang; Feifei Li; Yulong Shen; |
| 388 | Characterizing Submanifold Region for Out-of-Distribution Detection: (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a data structure-aware approach to mitigate the sensitivity of distances to the “curse of dimensionality”, where high-dimensional features are mapped to the manifold of ID samples, leveraging the well-known manifold assumption. |
Xuhui Li; Zhen Fang; Yonggang Zhang; Ning Ma; Jiajun Bu; Bo Han; Haishuai Wang; |
| 389 | Learning Prioritized Node-Wise Message Propagation in Graph Neural Networks (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Graphs are ubiquitous in the real world, in graphs, nodes represent entities and edges capture their relationships. Recently, graph neural networks (GNNs) [3]–[6] have been … |
Yao Cheng; Minjie Chen; Caihua Shan; Xiang Li; |
| 390 | Smoothing Outlier Scores Is All You Need to Improve Outlier Detectors (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Existing outlier detectors calculate outlier scores for data objects independently, ignoring the consistency between score similarity and object similarity. As a result, these … |
Jiawei Yang; Susanto Rahardja; Pasi Fränti; |
| 391 | HAQJSK: Hierarchical-Aligned Quantum Jensen-Shannon Kernels for Graph Classification (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a family of Hierarchical Aligned Quantum Jensen-Shannon Kernels (HAQJSK) for un-attributed graphs. |
Lu Bai; Lixin Cui; Yue Wang; Ming Li; Jing Li; Philip S. Yu; Edwin R. Hancock; |
| 392 | AEGK: Aligned Entropic Graph Kernels Through Continuous-Time Quantum Walks: (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a family of Aligned Entropic Graph Kernels (AEGK) for graph classification, based on the Averaged Mixing Matrix (AMM) of Continuous-time Quantum Walks (CTQWs). |
Lu Bai; Lixin Cui; Ming Li; Peng Ren; Yue Wang; Lichi Zhang; Philip S. Yu; Edwin R. Hancock; |
| 393 | Robust and Consistent Anchor Graph Learning for Multi-View Clustering (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, post-processing is needed to obtain final results after anchor graph construction, which negatively affects clustering performance. In this paper, we propose a Robust and Consistent Anchor Graph Learning method (RCAGL) for multi-view clustering to address these challenges. |
Suyuan Liu; Qing Liao; Siwei Wang; Xinwang Liu; En Zhu; |
| 394 | Hyperedge Graph Contrastive Learning [Extended Abstract] Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although various graph contrastive learning (GCL) techniques have been employed to generate augmented views and maximize their mutual information, current solutions only consider the pairwise relationships based on edges, neglecting the high-order information that can help generate more informative augmented views and make better contrast. To fill in this gap, we propose to leverage hyperedge to facilitate GCL, as it connects two or more nodes and can model high-order relationships among multiple nodes. |
Junfeng Zhang; Weixin Zeng; Jiuyang Tang; Xiang Zhao; |
| 395 | Online Feature Selection with Varying Feature Spaces (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, data streams generated in many real scenarios commonly exhibit arbitrarily incomplete feature spaces and scarcity labels, making existing approaches unsuitable for real applications. To fill these gaps, this study proposes a new problem called Online Feature Selection with Varying Features Spaces (OFSVF). |
Shengda Zhuo; Jinjie Qiu; Changdong Wang; Shuqiang Huang; |
| 396 | Efficient Projection-Based Algorithms for Tip Decomposition on Dynamic Bipartite Graphs (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a pioneering projection-based algorithm, coupled with advanced incremental maintenance strategies for edge modifications, tailored specifically for dynamic graphs. |
Tongfeng Weng; Yumeng Liu; Mo Sha; Xinyuan Chen; Xu Zhou; Kenli Li; Kian-Lee Tan; |
| 397 | Efficient Skyline Frequent-Utility Itemset Mining Algorithm on Massive Data (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This gap arises from the theoretical complexity of integrating these dimensions and the practical difficulty of setting appropriate thresholds for both. To overcome this limitation, we introduce Skyline frequent-utility itemset mining (SFUIM), a method that examines frequent and high-utility itemsets without requiring predefined thresholds. |
Jingxuan He; Xixian Han; Xiaolong Wan; Jinbao Wang; |