Time series forecasting is ubiquitous in various scientific and industrial domains. Powered by recurrent and convolutional and self-attention mechanism, deep learning exhibits high efficacy in time ...series forecasting. However, the existing forecasting methods are suffering some limitations. For example, recurrent neural networks are limited by the gradient vanishing problem, convolutional neural networks cost more parameters, and self-attention has a defect in capturing local dependencies. What’s more, they all rely on time invariant or stationary since they leverage parameter sharing by repeating a set of fixed architectures with fixed parameters over time or space. To address the above issues, in this paper we propose a novel time-variant framework named Self-Attention-based Time-Variant Neural Networks (SATVNN), generally capable of capturing dynamic changes of time series on different scales more accurately with its time-variant structure and consisting of self-attention blocks that seek to better capture the dynamic changes of recent data, with the help of Gaussian distribution, Laplace distribution and a novel Cauchy distribution, respectively. SATVNN obviously outperforms the classical time series prediction methods and the state-of-the-art deep learning models on lots of widely used real-world datasets.
Time series analysis is the process of exploring and analyzing past trends to predict future events for any given time interval. Powered by recent advances in convolutional, recurrent and ...self-attention mechanisms, many deep learning methods have facilitated the investigation of time series forecasting. However, despite their effectiveness, it is doubtful that future trends can be accurately predicted due to the intricate temporal irregularities. Plus, time series frequently exhibit features at various time scales, but existing approaches do not adequately take this into account. To address above issues, this paper offers a new
M
ulti-scale
A
daptive attention-based
T
ime-
V
ariant neural
N
etworks (MATVN) for multi-step ahead time series forecasting. Specifically, we contribute a novel framework capable of capturing irregular dynamic behaviors observed in temporal data over time with a Time-Variant architecture. Furthermore, a newly proposed Multi-scale Multi-head Adaptive attention module is introduced into the Time-Variant architecture to encode temporal dependencies from various pre-defined scale-aware ranges. Additionally, we endow the proposed module with more flexible individual representation learning and scale-aware attention scopes for each token to better capture multi-scale temporal patterns by designing a new Adaptive Window-aware Mask strategy. Experimental results on the vast majority of application scenarios, including climatology and energy consumption, demonstrate that the proposed model outperforms a lot of recent state-of-the-art methods in multi-step time series forecasting tasks.
Time series prediction with deep learning methods, especially Long Short-term Memory Neural Network (LSTM), have scored significant achievements in recent years. Despite the fact that LSTM can help ...to capture long-term dependencies, its ability to pay different degree of attention on sub-window feature within multiple time-steps is insufficient. To address this issue, an evolutionary attention-based LSTM training with competitive random search is proposed for multivariate time series prediction. By transferring shared parameters, an evolutionary attention learning approach is introduced to LSTM. Thus, like that for biological evolution, the pattern for importance-based attention sampling can be confirmed during temporal relationship mining. To refrain from being trapped into partial optimization like traditional gradient-based methods, an evolutionary computation inspired competitive random search method is proposed, which can well configure the parameters in the attention layer. Experimental results have illustrated that the proposed model can achieve competetive prediction performance compared with other baseline methods.
Graph neural networks (GNNs) have shown remarkable performance on homophilic graph data while being far less impressive when handling non-homophilic graph data due to the inherent low-pass filtering ...property of GNNs. In general, since real-world graphs are often complex mixtures of diverse subgraph patterns, learning a universal spectral filter on the graph from the global perspective as in most current works may still suffer from great difficulty in adapting to the variation of local patterns. On the basis of the theoretical analysis of local patterns, we rethink the existing spectral filtering methods and propose the N ode-oriented spectral F iltering for G raph N eural N etwork (namely NFGNN). By estimating the node-oriented spectral filter for each node, NFGNN is provided with the capability of precise local node positioning via the generalized translated operator, thus discriminating the variations of local homophily patterns adaptively. Meanwhile, the utilization of re-parameterization brings a good trade-off between global consistency and local sensibility for learning the node-oriented spectral filters. Furthermore, we theoretically analyze the localization property of NFGNN, demonstrating that the signal after adaptive filtering is still positioned around the corresponding node. Extensive experimental results demonstrate that the proposed NFGNN achieves more favorable performance.
Accurate forecasting of time series mitigates the uncertainty of future outlooks and is a great help in reducing errors in decisions. Despite years of researches, there are still some challenges to ...accurate forecasting of time series, including the difficulty of dynamic modeling, the problem of capturing short-term correlations, and the conundrum of long-term forecasting. This paper offers an Adversarial Truncated Cauchy Self-Attentive Time-Variant Neural Network (ASATVN) for multi-step ahead time series forecasting. Specifically, the proposed model builds on Generative Adversarial Networks, in which the generator is composed of a novel time-variant model. The time-variant model contributes to learning dynamic time-series changes with its time-variant architecture and employs a newly proposed Truncated Cauchy Self-Attention block to capture the local sequential dependencies better. For the discriminator, two self-attentive discriminators are presented to regularize predictions with fidelity and continuity, which is beneficial to predicting sequence over longer time horizons. Our proposed ASATVN model outperforms the state-of-the-art predictive models on eleven real-world benchmark datasets, demonstrating its effectiveness.
•A time-variant network learns various dynamics across multiple time steps.•A new self-attention are more sensitive to the local context of time series.•Two discriminators regularize predictor to offer realistic and continuous forecasts.
Despite that path-based and embedding-based models with knowledge graphs (KGs) achieve better recommendation performance compared with other deep learning based methods, such improvement is limited ...due to a lack of modeling user's dynamic interest. To address this issue, we explore a principled model to provide semantic understanding of each item in user's historical interest sequence in KGs. Specifically, we propose a multi-granularity dynamic interest sequence learning method, which is based on knowledge-enhanced path mining and interest fluctuation signal discovery, to obtain semantic-enhanced paths. Furthermore, the paths are embedded by the SEP2Vec, and merged through the proposed entropy-aware pooling layer to obtain the user preference representation, which is then used to learn dynamic user interest sequence. Experimental results on two public datasets of movie and music recommendation, and two industrial datasets of personalized local service recommendation in Alipay App have illustrated that the proposed model can achieve significantly better prediction performance compared with other known baselines.
While large enterprises are benefiting from their global supply chains in these years, it is not easy for Small and Medium-sized Enterprises (SMEs) to find supply chain partners. Treating it as a ...supply chain mining problem, some deep learning methods, especially knowledge graph (KG) enhanced ones, can achieve workable performance by utilizing explicit structure information from KG while considering effectiveness. However, such improvement is limited when facing the challenges of scalability, complexity, and noisiness in large-scale KGs. To address these issues, we propose a novel M eta-tag S upported C onnectivity representation L earning framework, also known as MSCL. Specifically, a Meta-tag Collaborative Filtering (MCF) method is proposed to highlight the representative schema from huge number of paths connecting two enterprises in large-scale KG. Furthermore, the DPPs-induced Hierarchical Path Sampling (DHPS), a novel sampling framework, is also developed to capture the latent connectivity pattern in KG more effectively. Moreover, the path-wise knowledge representations and the underlying information inherent in pairwise enterprises are aggregated by a connectivity representation learning (CRL) approach for SMEs supply chain mining. Experimental results from two real-world industries have illustrated that the proposed model can achieve competitive performance compared with other existing baselines.
•A novel framework is proposed to integrate KGE into the DLRS in recommender system.•A CIS layer is designed to address the challenge of data sparsity and achieve information sharing.•Results show ...that our method can work well on industrial-level online recommendation tasks.
Despite that existing knowledge graphs embedding (KGE) based methods can achieve better recommendation performance compared with deep learning based ones, such improvement is limited due to lack of capturing the shared information between user-item interaction and item-item relation encoded in knowledge graph (KG) by fully leveraging the implicit and explicit relationship. To address this issue, in this paper, we propose a principled deep knowledge-enhanced network (DKEN) framework based on deep learning and KGE to model the semantics of entities and relations encoded in the KG. In particular, the DKEN utilizes deep neural networks (DNN) to learn higher-order feature interactions and ensembles KGE features with DNN features into an end-to-end learning process naturally to exploit implicit interaction and explicitt semantic features. Furthermore, a cross information sharing (CIS) layer is designed to facilitate information sharing between items and entities, and two aggregators are developed to improve the performance of the model. Extensive experiments on several public datasets, as well as online AB tests of an industrial recommendation scenario in the Ant Financial Service Group, demonstrate that DKEN achieves remarkably better performance than several state-of-the-art baselines.
For an actual recommendation system, it generally involves a variety of heterogeneous interactive relationships, such as the typical user-user (U2U), item-item (I2I), and user-item (U2I) interaction ...relationships. With the application of graph neural networks (GNNs) in embedding various interactive relations, recommendation technology has made gratifying progress in recent years, which benefits lot from its powerful ability in relation modeling. However, most of the existing GNN-based methods fail to collaboratively explore the above heterogeneous multiple interactive relationships, including the internal correlations among multiple relationships and the intrinsic association behind different relationships. As a consequence, the user's personalized preference for the items to be recommended will not be well captured. In this paper, we propose a S ylvester equation induced C ollaborative R epresentation L earning framework (S-CRL) for recommendation system by utilizing the heterogeneous multiple interactive relationships. In particular, we ingeniously define a novel Sylvester equation to associate tactfully the multiple heterogeneous relations together. From the perspective of rating propagation, such Sylvester equation is shown theoretically to be the optimal solution of a local structure sensitive rating propagation function. Additionally, to seek more expressive embeddings about user and item, a layer-wise attention is introduced to aggregate the multi-hop information from U2U and I2I graphs, respectively, so as to promote the aggregation with the corresponding embeddings from the U2I interaction graph. Extensive experiments on three real-world datasets verify that our model achieves more favorable performance over currently representative methods.