In pattern classification, polynomial classifiers are well-studied methods as they are capable of generating complex decision surfaces. Unfortunately, the use of multivariate polynomials is limited ...to kernels as in support-vector machines, because polynomials quickly become impractical for high-dimensional problems. In this paper, we effectively overcome the curse of dimensionality by employing the tensor train (TT) format to represent a polynomial classifier. Based on the structure of TTs, two learning algorithms are proposed, which involve solving different optimization problems of low computational complexity. Furthermore, we show how both regularization to prevent overfitting and parallelization, which enables the use of large training sets, are incorporated into these methods. The efficiency and efficacy of our tensor-based polynomial classifier are then demonstrated on the two popular data sets U.S. Postal Service and Modified NIST.
This article introduces the Tensor Network B-spline (TNBS) model for the regularized identification of nonlinear systems using a nonlinear autoregressive exogenous (NARX) approach. Tensor network ...theory is used to alleviate the curse of dimensionality of multivariate B-splines by representing the high-dimensional weight tensor as a low-rank approximation. An iterative algorithm based on the alternating linear scheme is developed to directly estimate the low-rank tensor network approximation, removing the need to ever explicitly construct the exponentially large weight tensor. This reduces the computational and storage complexity significantly, allowing the identification of NARX systems with a large number of inputs and lags. The proposed algorithm is numerically stable, robust to noise, guaranteed to monotonically converge, and allows the straightforward incorporation of regularization. The TNBS-NARX model is validated through the identification of the cascaded watertank benchmark nonlinear system, on which it achieves state-of-the-art performance while identifying a 16-dimensional B-spline surface in 4 s on a standard desktop computer. An open-source MATLAB implementation is available on GitHub.
Specifying a prior distribution is an essential part of solving Bayesian inverse problems. The prior encodes a belief on the nature of the solution and this regularizes the problem. In this article ...we completely characterize a Gaussian prior that encodes the belief that the solution is a structured tensor. We first define the notion of (A,b)-constrained tensors and show that they describe a large variety of different structures such as Hankel, circulant, triangular, symmetric, and so on. Then we completely characterize the Gaussian probability distribution of such tensors by specifying its mean vector and covariance matrix. Furthermore, explicit expressions are proved for the covariance matrix of tensors whose entries are invariant under a permutation. These results unlock a whole new class of priors for Bayesian inverse problems. We illustrate how new kernel functions can be designed and efficiently computed and apply our results on two particular Bayesian inverse problems: completing a Hankel matrix from a few noisy measurements and learning an image classifier of handwritten digits. The effectiveness of the proposed priors is demonstrated for both problems. All applications have been implemented as reactive Pluto notebooks in Julia.
This article extends the tensor network Kalman filter to matrix outputs with an application in recursive identification of discrete-time nonlinear multiple-input-multiple-output (MIMO) Volterra ...systems. This extension completely supersedes previous work, where only l scalar outputs were considered. The Kalman tensor equations are modified to accommodate for matrix outputs and their implementation using tensor networks is discussed. The MIMO Volterra system identification application requires the conversion of the output model matrix with a row-wise Kronecker product structure into its corresponding tensor network, for which we propose an efficient algorithm. Numerical experiments demonstrate both the efficacy of the proposed matrix conversion algorithm and the improved convergence of the Volterra kernel estimates when using matrix outputs.
This article introduces two Tensor Network-based iterative algorithms for the identification of high-order discrete-time nonlinear multiple-input multiple-output (MIMO) Volterra systems. The system ...identification problem is rewritten in terms of a Volterra tensor, which is never explicitly constructed, thus avoiding the curse of dimensionality. It is shown how each iteration of the two identification algorithms involves solving a linear system of low computational complexity. The proposed algorithms are guaranteed to monotonically converge and numerical stability is ensured through the use of orthogonal matrix factorizations. The performance and accuracy of the two identification algorithms are illustrated by numerical experiments, where accurate degree-10 MIMO Volterra models are identified in about 1 s using Matlab on a 3.3 GHz quad-core desktop computer with 16 GB RAM.
Summary
We generalize the matrix Kronecker product to tensors and propose the tensor Kronecker product singular value decomposition that decomposes a real k‐way tensor
A into a linear combination of ...tensor Kronecker products with an arbitrary number of d factors. We show how to construct
A=∑j=1RσjAj(d)⊗⋯⊗Aj(1), where each factor
Aj(i) is also a k‐way tensor, thus including matrices (k=2) as a special case. This problem is readily solved by reshaping and permuting
A into a d‐way tensor, followed by a orthogonal polyadic decomposition. Moreover, we introduce the new notion of general symmetric tensors (encompassing symmetric, persymmetric, centrosymmetric, Toeplitz and Hankel tensors, etc.) and prove that when
A is structured then its factors
Aj(1),⋯,Aj(d) will also inherit this structure.
We present an iterative algorithm, called the symmetric tensor eigen-rank-one iterative decomposition (STEROID), for decomposing a symmetric tensor into a real linear combination of symmetric rank-1 ...unit-norm outer factors using only eigendecompositions and least-squares fitting. Originally designed for a symmetric tensor with an order being a power of two, STEROID is shown to be applicable to any order through an innovative tensor embedding technique. Numerical examples demonstrate the high efficiency and accuracy of the proposed scheme even for large scale problems. Furthermore, we show how STEROID readily solves a problem in nonlinear block-structured system identification and nonlinear state-space identification.
In recent years, the application of tensors has become more widespread in fields that involve data analytics and numerical computation. Due to the explosive growth of data, low-rank tensor ...decompositions have become a powerful tool to harness the notorious curse of dimensionality. The main forms of tensor decomposition include CP decomposition, Tucker decomposition, tensor train (TT) decomposition, etc. Each of the existing TT decomposition algorithms, including the TT-SVD and randomized TT-SVD, is successful in the field, but neither can both accurately and efficiently decompose large-scale sparse tensors. Based on previous research, this paper proposes a new quasi-optimal fast TT decomposition algorithm for large-scale sparse tensors with proven correctness and the upper bound of computational complexity derived. It can also efficiently produce sparse TT with no numerical error and slightly larger TT-ranks on demand. In numerical experiments, we verify that the proposed algorithm can decompose sparse tensors in a much faster speed than the TT-SVD, and have advantages on speed, precision and versatility over the randomized TT-SVD and TT-cross. And, with it we can realize large-scale sparse matrix TT decomposition that was previously unachievable, enabling the tensor decomposition based algorithms to be applied in more scenarios.
This article introduces a Tensor Network Kalman filter, which can estimate state vectors that are exponentially large without ever having to explicitly construct them. The Tensor Network Kalman ...filter also easily accommodates the case where several different state vectors need to be estimated simultaneously. The key lies in rewriting the standard Kalman equations as tensor equations and then implementing them using Tensor Networks, which effectively transforms the exponential storage cost and computational complexity into a linear one. We showcase the power of the proposed framework through an application in recursive nonlinear system identification of high-order discrete-time multiple-input multiple-output (MIMO) Volterra systems. The identification problem is transformed into a linear state estimation problem wherein the state vector contains all Volterra kernel coefficients and is estimated using the Tensor Network Kalman filter. The accuracy and robustness of the scheme are demonstrated via numerical experiments, which show that updating the Kalman filter estimate of a state vector of length 109 and its covariance matrix takes about 0.007 s on a standard desktop computer in Matlab.
Tensor Networks (TNs) have recently been used to speed up kernel machines by constraining the model weights, yielding exponential computational and storage savings. In this paper we prove that the ...outputs of Canonical Polyadic Decomposition (CPD) and Tensor Train (TT)-constrained kernel machines recover a Gaussian Process (GP), which we fully characterize, when placing i.i.d. priors over their parameters. We analyze the convergence of both CPD and TT-constrained models, and show how TT yields models exhibiting more GP behavior compared to CPD, for the same number of model parameters. We empirically observe this behavior in two numerical experiments where we respectively analyze the convergence to the GP and the performance at prediction. We thereby establish a connection between TN-constrained kernel machines and GPs.