Based on the data at~40°N at different longitudes during different stratospheric sudden warming(SSW)events,the responses of zonal winds in the stratosphere,mesosphere and lower thermosphere to SSWs ...are studied in this paper.The variations of zonal wind over Langfang,China(39.4°N,116.7°E)by MF radar and the modern era retrospective-analysis for research and applications(MERRA)wind data during 2010 and 2013 SSW and over Fort Collins,USA(41°N,105°W)by lidar and MERRA wind data during 2009 SSW are compared.Results show that the zonal wind at~40°N indeed respond to the SSWs while different specifics are found in different SSW events or at different locations.The zonal wind has significant anomalies during the SSWs.Over Langfang,before the onset of 2010 and 2013 SSW,the zonal wind reverses from eastward to westward below about 60–70 km and accelerates above this region,while westward wind prevails from 30 to 100 km after the onset of2010 SSW,and westward wind prevails in 30–60 and 85–100 km and eastward wind prevails in 60–85 km after the onset of2013 SSW.Over Fort Collins during 2009 SSW,eastward wind reverses to westward in 20–30 km before the onset while westward wind prevails in 20–30 and 60–97 km and eastward wind prevails in 30–60 and in 97–100 km after the onset.Moreover,simulations by the specified dynamics version of the whole atmosphere community climate model(SD-WACCM)are taken to explain different responding specifics of zonal wind to SSW events.It is found that the modulation of planetary wave(PW)plays the main role.Different phases of PWs would lead to the different zonal wind along with longitudes and the different amplitudes and phases in different SSW events can lead to the different zonal wind responses.
A survey on deep learning for big data Zhang, Qingchen; Yang, Laurence T.; Chen, Zhikui ...
Information fusion,
July 2018, 2018-07-00, Volume:
42
Journal Article
Peer reviewed
•We review the most typical four deep learning models and their variants.•We provide a survey on deep learning models for big data feature learning.•We discuss the remaining challenges and future ...trends of big data deep learning.
Deep learning, as one of the most currently remarkable machine learning techniques, has achieved great success in many applications such as image analysis, speech recognition and text understanding. It uses supervised and unsupervised strategies to learn multi-level representations and features in hierarchical architectures for the tasks of classification and pattern recognition. Recent development in sensor networks and communication technologies has enabled the collection of big data. Although big data provides great opportunities for a broad of areas including e-commerce, industrial control and smart medical, it poses many challenging issues on data mining and information processing due to its characteristics of large volume, large variety, large velocity and large veracity. In the past few years, deep learning has played an important role in big data analytic solutions. In this paper, we review the emerging researches of deep learning models for big data feature learning. Furthermore, we point out the remaining challenges of big data deep learning and discuss the future topics.
To improve the efficiency of big data feature learning, the paper proposes a privacy preserving deep computation model by offloading the expensive operations to the cloud. Privacy concerns become ...evident because there are a large number of private data by various applications in the smart city, such as sensitive data of governments or proprietary information of enterprises. To protect the private data, the proposed model uses the BGV encryption scheme to encrypt the private data and employs cloud servers to perform the high-order back-propagation algorithm on the encrypted data efficiently for deep computation model training. Furthermore, the proposed scheme approximates the Sigmoid function as a polynomial function to support the secure computation of the activation function with the BGV encryption. In our scheme, only the encryption operations and the decryption operations are performed by the client while all the computation tasks are performed on the cloud. Experimental results show that our scheme is improved by approximately 2.5 times in the training efficiency compared to the conventional deep computation model without disclosing the private data using the cloud computing including ten nodes. More importantly, our scheme is highly scalable by employing more cloud servers, which is particularly suitable for big data.
Convolutional neural networks, are one of the most representative deep learning models. CNNs were extensively used in many aspects of medical image analysis, allowing for great progress in ...computer-aided diagnosis in recent years. In this paper, we provide a survey on convolutional neural networks in medical image analysis. First, we review the commonly used CNNs in medical image processing, including AlexNet, GoogleNet, ResNet, R-CNN, and FCNN. Then, we present an overview of the use of CNNs, for image classification, segmentation, detection, and other tasks such as registration, content-based image retrieval, image generation and enhancement, in some typical medical diagnosis areas such as brain, breast, and abdominal. Finally, we discuss the remaining challenges of CNNs in medical image analysis, and accordingly we present some ideas for future research directions.
Currently, a large number of industrial data, usually referred to big data, are collected from Internet of Things (IoT). Big data are typically heterogeneous, i.e., each object in big datasets is ...multimodal, posing a challenging issue on the convolutional neural network (CNN) that is one of the most representative deep learning models. In this paper, a deep convolutional computation model (DCCM) is proposed to learn hierarchical features of big data by using the tensor representation model to extend the CNN from the vector space to the tensor space. To make full use of the local features and topologies contained in the big data, a tensor convolution operation is defined to prevent overfitting and improve the training efficiency. Furthermore, a high-order backpropagation algorithm is proposed to train the parameters of the deep convolutional computational model in the high-order space. Finally, experiments on three datasets, i.e., CUAVE, SNAE2, and STL-10 are carried out to verify the performance of the DCCM. Experimental results show that the deep convolutional computation model can give higher classification accuracy than the deep computation model or the multimodal model for big data in IoT.
•A CP-HOPCM algorithm based on canonical polyadic decomposition is proposed.•The canonical polyadic decomposition in CP-HOPCM is used to compress the attributes.•A TT-HOPCM algorithm based on the ...tensor-network is proposed.•The tensor-network in TT-HOPCM is used to compress the attributes.•The proposed schemes compress the objects greatly without a high accuracy drop.
Internet of Things (IoT) connects the physical world and the cyber world to offer intelligent services by data mining for big data. Each big data sample typically involves a large number of attributes, posing a remarkable challenge on the high-order possibilistic c-means algorithm (HOPCM). Specially, HOPCM requires high-performance servers with a large-scale memory and a powerful computing unit, to cluster big samples, limiting its applicability in IoT systems with low-end devices such as portable computing units and embedded devises which have only limited memory space and computing power. In this paper, we propose two high-order possibilistic c-means algorithms based on the canonical polyadic decomposition (CP-HOPCM) and the tensor-train network (TT-HOPCM) for clustering big data. In detail, we use the canonical polyadic decomposition and the tensor-train network to compress the attributes of each big data sample. To evaluate the performance of our algorithms, we conduct the experiments on two representative big data datasets, i.e., NUS-WIDE-14 and SNAE2, by comparison with the conventional high-order possibilistic c-means algorithm in terms of attributes reduction, execution time, memory usage and clustering accuracy. Results imply that CP-HOPCM and TT-HOPCM are potential for big data clustering in IoT systems with low-end devices since they can achieve a high compression rate for heterogeneous samples to save the memory space significantly without a significant clustering accuracy drop.
Supervised machine learning algorithms, especially classification algorithms, have been widely used in data analysis of industrial big data. Among them, the support vector machine (SVM) has achieved ...great success in the binary classification of some areas like image processing, computer vision, and pattern recognition. However, an SVM cannot achieve the desirable classification results for heterogeneous and high-dimensional data generated from thousands of industrial sensors in physical environments, because the traditional vector-based and feature-aligned SVM algorithm may result in loss of structural information and rich context information. Although the support tensor machine (STM) has extended the traditional vector-based SVM to tensor space, it fails to deal with multiple classification problems. Therefore, designing a general multiple classification algorithm for heterogeneous and high-dimensional data is a challenging but promising topic. To achieve this goal, this article presents a support multimode tensor machine (SMTM) algorithm by applying the multimode product to generalize the formulation of the STM. Furthermore, this article presents an efficient algorithm to train the parameters. Experiments conducted on various data sets validate the better performance of the SMTM over other algorithms in the multiple classification and imply the potential of the proposed model for multiple classification on industrial big data.
With the rapid advances of sensing technologies and wireless communications, large amounts of dynamic data pertaining to industrial production are being collected from many sensor nodes deployed in ...the industrial Internet of Things. Analyzing those data effectively can help to improve the industrial services and mitigate the system unprepared breakdowns. As an important technique of data analysis, clustering attempts to find the underlying pattern structures embedded in unlabeled information. Unfortunately, most of the current clustering techniques that could only deal with static data become infeasible to cluster a significant volume of data in the dynamic industrial applications. To tackle this problem, an incremental clustering algorithm by fast finding and searching of density peaks based on k-mediods is proposed in this paper. In the proposed algorithm, two cluster operations, namely cluster creating and cluster merging, are defined to integrate the current pattern into the previous one for the final clustering result, and k-mediods is employed to modify the clustering centers according to the new arriving objects. Finally, experiments are conducted to validate the proposed scheme on three popular UCI datasets and two real datasets collected from industrial Internet of Things in terms of clustering accuracy and computational time.
90% of esophageal cancer are esophageal squamous cell carcinoma (ESCC) and ESCC has a very poor prognosis and high mortality. Nevertheless, the key metabolic pathways associated with ESCC progression ...haven't been revealed yet. Metabolomics has become a new platform for biomarker discovery over recent years. We aim to elucidate dominantly metabolic pathway in all ESCC tumor/node/metastasis (TNM) stages and adjacent cancerous tissues. We collected 60 postoperative esophageal tissues and 15 normal tissues adjacent to the tumor, then performed Liquid Chromatography with tandem mass spectrometry (LC-MS/MS) analyses. The metabolites data was analyzed with metabolites differential and correlational expression heatmap according to stage I vs. con., stage I vs. stage II, stage II vs. stage III, and stage III vs. stage IV respectively. Metabolic pathways were acquired by Kyoto Encyclopedia of Genes and Genomes. (KEGG) pathway database. The metabolic pathway related genes were obtained via Gene Set Enrichment Analysis (GSEA). mRNA expression of ESCC metabolic pathway genes was detected by two public datasets: gene expression data series (GSE)23400 and The Cancer Genome Atlas (TCGA). Receiver operating characteristic curve (ROC) analysis is applied to metabolic pathway genes. 712 metabolites were identified in total. Glycerophospholipid metabolism was significantly distinct in ESCC progression. 16 genes of 77 genes of glycerophospholipid metabolism mRNA expression has differential significance between ESCC and normal controls. Phosphatidylserine synthase 1 (PTDSS1) and Lysophosphatidylcholine Acyltransferase1 (LPCAT1) had a good diagnostic value with Area under the ROC Curve (AUC) > 0.9 using ROC analysis. In this study, we identified glycerophospholipid metabolism was associated with the ESCC tumorigenesis and progression. Glycerophospholipid metabolism could be a potential therapeutic target of ESCC progression.
Deep learning, as the most important architecture of current computational intelligence, achieves super performance to predict the cloud workload for industry informatics. However, it is a nontrivial ...task to train a deep learning model efficiently since the deep learning model often includes a great number of parameters. In this paper, an efficient deep learning model based on the canonical polyadic decomposition is proposed to predict the cloud workload for industry informatics. In the proposed model, the parameters are compressed significantly by converting the weight matrices to the canonical polyadic format. Furthermore, an efficient learning algorithm is designed to train the parameters. Finally, the proposed efficient deep learning model is applied to the workload prediction of virtual machines on cloud. Experiments are conducted on the datasets collected from PlanetLab to validate the performance of the proposed model by comparing with other machine-learning-based approaches for workload prediction of virtual machines. Results indicate that the proposed model achieves a higher training efficiency and workload prediction accuracy than state-of-the-art machine-learning-based approaches, proving the potential of the proposed model to provide predictive services for industry informatics.