Candida auris, first described from an ear infection in Japan, is the most talked about multidrug resistant emerging pathogenic fungal species. Its environmental niche remained a mystery until its ...first isolation from wetlands of the Andaman Islands, India in 2020. We screened a subset of the world’s largest sequence repository, the Sequence Read Archive at NCBI using a DNA metabarcoding approach, based on either the ITS1 or ITS2 region of the official primary fungal DNA barcode, to identify potential environmental sources of C. auris. Our search identified 34 matches with partial C. auris ITS sequences from seven metabarcoding studies, providing wider evidence for the presence of C. auris outside human-maintained facilities.
Social media contains a massive amount of information, which provides researchers and practitioners with an invaluable source of data to conduct research from end-users' perspectives, in order to ...influence firm strategic choices. Although an extensive amount of research has been developed in B2C and B2B marketing context, few social media studies take a dive into potential linkages between external and internal marketing contexts in an industry specific paradigm. This study aims to bridge B2C and B2B social media marketing, by adopting the outside-in perspective as theoretical lens. Using a large-scale dataset, collected from a micro-blogging site, and consumer-oriented information assembled from multiple sources, we empirically examine the inter-relationship between firm-generated messages, consumer digital engagement, and firm sales performance in the movie industry. Theoretically, this study builds upon the outside-in perspective and extends the current knowledge of the outside-in perspective to the social media context. It also bridges the B2C and B2B marketing literature by demonstrating that the insight garnered from B2C social media interactions should be integrated into the B2B firm interactions, communications, and decision makings. Managerially, this study provides movie practitioners with important implications.
•Consumer digital engagement measurements can be considered good outside-in performance metrics•Consumer digital engagementas triggered by varying types of FGC mediates the impact of FGC on firm sales performance differentially•Insight from B2C social media communications can be value-added and should be integrated into B2B firm dynamics and decision making processes
This paper presents a multi-objective optimization model for a green supply chain management scheme that minimizes the inherent risk occurred by hazardous materials, associated carbon emission and ...economic cost. The model related parameters are capitalized on a big data analysis. Three scenarios are proposed to improve green supply chain management. The first scenario divides optimization into three options: the first involves minimizing risk and then dealing with carbon emissions (and thus economic cost); the second minimizes both risk and carbon emissions first, with the ultimate goal of minimizing overall cost; and the third option attempts to minimize risk, carbon emissions, and economic cost simultaneously. This paper provides a case study to verify the optimization model. Finally, the limitations of this research and approach are discussed to lay a foundation for further improvement.
A Survey on Large-Scale Machine Learning Wang, Meng; Fu, Weijie; He, Xiangnan ...
IEEE transactions on knowledge and data engineering,
06/2022, Volume:
34, Issue:
6
Journal Article
Peer reviewed
Open access
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions and having been widely used in real-world applications, such as text mining, visual ...classification, and recommender systems. However, most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data. This issue calls for the need of Large-scale Machine Learning (LML), which aims to learn patterns from big data with comparable performance efficiently. In this paper, we offer a systematic survey on existing LML methods to provide a blueprint for the future developments of this area. We first divide these LML methods according to the ways of improving the scalability: 1) model simplification on computational complexities, 2) optimization approximation on computational efficiency, and 3) computation parallelism on computational capabilities. Then we categorize the methods in each perspective according to their targeted scenarios and introduce representative methods in line with intrinsic strategies. Lastly, we analyze their limitations and discuss potential directions as well as open issues that are promising to address in the future.
Proteins have evolved to perform diverse cellular functions, from serving as reaction catalysts to coordinating cellular propagation and development. Frequently, proteins do not exert their full ...potential as monomers but rather undergo concerted interactions as either homo-oligomers or with other proteins as hetero-oligomers. The experimental study of such protein complexes and interactions has been arduous. Theoretical structure prediction methods are an attractive alternative. Here, we investigate homo-oligomeric interfaces by tracing residue coevolution via the global statistical direct coupling analysis (DCA). DCA can accurately infer spatial adjacencies between residues. These adjacencies can be included as constraints in structure prediction techniques to predict high-resolution models. By taking advantage of the ongoing exponential growth of sequence databases, we go significantly beyond anecdotal cases of a few protein families and apply DCA to a systematic large-scale study of nearly 2,000 Pfam protein families with sufficient sequence information and structurally resolved homo-oligomeric interfaces. We find that large interfaces are commonly identified by DCA. We further demonstrate that DCA can differentiate between subfamilies with different binding modes within one large Pfam family. Sequence-derived contact information for the subfamilies proves sufficient to assemble accurate structural models of the diverse protein-oligomers. Thus, we provide an approach to investigate oligomerization for arbitrary protein families leading to structural models complementary to often-difficult experimental methods. Combined with ever more abundant sequential data, we anticipate that this study will be instrumental to allow the structural description of many heteroprotein complexes in the future.
•We introduce an approach to a content analysis of geotagged photos for CES uses.•By using automated tags and a network analysis, themes of the photos were grouped.•This method allowed to distinguish ...CES- and non-CES-related photos.•This approach can provide spatial information about socio-cultural uses.•Our approach is applicable for crowd-sourced photos available in other regions.
The volume of accessible geotagged crowdsourced photos has increased. Such data include spatial, temporal, and thematic information on recreation and outdoor activities, thus can be used to quantify the demand for cultural ecosystem services (CES). So far photo content has been analyzed based on user-labeled tags or the manual labeling of photos. Both approaches are challenged with respect to consistency and cost-efficiency, especially for large-scale studies with an enormous volume of photos. In this study, we aim at developing a new method to analyze the content of large volumes of photos and to derive indicators of socio-cultural usage of landscapes. The method uses machine-learning and network analysis to identify clusters of photo content that can be used as an indicator of cultural services provided by landscapes. The approach was applied in the Mulde river basin in Saxony, Germany. All public Flickr photos (n = 12,635) belonging to the basin were tagged by deep convolutional neural networks through a cloud computing platform, Clarifai. The machine-predicted tags were analyzed by a network analysis that leads to nine hierarchical clusters. Those clusters were used to distinguish between photos related to CES (65%) and not related to CES (35%). Among the nine clusters, two clusters were related to CES: ‘landscape aesthetics’ and ‘existence’. This step allowed mapping of different aspects of CES and separation of non-relevant photos from further analysis. We further analyzed the impact of protected areas on the spatial pattern of CES and not-related CES photos. The presence of protected areas had a significant positive impact on the areas with both ‘landscape aesthetics’ and ‘existence’ photos: the total number of days in each mapping unit where at least one photo was taken by a user (‘photo-user-day’) increased with the share of protected areas around the location. The presented approach has shown its potential for reliable mapping of socio-cultural uses of landscapes. It is expected to scale well with large numbers of photos and to be easily transferable to different regions.
In Geological Carbon Sequestration (GCS), mineralization is a secure carbon dioxide (CO2) trapping mechanism to prevent possible leakage at a later stage of the GCS project. Modeling the ...mineralization mechanism during GCS relies on numerical reservoir simulation, but the computational cost is prohibitively high due to the complex physical processes. Therefore, deep learning (DL) models can be used as a computationally cheaper and more reliable at the same time, alternative to conventional numerical simulations. In this work, we have developed a DL approach to effectively predict the dissolution and precipitation of various essential minerals, including Anorthite, Kaolinite, and Calcite, during CO2 injection into deep saline aquifers. We have established a reservoir model to simulate the geological CO2 storage process. Seven hundred twenty-two numerical realizations were performed to generate a comprehensive dataset for training DL models. Two convolution neural networks (CNN), Fourier Neural Operator (FNO), and U-Net were trained. The trained models used reservoir and well properties along with time information as input and predicted the precipitation and dissolution of minerals in space and time scales. During the training process, root-mean-squared-error (RMSE) was used as a loss function. To gauge prediction performance, we have applied the trained model to predict the concentrations of different minerals on the test dataset, which is 15% of the entire dataset, and two metrics, including the average absolute percentage error (AAPE) and the coefficient of determination (R2), were adopted. The FNO model resulted in the R2 of 0.95 for the Calcite model, 0.94 for the Kaolinite model, and 0.93 for the Anorthite model. The U-Net model resulted in the R2 of 0.88 for the Calcite model, 0.89 for the Kaolinite model, and 0.912 for the Anorthite model. The model’s prediction CPU time (0.2 s/case) was much lower than that of the physics-based reservoir simulator (3600 s/case). Therefore, the proposed method offers predictions as accurate as our physics-based reservoir simulations while providing a substantial computational time acceleration.
Display omitted
•A robust deep learning (DL) workflow is presented.•DL workflow can efficiently predict the spatial and temporal mineralization process.•DL workflow showed substantial acceleration compared to full numerical reservoir simulation.•Multi model approach enhanced the prediction performance of CO2 mineralization process.
Abstract
English is one of the world’s universal language. With the frequent exchange of our country with the world, the society has a higher demand for English talents. According to the analysis of ...big data, the reform of the traditional teaching method and the cultivation of the comprehensive English talent are the goal of the English education in our country, including the multi-dimensional interactive teaching mode, the limitation of the traditional mode, the practice teaching activity and the training of the students’ language expression ability. To provide students with a good language learning environment is the function of the interactive teaching mode which should be promoted in our education 1.
Football is the most popular sport in the world with four billion fans all over the world. Reportedly, the violence incidence rates are high during or after the matches. The violent or destructive ...behavior carried out by a person or player, who watches or plays the game in the stadium is known as football hooliganism. To prevent or control the violence, a real time violence detection system is exclusively needed to monitor the behavior of the crowd and players to take necessary action before the violence is about to happen. Even it is necessary for the system to find whether the attack is non-intentional or intentional in the game. In this paper, a real time violence detection system is proposed which processes the huge input streaming data and recognize the violence with human intelligence simulation. The input to the system is the enormous amount of real time video streams from different sources which is processed in Spark framework. In the Spark framework, the frames are separated and the features of individual frames are extracted by using HOG (Histogram of Oriented Gradients) function. Then the frames are labeled based on features as violence model, human part model and negative model, which are used to train the Bidirectional Long Short-Term Memory (BDLSTM) network for recognition of violence scenes. The bidirectional LSTM can access the information both in forward and reverse direction. Thus the output is generated in context to both past and future information. The network is trained with the violent interaction dataset (VID), containing 2314 videos with 1077 fight ones and 1237 no-fight ones. Moreover to make the model robust to violence detection, we have created a dataset with 410 video clips having non-violence scenes and 409 video clips having violence scenes, acquired from the football stadium. The performance of this model is validated and it proves the sturdiness of the system with an accuracy of 94.5 percentage in recognizing the violent action.
This paper presents a new approach to analyzing measurement records from industrial processes. The proposed methodology is based on the model of contextual processing and uses big data from ...experimental process tomography datasets. Electrical capacitance tomography is used for monitoring noninvasive flow and for data acquisition. The measurement data are collected, stored, and processed to identify process regimes and process threats. A specific physical modification was introduced into the pneumatic conveying flow rig in order to study flow behavior under extreme conditions, extending the available knowledge base. A support vector machine was applied for data classification. This study illustrates how contextual processing can facilitate data interpretation and opens the way for the development of methods for detecting pre-emergency flow patterns.