Based on the importance of automatic classification of Tibetan La case C4'^) example sentences in Tibetan natural language processing, according to the usage and adding rules of Tibetan La case, this ...paper classifies Tibetan La case example sentences and defines the classification concept, and proposes an automatic classification model of Tibetan La case example sentences with fusion dual-channel syllable features. The proposed model first uses word2vec and Glove to construct a dual-channel Tibetan syllable embedding, and combines the dual-channel syllable features in each convolution respectively to enrich the expression of input features and improve the spatial representation ability of the convolutional layer. Then in each convolution, the Bi-LSTM combined with the hierarchical attention mechanism is used to learn the timing features, and the multi-channel features are spliced to improve the learning ability of the context timing features. Finally, the automatic classification of Tibetan La case example s
Purpose/significance In order to solve the problem that users often have difficulty in obtaining information in massive digital resources of library, this paper construct a personalized knowledge ...service system, which is the inevitable choice of library to help users to get rid of the information overload predicament and improve the quality of knowledge service. Method/process Firstly, this paper built a mapping model of Chinese Library Classification(CLC) and subject classification. Then, based on Hadoop distributed processing platform, it proposed to build automatic classification model of massive academic resources in libraries by improving TF-IDF+ Bayesian algorithm, the model can help to construct the personalized knowledge service systems in library. Result/conclusionIn the experimental part,we collected more than 6 million documents from CNKI as the original training corpus (corpus covers 75 disciplines) to test the effectiveness of the classification model, the experimental result shows that the
Sparse Representation-based Classification (SRC) has been seen to be a reliable Face Recognition technique. The ℓsub.1 Bayesian based on the Lasso algorithm has proven to be most effective in class ...identification and computation complexity. In this paper, we revisit classification algorithm and then recommend the group-based classification. The proposed modified algorithm, which is called as Group Class Residual Sparse Representation-based Classification (GCR-SRC), extends the coherency of the test sample to the whole training samples of the identified class rather than only to the nearest one of the training samples. Our method is based on the nearest coherency between a test sample and the identified training samples. To reduce the dimension of the training samples, we choose random projection for feature extraction. This method is selected to reduce the computational cost without increasing the algorithm’s complexity. From the simulation result, the reduction factor (ρ) 64 can achieve a maximum recognition rate about 10% higher than the SRC original using the downscaling method. Our proposed method’s feasibility and effectiveness are tested on four popular face databases, namely AT&T, Yale B, Georgia Tech, and AR Dataset. GCR-SRC and GCR-RP-SRC achieved up to 4% more accurate than SRC random projection with class-specific residuals. The experiment results show that the face recognition technology based on random projection and group-class-based not only reduces the dimension of the face data but also increases the recognition accuracy, indicating that it is a feasible method for face recognition.
•The BACH challenge was organized to push forward methods for automatic classification of breast cancer biopsies using clinical hematoxylin-eosin stained histopathological images.•A large public ...dataset, composed of 400 microscopy images and 30 whole-slide images, was specifically compiled for the BACH challenge.•A total of 64 methods were submitted, out of 677 registration, and a detailed comparative analysis was carried out for the methods with higher accuracy scores.•Several submitted algorithms performed better than the state-of-the-art in terms of accuracy (top score of 87%).•Convolutional neural networks dominated the submissions, and was the method of choice in the algorithm that won the challenge.
Display omitted
Breast cancer is the most common invasive cancer in women, affecting more than 10% of women worldwide. Microscopic analysis of a biopsy remains one of the most important methods to diagnose the type of breast cancer. This requires specialized analysis by pathologists, in a task that i) is highly time- and cost-consuming and ii) often leads to nonconsensual results. The relevance and potential of automatic classification algorithms using hematoxylin-eosin stained histopathological images has already been demonstrated, but the reported results are still sub-optimal for clinical use. With the goal of advancing the state-of-the-art in automatic classification, the Grand Challenge on BreAst Cancer Histology images (BACH) was organized in conjunction with the 15th International Conference on Image Analysis and Recognition (ICIAR 2018). BACH aimed at the classification and localization of clinically relevant histopathological classes in microscopy and whole-slide images from a large annotated dataset, specifically compiled and made publicly available for the challenge. Following a positive response from the scientific community, a total of 64 submissions, out of 677 registrations, effectively entered the competition. The submitted algorithms improved the state-of-the-art in automatic classification of breast cancer with microscopy images to an accuracy of 87%. Convolutional neuronal networks were the most successful methodology in the BACH challenge. Detailed analysis of the collective results allowed the identification of remaining challenges in the field and recommendations for future developments. The BACH dataset remains publicly available as to promote further improvements to the field of automatic classification in digital pathology.
Applying artificial intelligence methods, the paper frames the algorithm structure and software for the formalized determination of the type of distribution (automatic classification) of the ...probability density function and the vector of limit values by justifying theoretically security gradations and determining quantitatively security indicators. The methodological basis of the research is the applied systems theory, statistical analysis, and methods of artificial intelligence (cluster analysis). The study of the approaches applied showed the absence of a theoretical basis for determining security gradations and the absence of their theoretical quantitative justification. The theoretical basis for determining security gradations is the concept of an extended "homeostatic plateau", which connects three levels of security in both directions: optimal, crisis, and critical with spheres of positive, neutral and negative feedback. To determine the bifurcation points (vector of limit values), the “t-criterion” method is used, which consists in constructing the probability density function of a “benchmark” sample, determining whether it belongs to the type of distribution with the calculation of statistical characteristics (mathematical expectation, mean square deviation, and asymmetry coefficient) and formalized calculation of the vector of limit values for characteristic types of distribution (normal, lognormal, exponential). To solve the problem of recognising (automatic classifying) the type of distribution of probability density functions of security indicators, artificial intelligence methods are used, namely, a discriminant method from the class of cluster analysis methods using quantitative and qualitative metrics: Euclidean distance, Manhattan metric and recognition by characteristic features. To digitize the determination of the vector of safety indicators limit values, an algorithm structure and software in the C++ programming language (version 6) have been developed, which ensures full automation of all stages of the algorithm and the adequacy of recognising graphic digital data with a predetermined number of clusters (types of distribution). A distinctive feature of the proposed method of formalized determination of the security indicators limit values is a complete absence of subjectivity and complete mathematical formalization, which significantly increases the speed, quality and reliability of the results obtained when evaluating the level of sustainable development, economic security, national security or national stability, regardless of the level of a researcher's qualification.
In the Yangtze River Delta in China, known for its intricate water network, achieving harmonious development between humans and nature in rural areas is imperative. However, the identification of the ...water-net landscape characteristics and the relationship between rural sustainability and these landscape characteristics remain unclear. The aim of this study was to bridge this gap by proposing a novel framework for investigating the relationship between landscape characteristics and rural sustainability from a typo-morphological perspective. Specifically, through regression analysis, the influence of multilevel spatial characteristics of rural landscape on sustainability was selected as the research focus. First, multilevel metrics were introduced to delineate the rural landscape characteristics, including single and multiple landscape elements and landscape types, using deep learning methods to achieve automatic classification. Subsequently, by employing an improved entropy method, we comprehensively quantified rural sustainability indicators from the economic, social, and ecological dimensions. Finally, the ordinary least squares (OLS) model and two spatial variation coefficient models, namely, geographically weighted regression (GWR) and multiscale geographically weighted regression (MGWR), were used to quantitatively analyze the relationship between the landscape characteristics and rural sustainability. Significant regression model performances were obtained with adjusted R2 values of 0.33, 0.35, and 0.4 at each landscape characteristic level. The adjusted R2 values for the GWR and MGWR, which incorporated all of the landscape characteristics metrics, were 0.84 and 0.88, respectively. The results demonstrate that rural sustainability highly depends on the proposed multilevel characteristics and exhibits spatial heterogeneity. The findings of this study improve our understanding of the typo-morphological characteristics of the landscape and provide important planning and decision-making references for sustainable development in rural areas.
Display omitted
•Using typo-morphological metrics to delineate multilevel landscape characteristics.•Spatial correlation study of landscape characteristics and rural sustainability.•Employing deep learning methods for the classification of landscape types.•Landscape characteristics have multiscale impact on rural sustainability.•Multilevel landscape character can inform development policy in water-net region.
Data imbalance is frequently encountered in biomedical applications. Resampling techniques can be used in binary classification to tackle this issue. However such solutions are not desired when the ...number of samples in the small class is limited. Moreover the use of inadequate performance metrics, such as accuracy, lead to poor generalization results because the classifiers tend to predict the largest size class. One of the good approaches to deal with this issue is to optimize performance metrics that are designed to handle data imbalance. Matthews Correlation Coefficient (MCC) is widely used in Bioinformatics as a performance metric. We are interested in developing a new classifier based on the MCC metric to handle imbalanced data. We derive an optimal Bayes classifier for the MCC metric using an approach based on Frechet derivative. We show that the proposed algorithm has the nice theoretical property of consistency. Using simulated data, we verify the correctness of our optimality result by searching in the space of all possible binary classifiers. The proposed classifier is evaluated on 64 datasets from a wide range data imbalance. We compare both classification performance and CPU efficiency for three classifiers: 1) the proposed algorithm (MCC-classifier), the Bayes classifier with a default threshold (MCC-base) and imbalanced SVM (SVM-imba). The experimental evaluation shows that MCC-classifier has a close performance to SVM-imba while being simpler and more efficient.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK