Handwritten signatures are biometric traits at the center of debate in the scientific community. Over the last 40 years, the interest in signature studies has grown steadily, having as its main ...reference the application of automatic signature verification, as previously published reviews in 1989, 2000, and 2008 bear witness. Ever since, and over the last 10 years, the application of handwritten signature technology has strongly evolved and much research has focused on the possibility of applying systems based on handwritten signature analysis and processing to a multitude of new fields. After several years of haphazard growth of this research area, it is time to assess its current developments for their applicability in order to draw a structured way forward. This perspective reports a systematic review of the last 10 years of the literature on handwritten signatures with respect to the new scenario, focusing on the most promising domains of research and trying to elicit possible future research directions in this subject.
In document image analysis, segmentation is the task that identifies the regions of a document. The increasing number of applications of document analysis requires a good knowledge of the available ...technologies. This survey highlights the variety of the approaches that have been proposed for document image segmentation since 2008. It provides a clear typology of documents and of document image segmentation algorithms. We also discuss the technical limitations of these algorithms, the way they are evaluated and the general trends of the community.
•Extensive review of the state of the art with a well defined scope.•Analysis of the algorithms from a scientific and an industrial point of view.•Well defined document and algorithm typologies.•Discussion on the trends of the field and the evaluation of the algorithms.
Global Position Systems and other navigation systems that collect spatial data through an array of sensors carried on by people and distributed in space have changed the way we navigate complex ...environments, such as cities. However, indoor navigation without reliable GPS signals relies on wall-mounted antennas, WiFi, or quantum sensors. Despite the gains of such technologies, underlying these navigation systems is the dismissal of the human wayfinding ability based on visual recognition of spatial features. In this paper, we propose a robust and parsimonious approach using Deep Convolutional Neural Network (DCNN) to recognize and interpret interior space. DCNN has achieved incredible success in object and scene recognition. In this study we design and train a DCNN to classify a pre-zoning indoor space, and from a single phone photo to recognize the learned space features, with no need of additional assistive technology. We collect more than 600,000 images inside MIT campus buildings to train our DCNN model, and achieved 97.9% accuracy in validation dataset and 81.7% accuracy in test dataset based on spatial-scale fixed model. Furthermore, the recognition accuracy and spatial resolution can be potentially improved through multiscale classification model. We identify the discriminative image regions through Class Activating Mapping (CAM) technique, to observe the model's behavior in how to recognize space and interpret it in an abstract way. By evaluating the results with misclassification matrix, we investigate the visual spatial feature of interior space by looking into its visual similarity and visual distinctiveness, giving insights into interior design and human indoor perception and wayfinding research. The contribution of this paper is threefold. First, we propose a robust and parsimonious approach for indoor navigation using DCNN. Second, we demonstrate that DCNN also has a potential capability in space feature learning and recognition, even under severe appearance changes. Third, we introduce a DCNN based approach to look into the visual similarity and visual distinctiveness of interior space.
•Participants read multiple documents with or without media multitasking.•Participants summarized main ideas of paragraphs or reread paragraphs.•Media multitasking negatively affected processing and ...comprehension.•Main idea summarization mitigated effects of multitasking on comprehension.
Media multitasking refers to simultaneous engagement in two activities, or the act of switching between multiple activities, of which at least one is a media activity. Based on this definition, we had 134 Norwegian undergraduates read four partly conflicting documents on sun exposure and health on a computer in order to write a report on the issue, with half of the participants (randomly assigned) receiving and reading short, authentic social media messages on a smartphone while reading the documents, and the other half reading the documents without being sent any such messages. Further, we manipulated what participants did after reading each document paragraph, with half of the participants (randomly assigned) briefly summarizing the main idea of each paragraph in writing, and the other half just rereading each paragraph. Participants’ integrative processing (i.e., cross-text elaboration strategies) were assessed with a task-specific self-report measure immediately after reading all four documents, and their comprehension of the documents was assessed by analyzing their written reports in terms of their ability to elaborate and integrate information within and across the perspectives discussed in the documents. Results indicated that social media multitasking on a smartphone disturbed both the integrative processing and the integrated understanding of the documents, with main idea summarization mitigating or counteracting these negative effects of multitasking. However, when controlling for working memory, reading comprehension skills, and prior knowledge, integrative processing was not found to mediate the effect of multitasking on integrated understanding of the documents. Limitations of the present study and directions for future research are discussed.
Text Mining in Big Data Analytics Hassani, Hossein; Beneki, Christina; Unger, Stephan ...
Big data and cognitive computing,
03/2020, Letnik:
4, Številka:
1
Journal Article
Recenzirano
Odprti dostop
Text mining in big data analytics is emerging as a powerful tool for harnessing the power of unstructured textual data by analyzing it to extract new knowledge and to identify significant patterns ...and correlations hidden in the data. This study seeks to determine the state of text mining research by examining the developments within published literature over past years and provide valuable insights for practitioners and researchers on the predominant trends, methods, and applications of text mining research. In accordance with this, more than 200 academic journal articles on the subject are included and discussed in this review; the state-of-the-art text mining approaches and techniques used for analyzing transcripts and speeches, meeting transcripts, and academic journal articles, as well as websites, emails, blogs, and social media platforms, across a broad range of application areas are also investigated. Additionally, the benefits and challenges related to text mining are also briefly outlined.
The organization of scientific papers typically follows a standardized pattern, the well‐known IMRaD structure (introduction, methods, results, and discussion). Using the full text of 45,000 papers ...published in the PLoS series of journals as a case study, this paper investigates, from the viewpoint of bibliometrics, how references are distributed along the structure of scientific papers as well as the age of these cited references. Once the sections of articles are realigned to follow the IMRaD sequence, the position of cited references along the text of articles is invariant across all PLoS journals, with the introduction and discussion accounting for most of the references. It also provides evidence that the age of cited references varies by section, with older references being found in the methods and more recent references in the discussion. These results provide insight into the different roles citations have in the scholarly communication process.
Les dernières années, les services d'archives ont entrepris de vastes campagnes de numérisation, dans le but de préserver les fonds documentaires. Ces documents scannés sont alors disponibles sous ...forme d'images, matrices de pixels. Notre objectif est de reconnaître automatiquement le contenu de ces images pour en extraire de l'information interprétée. C’est ce que l’on appelle l’analyse automatique d’images de documents.
Biomedical research has become an essential entity in human life. However, finding trends related to research topics in the health sector contained in the repository is a challenging matter. In this ...study, we implemented topic modelling to analyze biomedical research trends using the LDA method. Topic modelling was carried out using data from 7000 articles from PubMed, which were processed with text processing such as lowercase, punctuation removal, tokenization, stop-word removal, and lemmatization. For topic modelling, the LDA with corpus conditions varied to 75% and 100% for validation. Alpha and beta parameters are also set with variations between 0.01, 0.31, 0.61, 0.91, symmetry, and asymmetry when the number of the corpus is changed. When the number of the corpus is 75%, the optimal number of topics is 7, with a coherence value of 0.52. Whereas when the number of the corpus is 100%, the optimal number of topics is 10 with a coherence value of 0.51. In addition, based on the results of article topic modelling, several topics are trending, including disease diagnosis, patient care, and genetic or cell research. Based on the classification of biomedical topics into seven categories, the optimal accuracy, precision, and recall values using the Random Forest algorithm were obtained, namely 85.57%, 87.36%, and 87.58%. The results of this study suggest that topic modelling using the LDA can be used to identify trends in biomedical research with high accuracy. This information can help stakeholders make informed decisions about the direction of future research.
In this research, the formation of highly specialized chatbots was presented. The influence of multi-threading subject areas search was noted. The use of related subject areas in chatbot text ...analysing was defined. The advantages of using multiple related subject areas are noted using the example of an intelligent chatbot.
W tym badaniu przedstawiono tworzenie wysoce wyspecjalizowanych chatbotów. Zwrócono uwagę na wpływ wielowątkowego wyszukiwania obszarów tematycznych. Zdefiniowano wykorzystanie powiązanych obszarów tematycznych w analizie tekstu chatbota. Na przykładzie inteligentnego chatbota odnotowano zalety korzystania z wielu powiązanych obszarów tematycznych.
The problem of projecting multidimensional data into lower dimensions has been pursued by many researchers due to its potential application to data analyses of various kinds. This paper presents a ...novel multidimensional projection technique based on least square approximations. The approximations compute the coordinates of a set of projected points based on the coordinates of a reduced number of control points with defined geometry. We name the technique least square projections (LSP). From an initial projection of the control points, LSP defines the positioning of their neighboring points through a numerical solution that aims at preserving a similarity relationship between the points given by a metric in mD. In order to perform the projection, a small number of distance calculations are necessary, and no repositioning of the points is required to obtain a final solution with satisfactory precision. The results show the capability of the technique to form groups of points by degree of similarity in 2D. We illustrate that capability through its application to mapping collections of textual documents from varied sources, a strategic yet difficult application. LSP is faster and more accurate than other existing high-quality methods, particularly where it was mostly tested, that is, for mapping text sets.