We introduce a novel exploratory technique, termed biarchetype analysis, which extends archetype analysis to simultaneously identify archetypes of both observations and features. This innovative ...unsupervised machine learning tool aims to represent observations and features through instances of pure types, or biarchetypes, which are easily interpretable as they embody mixtures of observations and features. Furthermore, the observations and features are expressed as mixtures of the biarchetypes, which makes the structure of the data easier to understand. We propose an algorithm to solve biarchetype analysis. Although clustering is not the primary aim of this technique, biarchetype analysis is demonstrated to offer significant advantages over biclustering methods, particularly in terms of interpretability. This is attributed to biarchetypes being extreme instances, in contrast to the centroids produced by biclustering, which inherently enhances human comprehension. The application of biarchetype analysis across various machine learning challenges underscores its value, and both the source code and examples are readily accessible in R and Python at https://github.com/aleixalcacer/JA-BIAA .
The prospects of using a Reconfigurable Intelligent Surface (RIS) to aid wireless communication systems have recently received much attention from academia and industry. Most papers make theoretical ...studies based on elementary models, while the prototyping of RIS-aided wireless communication and real-world field trials are scarce. In this paper, we describe a new RIS prototype consisting of 1100 controllable elements working at 5.8 GHz band. We propose an efficient algorithm for configuring the RIS over the air by exploiting the geometrical array properties and a practical receiver-RIS feedback link. In our indoor test, where the transmitter and receiver are separated by a 30 cm thick concrete wall, our RIS prototype provides a 26 dB power gain compared to the baseline case where the RIS is replaced by a copper plate. A 27 dB power gain was observed in the short-distance outdoor measurement. We also carried out long-distance measurements and successfully transmitted a 32 Mbps data stream over 500 m. A 1080p video was live-streamed and it only played smoothly when the RIS was utilized. The power consumption of the RIS is around 1 W. Our paper is vivid proof that the RIS is a very promising technology for future wireless communications.
New malware variants appear rapidly and continuously increase the difficulty to classify malware into correct families. This brings two challenges for malware classification: The first is the scarce ...samples problem, where collecting a large volume of a newly detected malware family to train a classifier can be extremely hard and it is unavoidable to suffer from overfitting using a small number of samples. The second is the dynamic recognition problem. Most widely adopted classifiers are trained on predefined known malware families, lacking ability to incrementally identifying novel families, which require to retrain from scratch. To tackle these challenges, in this study, we employ meta-learning based few-shot learning (FSL) technique and propose a new few-shot malware classification model called SIMPLE (Supervised Infinite Mixture Prototypes LEarning). With the help of meta-learning, SIMPLE is trained with predefined malware families and can maintain its ability to classify novel malware families that has never met. Furthermore, the prior knowledge learned via meta-learning can prevent from overfitting caused by scarce samples. Our proposed SIMPLE introduces multi-prototype modeling to generate multiple prototypes of each family to enhance the generalization ability, based on API invocation sequences from dynamic analysis. This is inspired by the observation that behaviors within the same family often match multiple subpatterns and satisfy multimodal data distribution. In the broad experiments, SIMPLE achieves state-of-the-art few-shot malware classification performance and outperforms all the baselines. With only 5 samples per family, SIMPLE reaches very high accuracy of 90% in 5-way classification task on novel malware families, which substantially solves the problem of scarce samples and dynamic recognition. We also make analysis on the reason of effectiveness with multi-prototype and fast adaption feature to provide more interpretability for the results.
We tackle the persistent problem of people from specific demographic groups (e.g., women) being undervalued in professional contexts in which traits associated with their group do not align with the ...traits perceived to be essential for success (the professional prototype). We introduce the concept of
balancing
professional prototypes such that group membership becomes irrelevant to determining an individual’s prototypicality. Using a novel technique called
prototype inversion
, we emphasize the importance of professional traits typically associated with an underrepresented group, without dismissing those associated with the currently prototypical group. By balancing the prototype in this way, it becomes easier to recognize the professional potential of members of underrepresented groups, without incurring backlash from the currently prototypical group. We conducted a full-cycle research project to demonstrate the effectiveness of this strategy in the extreme context of women in firefighting using qualitative and quantitative methods and participants from both the laboratory and the field.
Among various approaches of few-shot named entity recognition (NER), two-stage models based on prototype networks are widely used. However, these methods can not fully utilize the semantic ...information in entity labels and overly relies on entity type prototype vectors in distance calculation, resulting in poor generalization ability of the model. To address these issues, this paper proposes a few-shot named entity recognition method based on label semantic information awareness. This method consists of a two-stage process: entity span detection and entity type classification. When constructing entity type prototype vectors, the semantic information associated with the corresponding entity types is considered and fused with the prototype vectors through a dimension transformation layer. During the entity recognition of new samples, entity type positive and negative samples are combined with entity type prototype vectors to form entity type triplets, and the samples are classified based on the distance to the t
Directed motion at the nanoscale is a central attribute of life, and chemically driven motor proteins are nature’s choice to accomplish it. Motivated and inspired by such bionanodevices, in the past ...few decades chemists have developed artificial prototypes of molecular motors, namely, multicomponent synthetic species that exhibit directionally controlled, stimuli-induced movements of their parts. In this context, photonic and redox stimuli represent highly appealing modes of activation, particularly from a technological viewpoint. Here we describe the evolution of the field of photo- and redox-driven artificial molecular motors, and we provide a comprehensive review of the work published in the past 5 years. After an analysis of the general principles that govern controlled and directed movement at the molecular scale, we describe the fundamental photochemical and redox processes that can enable its realization. The main classes of light- and redox-driven molecular motors are illustrated, with a particular focus on recent designs, and a thorough description of the functions performed by these kinds of devices according to literature reports is presented. Limitations, challenges, and future perspectives of the field are critically discussed.
We developed a first prototype of an end-to-end machine learning based simulation framework for arbitrary analysis ntuples at the CMS experiment. Such a framework, called FlashSim, was capable of ...simulating a wide variety of physical objects with good performance on 1d distributions, correlations and desired physical content when compared to the current state-of-theart simulation. Current methods are based on MC techniques, computationally expensive and requiring a long time to compute. Our prototype was trained to replicate the samples from state-of-the-art methods through the use of the Normalizing Flows algorithm. It showed compatible results with a speedup of several orders of magnitude. This type of approach opens the way to general, analysis agnostic simulation frameworks which may be able to tackle the challenges of the simulation needs for HL-LHC and future collaborations.