For a target task where the labeled data are unavailable, domain adaptation can transfer a learner from a different source domain. Previous deep domain adaptation methods mainly learn a global domain ...shift, i.e., align the global source and target distributions without considering the relationships between two subdomains within the same category of different domains, leading to unsatisfying transfer learning performance without capturing the fine-grained information. Recently, more and more researchers pay attention to subdomain adaptation that focuses on accurately aligning the distributions of the relevant subdomains. However, most of them are adversarial methods that contain several loss functions and converge slowly. Based on this, we present a deep subdomain adaptation network (DSAN) that learns a transfer network by aligning the relevant subdomain distributions of domain-specific layer activations across different domains based on a local maximum mean discrepancy (LMMD). Our DSAN is very simple but effective, which does not need adversarial training and converges fast. The adaptation can be achieved easily with most feedforward network models by extending them with LMMD loss, which can be trained efficiently via backpropagation. Experiments demonstrate that DSAN can achieve remarkable results on both object recognition tasks and digit classification tasks. Our code will be available at https://github.com/easezyc/deep-transfer-learning .
We propose a variant of residual networks (ResNets) for galaxy morphology classification. The variant, together with other popular convolutional neural networks (CNNs), is applied to a sample of ...28790 galaxy images from the Galaxy Zoo 2 dataset, to classify galaxies into five classes, i.e., completely round smooth, in-between smooth (between completely round and cigar-shaped), cigar-shaped smooth, edge-on and spiral. Various metrics, such as accuracy, precision, recall, F1 value and AUC, show that the proposed network achieves state-of-the-art classification performance among other networks, namely, Dieleman, AlexNet, VGG, Inception and ResNets. The overall classification accuracy of our network on the testing set is 95.2083% and the accuracy of each type is given as follows: completely round, 96.6785%; in-between, 94.4238%; cigar-shaped, 58.6207%; edge-on, 94.3590% and spiral, 97.6953%. Our model algorithm can be applied to large-scale galaxy classification in forthcoming surveys, such as the Large Synoptic Survey Telescope (LSST) survey.
Nowadays, trendy research in biomedical sciences juxtaposes the term 'precision' to medicine and public health with companion words like big data, data science, and deep learning. Technological ...advancements permit the collection and merging of large heterogeneous datasets from different sources, from genome sequences to social media posts or from electronic health records to wearables. Additionally, complex algorithms supported by high-performance computing allow one to transform these large datasets into knowledge. Despite such progress, many barriers still exist against achieving precision medicine and precision public health interventions for the benefit of the individual and the population.
The present work focuses on analyzing both the technical and societal hurdles related to the development of prediction models of health risks, diagnoses and outcomes from integrated biomedical databases. Methodological challenges that need to be addressed include improving semantics of study designs: medical record data are inherently biased, and even the most advanced deep learning's denoising autoencoders cannot overcome the bias if not handled a priori by design. Societal challenges to face include evaluation of ethically actionable risk factors at the individual and population level; for instance, usage of gender, race, or ethnicity as risk modifiers, not as biological variables, could be replaced by modifiable environmental proxies such as lifestyle and dietary habits, household income, or access to educational resources.
Data science for precision medicine and public health warrants an informatics-oriented formalization of the study design and interoperability throughout all levels of the knowledge inference process, from the research semantics, to model development, and ultimately to implementation.
The extensive usage of fossil fuels has caused significant environmental pollution, climate change and energy crises. The significant advantages of hydrogen, such as cleanliness, high efficiency, and ...a wide range of sources, make it quite promising. Hydrogen is prone to material damage, which may lead to leakage. High-pressure leaking hydrogen is highly susceptible to spontaneous combustion due to its combustion characteristics, which may cause jet fire or explosion accidents, resulting in serious casualties and property damage. This paper presents a detailed review of the research progress on hydrogen leak diffusion characteristics, leak spontaneous combustion mechanisms and material hydrogen damage mechanisms from the perspectives of theoretical analysis, experiments and numerical simulations. This review points out that although a large number of research results have been obtained on the safety characteristics of hydrogen, there are still some deficiencies and limitations. Further research topics are clarified, such as further optimizing the kinetic mechanism of the high-pressure hydrogen leakage reaction and turbulence model, exploring the expansion and dilution law of hydrogen clouds after liquid hydrogen flooding, further studying the spontaneous combustion mechanism of leaked hydrogen and the interaction between mechanisms, and investigating the synergistic damage effect of hydrogen and other components on materials. The leakage spontaneous combustion process in open space, the development process of the bidirectional effect of hydrogen jet fuel and crack growth under the impact of high-pressure hydrogen jet fuel on the material may need to be explored next.
Visual communication technology is widely used in the field of design. With the continuous advancement of science and technology, multimedia technology in China has also made great progress. The ...application of multimedia technology to visual communication technology is of great significance. Only by combining the two ingeniously and actively can it ensure the long-term development of visual communication technology and vitality. Based on this, this article will briefly describe the concept of multimedia technology and visual communication, analyze the impact of multimedia on visual communication technology, and propose innovative strategies for multimedia and visual communication technology for reference.
There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained ...language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model-GatorTron-using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on five clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve five clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og .
Irrigation water management and real-time monitoring of crop water stress status can enhance agricultural water use efficiency, crop yield, and crop quality. The aim of this study was to simplify the ...calculation of the crop water stress index (CWSI) and improve its diagnostic accuracy. Simplified CWSI (CWSIsi) was used to diagnose water stress for cotton that has received four different irrigation treatments (no stress, mild stress, moderate stress, and severe stress) at the flowering and boll stage. High resolution thermal infrared and multispectral images were taken using an Unmanned Aerial Vehicle remote sensing platform at midday (local time 13:00), and stomatal conductance (gs), transpiration rate (tr), and cotton root zone soil volumetric water content (θ) were concurrently measured. The soil background pixels of thermal images were eliminated using the Canny edge detection to obtain a unimodal histogram of pure canopy temperatures. Then the wet reference temperature (Twet), dry reference temperature (Tdry), and mean canopy temperature (Tl) were obtained from the canopy temperature histogram to calculate CWSIsi. The other two methods of CWSI evaluation were empirical CWSI (CWSIe), in which the temperature parameters were determined by measuring natural reference cotton leaves, and statistical CWSI (CWSIs), in which Twet was the mean of the lowest 5% of canopy temperatures and Tdry was the air temperature (Tair) + 5 °C. Compared with CWSIe, CWSIs and spectral indices (NDVI, TCARI, OSAVI, TCARI/OSAVI), CWSIsi has higher correlation with gs (R2 = 0.660) and tr (R2 = 0.592). The correlation coefficient (R) for θ (0–45 cm) and CWSIsi is also high (0.812). The plotted high-resolution map of CWSIsi shows the different distribution of cotton water stress in different irrigation treatments. These findings demonstrate that CWSIsi, which only requires parameters from a canopy temperature histogram, may potentially be applied to precision irrigation management.
Abstract Alzheimer’s Disease (AD) pathology has been increasingly explored through single-cell and single-nucleus RNA-sequencing (scRNA-seq & snRNA-seq) and spatial transcriptomics (ST). However, the ...surge in data demands a comprehensive, user-friendly repository. Addressing this, we introduce a single-cell and spatial RNA-seq database for Alzheimer’s disease (ssREAD). It offers a broader spectrum of AD-related datasets, an optimized analytical pipeline, and improved usability. The database encompasses 1,053 samples (277 integrated datasets) from 67 AD-related scRNA-seq & snRNA-seq studies, totaling 7,332,202 cells. Additionally, it archives 381 ST datasets from 18 human and mouse brain studies. Each dataset is annotated with details such as species, gender, brain region, disease/control status, age, and AD Braak stages. ssREAD also provides an analysis suite for cell clustering, identification of differentially expressed and spatially variable genes, cell-type-specific marker genes and regulons, and spot deconvolution for integrative analysis. ssREAD is freely available at https://bmblx.bmi.osumc.edu/ssread/ .
Outdoor air contamination was frequently observed and focused in northeast China in winter when severe cold days are common and understanding dynamic characteristics of indoor air quality in this ...area is essential for occupants' ventilation strategies. Herein, using spectrophotometric method, GC-MS (Gas Chromatography-Mass Spectrometer) method, TSI DustTrak particle tester and Telaire 7001 CO2 tester, we investigated the concentration change of formaldehyde (HCHO), volatile organic compounds (VOCs), PM2.5 and CO2 of 21 houses for 4 seasons under 2 conditions (airtight and natural ventilation). Moreover, 6 houses were selected to install convenient sensors for longtime monitoring. Outdoor temperature, concentration of PM2.5, residents' opening window behavior, indoor infiltration rate, temperature, relative humidity and the furniture surface area were also recorded for comparison and correlation analysis. According to the detected results, worst indoor conditions were revealed that average concentrations of HCHO (autumn), TVOC (summer), PM2.5 (winter) and CO2 (winter) were 0.094 mg/m3, 0.924 mg/m3, 0.073 mg/m3, and 883 ppm, respectively. Indoor concentration of PM2.5, TVOC and CO2 exceeded the standard limit and they changed with seasonal characteristics. The observed 4 factors have different effects on indoor air quality. Based on this research, it is recommended that outdoor condition be considered while opening and closing window in winter and ventilation be strengthened in summer.
•A survey was conducted to discover indoor air quality in cold northeast China.•Correlation analysis and risk analysis were performed according to the detected data.•Discussion and recommendation were based on on-site measurement and sensors.
Overly restrictive eligibility criteria for clinical trials may limit the generalizability of the trial results to their target real-world patient populations. We developed a novel machine learning ...approach using large collections of real-world data (RWD) to better inform clinical trial eligibility criteria design. We extracted patients' clinical events from electronic health records (EHRs), which include demographics, diagnoses, and drugs, and assumed certain compositions of these clinical events within an individual's EHRs can determine the subphenotypes-homogeneous clusters of patients, where patients within each subgroup share similar clinical characteristics. We introduced an outcome-guided probabilistic model to identify those subphenotypes, such that the patients within the same subgroup not only share similar clinical characteristics but also at similar risk levels of encountering severe adverse events (SAEs). We evaluated our algorithm on two previously conducted clinical trials with EHRs from the OneFlorida+ Clinical Research Consortium. Our model can clearly identify the patient subgroups who are more likely to suffer or not suffer from SAEs as subphenotypes in a transparent and interpretable way. Our approach identified a set of clinical topics and derived novel patient representations based on them. Each clinical topic represents a certain clinical event composition pattern learned from the patient EHRs. Tested on both trials, patient subgroup (#SAE=0) and patient subgroup (#SAE>0) can be well-separated by k-means clustering using the inferred topics. The inferred topics characterized as likely to align with the patient subgroup (#SAE>0) revealed meaningful combinations of clinical features and can provide data-driven recommendations for refining the exclusion criteria of clinical trials. The proposed supervised topic modeling approach can infer the clinical topics from the subphenotypes with or without SAEs. The potential rules for describing the patient subgroups with SAEs can be further derived to inform the design of clinical trial eligibility criteria.