•We have developed a systematic methodology (METRIC) for identifying metamorphic relations (MRs).•We have also developed a tool to support and partly automate the METRIC methodology.•The positive ...results of our experiment have confirmed that METRIC is effective at identifying MRs.
Metamorphic testing is a promising technique for testing software systems when the oracle problem exists, and has been successfully applied to various application domains and paradigms. An important and essential task in metamorphic testing is the identification of metamorphic relations, which, due to the absence of a systematic and specification-based methodology, has often been done in an ad hoc manner—something which has hindered the applicability and effectiveness of metamorphic testing. To address this, a systematic methodology for identifying metamorphic relations based on the category-choice framework, called metric, is introduced in this paper. A tool implementing this methodology has been developed and examined in an experiment to determine the viability and effectiveness of metric, with the results of the experiment confirming that metric is both effective and efficient at identifying metamorphic relations.
Question Answering (QA) is an attractive and challenging area in NLP community. There are diverse algorithms being proposed and various benchmark datasets with different topics and task formats being ...constructed. QA software has also been widely used in daily human life now. However, current QA software is mainly tested in a reference-based paradigm, in which the expected outputs (labels) of test cases need to be annotated with much human effort before testing. As a result, neither the just-in-time test during usage nor the extensible test on massive unlabeled real-life data is feasible, which keeps the current testing of QA software from being flexible and sufficient. In this paper, we propose a method, QAAskeR, with three novel Metamorphic Relations for testing QA software. qaAskeR does not require the annotated labels but tests QA software by checking its behaviors on multiple recursively asked questions that are related to the same knowledge. Experimental results show that qaAskeR can reveal violations at over 80% of valid cases without using any pre-annotated labels. Diverse answering issues, especially the limited generalization on question types across datasets, are revealed on a state-of-the-art QA algorithm.
Ischemia-reperfusion (IR) injury is a main cause to and the mechanism of necrosis after flap transplantation. Researches were hardly conducted on the role and possible mechanism of keratinocyte ...growth factor (KGF) in association with IR flap injury.
A CoCl2-stimulated hypoxia cell model was established to investigate the effects of KGF on cell viability, apoptosis, cell cycle, and reactive oxygen species level. The experiments were performed by cell counting kit-8 and flow cytometry as required. Meanwhile, the expressions of cell cycle-related and nuclear factor E2–related factor 2 (Nrf2) signaling–related genes were determined using quantitative real-time PCR and Western blot. The right dorsolateral areas of Institute of Cancer Research mice were marked as flaps, the pedicle of which formed an IR process through clamping and loosening. Tissue morphologies were observed using hematoxylin and eosin staining 24 h after the surgery. The effects of KGF on cell apoptosis and associated genes expressions were studied by terminal deoxynucleotidyl transferase-mediated dUTP-biotin nick end labeling, immunohistochemistry, and Western blot.
HaCAT cells treated with 40 μM CoCl2 could not only reduce cell viability, promote cell apoptosis, arrest G1 phase of cell cycle and increase the activity of reactive oxygen species but also downregulate the expressions of c-myc, c-fos, transforming growth factor-α, Nrf2, heme oxygenase-1, and gamma-glutamyl cysteine synthetase. Additional recombinant human KGF, on one hand, could protect the cells from hypoxia injury. On the other hand, recombinant human KGF could significantly inhibit cell apoptosis, increase KGF activity, and increase the Nrf2, heme oxygenase-1, and gamma-glutamyl cysteine synthetase proteins levels in IR flap tissues.
KGF played an important role in protecting mice flaps from IR injury, and the possible mechanism was involved in activating the Nrf2 signaling.
Key message
Prolonged hypomethylation of DNA leads to telomere truncation correlated with increased telomere recombination, transposon mobilization and stem cell death.
Epigenetic pathways, including ...DNA methylation, are crucial for telomere maintenance. Deficient in DNA Methylation 1 (DDM1) encodes a nucleosome remodeling protein, required to maintain DNA methylation in
Arabidopsis thaliana
. Plants lacking DDM1 can be self-propagated, but in the sixth generation (G6) hypomethylation leads to rampant transposon activation and infertility. Here we examine the role of DDM1 in telomere length homeostasis through a longitudinal study of successive generations of
ddm1-2
mutants. We report that bulk telomere length remains within the wild-type range for the first five generations (G1–G5), and then precipitously drops in G6. While telomerase activity becomes more variable in later generation
ddm1-2
mutants, there is no correlation between enzyme activity and telomere length. Plants lacking DDM1 also exhibit no dysregulation of several known telomere-associated transcripts, including TERRA. Instead, telomere shortening coincides with increased G-overhangs and extra-chromosomal circles, consistent with deletional recombination. Telomere shortening also correlates with transcriptional activation of retrotransposons, and a hypersensitive DNA damage response in root apical meristems. Since abiotic stresses, including DNA damage, stimulate homologous recombination, we hypothesize that telomere deletion in G6
ddm1-2
mutants is a by-product of elevated genome-wide recombination in response to transposon mobilization. Further, we speculate that telomere truncation may be beneficial in adverse environmental conditions by accelerating the elimination of stem cells with aberrant genomes.
Failure indexing is a longstanding crux in software debugging, the goal of which is to automatically divide failures (e.g., failed test cases) into distinct groups according to the culprit root ...causes, as such multiple faults residing in a faulty program can be handled independently and simultaneously. The community of failure indexing has long been plagued by two challenges: 1) The effectiveness of division is still far from promising. Specifically, existing failure indexing techniques only employ a limited source of software run-time data, for example, code coverage, to be failure proximity and further divide them, which typically delivers unsatisfactory results. 2) The outcome can be hardly comprehensible. Specifically, a developer who receives the division result is just aware of how all failures are divided, without knowing why they should be divided the way they are. This leads to difficulties for developers to be convinced by the division result, which in turn affects the adoption of the results. To tackle these two problems, in this paper, we propose SURE, a vi SU alized failu R e ind E xing approach using the program memory spectrum. We first collect the run-time memory information (i.e., variables’ names and values, as well as the depth of the stack frame) at several preset breakpoints during the execution of a failed test case, and transform the gathered memory information into a human-friendly image (called program memory spectrum, PMS). Then, any pair of PMS images that serve as proxies for two failures is fed to a trained Siamese convolutional neural network, to predict the likelihood of them being triggered by the same fault. Last, a clustering algorithm is adopted to divide all failures based on the mentioned likelihood. In the experiments, we use 30% of the simulated faults to train the neural network, and use 70% of the simulated faults as well as real-world faults to test. Results demonstrate the effectiveness of SURE: It achieves 101.20% and 41.38% improvements in faults number estimation, as well as 105.20% and 35.53% improvements in clustering, compared with the state-of-the-art technique in this field, in simulated and real-world environments, respectively. Moreover, we carry out a human study to quantitatively evaluate the comprehensibility of PMS, revealing that this novel type of representation can help developers better comprehend failure indexing results.
A very important function of an issue tracking system is to assign labels to issue reports, such as bug, feature, enhancement, etc., in order to categorize issues to facilitate various development ...activities. In practice, it is very common that an issue has multiple labels. However, current works are mainly based on single-label prediction, which are not suitable for just-in-time multi-labeling services, due to the low efficiency. Therefore, in this paper, we propose MULA, a just-in-time MUlti-LAbeling system, which learns and automatically assigns multiple labels to issue reports. We have built a dataset with 81,601 entries and 11 labels, as the first benchmark for this task, and implemented a GitHub app. To the best of our knowledge, this is the first work and tool for online multi-labeling GitHub issues based on their categories. We conduct a comprehensive empirical study, including comparisons with five commonly adopted labeling models that show the superiority of MULA, as well as an evaluation that shows high consistency between MULA's suggestions and developers' opinions.
Adipose-derived stem cells (ADSCs) and vascular endothelial growth factor (VEGF) contribute to the healing of wound. The purpose of the present study was to investigate the role of VEGF produced by ...ADSCs in the protection of fibroblasts and skin of mice from ultraviolet (UV) radiation. ADSCs and fibroblasts were extracted from adipose and skin on the abdomen of mice by enzyme digestion methods. ADSCs surface markers were detected using flow cytometry, and immunofluorescence was used to identify fibroblasts. The expression of VEGF in modified ADSCs with lentivirus was determined. Fibroblasts were injured by UV radiation and co-cultured with ADSCs carrying overexpressed VEGF or normal VEGF. Cell cycle was assessed by flow cytometry. Mice were treated with UV radiation dorsally and injected with ADSCs containing overexpressed VEGF or normal VEGF. mRNA and protein levels of cell senescence-related genes were measured by qPCR and western blot. It was found that ADSCs with overexpressed VEGF not only promoted the effect of ADSCs on down-regulating senescence-associated (SA)-β-Gal, p21 and matrix metalloproteinase (MMP)-1, the healing of wound injured by UV radiation and up-regulating collagen I expression in fibroblasts and wound, but also on inhibiting cell cycle arrest in fibroblasts injured by UV radiation and preventing the skin from photoaging caused by UV radiation. VEGF expression in ADSCs played a key role in protecting skin fibroblasts from ageing, which further allowed the skin to resist photoaging, thereby promoting the recovery of wound injured by UV radiation.
•An approach for predicting whether a crashing fault resides in a stack trace.•A learning model based on 89 features from stack traces and source code.•Empirical evaluation on crashes from 7 ...open-source projects, with accuracy of 92%.•Able to reduce the search space of crashing faults and assist crash localization.
Given a stack trace reported at the time of software crash, crash localization aims to pinpoint the root cause of the crash. Crash localization is known as a time-consuming and labor-intensive task. Without tool support, developers have to spend tedious manual effort examining a large amount of source code based on their experience. In this paper, we propose an automatic approach, namely CraTer, which predicts whether a crashing fault resides in stack traces or not (referred to as predicting crashing fault residence). We extract 89 features from stack traces and source code to train a predictive model based on known crashes. We then use the model to predict the residence of newly-submitted crashes. CraTer can reduce the search space for crashing faults and help prioritize crash localization efforts. Experimental results on crashes of seven real-world projects demonstrate that CraTer can achieve an average accuracy of over 92%.
•Two variants of Sendys to patch its NOR problem and improve its performance.•First comprehensive theoretical analysis on Sendys and its variants.•A short-cut reformulation of the enhanced Sendys ...with the best performance.•Empirical studies on single and multiple faults to complement theoretical analysis.
Combining spectrum-based fault localization (SBFL) with other techniques is generally regarded as a feasible approach as advantages from both techniques would be preserved. Sendys which combines SBFL with slicing-hitting-set-computation is one of the promising techniques. However, all current evaluations on Sendys were obtained via empirical studies, which have inevitable threats to validity. Besides, purely empirical studies cannot reveal the essential reason that why Sendys performs well or badly, and whether all the complicated computations are necessary. Therefore, in this paper, we provide an in-depth theoretical analysis on Sendys, which can give definite and convincing conclusions. We generalize our previous theoretical framework on SBFL, to make it applicable to combined techniques like Sendys. We first provide a variant of current Sendys by patching its loophole of ignoring “zero or negative risk values” in normalization. This variant plays as a substitution of the original Sendys, as well as one of the baselines in our analysis. Then, by modifying a few steps of this variant, we propose an enhanced Sendys and theoretically prove its superiority over several other methods in single-fault scenario. Moreover, we provide a short-cut reformulation of the enhanced Sendys by preserving its performance, but only requiring very simple computations. And it is proved to be even better than traditional SBFL maximal formulas. As a complementary, our empirical studies with 13 subject programs demonstrate the obvious superiority of the enhanced Sendys, as well as its stability across different formulas in single-fault scenario. For multiple-fault cases, the variant of Sendys is observed to have the best performance. Besides, this variant has shown great helpfulness in improving the bad performance of the original Sendys when encountering the NOR problem.
Unsupervised machine learning is the training of an artificial intelligence system using information that is neither classified nor labeled, with a view to modeling the underlying structure or ...distribution in a dataset. Since unsupervised machine learning systems are widely used in many real-world applications, assessing the appropriateness of these systems and validating their implementations with respect to individual users' requirements and specific application scenarios/contexts are indisputably two important tasks. Such assessments and validation tasks, however, are fairly challenging due to the absence of a priori knowledge of the data. In view of this challenge, in this article, we develop a MET amorphic T esting approach to assessing and validating unsupervised machine LE arning systems, abbreviated as mettle . Our approach provides a new way to unveil the (possibly latent) characteristics of various machine learning systems, by explicitly considering the specific expectations and requirements of these systems from individual users' perspectives. To support mettle , we have further formulated 11 generic metamorphic relations (MRs), covering users' generally expected characteristics that should be possessed by machine learning systems. We have performed an experiment and a user evaluation study to evaluate the viability and effectiveness of mettle . Our experiment and user evaluation study have shown that, guided by user-defined MR-based adequacy criteria, end users are able to assess, validate, and select appropriate clustering systems in accordance with their own specific needs. Our investigation has also yielded insightful understanding and interpretation of the behavior of the machine learning systems from an end-user software engineering's perspective , rather than a designer's or implementor's perspective, who normally adopts a theoretical approach.