Evaluating the correctness of code generated by AI is a challenging open problem. In this paper, we propose a fully automated method, named ACCA, to evaluate the correctness of AI-generated code for ...security purposes. The method uses symbolic execution to assess whether the AI-generated code behaves as a reference implementation. We use ACCA to assess four state-of-the-art models trained to generate security-oriented assembly code and compare the results of the evaluation with different baseline solutions, including output similarity metrics, widely used in the field, and the well-known ChatGPT, the AI-powered language model developed by OpenAI.
Our experiments show that our method outperforms the baseline solutions and assesses the correctness of the AI-generated code similar to the human-based evaluation, which is considered the ground truth for the assessment in the field. Moreover, ACCA has a very strong correlation with the human evaluation (Pearson’s correlation coefficient r=0.84 on average). Finally, since it is a full y automated solution that does not require any human intervention, the proposed method performs the assessment of every code snippet in ∼0.17 s on average, which is definitely lower than the average time required by human analysts to manually inspect the code, based on our experience.
•ACCA aligns with human evaluation for code correctness in the 93% of cases.•Code correctness computed by ACCA is the closest to the human evaluation.•ACCA is the most correlated to the human evaluation over all the predictions.•Computational time required by ACCA are lower than human evaluation, on average.
The Internet of Things (IoT) is experiencing a strong growth in both industrial and consumer scenarios. At the same time, the devices taking part in delivering IoT services-usually characterized by ...limited hardware and software resources-are more and more targeted by cyberattacks. This calls for designing and evaluating new approaches for protecting IoT systems, which are challenged by the limited computational capabilities of devices and by the scarce availability of reliable datasets. In line with this need, in this paper we compare three state-of-the-art machine-learning models used for Anomaly Detection based on autoencoders, i.e. shallow Autoencoder, Deep Autoencoder (DAE), and Ensemble of Autoencoders (viz. KitNET). In addition, we evaluate the robustness of such solutions when Data Poisoning Attack (DPA) occurs, to assess the detection performance when the benign traffic used for learning the legitimate behavior of devices is mixed to malicious traffic. The evaluation relies on the public Kitsune Network Attack Dataset. Results reveal that the models do not differ in performance when trained with unpoisoned benign traffic, reaching (at 1% FPR) an F1 score of ≈ 97%. However, when DPA occurs, DAE proves to be the more robust in detection, showing more than 50% of F1 Score with 10% poisoning. Instead, the other models show strong performance drops (down to ≈ 20% F1 Score) by injecting only 0.5% of the malicious traffic.
Evaluating the correctness of code generated by AI is a challenging open problem. In this paper, we propose a fully automated method, named ACCA, to evaluate the correctness of AI-generated code for ...security purposes. The method uses symbolic execution to assess whether the AI-generated code behaves as a reference implementation. We use ACCA to assess four state-of-the-art models trained to generate security-oriented assembly code and compare the results of the evaluation with different baseline solutions, including output similarity metrics, widely used in the field, and the well-known ChatGPT, the AI-powered language model developed by OpenAI. Our experiments show that our method outperforms the baseline solutions and assesses the correctness of the AI-generated code similar to the human-based evaluation, which is considered the ground truth for the assessment in the field. Moreover, ACCA has a very strong correlation with the human evaluation (Pearson's correlation coefficient r=0.84 on average). Finally, since it is a fully automated solution that does not require any human intervention, the proposed method performs the assessment of every code snippet in ~0.17s on average, which is definitely lower than the average time required by human analysts to manually inspect the code, based on our experience.
The aim of the study was to evaluate the cytotoxicity of three epoxy resin-based endodontic sealer, AH Plus, Sicura Seal and Top Seal. Direct and indirect cytotoxicity were evaluated by ...3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide assay and LIVE/DEAD® Viability/Cytotoxicity Assay on MG63 osteoblasts-like cells. Data were statistically analyzed by analysis of variance and Tukey test, setting a significance level of 5%. Both results related to direct and indirect cell viability tests showed that all groups were significantly more cytotoxic than the negative control group. The cytotoxicity activity after one week of culture showed the absence of direct cytotoxicity, while a medium rate of indirect cytotoxicity. All the three epoxy resin-based sealers (AH Plus, Top Seal and Sicura Seal) showed a medium rate of cytotoxicity on osteoblasts-like cells in vitro. No significant difference was found among the sealers analyzed.
Cadophora luteo-olivacea represents a critical problem for kiwifruit in the post-harvest phase, mainly for its little note epidemiology. The study presented some results about the possibility of ...preserving kiwifruit from skin pitting symptoms using alternative methods to fungicides. By in vitro assays, antagonist mechanisms of action against pathogen isolates were tested. Trichoderma harzianum (Th1) showed the highest inhibitory activity against C. luteo-olivacea isolates by volatile, non-volatile, and by dual culture assay, displaying an inhibition respectively by 90%, 70.6%, and 78.8%, and with respect to Aureobasidium pullulans (L1 and L8) by 23.3% and 25.8%, 50% and 34.7%, and 22.5% and 23.6%, respectively. Further, the sensitivity on CFU and mycelial growth of C. luteo-olivacea isolates to fludioxonil, and CaCl2 was tested, displaying interesting EC50 values (0.36 and 0.92 g L−1, 22.5 g L−1, respectively). The effect of Brassica nigra defatted meal was tested as biofumigation assays and through FT-IR (Fourier-Transform Infrared) spectroscopy. The above-mentioned treatments were applied in vivo to evaluate their efficacy on kiwifruits. Our data demonstrated that alternative solutions could be considered to control postharvest pathogens such as C. luteo-olivacea.
We evaluated the role of CRP and other laboratory parameters in predicting the worsening of clinical conditions during hospitalization, ICU admission, and fatal outcome among patients with COVID-19. ...Consecutive adult inpatients with SARS-CoV-2 infection and respiratory symptoms treated in three different COVID centres were enrolled, and they were tested for laboratory parameters within 48 h from admission. Three-hundred ninety patients were enrolled. Age, baseline CRP, and LDH were associated with a P/F ratio < 200 during hospitalization. Male gender and CRP > 60 mg/L were shown to be independently associated with ICU admission. Lymphocytes < 1000 cell/μL were associated with the worst P/F ratio. CRP > 60 mg/L predicted exitus. We subsequently devised an 11-points numeric ordinary scoring system based on age, sex, CRP, and LDH at admission (ASCL score). Patients with an ASCL score of 0 or 2 were shown to be protected against a P/F ratio < 200, while patients with an ASCL score of 6 to 8 were shown to be at risk for P/F ratio < 200. Patients with an ASCL score ≥ 7 had a significantly increased probability of death during hospitalization. In conclusion, patients with elevated CRP and LDH and an ASCL score > 6 at admission should be prioritized for careful respiratory function monitoring and early treatment to prevent a progression of the disease.