Natural Language Processing, specifically text classification or text categorization, has become a trend in computer science. Commonly, text classification is used to categorize large amounts of data ...to allocate less time to retrieve information. Students, as well as research advisers and panelists, take extra effort and time in classifying research documents. To solve this problem, the researchers used state-of-the-art supervised term weighting schemes, namely: TF-MONO and SQRTF-MONO and its application to machine learning algorithms: K-Nearest Neighbor, Linear Support Vector, Naive Bayes Classifiers, creating a total of six classifier models to ascertain which of them performs optimally in classifying research documents while utilizing Optical Character Recognition for text extraction. The results showed that among all classification models trained, SQRTF-MONO and Linear SVC outperformed all other models with an F1 score of 0.94 both in the abstract and the background of the study datasets. In conclusion, the developed classification model and application prototype can be a tool to help researchers, advisers, and panelists to lessen the time spent in classifying research documents.
The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing ...technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
We have micromachined a silicon-chip device that transports DNA with a Brownian ratchet that rectifies the Brownian motion of microscopic particles. Transport properties for a DNA 50-mer agree with ...theoretical predictions, and the DNA diffusion constant agrees with previous experiments. This type of micromachine could provide a generic pump or separation component for DNA or other charged species as part of a microscale lab-on-a-chip. A device with reduced feature size could produce a size-based separation of DNA molecules, with applications including the detection of single-nucleotide polymorphisms.
Numerous enhancements of prediction models through hybridization and combining various machine learning to increase the prediction model's performance are still an ongoing research interest in data ...mining. This study is a modification of GA with a new crossover mating structure called Flip Multi-Sliced Average Crossover (FMSAX) operator with a rank-based selection function integrating the C4.5 algorithm. To measure the accuracy level of the C4.5, the dataset is split into 70:30 for training and testing, respectively. The results showed that the prediction model of the modified GA combined with the C4.5 algorithm outperformed the C4.5 prediction model, the GA having AX and roulette wheel selection function. The results showed that the prediction model obtained an accuracy value of 98.6207%, where its MAE, RMSE, Precision, Recall, and F-Measure values are 0.0066, 0.0738, 0.987, 0.986, and 0.986, respectively.
Genetic algorithms (GAs) are commonly employed in optimization techniques and one of the crucial components of GAs is the crossover function, which often encounters the issue of premature convergence ...(PC). The proposed innovative crossover operator, known as Flip Multi-Sliced Average Crossover (FMSAX), significantly enhances the performance of GA by addressing variable minimization. This approach introduces a novel method of generating offspring by dividing the chromosomes into three equal segments, producing a head-body-tail structure. This is achieved by flipping each gene and computing the average to generate a new offspring. The FMSAX operator represents a more optimized and efficient technique for variable optimization, aimed at mitigating premature convergence issues. The simulation results, two types of datasets, each containing 30 and 40 variables, were utilized. The results clearly demonstrate that the enhanced crossover operator within the GA outperformed the original GA with an average crossover operator by 25 variables (62.50%) and 10 variables (25%) eliminated, respectively. Moreover, it yielded Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE) values of 0.0066 and 0.0738, respectively.
An in situ, real-time process control tool was developed for MEMS deep reactive-ion etch (DRIE) fabrication. DRIE processes are used to manufacture high-aspect-ratio silicon structures up to several ...hundred microns thick, which would be difficult or impossible to produce by other methods. DRIE MEMS technologies promise to deliver new devices with increased performance and functionality at lower cost. A major difficulty with DRIE is the control of etch depth. Our research shows that it is possible to monitor the etch depth of various MEMS structures (holes, pillars, trenches, etc.) through measurement and analysis of the infrared reflectance spectrum. Depths as large as 150 /spl mu/m have been measured. Excellent correlation is found between the etch depths determined by analysis of these measurements and those measured with an SEM. In addition to etch depth, other parameters such as the photoresist thickness (e.g., mask erosion) can be simultaneously extracted. Based on these results, an infrared-reflectance etch monitor was integrated onto a reactive ion etcher at the Berkeley Sensor and Actuator Center for real-time monitoring and end-point determination. The integrated optical metrology system demonstrated accurate real-time monitoring of the etch depth and photoresist mask erosion.
An interdigitated electrode array (IDEA) device has been designed and used to transport DNA based on a Brownian ratchet mechanism. This migration is produced by the periodic formation of an ...asymmetric sawtooth electric field in the device. Oligonucleo tides of 25, 50, and 100 bases in length were tested using two different array geometries. DNA transport as a function of DNA size, electric field frequency, and array geometry is shown to be in qualitative agreement with theory. Such a device could provide for DNA separations over a broad size range, and can be readily scaled as a component in a microfabricated DNA analysis system.
Life beyond Earth may be based on RNA or DNA if such life is related to life on Earth through shared ancestry due to meteoritic exchange, such as may be the case for Mars, or if delivery of similar ...building blocks to habitable environments has biased the evolution of life toward utilizing nucleic acids. In this case, in situ sequencing is a powerful approach to identify and characterize such life without the limitations or expense of returning samples to Earth, and can monitor forward contamination. A new semiconductor sequencing technology based on sensing hydrogen ions released during nucleotide incorporation can enable massively parallel sequencing in a small, robust, optics-free CMOS chip format. We demonstrate that these sequencing chips survive several analogues of space radiation at doses consistent with a 2-year Mars mission, including protons with solar particle event-distributed energy levels and 1 GeV oxygen and iron ions. We find no measurable impact of irradiation at 1 and 5 Gy doses on sequencing quality nor on low-level hardware characteristics. Further testing is required to study the impacts of soft errors as well as to characterize performance under neutron and gamma irradiation and at higher doses, which would be expected during operation in environments with significant trapped energetic particles such as during a mission to Europa. Our results support future efforts to use in situ sequencing to test theories of panspermia and/or whether life has a common chemical basis.
Recent advancements in information technologies have made it a common practice to analyze educational data from various sources. Determining and analyzing these data could improve the education ...system by identifying the factors influencing students' academic performance and assessing their performance status. Thus, studies on predicting students' academic performance, gaining a high accuracy level, and extracting insights from massive volumes of educational data are still significant factors among researchers. In this study, predicting the academic performance of students using decision tree (DT), logistic regression (LR), random forest (RF), support vector machine (SVM), and naïve Bayes (NB) algorithm utilizing the data from Agusan del Sur State College of Agriculture and Technology (ASSCAT). The result shows that the best model of all five algorithms was the logistic regression, which obtained an accuracy of 0.91, followed by the support vector machine, which yielded an accuracy rate of 0.90. Logistic regression model can assist the university administrators, faculty, and students predict which students may underperform, allowing for timely intervention.