Akademska digitalna zbirka SLovenije - logo
E-resources
Full text
Peer reviewed Open access
  • Strojno učenje u uvjetima m...
    Juričić, Vedran

    Politehnika, 12/2023, Volume: 7, Issue: 2
    Journal Article, Web Resource

    Strojno učenje je predmet istraživanja brojnih znanstvenih i stručnih projekata, i važan sastavni dio sustava koji se koriste u medicini, bankarstvu, računalnoj sigurnosti, komunikaciji i brojnim drugim domenama. Jedno je od najaktivnijih područja istraživanja, s konstantnim napretkom i razvojem novih algoritama i pristupa, te poboljšanjem postojećih metoda. Značajan utjecaj na performanse modela strojnog učenja ima skup podataka nad kojim je napravljeno treniranje, odnosno kvaliteta podataka, ravnomjerna razdioba vrijednosti i veličina skupa. To predstavlja potencijalan problem kod metoda strojnog učenja koje zahtijevaju prethodno označene podatke, jer prikupljanje podataka može biti iznimno složeno, skupo i vremenski zahtjevno. U tom slučaju klasičan model strojnog učenja vrlo vjerojatno neće imati dobre performanse. Jedan od pristupa rješavanja ovog problema je primjena učenja prijenosom, u kojem model koristi skup podataka ne samo iz promatrane domene, već i iz druge, idealno srodne domene. U radu su simulirani uvjeti manje raspoloživosti skupa podataka, na kojem su analizirane performanse tri modela temeljena na neuronskim mrežama, od kojih se jedan temelji na prethodno istreniranom modelu. Opisan je postupak kreiranja skupova za treniranje i prezentirani su rezultati analize navedena tri modela s različitim veličinama skupova. Machine learning is the subject of numerous scientific and professional research projects and is an important component of systems used in medicine, banking, computer security, communications and numerous other fields. It is one of the most active areas of research with constant progress and development of new algorithms and approaches as well as improvement of existing methods. The performance of the machine learning model is significantly affected by the dataset used for training, i.e. the quality of the data, the uniform distribution of values and the size of the set. This is a potential problem with machine learning methods that require pre-labelled data, as data acquisition can be extremely complex, expensive and time-consuming. In this case, the classical machine learning model will most likely not perform well. One approach to solve this problem is to apply transfer learning, where the model uses a dataset not only from the target domain but also from other, and ideally related domains. In the work, conditions with lower availability of datasets were simulated, under which the performance of three models was analyzed, one of which was based on a previously trained model. The process of creating training sets is described, and the results of analyzing the three models with different sized sets are presented.