V Ljubljani sta med 6. in 8. 11. 2019 potekala 54. srečanje in javni posvet ALTE (Association of Language Testers in Europe). Srečanje na temo Enojezično testiranje v večjezični realnosti: jezikovne ...ideologije in njihov vpliv na jezikovno testiranje sta organizirala Univerza v Ljubljani, Filozofska fakulteta in njen Center za slovenščino kot drugi in tuji jezik pri Oddelku za slovenistiko. V tem okviru je 8. 11. 2019 potekala okrogla miza (Bližnja) srečanja oblikovalcev jezikovne politike. Objavljamo zapis posnetka pogovora sodelujočih na dogodku.
Cilj rada je pokušati, u kontekstu testiranja modela ChatGPT na studentskim zadacima iz područja statistike, prepoznati slučajeve u kojima veliki jezični modeli pokazuju slično ponašanje ljudskom ...razmišljanju, a u kojima „razmišljaju“ na drugačiji način te identificirati prilike, rizike i ograničenja kod primjene umjetne inteligencije u nastavi. Analizirat će se mogućnosti i ograničenja velikih jezičnih modela te načini na koje se u ovom brzo rastućem području nastoji nadići postojeće pristranosti i nedostatke. U radu će se testirati chatbot na temelju velikoga jezičnoga modela GPT-4 ChatGPT u znanju uvodnog statističkog kolegija koji se predaje na drugoj godini studija studentima informatičkog studija. Testiranje je provedeno ručnim unošenjem 170 kviz pitanja iz područja statistike u preglednik ChatGPT-a. Pitanja su podijeljena u tri kategorije: teorijska pitanja u kojim se reproducira znanje, teorijska pitanja u kojim se testira razumijevanje područja i zadaci. Kviz pitanja su postavljena na hrvatskom jeziku i analizirani su odgovori dobiveni na hrvatskom jeziku. Uspoređena je točnost rješavanja kviz pitanja za studente i ChatGPT po kategorijama pitanja korištenjem Wilcoxonovog testa sume rangova. Rezultati pokazuju da ChatGPT daje statistički bolje rezultate od studenata u kategorijama teorijskih pitanja u kojima se traži reprodukcija znanja i razumijevanje, dok su kod rješavanja zadataka studenti uspješniji, ali razlika u točnosti nije statistički značajna (p<0,01).
Cilj istraživanja bio je ispitati metrijska svojstva testa Advanced Measures of Music Audiation E. E. Gordona (1989) na uzorku studenata u Hrvatskoj kojima glazba nije glavni predmet studija. ...Istraživanje je provedeno sa studenticama odgojiteljskog studija na Fakultetu za odgojne i obrazovne znanosti u Osijeku (N = 235). Deskriptivni pokazatelji AMMA testa i normalnost raspodjele rezultata na hrvatskom uzorku ne odstupaju od rezultata dobivenih prilikom standardizacije na američkom uzorku i poljskom uzorku kod studenata kojima glazba nije glavni predmet na studiju. Utvrđeno je kako koeficijent pouzdanosti AMMA testa izračunat split-half tehnikom iznosi 0,87, a dobivene su i visoke vrijednosti interkorelacija između ukupnog testa te Tonalnog (r = 0,88) i Ritamskog subtesta (r = 0,87). Niže vrijednosti indeksa težine i diskriminativnosti (< 0,20) najslabija su svojstva primjene AMMA testa na hrvatskom uzorku. Promatrajući ukupnost istraženih metrijskih svojstava, može se zaključiti kako je AMMA test pouzdan mjerni instrument za uporabu procjene auditivnih glazbenih sposobnosti kod studenata neglazbenika u Hrvatskoj.
The mechanical properties of the composite materials for prosthetic sockets are a key determinant of the quality and usability of prostheses. Our aim was to compare the existing materials used in ...production at our institution with some modified, potentially improved materials. We conducted an industrial experiment. The existing material (A) was compared with three newly produced materials that introduced changes in the lamination process: B1, where an infusion spiral tube was added; B2, where the resin was degassed; and B3, where a mesh and peel ply were used. The specimens underwent laboratory strength testing. The strength measurements were statistically analysed using one-way analysis of covariance (ANCOVA) that was adjusted for specimen thickness because of the observed negative correlation of thickness with strength. Material A had the highest bending strength, on average, but there were no statistically significant differences in the bending strength between the materials after adjusting for the specimen thickness (p = 0.941). Materials B1 and B2 exhibited statistically significantly lower tensile strengths than material A (p < 0.001). Material B3 had the lowest average tensile strength, but it could not be statistically distinguished from the others, because of the significantly larger average specimen thickness. The compressive strength was tested only for materials B1, B2 and B3; their averages did not differ statistically significantly (p = 0.291). Laboratory strength testing provided important insights into the differences between the various laminated composite prosthetics materials. We did not reach our initial goal to produce a better material, but we will continue our research and development in this field with a more systematic, technological approach.
To develop and content validate a self-assessment questionnaire on motivational interviewing (MI) practice as the first stages in forming the questionnaire to be used in cross-sectional studies ...involving practitioners conducting the MI-based alcohol screening and brief intervention (ASBI).
A comprehensive mixed methods approach included a literature review, 3 rounds of expert panel (EP) opinions (n=10), cognitive testing (CT) with 10 MI-based ASBI practitioners, and questionnaire piloting with 31 MI-based ASBI practitioners. Based on the EP opinions in the second round, content validity indices (CVIs) and the modified kappa coefficient (k*) were calculated, focusing on the relevance and understandability of questions and comprehensiveness and meaningfulness of the response options. This analysis was performed in 2020, at the conclusion of the national "Together for a Responsible Attitude Towards Alcohol Consumption" ("Skupaj za odgovoren odnos do pitja alkohola", SOPA) project's pilot implementation.
On a scale level, CVI values based on universal agreement for the entire questionnaire were high for 3/4 categories (S-CVI-UA>0.80), and CVI values based on average agreement were high across all categories (S-CVI-Ave>0.90). At the item level, CVI values (I-CVI) were never <0.50 (automatic item rejection), and the modified kappa value (k*) indicated poor validity for two items in the understandability category (k*=0.33). All problematic parts of the questionnaire were further tested and successfully modified based on the results of CT, and accepted in the third round of testing.
The final version of the questionnaire demonstrated appropriate content validity for use in studies among Slovenian MI-based ASBI practitioners and is now ready for further psychometric testing.
Svakodnevnim korištenjem raznih softvera ljudi se susreću s pogreškama nastalim u procesu njihova razvoja. One mogu biti trivijalne, ali mogu biti i kritične za korištenje određenih funkcionalnosti ...softvera. Pogreške prilikom razvoja su neizbježne, zato se u testiranje softvera ulažu velike količine novca i vremena. No, unatoč velikom trudu i ulaganju, nemoguće je pronaći apsolutno sve pogreške prije izlaska softvera u produkciju. Pri tome nam može pomoći automatsko testiranje. U
radu je prikazan proces automatskog testiranja web-aplikacija primjenom
alata za automatsko testiranje: web-drivera Geb, programskog jezika Groovy i testnog frameworka Spock. Rezultati ovog istraživanja pokazali su da kombinacija navedenih alata predstavlja odgovarajuće i kompletno rješenje za provedbu automatskog testiranja web-aplikacija.
Every day people encounter mistakes and bugs in software while they use it. Some of these bugs can be trivial, but some may be critical for some
software functions. Since mistakes are inevitable, software testing requires a large amount of money, time and work. But despite efforts and
investment, it is impossible to find all mistakes before software enters the market and production. Automatic testing can be very useful during that process. This paper presents methodology of automatic testing and its specific steps during testing of web applications, which
is based on several chosen tools: Groovy programming language, Geb web driver and Spock test framework. The results have shown that this specific combination of tools presents an adequate and complete solution for automatic testing of web applications.