This paper presents a hybrid document recommender system intended for use in digital libraries and institutional repositories that are part of the Slovenian Open Access Infrastructure. The ...recommender system provides recommendations of similar documents across different digital libraries and institutional repositories with the aim to connect researchers and improve collaboration efforts. The hybrid recommender system makes use of document processing techniques, document metadata, and the similarity ranking function BM25 to provide content-based recommendations as a primary method. It also uses collaborative-filtering methods as a secondary method in a cascade hybrid recommendation technique. We also provide a real-world data feedback collection analysis for our hybrid recommender system on an academic digital repository in order to be able to identify suitable time-frames for direct feedback collection during the year.
Prispevek izhaja iz treh izzivov, ki jih zaznavamo pri pouku slovenščine v višjih razredih osnovnih šol in v srednjih šolah: kako odpraviti napake knjižne norme, ki vztrajajo v pisnih izdelkih ...učencev; kako izboljšati frazeološko kompetenco; kako izboljšati sporazumevalno jezikovno zmožnost. Ti izzivi so osrednja točka razvoja sodobnega učnega e-okolja Slovenščina na dlani, ki temelji na jezikovnih in informacijsko-komunikacijskih tehnologijah ter prinaša podporo prožnim oblikam poučevanja, poučevanju na daljavo, lajša učiteljevo delo, omogoča pa tudi motiviranje učencev prek elementov igrifikacije. V prispevku predstavljamo zasnovo in izvedbo vsakega od štirih vsebinskih sklopov e-okolja: pravopis, slovnica, frazeologija in besedila.
The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses, research and professional ...articles, along with other academic document types. The data within the dataset was collected as a part of the establishment of the Slovenian Open-Access Infrastructure which defined a unified document collection process and cataloguing for universities in Slovenia within the infrastructure repositories. The data was collected from several already established but separate library systems in Slovenia and merged into a single metadata scheme using metadata deduplication and merging techniques. It consists of text and numerical fields, representing attributes that describe documents. These attributes include document titles, keywords, abstracts, typologies, authors, issue years and other identifiers such as URL and UDC. The potential of this dataset lies especially in text mining and text classification tasks and can also be used in development or benchmarking of content-based recommender systems on real-world data.
Purpose
– The purpose of this paper is to present a technical perspective when implementing the Slovenian open access infrastructure that consists of four institutional repositories (IRs) and a ...national portal (NP) that aggregates content from the repositories in order to provide a common search engine, recommendations of similar documents, and similar text detection.
Design/methodology/approach
– During the project, the necessary legal background and processes for mandatory submissions of final study works, research publications and research data were established, as well as processes for data exchange between the IRs and the NP, and processes for similar text detection.
Findings
– The consortium consisted of four Slovenian universities that significantly differ in size, organisation, and workflows. It was anticipated that exactly the same legal background and software would be used for the four repositories. It turned out that complete unification was impossible due to the differences.
Practical implications
– The national open access infrastructure will improve the visibility of Slovenian research organisations. It supports the compliance with the funders’ open access mandates. The established infrastructure enables the depositing and archiving of approximately 80 percent of the peer-reviewed scientific publications that are annually published by Slovenian researchers. At the same time, the majority of final study works from Slovenian higher education institutions are available in full-text format.
Originality/value
– This paper describes a technical perspective for setting up a national open access infrastructure, which has not been described in the literature previously.
For a non-decreasing sequence $S=(s_1,s_2,\ldots)$ of positive integers, a
partition of the vertex set of a graph $G$ into subsets $X_1,\ldots, X_\ell$,
such that vertices in $X_i$ are pairwise at ...distance greater than $s_i$ for
every $i\in\{1,\ldots,\ell\}$, is called an $S$-packing $\ell$-coloring of $G$.
The minimum $\ell$ for which $G$ admits an $S$-packing $\ell$-coloring is
called the $S$-packing chromatic number of $G$, denoted by $\chi_S(G)$. In this
paper, we consider $S$-packing colorings of distance graphs
$G(\mathbb{Z},\{k,t\})$, where $k$ and $t$ are positive integers, which are the
graphs whose vertex set is $\mathbb{Z}$, and two vertices $x,y\in \mathbb{Z}$
are adjacent whenever $|x-y|\in\{k,t\}$. We complement partial results from two
earlier papers, thus determining all values of $\chi_S(G(\mathbb{Z},\{k,t\}))$
when $S$ is any sequence with $s_i\le 2$ for all $i$. In particular, if
$S=(1,1,2,2,\ldots)$, then the $S$-packing chromatic number is $2$ if $k+t$ is
even, and $4$ otherwise, while if $S=(1,2,2,\ldots)$, then the $S$-packing
chromatic number is $5$, unless $\{k,t\}=\{2,3\}$ when it is $6$; when
$S=(2,2,2,\ldots)$, the corresponding formula is more complex.