Persistent unique identifiers (PID) are a critical element in digital research data infrastructure to unambiguously identify, locate, and cite digital representations of a growing range of entities - ...publications, data, instruments, organizations, funding awards, field programs, and others. The IGSN was developed as the International Geo Sample Number to provide a persistent, globally unique, web resolvable identifier for physical samples. IGSN is both a governance and technical system for assigning globally unique persistent identifiers to physical samples. Even though initially developed for samples in the geosciences, the application of IGSN can be and has already been expanded to other domains that rely on physical samples and collections. This paper describes the current architecture and technical implementation of IGSN, how IGSN relates to other sample identifiers, and how its technical systems are supported by an international governance structure.
The International Geo Sample Number (IGSN) is a globally unique persistent identifier (PID) for physical samples that provides discovery functionality of digital sample descriptions via the internet. ...In this article we describe the implementation of a registration service for IGSNs of the Helmholtz Centre Potsdam - GFZ German Research Centre for Geosciences. This includes the adaption of the metadata schema developed within the context of the System for Earth Sample Registration (SESAR1) to better describe the complex sample hierarchy of drilling cores, core sections and samples of scientific drilling projects.
The GFZ German Research Centre for Geosciences is the national laboratory for Geosciences in Germany. As part of the Helmholtz Association, providing and maintaining large-scale scientific ...infrastructures are an essential part of GFZ activities. This includes the generation of significant volumes and numbers of research data, which subsequently become source materials for data publications. The development and maintenance of data systems is a key component of GFZ Data Services to support state-of-the-art research. A challenge lies not only in the diversity of scientific subjects and communities, but also in different types and manifestations of how data are managed by research groups and individual scientists. The data repository of GFZ Data Services provides a flexible IT infrastructure for data storage and publication, including minting of digital object identifiers (DOI). It was built as a modular system of several independent software components linked together through Application Programming Interfaces (APIs) provided by the eSciDoc framework. Principal application software are panMetaDocs for data management and DOIDB for logging and moderating data publications activities. Wherever possible, existing software solutions were integrated or adapted. A summary of our experiences made in operating this service is given. Data are described through comprehensive landing pages and supplementary documents, like journal articles or data reports, thus augmenting the scientific usability of the service.
On timescales beyond the life of a research project, a core task in the curation of digital research data is the migration of data and metadata to new storage media, new hardware, and software ...systems. These migrations are necessitated by ageing software systems, ageing hardware systems, and the rise of new technologies in data management. Using the example of the German Continental Deep Drilling Program (KTB) we outline steps taken to keep the acquired data accessible to researchers and trace the history of data management in KTB from a project platform in the early 1990ies through three migrations up to the current data management platform. The migration steps taken not only preserved the data, but also made data from KTB accessible via internet and citable through Digital Object Identifier (DOI). We also describe measures taken to manage hardware and software obsolescence and minimise the amount of maintenance necessary to keep data accessible beyond the active project phase. At present, data from KTB are stored in an Open Archival Information System (OAIS) compliant repository based on the eSciDoc repository framework. Information packages consist of self-contained packages of binary data files and discovery metadata in Extensible Mark-up Language (XML) format. The binary data files were created from a relational database used for data management in the previous version of the system, and from websites generated from a content management system. Metadata are provided in DataCite, GCMD-DIF, and ISO19139/INSPIRE schema definitions. Access to the KTB data is provided through download pages which are produced by XML transformation from the stored metadata.
EPOS is a Research Infrastructure plan that is undertaking the challenge of integrating data from different solid Earth disciplines and of providing a common knowledge-base for the Solid-Earth ...community in Europe, by implementing and managing a logically centralised catalog based on the CERIF model. The EPOS catalogue will contain the information about all the participating actors, such as Research Infrastructures, Organisations and their assets, in relationship with the people, their roles and their affilitation within the specific scientific domain. The catalogue will guarantee the discoverability of domain specific data, data products, software and services (DDSS) and enable the EPOS Integrated Core Services system to perform - on behalf of a end user – advanced operations on data as for instance processing and visualization. It will also foster the homogenisation of vocabularies, as well as supporting heterogeneous metadata. Clearly, the effort of accomodating the diversities across all the players needs to take into account of existing initiatives concerning metadata standards and institutional recommendations, trying to satisfy the EPOS requirements by incorporating and profiling more generic concepts and semantics. The paper describes the approach of the EPOS metadata working group, providing the rationale behind the integration, extension and mapping strategy to converge the EPOS metadata baseline model towards the CERIF entities, relationships and vocabularies. Special attention will be given to the outcomes of the mapping process between two elements of the EPOS baseline - Research Infrastructure and Equipment - and CERIF, by providing detailed insights and description of the two data models, of encountered issues and of proposed solutions.