Multiscale mixing patterns in networks Peel, Leto; Delvenne, Jean-Charles; Lambiotte, Renaud
Proceedings of the National Academy of Sciences - PNAS,
04/2018, Letnik:
115, Številka:
16
Journal Article
Recenzirano
Odprti dostop
Assortative mixing in networks is the tendency for nodes with the same attributes, or metadata, to link to each other. It is a property often found in social networks, manifesting as a higher ...tendency of links occurring between people of the same age, race, or political belief. Quantifying the level of assortativity or disassortativity (the preference of linking to nodes with different attributes) can shed light on the organization of complex networks. It is common practice to measure the level of assortativity according to the assortativity coefficient, or modularity in the case of categorical metadata. This global value is the average level of assortativity across the network and may not be a representative statistic when mixing patterns are heterogeneous. For example, a social network spanning the globe may exhibit local differences in mixing patterns as a consequence of differences in cultural norms. Here, we introduce an approach to localize this global measure so that we can describe the assortativity, across multiple scales, at the node level. Consequently, we are able to capture and qualitatively evaluate the distribution of mixing patterns in the network. We find that, for many real-world networks, the distribution of assortativity is skewed, overdispersed, and multimodal. Our method provides a clearer lens through which we can more closely examine mixing patterns in networks.
This paper describes the scholarly metadata collected and made available by Crossref, as well as its importance in the scholarly research ecosystem. Containing over 106 million records and expanding ...at an average rate of 11% a year, Crossref’s metadata has become one of the major sources of scholarly data for publishers, authors, librarians, funders, and researchers. The metadata set consists of 13 content types, including not only traditional types, such as journals and conference papers, but also data sets, reports, preprints, peer reviews, and grants. The metadata is not limited to basic publication metadata, but can also include abstracts and links to full text, funding and license information, citation links, and the information about corrections, updates, retractions, etc. This scale and breadth make Crossref a valuable source for research in scientometrics, including measuring the growth and impact of science and understanding new trends in scholarly communications. The metadata is available through a number of APIs, including REST API and OAI-PMH. In this paper, we describe the kind of metadata that Crossref provides and how it is collected and curated. We also look at Crossref’s role in the research ecosystem and trends in metadata curation over the years, including the evolution of its citation data provision. We summarize the research used in Crossref’s metadata and describe plans that will improve metadata quality and retrieval in the future.
Large-scale, data-intensive scientific applications are often expressed as scientific workflows (SWfs). In this paper, we consider the problem of efficient scheduling of a large SWf in a multisite ...cloud, i.e., a cloud with geo-distributed cloud data centers (sites). The reasons for using multiple cloud sites to run a SWf are that data is already distributed, the necessary resources exceed the limits at a single site, or the monetary cost is lower. In a multisite cloud, metadata management has a critical impact on the efficiency of SWf scheduling as it provides a global view of data location and enables task tracking during execution. Thus, it should be readily available to the system at any given time. While it has been shown that efficient metadata handling plays a key role in performance, little research has targeted this issue in multisite cloud. In this paper, we propose to identify and exploit hot metadata (frequently accessed metadata) for efficient SWf scheduling in a multisite cloud, using a distributed approach. We implemented our approach within a scientific workflow management system, which shows that our approach reduces the execution time of highly parallel jobs up to 64 percent and that of the whole SWfs up to 55 percent.
This paper is a survey of standards being used in the domain of digital cultural heritage with focus on the Metadata Encoding and Transmission Standard (METS) created by the Library of Congress in ...the United States of America. The process of digitization of cultural heritage requires silo breaking in a number of areas—one area is that of academic disciplines to enable the performance of rich interdisciplinary work. This lays the foundation for the emancipation of the second form of silo which are the silos of knowledge, both traditional and born digital, held in individual institutions, such as galleries, libraries, archives and museums. Disciplinary silo breaking is the key to unlocking these institutional knowledge silos. Interdisciplinary teams, such as developers and librarians, work together to make the data accessible as open data on the “semantic web”. Description logic is the area of mathematics which underpins many ontology building applications today. Creating these ontologies requires a human–machine symbiosis. Currently in the cultural heritage domain, the institutions’ role is that of provider of this open data to the national aggregator which in turn can make the data available to the trans-European aggregator known as Europeana. Current ingests to the aggregators are in the form of machine readable cataloguing metadata which is limited in the richness it provides to disparate object descriptions. METS can provide this richness.
Public access to well-organized migration data repositories is rare. While open migration data are available, most are dispersed, challenging their organization. This work uses a metadata modeling ...approach to organize migration-related data and research components hosted in an open-source, collaborative research environment. The proposed platform integrates data, computational models, application libraries, and research projects in a unified environment for research. An overview of the metadata modeling process and the XML technology used to integrate the platform components was provided, along with descriptions of the platform’s essential features, such as standardized exchange of information for data organization and sharing and collaborative authoring. The design of a prototype collaborative, open-source migration data and modeling research platform using sample data and software components was built and hosted on GitHub for testing and refinement purposes. It allows the authors to collaborate, improve, and assess the platform’s functionalities and capabilities. Our platform, COSMOS (Collaborative Open-Source Modeling System), adopted the creation and use of open data formats for alignment with FAIR (Findable, Accessible, Interoperable, and Reusable) data principles to ensure that data at the prototype level is consistent with and can be used by the international migration research community in the future. By providing a standardized framework for data organization, data sharing, and collaboration, the platform can help accelerate scientific discovery and advance our understanding of complex social and cultural phenomena such as immigration.
Minimally, a research data repository exists to make a collection of data assets available to potential users. If a dataset cannot be discovered and found, it cannot be reused (Garnett et al. 2017). ...Harvestable metadata catalogues are a key strategy for achieving greater global findability of data assets, as they create a surveyable access point to discover data products within large data collections. Such catalogues can be especially effective if they are tailored for interoperability with feature-rich infrastructures (e.g. meta-catalogues, see Kapiszewski & Karcher 2020; CRFCB 2014) that are highly visible and widely used, and also themselves integrated within the larger ecosystem of research infrastructures. This study offers insight into a set of World Data System (WDS) research data repositories ongoing and successful implementations of harvestable metadata services, which apply established and emerging research data standards and practices to fit global, local and domain-specific interoperability contexts. Establishing a harvestable metadata service involves making choices in a space where standards and technologies are continuously evolving. The repositories in this study leverage the resources they have, within the policy and funding constraints of their institution, to serve the changing needs of heterogeneous user groups. This document encapsulates and completes the work that was carried out by the WDS International Technology Office (ITO) Harvestable Metadata Services Working Group (HMetS-WG).