Mark Davison examines several legal models designed to protect databases, considering in particular the EU Directive, the history of its adoption and its transposition into national laws. He compares ...the Directive with a range of American legislative proposals, as well as the principles of misappropriation that underpin them. In addition, the book also contains a commentary on the appropriateness of the various models in the context of moves for an international agreement on the topic. This book will be of interest to academics and practitioners, including those involved with databases and other forms of new media.
This first of a kind book places spatial data within the broader domain of information technology (IT) while providing a comprehensive and coherent explanation of the guiding principles, methods, ...implementation and operational management of spatial databases within the workplace. The text explains the key concepts, issues and processes of spatial data implementation and provides a holistic management perspective that complements the technical aspects of spatial data stressed in other textbooks. In this respect, this book is unique in its coverage of spatial database principles and architecture, database modelling including UML, database and spatial data standards, spatial data infrastructure, database implementation, and workplace-oriented project management including user needs study and end user education. The text first overviews the current state of spatial information technology and it concludes with a speculative account of likely future developments. Cutting edge research and practical workplace needs are defined and explained. Topics covered, among others, include strategies for end user education, current spatial data standards and their importance, legal issues and liabilities in the ownership and use of spatial data, spatial metadata use within distributed databases, the Internet and Web-based solutions to database deployment, quality assurance and quality control in database implementation and use, spatial decision support, and spatial data mining. The book applies equally to senior undergraduate and graduate courses and students, as well as spatial data managers and practitioners already in the workplace. It will enhance their technical and human-resource based understanding of spatial data management. Certification courses that seek to prepare students for careers in the spatial information industry and courses targeted at enhancing needed geospatial workplace knowledge and skills will benefit greatly from its content.
Abstract
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database ...and the PubMed database of citations and abstracts for published life science journals. The Entrez system provides search and retrieval operations for most of these data from 39 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. New resources released in the past year include PubMed Data Management, RefSeq Functional Elements, genome data download, variation services API, Magic-BLAST, QuickBLASTp, and Identical Protein Groups. Resources that were updated in the past year include the genome data viewer, a human genome resources page, Gene, virus variation, OSIRIS, and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
The technologies of mobile communications and ubiquitous computing pervade our society, and wireless networks sense the movement of people and vehicles, generating large volumes of mobility data. ...This is a scenario of great opportunities and risks: on one side, mining this data can produce useful knowledge, supporting sustainable mobility and intelligent transportation systems, on the other side, individual privacy is at risk, as the mobility data contain sensitive personal information. A new multidisciplinary research area is emerging at this crossroads of mobility, data mining, and privacy. This book assesses this research frontier from a computer science perspective, investigating the various scientific and technological issues, open problems, and roadmap. The editors manage a research project called GeoPKDD, Geographic Privacy-Aware Knowledge Discovery and Delivery, funded by the EU Commission and involving 40 researchers from 7 countries, and this book tightly integrates and relates their findings in 13 chapters covering all related subjects, including the concepts of movement data and knowledge discovery from movement data, privacy-aware geographic knowledge discovery, wireless network and next-generation mobile technologies, trajectory data models, systems and warehouses, privacy and security aspects of technologies and related regulations, querying, mining and reasoning on spatiotemporal data, and visual analytics methods for movement data. This book will benefit researchers and practitioners in the related areas of computer science, geography, social science, statistics, law, telecommunications and transportation engineering.
Private information retrieval (PIR) is the problem of retrieving as efficiently as possible, one out of <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> messages from ...<inline-formula> <tex-math notation="LaTeX">N </tex-math></inline-formula> non-communicating replicated databases (each holds all <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> messages) while keeping the identity of the desired message index a secret from each individual database. The information theoretic capacity of PIR (equivalently, the reciprocal of minimum download cost) is the maximum number of bits of desired information that can be privately retrieved per bit of downloaded information. <inline-formula> <tex-math notation="LaTeX">T </tex-math></inline-formula>-private PIR is a generalization of PIR to include the requirement that even if any <inline-formula> <tex-math notation="LaTeX">T </tex-math></inline-formula> of the <inline-formula> <tex-math notation="LaTeX">N </tex-math></inline-formula> databases collude, the identity of the retrieved message remains completely unknown to them. Robust PIR is another generalization that refers to the scenario where we have <inline-formula> <tex-math notation="LaTeX">M \geq N </tex-math></inline-formula> databases, out of which any <inline-formula> <tex-math notation="LaTeX">M - N </tex-math></inline-formula> may fail to respond. For <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> messages and <inline-formula> <tex-math notation="LaTeX">M\geq N </tex-math></inline-formula> databases out of which at least some <inline-formula> <tex-math notation="LaTeX">N </tex-math></inline-formula> must respond, we show that the capacity of <inline-formula> <tex-math notation="LaTeX">T </tex-math></inline-formula>-private and Robust PIR is <inline-formula> <tex-math notation="LaTeX">(1+T/N+T^{2}/N^{2}+\cdots +T^{K-1}/N^{K-1})^{-1} </tex-math></inline-formula>. The result includes as special cases the capacity of PIR without robustness (<inline-formula> <tex-math notation="LaTeX">M=N </tex-math></inline-formula>) or <inline-formula> <tex-math notation="LaTeX">T </tex-math></inline-formula>-privacy constraints (<inline-formula> <tex-math notation="LaTeX">T=1 </tex-math></inline-formula>).
Nucleotide sequence and taxonomy reference databases are critical resources for widespread applications including marker-gene and metagenome sequencing for microbiome analysis, diet metabarcoding, ...and environmental DNA (eDNA) surveys. Reproducibly generating, managing, using, and evaluating nucleotide sequence and taxonomy reference databases creates a significant bottleneck for researchers aiming to generate custom sequence databases. Furthermore, database composition drastically influences results, and lack of standardization limits cross-study comparisons. To address these challenges, we developed RESCRIPt, a Python 3 software package and QIIME 2 plugin for reproducible generation and management of reference sequence taxonomy databases, including dedicated functions that streamline creating databases from popular sources, and functions for evaluating, comparing, and interactively exploring qualitative and quantitative characteristics across reference databases. To highlight the breadth and capabilities of RESCRIPt, we provide several examples for working with popular databases for microbiome profiling (SILVA, Greengenes, NCBI-RefSeq, GTDB), eDNA and diet metabarcoding surveys (BOLD, GenBank), as well as for genome comparison. We show that bigger is not always better, and reference databases with standardized taxonomies and those that focus on type strains have quantitative advantages, though may not be appropriate for all use cases. Most databases appear to benefit from some curation (quality filtering), though sequence clustering appears detrimental to database quality. Finally, we demonstrate the breadth and extensibility of RESCRIPt for reproducible workflows with a comparison of global hepatitis genomes. RESCRIPt provides tools to democratize the process of reference database acquisition and management, enabling researchers to reproducibly and transparently create reference materials for diverse research applications. RESCRIPt is released under a permissive BSD-3 license at https://github.com/bokulich-lab/RESCRIPt.
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank
nucleic acid sequence database and the ...PubMed database of citations and abstracts for published life science journals. The Entrez system provides search and retrieval operations for most of these data from 37 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. New resources released in the past year include iCn3D, MutaBind, and the Antimicrobial Resistance Gene Reference Database; and resources that were updated in the past year include My Bibliography, SciENcv, the Pathogen Detection Project, Assembly, Genome, the Genome Data Viewer, BLAST and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Abstract
Drug development involves a deep understanding of the mechanisms of action and possible side effects of each drug, and sometimes results in the identification of new and unexpected uses for ...drugs, termed as drug repurposing. Both in case of serendipitous observations and systematic mechanistic explorations, confirmation of new indications for a drug requires hypothesis building around relevant drug-related data, such as molecular targets involved, and patient and cellular responses. These datasets are available in public repositories, but apart from sifting through the sheer amount of data imposing computational bottleneck, a major challenge is the difficulty in selecting which databases to use from an increasingly large number of available databases. The database selection is made harder by the lack of an overview of the types of data offered in each database. In order to alleviate these problems and to guide the end user through the drug repurposing efforts, we provide here a survey of 102 of the most promising and drug-relevant databases reported to date. We summarize the target coverage and types of data available in each database and provide several examples of how multi-database exploration can facilitate drug repurposing.
Organizations that collect large amounts of unstructured data are increasingly turning to nonrelational databases, now frequently called NoSQL databases.