A climate data record of global sea surface temperature (SST) spanning 1981-2016 has been developed from 4 × 10
satellite measurements of thermal infra-red radiance. The spatial area represented by ...pixel SST estimates is between 1 km
and 45 km
. The mean density of good-quality observations is 13 km
yr
. SST uncertainty is evaluated per datum, the median uncertainty for pixel SSTs being 0.18 K. Multi-annual observational stability relative to drifting buoy measurements is within 0.003 K yr
of zero with high confidence, despite maximal independence from in situ SSTs over the latter two decades of the record. Data are provided at native resolution, gridded at 0.05° latitude-longitude resolution (individual sensors), and aggregated and gap-filled on a daily 0.05° grid. Skin SSTs, depth-adjusted SSTs de-aliased with respect to the diurnal cycle, and SST anomalies are provided. Target applications of the dataset include: climate and ocean model evaluation; quantification of marine change and variability (including marine heatwaves); climate and ocean-atmosphere processes; and specific applications in ocean ecology, oceanography and geophysics.
The HydroATLAS database provides a standardized compendium of descriptive hydro-environmental information for all watersheds and rivers of the world at high spatial resolution. Version 1.0 of ...HydroATLAS offers data for 56 variables, partitioned into 281 individual attributes and organized in six categories: hydrology; physiography; climate; land cover & use; soils & geology; and anthropogenic influences. HydroATLAS derives the hydro-environmental characteristics by aggregating and reformatting original data from well-established global digital maps, and by accumulating them along the drainage network from headwaters to ocean outlets. The attributes are linked to hierarchically nested sub-basins at multiple scales, as well as to individual river reaches, both extracted from the global HydroSHEDS database at 15 arc-second (~500 m) resolution. The sub-basin and river reach information is offered in two companion datasets: BasinATLAS and RiverATLAS. The standardized format of HydroATLAS ensures easy applicability while the inherent topological information supports basic network functionality such as identifying up- and downstream connections. HydroATLAS is fully compatible with other products of the overarching HydroSHEDS project enabling versatile hydro-ecological assessments for a broad user community.
CRU TS (Climatic Research Unit gridded Time Series) is a widely used climate dataset on a 0.5° latitude by 0.5° longitude grid over all land domains of the world except Antarctica. It is derived by ...the interpolation of monthly climate anomalies from extensive networks of weather station observations. Here we describe the construction of a major new version, CRU TS v4. It is updated to span 1901-2018 by the inclusion of additional station observations, and it will be updated annually. The interpolation process has been changed to use angular-distance weighting (ADW), and the production of secondary variables has been revised to better suit this approach. This implementation of ADW provides improved traceability between each gridded value and the input observations, and allows more informative diagnostics that dataset users can utilise to assess how dataset quality might vary geographically.
MIMIC-III ('Medical Information Mart for Intensive Care') is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary care ...hospital. Data includes vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more. The database supports applications including academic and industrial research, quality improvement initiatives, and higher education coursework.
As fundamental data, gross domestic product (GDP) and electricity consumption can be used to effectively evaluate economic status and living standards of residents. Some scholars have estimated ...gridded GDP and electricity consumption. However, such gridded data have shortcomings, including overestimating real GDP growth, ignoring the heterogeneity of the spatiotemporal dynamics of the grid, and limited time-span. Simultaneously, the Defense Meteorological Satellite Program's Operational Linescan System (DMSP/OLS) and National Polar-orbiting Partnership's Visible Infrared Imaging Radiometer (NPP/VIIRS) nighttime light data, adopted in these studies as a proxy tool, still facing shortcomings, such as imperfect matching results, discontinuity in temporal and spatial changes. In this study, we employed a series of methods, such as a particle swarm optimization-back propagation (PSO-BP) algorithm, to unify the scales of DMSP/OLS and NPP/VIIRS images and obtain continuous 1 km × 1 km gridded nighttime light data during 1992-2019. Subsequently, from a revised real growth perspective, we employed a top-down method to calculate global 1 km × 1 km gridded revised real GDP and electricity consumption during 1992-2019 based on our calibrated nighttime light data.
This data descriptor reports results of a 1972-73 baseline study of modern pollen deposition in the Canadian Arctic to originally aid interpretation of Holocene pollen diagrams from that region, ...especially focussed on the arctic tree-line. The data set is geographically unique due to its extent, and allows the assessment of the effects of modern climate change on northern ecosystems, including fluctuations of the a arctic tree-line. Repeated sampling was conducted along an interior transect at 29 sites from the Boreal Forest to the High Arctic, with five additional coastal sites covering a total distance of 3,200 km. Static pollen samplers captured both local pollen and long-distance pollen wind-blown from the Boreal Forest. Moss and lichen polsters provided multi-year pollen fallout to assess the effectiveness of the static pollen samplers. The local vegetation was recorded at each site. This descriptor provides information on data archived at the World Data Center PANGAEA, which includes spreadsheets detailing site and sample information as well as raw and processed pollen data obtained on over 500 samples.
Smart meter roll-outs provide easy access to granular meter measurements, enabling advanced energy services, ranging from demand response measures, tailored energy feedback and smart home/building ...automation. To design such services, train and validate models, access to data that resembles what is expected of smart meters, collected in a real-world setting, is necessary. The REFIT electrical load measurements dataset described in this paper includes whole house aggregate loads and nine individual appliance measurements at 8-second intervals per house, collected continuously over a period of two years from 20 houses. During monitoring, the occupants were conducting their usual routines. At the time of publishing, the dataset has the largest number of houses monitored in the United Kingdom at less than 1-minute intervals over a period greater than one year. The dataset comprises 1,194,958,790 readings, that represent over 250,000 monitored appliance uses. The data is accessible in an easy-to-use comma-separated format, is time-stamped and cleaned to remove invalid measurements, correctly label appliance data and fill in small gaps of missing data.
The knowledge of the vibrational properties of a material is of key importance to understand physical phenomena such as thermal conductivity, superconductivity, and ferroelectricity among others. ...However, detailed experimental phonon spectra are available only for a limited number of materials, which hinders the large-scale analysis of vibrational properties and their derived quantities. In this work, we perform ab initio calculations of the full phonon dispersion and vibrational density of states for 1521 semiconductor compounds in the harmonic approximation based on density functional perturbation theory. The data is collected along with derived dielectric and thermodynamic properties. We present the procedure used to obtain the results, the details of the provided database and a validation based on the comparison with experimental data.
Abstract
Most of the existing chest X-ray datasets include labels from a list of findings without specifying their locations on the radiographs. This limits the development of machine learning ...algorithms for the detection and localization of chest abnormalities. In this work, we describe a dataset of more than 100,000 chest X-ray scans that were retrospectively collected from two major hospitals in Vietnam. Out of this raw data, we release 18,000 images that were manually annotated by a total of 17 experienced radiologists with 22 local labels of rectangles surrounding abnormalities and 6 global labels of suspected diseases. The released dataset is divided into a training set of 15,000 and a test set of 3,000. Each scan in the training set was independently labeled by 3 radiologists, while each scan in the test set was labeled by the consensus of 5 radiologists. We designed and built a labeling platform for DICOM images to facilitate these annotation procedures. All images are made publicly available in DICOM format along with the labels of both the training set and the test set.
Although a key driver of Earth's climate system, global land-atmosphere energy fluxes are poorly constrained. Here we use machine learning to merge energy flux measurements from FLUXNET eddy ...covariance towers with remote sensing and meteorological data to estimate global gridded net radiation, latent and sensible heat and their uncertainties. The resulting FLUXCOM database comprises 147 products in two setups: (1) 0.0833° resolution using MODIS remote sensing data (RS) and (2) 0.5° resolution using remote sensing and meteorological data (RS + METEO). Within each setup we use a full factorial design across machine learning methods, forcing datasets and energy balance closure corrections. For RS and RS + METEO setups respectively, we estimate 2001-2013 global (±1 s.d.) net radiation as 75.49 ± 1.39 W m
and 77.52 ± 2.43 W m
, sensible heat as 32.39 ± 4.17 W m
and 35.58 ± 4.75 W m
, and latent heat flux as 39.14 ± 6.60 W m
and 39.49 ± 4.51 W m
(as evapotranspiration, 75.6 ± 9.8 × 10
km
yr
and 76 ± 6.8 × 10
km
yr
). FLUXCOM products are suitable to quantify global land-atmosphere interactions and benchmark land surface model simulations.