Akademska digitalna zbirka SLovenije - logo
E-viri
Celotno besedilo
  • Observational Data Patterns...
    Pastorello, Gilberto; Agarwal, Deb; Samak, Taghrid; Poindexter, Cristina; Faybishenko, Boris; Gunter, Dan; Hollowgrass, Rachel; Papale, Dario; Trotta, Carlo; Ribeca, Alessio; Canfora, Eleonora

    2014 IEEE 10th International Conference on e-Science, 2014-Oct., Letnik: 1
    Conference Proceeding

    Observational data are fundamental for scientific research in almost any domain. Recent advances in sensor and data management technologies are enabling unprecedented amounts of observational data to be collected and analyzed. However, an essential part of using observational data is not currently as scalable as data collection and analysis methods: data quality assurance and control. While specialized tools for very narrow domains do exist, general methods are harder to create. This paper explores the identification of data issues that lead to the creation of data tests and tools to perform data quality control activities. Developing this identification step in a systematic manner allows for better and more general quality control tools. As our case study, we use carbon, water, and energy fluxes as well as micro-meteorological data collected at field sites that are part of FLUXNET, a network of over 400 ecosystem-level monitoring stations. In an effort toward the release of a new global data set of fluxes, we are doing data quality control for these data. The experience from this work led to the creation of a catalog of issues identified in the data. This paper presents this catalog and its generalization into a set of patterns of data quality issues that can be detected in observational data.