Many businesses today save time and money, and increase their agility, by outsourcing mundane IT tasks to cloud providers. The author argues that similar methods can be used to overcome the ...complexities inherent in increas ingly data-intensive, computational, and collaborative scientific research. He describes Globus Online, a system that he and his colleagues are developing to realize this vision.
A scientist’s choice of research problem affects his or her personal career trajectory. Scientists’ combined choices affect the direction and efficiency of scientific discovery as awhole. In this ...paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity’s importance corresponds to its degree centrality, and a problem’s difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies.
Off‐site impacts of soil erosion are of greater social and economic concern in Western Europe than on‐site impacts. They fall into two related categories: muddy flooding of properties and ecological ...impacts on watercourses because of excessive sedimentation and associated pollutants. Critical to these impacts is the connectedness of the runoff and sediment system between agricultural fields and the river system. We argue that well‐connected systems causing off‐site damage are not necessarily related to areas of high erosion rates; emphasis should therefore be on the way in which connections occur. In temperate, arable systems, important elements of connectivity are anthropogenic in origin: roads, tracks, sunken lanes, field drains, ditches, culverts and permeable field boundaries. Mapping these features allows us to understand how they affect runoff and modify its impacts, to design appropriate mitigation measures and to better validate model predictions. Published maps (digital and paper) do not, by themselves, give sufficient information. Field mapping and observation, aided by remote sensing, are also necessary.
With current annual production at over 600 million tonnes, wheat is the third largest crop in the world behind corn and rice, and an essential source of carbohydrates for millions of people. While ...wheat is grown over a wide range of environments, it is common in the major wheat-producing countries for grain filling to occur when soil moisture is declining and temperature is increasing. Average global temperatures have increased over the last decades and are predicted to continue rising, along with a greater frequency of extremely hot days. Such events have already been reported for major wheat growing regions in the world. However, the direct impact of past temperature variability and changes in averages and extremes on wheat production has not been quantified. Attributing changes in observed yields over recent decades to a single factor such as temperature is not possible due to the confounding effects of other factors. By using simulation modelling, we were able to separate the impact of temperature from other factors and show that the effect of temperature on wheat production has been underestimated. Surprisingly, observed variations in average growing-season temperatures of ±2 °C in the main wheat growing regions of Australia can cause reductions in grain production of up to 50%. Most of this can be attributed to increased leaf senescence as a result of temperatures >34 °C. Temperature conditions during grain filling in the major wheat growing regions of the world are similar to the Australian conditions during grain filling. With average temperatures and the frequency of heat events projected to increase world-wide with global warming, yield reductions due to higher temperatures during the important grain-filling stage alone could substantially undermine future global food security. Adaptation strategies need to be considered now to prevent substantial yield losses in wheat from increasing future heat stress.
As materials data sets grow in size and scope, the role of data mining and statistical learning methods to analyze these materials data sets and build predictive models is becoming more important. ...This manuscript introduces matminer, an open-source, Python-based software platform to facilitate data-driven methods of analyzing and predicting materials properties. Matminer provides modules for retrieving large data sets from external databases such as the Materials Project, Citrination, Materials Data Facility, and Materials Platform for Data Science. It also provides implementations for an extensive library of feature extraction routines developed by the materials community, with 47 featurization classes that can generate thousands of individual descriptors and combine them into mathematical functions. Finally, matminer provides a visualization module for producing interactive, shareable plots. These functions are designed in a way that integrates closely with machine learning and data analysis packages already developed and in use by the Python data science community. We explain the structure and logic of matminer, provide a description of its various modules, and showcase several examples of how matminer can be used to collect data, reproduce data mining studies reported in the literature, and test new methodologies.
Channel banks can contribute a significant proportion of fine‐grained (<63 μm) sediment to rivers, thereby also contributing to riverine total particulate phosphorus loads. Improving water quality ...through better agricultural practices alone can be difficult since the contributions from non‐agricultural sources, including channel banks, can generate a ‘spatial mismatch’ between the efficacy of best management applied on farms and the likelihood of meeting environmental objectives. Our study undertook a reconnaissance survey (n = 76 sites each with 3 profiles sampled) to determine the total phosphorus (TP) concentrations of channel banks across England and to determine if TP content can be predicted using readily accessible secondary data. TP concentrations in adjacent field topsoils, local soil soil type/texture and geological parent material were examined as potential predictors of bank TP. Carbon and nitrogen content were also analysed to explore the impacts of organic matter content on measured TP concentrations. The results suggest that channel bank TP concentrations are primarily controlled by parent material rather than P additions to adjacent topsoils through fertilizer and organic matter inputs, but significant local variability in concentrations prevents the prediction of bank TP content using mapped soil type or geology. A median TP concentration of 873 mg kg−1 was calculated for the middle section of the sampled channel bank profiles, with a 25th percentile of 675 mg kg−1, and 75th percentile of 1159 mg kg−1. Using these concentrations and, in comparison with previously published estimates, the estimated number of inland WFD waterbodies in England for which channel bank erosion contributes >20% of the riverine total PP load increased from 15 to 25 (corresponding range of 17–35 using the 25th and 75th percentiles of measured TP concentrations). Collectively, these 25 waterbodies account for 0.2% of the total inland WFD waterbody area comprising England.
The assessment and, where necessary, improved management, of particulate phosphorus loss from eroding channel banks at catchment and strategic scale is important for water quality protection and provision of ecosystem services. A reconnaissance survey of the total phosphorus content of channel banks was undertaken at national scale across England and the new information generated was employed to update estimates of the relative contributions of bank‐derived inputs to riverine total phosphorus loads at Water Framework Directive (WFD) waterbody scale.
The Globus Toolkit (GT) has been developed since the late 1990s to support the development of service-oriented distributed computing applications and infrastructures. Core GT components address, ...within a common framework, fundamental issues relating to security, resource access, resource management, data movement, resource discovery, and so forth. These components enable a broader “Globus ecosystem” of tools and components that build on, or interoperate with, GT functionality to provide a wide range of useful application-level functions. These tools have in turn been used to develop a wide range of both “Grid” infrastructures and distributed applications. I summarize here the principal characteristics of the recent Web Services-based GT4 release, which provides significant improvements over previous releases in terms of robustness, performance, usability, documentation, standards compliance, and functionality. I also introduce the new “dev.globus” community development process, which allows a larger community to contribute to the development of Globus software.
Soil erosion on agricultural land is a growing problem in Western Europe and constitutes a threat to soil quality and to the ability of soils to provide environmental services. The off-site impacts ...of runoff and eroded soil, principally eutrophication of water bodies, sedimentation of gravel-bedded rivers, loss of reservoir capacity, muddy flooding of roads and communities, are increasingly recognised and costed. The shift of funding in the European Union (EU) from production-related to avoidance of pollution and landscape protection, raises issues of cross-compliance: public support for agriculture has to be seen to give value-for-money. In this context risk-assessment procedures have been introduced to help farmers recognise sites where either certain crops should not be grown or anti-erosion measures are required. In England, Defra Defra, 2005a. Controlling Soil Erosion: a Manual for the Assessment and Management of Agricultural Land at Risk of Water Erosion in Lowland England. Revised September 2005. Department for Environment, Food and Rural Affairs, London sets out a system of risk-assessment, including ranking of crops susceptible to erosion and anti-erosion measures, that may be selected. We assess this system using field data for an area of erodible soils in the Rother valley, Sussex. The Defra approach correctly identifies most at-risk fields and, taken together with land-use maps, allows non-compliance with advice to be highlighted. We suggest a simple extension to the system which would further identify at-risk fields in terms of possible damage to roads and rivers from muddy runoff. The increased risk of erosion in the study area is associated with certain crops: potatoes, winter cereals, maize and grazed turnips and seems unlikely to be the result of changes in rainfall which over the last 130
years are minimal. We have not evaluated proposed anti-erosion measures in the area because few have been put into practice. The European Water Framework Directive will increasingly focus attention on agricultural fields as a source of river pollution. Assessing the risk of erosion and the need for field testing of suggested approaches, are not simply issues for the EU, but for the management of global agricultural systems.