Taiwan's National Health Insurance Research Database (NHIRD) exemplifies a population-level data source for generating real-world evidence to support clinical decisions and health care policy-making. ...Like with all claims databases, there have been some validity concerns of studies using the NHIRD, such as the accuracy of diagnosis codes and issues around unmeasured confounders. Endeavors to validate diagnosed codes or to develop methodologic approaches to address unmeasured confounders have largely increased the reliability of NHIRD studies. Recently, Taiwan's Ministry of Health and Welfare (MOHW) established a Health and Welfare Data Center (HWDC), a data repository site that centralizes the NHIRD and about 70 other health-related databases for data management and analyses. To strengthen the protection of data privacy, investigators are required to conduct on-site analysis at an HWDC through remote connection to MOHW servers. Although the tight regulation of this on-site analysis has led to inconvenience for analysts and has increased time and costs required for research, the HWDC has created opportunities for enriched dimensions of study by linking across the NHIRD and other databases. In the near future, researchers will have greater opportunity to distill knowledge from the NHIRD linked to hospital-based electronic medical records databases containing unstructured patient-level information by using artificial intelligence techniques, including machine learning and natural language processes. We believe that NHIRD with multiple data sources could represent a powerful research engine with enriched dimensions and could serve as a guiding light for real-world evidence-based medicine in Taiwan.
Big Data is an emerging paradigm and has currently become a strong attractor of global interest, specially within the transportation industry. The combination of disruptive technologies and new ...concepts such as the Smart City upgrades the transport data life cycle. In this context, Big Data is considered as a new pledge for the transportation industry to effectively manage all data this sector required for providing safer, cleaner and more efficient transport means, as well as for users to personalize their transport experience. However, Big Data comes along with its own set of technological challenges, stemming from the multiple and heterogeneous transportation/mobility application scenarios. In this survey we analyze the latest research efforts revolving on Big Data for the transportation and mobility industry, its applications, baselines scenarios, fields and use case such as routing, planning, infrastructure monitoring, network design, among others. This analysis will be done strictly from the Big Data perspective, focusing on those contributions gravitating on techniques, tools and methods for modeling, processing, analyzing and visualizing transport and mobility Big Data. From the literature review a set of trends and challenges is extracted so as to provide researchers with an insightful outlook on the field of transport and mobility.
Big Data: A Survey Chen, Min; Mao, Shiwen; Liu, Yunhao
Mobile networks and applications,
04/2014, Volume:
19, Issue:
2
Journal Article
Peer reviewed
In this paper, we review the background and state-of-the-art of big data. We first introduce the general background of big data and review related technologies, such as could computing, Internet of ...Things, data centers, and Hadoop. We then focus on the four phases of the value chain of big data, i.e., data generation, data acquisition, data storage, and data analysis. For each phase, we introduce the general background, discuss the technical challenges, and review the latest advances. We finally examine the several representative applications of big data, including enterprise management, Internet of Things, online social networks, medial applications, collective intelligence, and smart grid. These discussions aim to provide a comprehensive overview and big-picture to readers of this exciting area. This survey is concluded with a discussion of open problems and future directions.
Hypoxia is a big concern in coastal waters as it affects ecosystem health, fishery yield, and marine water resources. Accurately modeling coastal hypoxia is still very challenging even with the most ...advanced numerical models. A data‐driven model for coastal water quality is proposed in this study and is applied to predict the temporal‐spatial variations of dissolved oxygen (DO) and hypoxic condition in Chesapeake Bay, the largest estuary in the United States with mean summer hypoxic zone extending about 150 km along its main axis. The proposed model has three major components including empirical orthogonal functions analysis, automatic selection of forcing transformation, and neural network training. It first uses empirical orthogonal functions to extract the principal components, then applies neural network to train models for the temporal variations of principal components, and finally reconstructs the three‐dimensional temporal‐spatial variations of the DO. Using the first 75% of the 32‐year (1985–2016) data set for training, the model shows good performance for the testing period (the remaining 25% data set). Selection of forcings for the first mode points to the dominant role of streamflow in controlling interannual variability of bay‐wide DO condition. Different from previous empirical models, the approach is able to simulate three‐dimensional variations of water quality variables and it does not use in situ measured water quality variables but only external forcings as model inputs. Even though the approach is used for the hypoxia problem in Chesapeake Bay, the methodology is readily applicable to other coastal systems that are systematically monitored.
Key Points
We propose a data‐driven approach to model spatial‐temporal variations of water quality
The approach combines data‐dimension reduction method and deep‐learning techniques
The application for DO in Chesapeake Bay shows high model performance
Agriculture provides for the most basic needs of humankind: food and fiber. The introduction of new farming techniques in the past century (e.g., during the Green Revolution) has helped agriculture ...keep pace with growing demands for food and other agricultural products. However, further increases in food demand, a growing population, and rising income levels are likely to put additional strain on natural resources. With growing recognition of the negative impacts of agriculture on the environment, new techniques and approaches should be able to meet future food demands while maintaining or reducing the environmental footprint of agriculture. Emerging technologies, such as geospatial technologies, Internet of Things (IoT), Big Data analysis, and artificial intelligence (AI), could be utilized to make informed management decisions aimed to increase crop production. Precision agriculture (PA) entails the application of a suite of such technologies to optimize agricultural inputs to increase agricultural production and reduce input losses. Use of remote sensing technologies for PA has increased rapidly during the past few decades. The unprecedented availability of high resolution (spatial, spectral and temporal) satellite images has promoted the use of remote sensing in many PA applications, including crop monitoring, irrigation management, nutrient application, disease and pest management, and yield prediction. In this paper, we provide an overview of remote sensing systems, techniques, and vegetation indices along with their recent (2015–2020) applications in PA. Remote-sensing-based PA technologies such as variable fertilizer rate application technology in Green Seeker and Crop Circle have already been incorporated in commercial agriculture. Use of unmanned aerial vehicles (UAVs) has increased tremendously during the last decade due to their cost-effectiveness and flexibility in obtaining the high-resolution (cm-scale) images needed for PA applications. At the same time, the availability of a large amount of satellite data has prompted researchers to explore advanced data storage and processing techniques such as cloud computing and machine learning. Given the complexity of image processing and the amount of technical knowledge and expertise needed, it is critical to explore and develop a simple yet reliable workflow for the real-time application of remote sensing in PA. Development of accurate yet easy to use, user-friendly systems is likely to result in broader adoption of remote sensing technologies in commercial and non-commercial PA applications.
In the digital age, customers use online reviews to minimize the risks associated with purchasing a product. Major online retailers help customers choose the right product by exposing reviews that ...received many “helpful” votes at the top of the review section. Given that reviews that have received the maximum helpfulness votes are considered more important in purchase decisions, understanding determinants of helpfulness votes offers clear benefits to online retailers and review platforms. This study focuses on the effect of review informativeness, which is measured by the number of attributes discussed in a review, and its interplay of review valence on customers' perception of review helpfulness. We applied a word-level bigram analysis to derive product attributes from review text and examined the influence of the number of attributes on the review's helpfulness votes. More importantly, we also suggested the moderating role of review valence. Estimation results of the Zero-inflated Poisson models on 21,125 reviews across 14 wireless earbuds indicated that as more attributes are discussed in a review, the more the review can earn helpfulness votes from customers. Furthermore, the positive association between the number of attributes and helpfulness was enhanced among negative reviews. This study contributes to customers' information processing literature and offers guidelines to online retailers in designing a better decision support system.
We present a revised global plate motion model with continuously closing plate boundaries ranging from the Triassic at 230 Ma to the present day, assess differences among alternative absolute plate ...motion models, and review global tectonic events. Relatively high mean absolute plate motion rates of approximately 9-10 cm yr
−1
between 140 and 120 Ma may be related to transient plate motion accelerations driven by the successive emplacement of a sequence of large igneous provinces during that time. An event at ∼100 Ma is most clearly expressed in the Indian Ocean and may reflect the initiation of Andean-style subduction along southern continental Eurasia, whereas an acceleration at ∼80 Ma of mean rates from 6 to 8 cm yr
−1
reflects the initial northward acceleration of India and simultaneous speedups of plates in the Pacific. An event at ∼50 Ma expressed in relative, and some absolute, plate motion changes around the globe and in a reduction of global mean plate speeds from about 6 to 4-5 cm yr
−1
indicates that an increase in collisional forces (such as the India-Eurasia collision) and ridge subduction events in the Pacific (such as the Izanagi-Pacific Ridge) play a significant role in modulating plate velocities.
Understanding the soil and water conservation (SWC) impact of steep‐slope agricultural practices (e.g. terraces) has arguably never been more relevant than today, in the face of widespread ...intensifying rainfall conditions. In Italy, a diverse mosaic of terraced and non‐terraced cultivation systems have historically developed from local traditions and more recently from the introduction of machinery. Previous studies suggested that each type of vineyard configuration is characterised by a specific set of soil degradation patterns. However, an extensive analysis of SWC impacts by different vineyard configurations is missing, while this is crucial for providing robust guidelines for future‐proof viticulture. Here, we provide a unique extensive comparison of SWC in 50 vineyards, consisting of 10 sites of 5 distinct practices: slope‐wise cultivation (SC), contour cultivation (CC), contour terracing (CT), broad‐base terracing (BT) and oblique terracing (OT). A big‐data analysis approach of physical erosion modelling based on high‐resolution LiDAR data is performed, while four predefined SWC indicators are systematically analysed and statistically quantified. Regular contour terracing (CT) ranked best across all indicators, reflecting a good combination of flow interception and homogeneous distribution of runoff and sediment under intense rainfall conditions. The least SWC‐effective practices (SC, CC, and OT) were related to vineyards optimised for trafficability by access roads or uninterrupted inter‐row paths, which created high upstream‐downstream connectivity and are thus prone to flow accumulation. The novel large‐scale approach of this study offers a robust comparison of SWC impacts under intense rainstorms, which is becoming increasingly relevant for the sustainable future management of such landscapes.
To explore the electric vehicle networks in smart cities through big data analysis technology, this study utilizes K-means and fuzzy theory in big data analysis technology to construct an objective ...function-based fuzzy mean clustering algorithm theory (FCM). Then, the FCM algorithm is improved, and the electric vehicle network is simulated. The results show that in the analysis of network data transmission performance, when the probability of successful propagation is 100% and the <inline-formula> <tex-math notation="LaTeX">\lambda </tex-math></inline-formula> value is between 0.01-0.05, it is closest to the actual result, and the data delay is the smallest. In the analysis of the route guidance effects, when facing congested road sections, the route guidance strategy of this study can restrain the spread of congestion effectively and achieve timely evacuation of traffic congestion. In the further analysis of the impact of different factors on traffic conditions, under route guidance, with the increase in market penetration rate (MPR) of devices, following rate (FR) of vehicles, and congestion level (CL), the improvement of the induction strategy becomes clearer, and greater economic benefits are achieved. This study has found that utilizing big data analysis technology to improve the electric vehicle transportation networks can reduce the network data transmission performance delay significantly and change the path to suppress the spread of congestion effectively, which has provided experimental references for the development of electric vehicle transportation networks.