Data about points of interest (POI) have been widely used in studying urban land use types and for sensing human behavior. However, it is difficult to quantify the correct mix or the spatial ...relations among different POI types indicative of specific urban functions. In this research, we develop a statistical framework to help discover semantically meaningful topics and functional regions based on the co‐occurrence patterns of POI types. The framework applies the latent Dirichlet allocation (LDA) topic modeling technique and incorporates user check‐in activities on location‐based social networks. Using a large corpus of about 100,000 Foursquare venues and user check‐in behavior in the 10 most populated urban areas of the US, we demonstrate the effectiveness of our proposed methodology by identifying distinctive types of latent topics and, further, by extracting urban functional regions using K‐means clustering and Delaunay triangulation spatial constraints clustering. We show that a region can support multiple functions but with different probabilities, while the same type of functional region can span multiple geographically non‐adjacent locations. Since each region can be modeled as a vector consisting of multinomial topic distributions, similar regions with regard to their thematic topic signatures can be identified. Compared with remote sensing images which mainly uncover the physical landscape of urban environments, our popularity‐based POI topic modeling approach can be seen as a complementary social sensing view on urban space based on human activities.
Urban areas of interest (AOI) refer to the regions within an urban environment that attract people's attention. Such areas often have high exposure to the general public, and receive a large number ...of visits. As a result, urban AOI can reveal useful information for city planners, transportation analysts, and location-based service providers to plan new business, extend existing infrastructure, and so forth. Urban AOI exist in people's perception and are defined by behaviors. However, such perception was rarely captured until the Social Web information technology revolution. Social media data record the interactions between users and their surrounding environment, and thus have the potential to uncover interesting urban areas and their underlying spatiotemporal dynamics. This paper presents a coherent framework for extracting and understanding urban AOI based on geotagged photos. Six different cities from six different countries have been selected for this study, and Flickr photo data covering these cities in the past ten years (2004–2014) have been retrieved. We identify AOI using DBSCAN clustering algorithm, understand AOI by extracting distinctive textual tags and preferable photos, and discuss the spatiotemporal dynamics as well as some insights derived from the AOI. An interactive prototype has also been implemented as a proof-of-concept. While Flickr data have been used in this study, the presented framework can also be applied to other geotagged photos.
•We propose a framework for extracting and understanding urban AOI from geotagged photos.•We design an experiment to construct optimal polygons from point clusters.•We mine knowledge from the extracted AOI, and investigate their spatiotemporal dynamics.•An online system has been developed as a proof-of-concept to show the AOI in different cities.
•A data-driven approach to construct gazetteers from volunteered geographic information was introduced.•We built a high-performance Hadoop-based geoprocessing platform to facilitate gazetteer ...research.•It connects spatial analysis to the cloud computing environment for Big Geo-Data analytics.
Traditional gazetteers are built and maintained by authoritative mapping agencies. In the age of Big Data, it is possible to construct gazetteers in a data-driven approach by mining rich volunteered geographic information (VGI) from the Web. In this research, we build a scalable distributed platform and a high-performance geoprocessing workflow based on the Hadoop ecosystem to harvest crowd-sourced gazetteer entries. Using experiments based on geotagged datasets in Flickr, we find that the MapReduce-based workflow running on the spatially enabled Hadoop cluster can reduce the processing time compared with traditional desktop-based operations by an order of magnitude. We demonstrate how to use such a novel spatial-computing infrastructure to facilitate gazetteer research. In addition, we introduce a provenance-based trust model for quality assurance. This work offers new insights on enriching future gazetteers with the use of Hadoop clusters, and makes contributions in connecting GIS to the cloud computing environment for the next frontier of Big Geo-Data analytics.
A common need for artificial intelligence models in the broader geoscience is to encode various types of spatial data, such as points, polylines, polygons, graphs, or rasters, in a hidden embedding ...space so that they can be readily incorporated into deep learning models. One fundamental step is to encode a single point location into an embedding space, such that this embedding is learning-friendly for downstream machine learning models. We call this process location encoding. However, there lacks a systematic review on location encoding, its potential applications, and key challenges that need to be addressed. This paper aims to fill this gap. We first provide a formal definition of location encoding, and discuss the necessity of it for GeoAI research. Next, we provide a comprehensive survey about the current landscape of location encoding research. We classify location encoding models into different categories based on their inputs and encoding methods, and compare them based on whether they are parametric, multi-scale, distance preserving, and direction aware. We demonstrate that existing location encoders can be unified under one formulation framework. We also discuss the application of location encoding. Finally, we point out several challenges that need to be solved in the future.
Gaining access to inexpensive, high-resolution, up-to-date, three-dimensional road network data is a top priority beyond research, as such data would fuel applications in industry, governments, and ...the broader public alike. Road network data are openly available via user-generated content such as OpenStreetMap (OSM) but lack the resolution required for many tasks, e.g., emergency management. More importantly, however, few publicly available data offer information on elevation and slope. For most parts of the world, up-to-date digital elevation products with a resolution of less than 10 meters are a distant dream and, if available, those datasets have to be matched to the road network through an error-prone process. In this paper we present a radically different approach by deriving road network elevation data from massive amounts of in-situ observations extracted from user-contributed data from an online social fitness tracking application. While each individual observation may be of low-quality in terms of resolution and accuracy, taken together they form an accurate, high-resolution, up-to-date, three-dimensional road network that excels where other technologies such as LiDAR fail, e.g., in case of overpasses, overhangs, and so forth. In fact, the 1m spatial resolution dataset created in this research based on 350 million individual 3D location fixes has an RMSE of approximately 3.11m compared to a LiDAR-based ground-truth and can be used to enhance existing road network datasets where individual elevation fixes differ by up to 60m. In contrast, using interpolated data from the National Elevation Dataset (NED) results in 4.75m RMSE compared to the base line. We utilize Linked Data technologies to integrate the proposed high-resolution dataset with OpenStreetMap road geometries without requiring any changes to the OSM data model.
The W3C Semantic Sensor Network Incubator group (the SSN-XG) produced an OWL 2 ontology to describe sensors and observations — the SSN ontology, available at http://purl.oclc.org/NET/ssnx/ssn. The ...SSN ontology can describe sensors in terms of capabilities, measurement processes, observations and deployments. This article describes the SSN ontology. It further gives an example and describes the use of the ontology in recent research projects.
To a large degree, the attraction of Big Data lies in the variety of its heterogeneous multi-thematic and multi-dimensional data sources and not merely its volume. To fully exploit this variety, ...however, requires conflation. This is a two-step process. First, one has to establish identity relations between information entities across different data sources; and second, attribute values have to be merged according to certain procedures that avoid logical contradictions. The first step, also called matching, can be thought of as a weighted combination of common attributes according to some similarity measures. In this work, we propose such a matching based on multiple attributes of Points of Interest (POI) from the Location-based Social Network Foursquare and the local directory service Yelp. While both contain overlapping attributes that can be used for matching, they have specific strengths and weaknesses that make their conflation desirable. For instance, Foursquare offers information about user check-ins to places, while Yelp specializes in user-contributed reviews. We present a weighted multi-attribute matching strategy, evaluate its performance, and discuss application areas that benefit from a successful matching. Finally, we also outline how the established POI matches can be stored as Linked Data on the Semantic Web. Our strategy can automatically match 97% of randomly selected Yelp POI to their corresponding Foursquare entities.
The volume, velocity, and variety of data that are now becoming available allow us to study urban environments based on human behaviour with a spatial, temporal, and thematic granularity that was not ...achievable until now. Such data-driven approaches open up additional, complementary perspectives on how urban systems function, especially if they are based on user-generated content (UGC). While the data sources, such as social media, introduce specific biases, they also open up new possibilities for scientists and the broader public. For instance, they provide answers to questions that previously could only be addressed by complex simulations or extensive human-participant surveys. Unfortunately, many of the required data sets are locked in data silos that are accessible only via restricted APIs. Even if these data could be fully accessed, their naïve processing and visualization would surpass the abilities of modern computer architectures. Finally, the established place schemata used to study urban spaces differ substantially from UGC-based point-of-interest (POI) schemata. In this work, we present a multi-granular, data-driven, and theory-informed approach that addresses the key issues outlined above by introducing a theoretical and technical framework to interactively explore the pulse of a city based on social media.
Exosomes are a heterogenous subpopulation of extracellular vesicles 30-150 nm in range and of endosome-derived origin. We explored the exosome formation through different systems, including the ...endosomal sorting complex required for transport (ESCRT) and ESCRT-independent system, looking at the mechanisms of release. Different isolation techniques and specificities of exosomes from different tissues and cells are also discussed. Despite more than 30 years of research that followed their definition and indicated their important role in cellular physiology, the exosome biology is still in its infancy with rapidly growing interest. The reasons for the rapid increase in interest with respect to exosome biology is because they provide means of intercellular communication and transmission of macromolecules between cells, with a potential role in the development of diseases. Moreover, they have been investigated as prognostic biomarkers, with a potential for further development as diagnostic tools for neurodegenerative diseases and cancer. The interest grows further with the fact that exosomes were reported as useful vectors for drugs.