•A CNN architecture is proposed to infer transportation modes from GPS trajectories.•An adaptable and efficient layout for the input layer of the CNN is designed.•Key factors in the CNN: remove ...anomalies, data augmentation, use the bagging concept.•The proposed CNN achieves the accuracy of 84.8%, higher than other studies.
Identifying the distribution of users’ transportation modes is an essential part of travel demand analysis and transportation planning. With the advent of ubiquitous GPS-enabled devices (e.g., a smartphone), a cost-effective approach for inferring commuters’ mobility mode(s) is to leverage their GPS trajectories. A majority of studies have proposed mode inference models based on hand-crafted features and traditional machine learning algorithms. However, manual features engender some major drawbacks including vulnerability to traffic and environmental conditions as well as possessing human’s bias in creating efficient features. One way to overcome these issues is by utilizing Convolutional Neural Network (CNN) schemes that are capable of automatically driving high-level features from the raw input. Accordingly, in this paper, we take advantage of CNN architectures so as to predict travel modes based on only raw GPS trajectories, where the modes are labeled as walk, bike, bus, driving, and train. Our key contribution is designing the layout of the CNN’s input layer in such a way that not only is adaptable with the CNN schemes but represents fundamental motion characteristics of a moving object including speed, acceleration, jerk, and bearing rate. Furthermore, we ameliorate the quality of GPS logs through several data preprocessing steps. Using the clean input layer, a variety of CNN configurations are evaluated to achieve the best CNN architecture. The highest accuracy of 84.8% has been achieved through the ensemble of the best CNN configuration. In this research, we contrast our methodology with traditional machine learning algorithms as well as the seminal and most related studies to demonstrate the superiority of our framework.
•Tweets are mapped into numerical feature vectors using word-embedding models.•Tweets are classified into non-traffic, traffic incident, and traffic information.•Classification task is performed ...using convolutional and recurrent neural networks.•51,100 tweets are collected, labeled, and publicly released for future research.•Models’ superiority is demonstrated through several evaluation steps.
In recent years, several studies have harnessed Twitter data for detecting traffic incidents and monitoring traffic conditions. Researchers have utilized the bag-of-words representation for converting tweets into numerical feature vectors. However, the bag-of-words not only ignores the order of tweet's words but suffers from the curse of dimensionality and sparsity. A common approach in literature for dimensionality reduction is to build the bag-of-words on the top of pre-defined traffic keywords. The immediate criticisms to such a strategy are that the pre-defined set of keywords may not include all traffic keywords and the tweet language is subjected to change over time. To address these shortcomings, we utilize the power of deep-learning architectures for both representing tweets in numerical vectors and classifying them into three categories: 1) non-traffic, 2) traffic incident, and 3) traffic information and condition. First, we map tweets into low-dimensional vector space through word-embedding tools, which are also capable of measuring the semantic relationship between words. Supervised deep-learning algorithms including convolutional neural network (CNN) and recurrent neural network (RNN) are then deployed on the top of word-embedding models for detecting traffic events. For training and testing our proposed model, a large volume of traffic tweets is collected through Twitter API endpoints and labeled through an efficient strategy. Experimental results on our labeled dataset show that the proposed approach achieves clear improvements over state-of-the-art methods.
Identification of travelers' transportation modes is a fundamental step for various problems that arise in the domain of transportation such as travel demand analysis, transport planning, and traffic ...management. In this paper, we aim to identify travelers' transportation modes purely based on their GPS trajectories. First, a segmentation process is developed to partition a user's trip into GPS segments with only one transportation mode. A majority of studies have proposed mode inference models based on hand-crafted features, which might be vulnerable to traffic and environmental conditions. Furthermore, the classification task in almost all models have been performed in a supervised fashion while a large amount of unlabeled GPS trajectories has remained unused. Accordingly, we propose a deep SE mi-Supervised C onvolutional A utoencoder ( SECA ) architecture that can not only automatically extract relevant features from GPS segments but also exploit useful information in unlabeled data. The SECA integrates a convolutional-deconvolutional autoencoder and a convolutional neural network into a unified framework to concurrently perform supervised and unsupervised learning. The two components are simultaneously trained using both labeled and unlabeled GPS segments, which have already been converted into an efficient representation for the convolutional operation. An optimum schedule for varying the balancing parameters between reconstruction and classification errors are also implemented. The performance of the proposed SECA model, trip segmentation, the method for converting a raw trajectory into a new representation, the hyperparameter schedule, and the model configuration are evaluated by comparing to several baselines and alternatives for various amounts of labeled and unlabeled data. Our experimental results demonstrate the superiority of the proposed model over the state-of-the-art semi-supervised and supervised methods with respect to metrics such as accuracy and F-measure.
•A CNN approach for classifying vehicles based on GPS data is proposed.•A novel representation of a GPS trajectory suitable for deep learning is proposed.•The approach outperforms traditional machine ...learning methods.•The approach increases usability of GPS trajectory data.
Transportation agencies are starting to leverage increasingly-available GPS trajectory data to support their analyses and decision making. While this type of mobility data adds significant value to various analyses, one challenge that persists is lack of information about the types of vehicles that performed the recorded trips, which clearly limits the value of trajectory data in transportation system analysis. To overcome this limitation of trajectory data, a deep Convolutional Neural Network for Vehicle Classification (CNN-VC) is proposed to identify the vehicle’s class from its trajectory. This paper proposes a novel representation of GPS trajectories, which is not only compatible with deep learning models, but also captures both vehicle-motion characteristics and roadway features. To this end, an open source navigation system is also exploited to obtain more accurate information on travel time and distance between GPS coordinates. Before delving into training the CNN-VC model, an efficient programmatic strategy is also designed to label large-scale GPS trajectories by means of vehicle information obtained through Virtual Weigh Station records. Our experimental results reveal that the proposed CNN-VC model consistently outperforms both classical machine learning algorithms and other deep learning baseline methods. From a practical perspective, the CNN-VC model allows us to label raw GPS trajectories with vehicle classes, thereby enriching the data and enabling more comprehensive transportation studies such as derivation of vehicle class-specific origin-destination tables that can be used for planning.
Car-following models, as the essential part of traffic microscopic simulations, have been utilized to analyze and estimate longitudinal drivers’ behavior for sixty years. The conventional ...car-following models use mathematical formulas to replicate human behavior in car-following phenomenon. The incapability of these approaches to capture the complex interactions between vehicles calls for deploying advanced learning frameworks to consider more detailed behavior of drivers. In this study, we apply the gradient boosting of regression tree (GBRT) algorithm to vehicle trajectory data sets, which have been collected through the Next Generation Simulation (NGSIM) program, to develop a new car-following model. First, the regularization parameters of the proposed method are tuned using cross-validation technique and sensitivity analysis. Second, prediction performance of the GBRT is compared to the world-famous Gazis-Herman-Rothery (GHR) model, when both models have been trained on the same data sets. The estimation results of the models on unseen records indicate the superiority of the GBRT algorithm in capturing the motion characteristics of two successive vehicles.
The semi-actuated coordinated operation mode is a type of signal control where minor approaches are placed with detectors to develop actuated phasing while major movements are coordinated without ...using detection systems. The objective of this study is to propose a cost-effective approach for reducing delay in the semi-actuated coordinated signal operation without incurring any extra costs in terms of installing new detectors or developing adaptive controller systems. We propose a simple approach for further enhancing a pre-optimized timing plan. In this method, the green splits of non-coordinated phases are multiplied by a factor greater than one. In the meantime, the amount of green time added to the non-coordinated phases is subtracted from the coordinated phases to keep the cycle length constant. Thus, if the traffic demand on the side streets exceeds the expected traffic flow, the added time in the non-coordinated phase enables the non-coordinated phases to accommodate the additional traffic demand. A regression analysis is implemented so as to identify the optimal value of the mentioned factor, called actuated factor (ActF). The response variable is the average delay reduction (seconds/vehicle) of the simulation runs under the proposed signal timing plan compared with the simulation runs under the pre-optimized timing plan, obtained through the macroscopic signal optimization tools. External traffic movements, left-turn percentage, and ActF are the explanatory variables in the model. Results reveal that the ActF is the only significant variable with the optimal value of 1.15 that is applicable for a wide range of traffic volumes.
Brain computer interfaces (BCI) generally require the user to maintain an attentive state. Potential end-users with severe speech and physical impairments may have limited communication abilities to ...report their current state, thus an automatic calculation of state may improve performance. It’s not yet known if an effective automatic calculation of drowsiness can be detected reliably in end-user populations or healthy controls.
In this study, we examined data from 20 healthy participants (33 ± 13 years of age) to understand how self-rated measures (sleepiness and boredom) and automatically calculated drowsiness measures are associated with underlying classifier performance (area under the curve, AUC) during 5 successive 11-min calibration sessions. The BCI system used in this experiment is the RSVPKeyboard™, which relies on P300 responses to target letters (Orhan et al., 2012). EEG was obtained using a DSI-24 system (Wearable Sensing), recorded at 300 Hz and software filtered to reduce presence of artifacts. Automatically calculated measures based on the literature (Oken et al., 2006) utilized eye-blink rate (blinks/time) and theta 4–7 Hz, alpha 8–12 Hz, and median power frequencies (MPF). The frequency measures were calculated from a stimulus presentation sequence of approximately 10 letters lasting 4 s fed into MATLAB’s power spectral density function and averaged. P300 amplitude was calculated with EEGLAB v. 14.1 using a peak to trough (peak minus trough) method; the highest positive potential in the 350–600 ms range of the Cz channel as the peak and the negative-most peak preceding it as the trough. Self-report of sleepiness was assessed using the Karolinska Sleepiness Scale (KSS) (Gillberg et al., 1994). Boredom was assessed using a 6-item boredom scale (Markey et al., 2014).
Using a Wilcoxon-Signed Rank Test to compare scores from the first and last calibration sessions, the following observations were made: (1) There were significant differences in AUC (Z = 3.32, p < .001; mean from .877 to .794), (2) self-report measures of boredom (Z = −3.349, p < .001; mean from 13.3 to 23.65) and sleepiness (Z = −3.91, p < .0001; mean from 3.55 to 7.1), (3) alpha power (Z = −3.18, p = .0015; mean from 7.19E−6 to 9.6E−6), (4) eye blink rate (Z = 2.42, p = .15; mean from 4.35 to 2.67), and (5) P300 amplitude (Z = 3.44, p < .001; mean from 9.8E−6 to 6.37E−6) between the first and last calibration sessions. There were no significant differences in MPF (mean from 8.8 to 9.62) or theta (mean from 1.27E−5 to 1.36E−5).
There were significant declines in performance using a BCI system over the course of this experiment. These results suggest the automatically calculated measures may be able to capture user state and relationships to changes in performance. While the self-report measure findings are of interest, it may be difficult to continually solicit questionnaire feedback and automated measures could be further leveraged in classification models.
The Washington Metropolitan Area Transit Authority (WMATA) operates 1,250 buses on 168 different routes between 10,600 bus stops to support around 370,000 passengers each day. Utilizing sensors on ...vehicles and analyzing their location and movements throughout an hour, trip, or day can provide valuable information to a transit authority as well as to the users of a transit system. This amount of information can be overwhelming, but utilizing big data techniques can empower the data and the transit agency. First, this paper develops a methodology for assessing previous delays in the system by applying big data structure and statistical analysis to the data constantly collected by WMATA buses. This method of analysis also helps quantify the impact of potential transit system improvements. Second, the paper describes a model that uses the real-time data, that represents potential delays, to provide future passengers with more accurate arrival predictions despite delays. These analyses are powerful tools for agencies and planners to assess and improve transit service performance using big data analytics and real-time predictions.
Today, a wide range of commercial signal optimization tools have been developed for optimizing traffic signal timing plans. However, incapability of such programs to consider the stochastic ...variations in traffic conditions calls for integrating optimization techniques with a microscopic simulation environment. In this study, the Particle Swarm Optimization (PSO) algorithm is used as an optimizer for generating arterial traffic signal timing parameters. VISSIM, a microscopic simulator, is used as the evaluation platform for calculating performance measures of the traffic arterial operated by those timing plans. Such an advanced simulation is handled through VISSIM COM interface and implemented in MATLAB. The potentiality of the proposed method is investigated by applying it on a real-world arterial. Comparing the performance of generated timing plans by the proposed method with timing plans yielded by VISTRO, a widely used signal optimization tool, reveals that the proposed method is promising and outperforms for various traffic states.