Recent technological advances enabled modern robots to become part of our daily life. In particular, assistive robotics emerged as an exciting research topic that can provide solutions to improve the ...quality of life of elderly and vulnerable people. This paper introduces the robotic platform developed in the ENRICHME project, with particular focus on its innovative perception and interaction capabilities. The project’s main goal is to enrich the day-to-day experience of elderly people at home with technologies that enable health monitoring, complementary care, and social support. The paper presents several modules created to provide cognitive stimulation services for elderly users with mild cognitive impairments. The ENRICHME robot was tested in three pilot sites around Europe (Poland, Greece, and UK) and proven to be an effective assistant for the elderly at home.
Human re-identification is an important feature of domestic service robots, in particular for elderly monitoring and assistance, because it allows them to perform personalized tasks and human-robot ...interactions. However vision-based re- identification systems are subject to limitations due to human pose and poor lighting conditions. This paper presents a new re-identification method for service robots using thermal images. In robotic applications, as the number and size of thermal datasets is limited, it is hard to use approaches that require huge amount of training samples. We propose a re-identification system that can work using only a small amount of data. During training, we perform entropy-based sampling to obtain a thermal dictionary for each person. Then, a symbolic representation is produced by converting each video into sequences of dictionary elements. Finally, we train a classifier using this symbolic representation and geometric distribution within the new representation domain. The experiments are performed on a new thermal dataset for human re-identification, which includes various situations of human motion, poses and occlusion, and which is made publicly available for research purposes. The proposed approach has been tested on this dataset and its improvements over standard approaches have been demonstrated.
•A decentralized approach for multi-view multi-person tracking in VSNs.•Design overcomplete dictionaries matched to the structure of likelihoods functions.•Obtain sparse representation of likelihoods ...to reduce communication in the network.•Comparison on well-known L1-solvers to choose a fast and reliable method.•Qualitatively and quantitatively, our framework outperforms previous approaches.
Recent developments in low-cost CMOS cameras have created the opportunity of bringing imaging capabilities to sensor networks and a new field called visual sensor networks (VSNs) has emerged. VSNs consist of image sensors, embedded processors, and wireless transceivers which are powered by batteries. Since energy and bandwidth resources are limited, setting up a tracking system in VSNs is a challenging problem. In this paper, we present a framework for human tracking in VSN environments. The traditional approach of sending compressed images to a central node has certain disadvantages such as decreasing the performance of further processing (i.e., tracking) because of low quality images. Instead, in our decentralized tracking framework, each camera node performs feature extraction and obtains likelihood functions. We propose a sparsity-driven method that can obtain bandwidth-efficient representation of likelihoods extracted by the camera nodes. Our approach involves the design of special overcomplete dictionaries that match the structure of the likelihoods and the transmission of likelihood information in the network through sparse representation in such dictionaries. We have applied our method for indoor and outdoor people tracking scenarios and have shown that it can provide major savings in communication bandwidth without significant degradation in tracking performance. We have compared the tracking results and communication loads with a block-based likelihood compression scheme, a decentralized tracking method and a distributed tracking method. Experimental results show that our sparse representation framework is an effective approach that can be used together with any probabilistic tracker in VSNs.
In this paper a facial feature point tracker that is motivated by applications such as human–computer interfaces and facial expression analysis systems is proposed. The proposed tracker is based on a ...graphical model framework. The facial features are tracked through video streams by incorporating statistical relations in time as well as spatial relations between feature points. By exploiting the spatial relationships between feature points, the proposed method provides robustness in real-world conditions such as arbitrary head movements and occlusions. A Gabor feature-based occlusion detector is developed and used to handle occlusions. The performance of the proposed tracker has been evaluated on real video data under various conditions including occluded facial gestures and head movements. It is also compared to two popular methods, one based on Kalman filtering exploiting temporal relations, and the other based on active appearance models (AAM). Improvements provided by the proposed approach are demonstrated through both visual displays and quantitative analysis.
Display omitted
►A graphical model based approach for facial feature tracking. ►Temporal and spatial constraints are utilized for 3D head movements and occlusions. ►Comparison with Kalman filtering and AAM both qualitatively and quantitatively. ►Although there are some drifts, our tracker appears to be a better than the others.
•A decentralized approach for multi-view multi-person tracking in VSNs.•Likelihood functions are compressed to reduce communication in the network.•Comparison on well-known wavelet transforms to ...choose an appropriate domain.•Qualitatively and quantitatively, our framework outperforms the centralized approach.
Visual sensor networks (VSNs) consist of image sensors, embedded processors and wireless transceivers which are powered by batteries. Since the energy and bandwidth resources are limited, setting up a tracking system in VSNs is a challenging problem. In this paper, we present a framework for human tracking in VSNs. The traditional approach of sending compressed images to a central node has certain disadvantages such as decreasing the performance of further processing (i.e., tracking) because of low quality images. Instead, we propose a feature compression-based decentralized tracking framework that is better matched with the further inference goal of tracking. In our method, each camera performs feature extraction and obtains likelihood functions. By transforming to an appropriate domain and taking only the significant coefficients, these likelihood functions are compressed and this new representation is sent to the fusion node. As a result, this allows us to reduce the communication in the network without significantly affecting the tracking performance. An appropriate domain is selected by performing a comparison between well-known transforms. We have applied our method for indoor people tracking and demonstrated the superiority of our system over the traditional approach and a decentralized approach that uses Kalman filter.
Modern service robots are provided with one or more sensors, often including RGB-D cameras, to perceive objects and humans in the environment. This paper proposes a new system for the recognition of ...human social activities from a continuous stream of RGB-D data. Many of the works until now have succeeded in recognising activities from clipped videos in datasets, but for robotic applications it is important to be able to move to more realistic scenarios in which such activities are not manually selected. For this reason, it is useful to detect the time intervals when humans are performing social activities, the recognition of which can contribute to trigger human-robot interactions or to detect situations of potential danger. The main contributions of this research work include a novel system for the recognition of social activities from continuous RGB-D data, combining temporal segmentation and classification, as well as a model for learning the proximity-based priors of the social activities. A new public dataset with RGB-D videos of social and individual activities is also provided and used for evaluating the proposed solutions. The results show the good performance of the system in recognising social activities from continuous RGB-D data.
In this study, the authors propose a complete framework based on a hierarchical activity model to understand and recognise activities of daily living in unstructured scenes. At each particular time ...of a long-time video, the framework extracts a set of space-time trajectory features describing the global position of an observed person and the motion of his/her body parts. Human motion information is gathered in a new feature that the authors call perceptual feature chunks (PFCs). The set of PFCs is used to learn, in an unsupervised way, particular regions of the scene (topology) where the important activities occur. Using topologies and PFCs, the video is broken into a set of small events (‘primitive events’) that have a semantic meaning. The sequences of ‘primitive events’ and topologies are used to construct hierarchical models for activities. The proposed approach has been tested with the medical field application to monitor patients suffering from Alzheimer's and dementia. The authors have compared their approach to their previous study and a rule-based approach. Experimental results show that the framework achieves better performance than existing works and has the potential to be used as a monitoring tool in medical field applications.
In this paper, we present a unified approach for abnormal behavior detection and group behavior analysis in video scenes. Existing approaches for abnormal behavior detection do either use ...trajectory-based or pixel-based methods. Unlike these approaches, we propose an integrated pipeline that incorporates the output of object trajectory analysis and pixel-based analysis for abnormal behavior inference. This enables to detect abnormal behaviors related to speed and direction of object trajectories, as well as complex behaviors related to finer motion of each object. By applying our approach on three different data sets, we show that our approach is able to detect several types of abnormal group behaviors with less number of false alarms compared with existing approaches.
Autonomous vehicles (AVs) must share space with pedestrians, both in carriageway cases such as cars at pedestrian crossings and off-carriageway cases such as delivery vehicles navigating through ...crowds on pedestrianized high-streets. Unlike static obstacles, pedestrians are active agents with complex, interactive motions. Planning AV actions in the presence of pedestrians thus requires modelling of their probable future behavior as well as detecting and tracking them. This narrative review article is Part II of a pair, together surveying the current technology stack involved in this process, organising recent research into a hierarchical taxonomy ranging from low-level image detection to high-level psychological models, from the perspective of an AV designer. This self-contained Part II covers the higher levels of this stack, consisting of models of pedestrian behavior, from prediction of individual pedestrians' likely destinations and paths, to game-theoretic models of interactions between pedestrians and autonomous vehicles. This survey clearly shows that, although there are good models for optimal walking behavior, high-level psychological and social modelling of pedestrian behavior still remains an open research question that requires many conceptual issues to be clarified. Early work has been done on descriptive and qualitative models of behavior, but much work is still needed to translate them into quantitative algorithms for practical AV control.
Autonomous vehicles (AVs) must share space with pedestrians, both in carriageway cases such as cars at pedestrian crossings and off-carriageway cases such as delivery vehicles navigating through ...crowds on pedestrianized high-streets. Unlike static obstacles, pedestrians are active agents with complex, interactive motions. Planning AV actions in the presence of pedestrians thus requires modelling of their probable future behavior as well as detecting and tracking them. This narrative review article is Part I of a pair, together surveying the current technology stack involved in this process, organising recent research into a hierarchical taxonomy ranging from low-level image detection to high-level psychology models, from the perspective of an AV designer. This self-contained Part I covers the lower levels of this stack, from sensing, through detection and recognition, up to tracking of pedestrians. Technologies at these levels are found to be mature and available as foundations for use in high-level systems, such as behavior modelling, prediction and interaction control.