•The review focus on developing data-driven models for building energy prediction.•Seven feature types and five feature extraction methods are presented.•Data-driven algorithms, including statistical ...and machine learning methods, are reviewed.•Aspects reflected from expected prediction results are discussed.•Input updating strategies for multi-step energy prediction are summarized.
Building energy prediction plays a vital role in developing a model predictive controller for consumers and optimizing energy distribution plan for utilities. Common approaches for energy prediction include physical models, data-driven models and hybrid models. Among them, data-driven approaches have become a popular topic in recent years due to their ability to discover statistical patterns without expertise knowledge. To acquire the latest research trends, this study first summarizes the limitations of earlier reviews: seldom present comprehensive review for the entire data-driven process for building energy prediction and rarely summarize the input updating strategies when applying the trained data-driven model to multi-step energy prediction. To overcome these gaps, this paper provides a comprehensive review on building energy prediction, covering the entire data-driven process that includes feature engineering, potential data-driven models and expected outputs. The distribution of 105 papers, which focus on building energy prediction by data-driven approaches, are reviewed over data source, feature types, model utilization and prediction outputs. Then, in order to implement the trained data-driven models into multi-step prediction, input updating strategies are reviewed to deal with the time series property of energy related data. Finally, the review concludes with some potential future research directions based on discussion of existing research gaps.
Efforts have been devoted to the identification of the impacts of occupant behavior on building energy consumption. Various factors influence building energy consumption at the same time, leading to ...the lack of precision when identifying the individual effects of occupant behavior. This paper reports the development of a new methodology for examining the influences of occupant behavior on building energy consumption; the method is based on a basic data mining technique (cluster analysis). To deal with data inconsistencies, min–max normalization is performed as a data preprocessing step before clustering. Grey relational grades, a measure of relevancy between two factors, are used as weighted coefficients of different attributes in cluster analysis. To demonstrate the applicability of the proposed method, the method was applied to a set of residential buildings’ measurement data. The results show that the method facilitates the evaluation of building energy-saving potential by improving the behavior of building occupants, and provides multifaceted insights into building energy end-use patterns associated with the occupant behavior. The results obtained could help prioritize efforts at modification of occupant behavior in order to reduce building energy consumption, and help improve modeling of occupant behavior in numerical simulation.
The collection of digital information by governments, corporations, and individuals has created tremendous opportunities for knowledge- and information-based decision making. Driven by mutual ...benefits, or by regulations that require certain data to be published, there is a demand for the exchange and publication of data among various parties. Data in its original form, however, typically contains sensitive information about individuals, and publishing such data will violate individual privacy. The current practice in data publishing relies mainly on policies and guidelines as to what types of data can be published and on agreements on the use of published data. This approach alone may lead to excessive data distortion or insufficient protection.
Privacy-preserving data publishing
(PPDP) provides methods and tools for publishing useful information while preserving data privacy. Recently, PPDP has received considerable attention in research communities, and many approaches have been proposed for different data publishing scenarios. In this survey, we will systematically summarize and evaluate different approaches to PPDP, study the challenges in practical data publishing, clarify the differences and requirements that distinguish PPDP from other related problems, and propose future research directions.
With the increasing prevalence of information networks, research on privacy-preserving network data publishing has received substantial attention recently. There are two streams of relevant research, ...targeting different privacy requirements. A large body of existing works focus on preventing node re-identification against adversaries with structural background knowledge, while some other studies aim to thwart
edge disclosure
. In general, the line of research on preventing edge disclosure is less fruitful, largely due to lack of a formal privacy model. The recent emergence of
differential privacy
has shown great promise for rigorous prevention of edge disclosure. Yet recent research indicates that differential privacy is vulnerable to
data correlation
, which hinders its application to network data that may be inherently correlated. In this paper, we show that differential privacy could be tuned to provide provable privacy guarantees even in the correlated setting by introducing an extra parameter, which measures the extent of correlation. We subsequently provide a holistic solution for
non-interactive
network data publication. First, we generate a private vertex labeling for a given network dataset to make the corresponding adjacency matrix form dense clusters. Next, we adaptively identify dense regions of the adjacency matrix by a data-dependent partitioning process. Finally, we reconstruct a noisy adjacency matrix by a novel use of the exponential mechanism. To our best knowledge, this is the first work providing a practical solution for publishing real-life network data via differential privacy. Extensive experiments demonstrate that our approach performs well on different types of real-life network datasets.
This paper reports the development of a building energy demand predictive model based on the decision tree method. This method is able to classify and predict categorical variables: its competitive ...advantage over other widely used modeling techniques, such as regression method and ANN method, lies in the ability to generate accurate predictive models with interpretable flowchart-like tree structures that enable users to quickly extract useful information. To demonstrate its applicability, the method is applied to estimate residential building energy performance indexes by modeling building energy use intensity (EUI) levels. The results demonstrate that the use of decision tree method can classify and predict building energy demand levels accurately (93% for training data and 92% for test data), identify and rank significant factors of building EUI automatically. The method can provide the combination of significant factors as well as the threshold values that will lead to high building energy performance. Moreover, the average EUI value of data records in each classified data subsets can be used for reference when performing prediction. One crucial benefit is improving building energy performance and reducing energy consumption. Another advantage of this methodology is that it can be utilized by users without requiring much computation knowledge.
•Building energy saving advisory was developed using data mining framework.•Monitored energy usage was used to discovers the correlations, and make recommendations.•The approach presents an ...end-to-end solution, from raw energy usage data to feasible recommendations.•Recommendations are based on the occupants' past behavior, so they are achievable.
Occupants’ behavior and their interaction with home appliances are crucial for assessing building energy consumption. This study proposes a new methodology for monitoring the energy consumed in building end-use loads to build an advisory system. The built system alerts occupants to take certain measures (prioritized recommendations) to reduce energy consumption of end-use loads. The quantification of potential savings is also provided upon following said measures. The proposed methodology is also capable of evaluating the energy savings performed by the occupants. The system works based on the analysis of historical data generated by occupants using data mining techniques to output highly feasible recommendations. For demonstration purposes, the methodology was tested on the real dataset of a building in Japan. The dataset includes detailed energy consumption of end-use loads, categorized as hot water supply, lighting, kitchen, refrigerator, entertainment & information, housework & sanitary, and others. Results suggest that the developed models are accurate, and that it is possible to save up to 21% of total energy consumption by only changing occupants’ energy use habits.
Utility mining is a new development of data mining technology. Among utility mining problems, utility mining with the itemset share framework is a hard one as no anti-monotonicity property holds with ...the interestingness measure. Prior works on this problem all employ a two-phase, candidate generation approach with one exception that is however inefficient and not scalable with large databases. The two-phase approach suffers from scalability issue due to the huge number of candidates. This paper proposes a novel algorithm that finds high utility patterns in a single phase without generating candidates. The novelties lie in a high utility pattern growth approach, a lookahead strategy, and a linear data structure. Concretely, our pattern growth approach is to search a reverse set enumeration tree and to prune search space by utility upper bounding. We also look ahead to identify high utility patterns without enumeration by a closure property and a singleton property. Our linear data structure enables us to compute a tight bound for powerful pruning and to directly identify high utility patterns in an efficient and scalable way, which targets the root cause with prior algorithms. Extensive experiments on sparse and dense, synthetic and real world data suggest that our algorithm is up to 1 to 3 orders of magnitude more efficient and is more scalable than the state-of-the-art algorithms.
Reverse engineering is a manually intensive but necessary technique for understanding the inner workings of new malware, finding vulnerabilities in existing systems, and detecting patent ...infringements in released software. An assembly clone search engine facilitates the work of reverse engineers by identifying those duplicated or known parts. However, it is challenging to design a robust clone search engine, since there exist various compiler optimization options and code obfuscation techniques that make logically similar assembly functions appear to be very different. A practical clone search engine relies on a robust vector representation of assembly code. However, the existing clone search approaches, which rely on a manual feature engineering process to form a feature vector for an assembly function, fail to consider the relationships between features and identify those unique patterns that can statistically distinguish assembly functions. To address this problem, we propose to jointly learn the lexical semantic relationships and the vector representation of assembly functions based on assembly code. We have developed an assembly code representation learning model \emph{Asm2Vec}. It only needs assembly code as input and does not require any prior knowledge such as the correct mapping between assembly functions. It can find and incorporate rich semantic relationships among tokens appearing in assembly code. We conduct extensive experiments and benchmark the learning model with state-of-the-art static and dynamic clone search approaches. We show that the learned representation is more robust and significantly outperforms existing methods against changes introduced by obfuscation and optimizations.
Many technical solutions have been developed to reduce buildings’ energy consumption, but limited efforts have been made to adequately address the role or action of building occupants in this ...process. Our earlier investigations have shown that occupants play a significant role in buildings’ energy consumption: It was shown that savings of up to 20% could be achieved by modifying occupant behavior thorough direct feedback and recommendations. Studying the role of occupants in building energy consumption requires an understanding of the interrelationships between climatic conditions; building characteristics; and building services and operation. This paper describes the development of a systematic procedure to provide building occupants with direct feedback and recommendations to help them take appropriate action to reduce building energy consumption. The procedure is geared toward developing a Reference Building (RB) (an energy-efficient building) for a specific given building. The RB is then compared against its given building to inform the occupants of the given building how they are using end-use loads and how they can improve them. The RB is generated using a data-mining approach, which involves clustering analysis and neural networks. The framework is based on clustering similar buildings by effects unrelated to occupant behavior. The buildings are then grouped based on their energy consumption, and those with lower consumption are combined to generate the RB. Performance evaluation is determined by comparison of a given building with an RB. This comparison provides feedback that can lead occupants to take appropriate measures (e.g., turning off unnecessary lights or heating, ventilation, and air conditioning (HVAC), etc.) to improve building energy performance. More accurate, scalable, and realistic results are achiveable through current methodology which is shown through comparison with existing literature.
•A model is developed to create a reference building to evaluate the energy saving opportunities.•Developed data mining framework is applied and validated on a set of 76 buildings.•The methodology provides the occupants with insights regarding their energy savings.•The framework is generic and can be applied to a wide range of buildings.
•A methodology is introduced to rank a set of buildings based on energetic performance of occupants.•Ranking is done in two levels, each one gives specific information about occupants of a single ...building.•The underlying factors that create the ranks are examined to give occupants clues on how to improve their rank and reduce energy consumption.
Identifying the impacts of occupants on building energy consumption has become an important issue in recent years. This is due to the interrelationship of influencing factors such as urban climate, building characteristics, occupant behavior, and building services and operation, which makes it challenging to identify the role of occupants in energy consumption. The research problem in this study lies in the fact that the occupants of a building may not be cautious regarding energy savings, and there exists no ground to assess their energy consumption behavior. One solution is the development of a systematic comparison procedure between similar buildings. This paper introduces a new procedure for comparison between occupants of several buildings to show the rank of each building among others and suggest occupants on reducing their energy consumption and improving their rank. The proposed framework is developed based on multiple data-mining methods, including clustering, association rules mining, and neural networks. The proposed methodology is composed of two levels. The first considers the amount of energy usage by occupants after filtering effects unrelated to the occupant behavior. The second ranks the buildings in terms of achieved and potential savings during the time under investigation. To demonstrate the application, the methodology was applied on a set of monitored residential buildings in Japan. Results suggest that the proposed method enhances the evaluation of buildings’ energy-saving potential by revealing the occupants’ contribution. It also provides diverse and prioritized strategies to help occupants manage their energy consumption by revealing the building energy end-use patterns.
Display omitted