As enterprises around the globe embrace globalization, strategic alliances among enterprises have become an important means to gain competitive advantages. Enterprises cooperate to improve the ...quality or lower the prices of their services, which introduce quality correlations, i.e., the quality of a service is associated with other services. Existing approaches for service composition have not fully and systematically considered the quality correlations between services. In this paper, we propose a novel approach named Q 2 C ( Q uery of Q uality C orrelation) to systematically model quality correlations and enable efficient queries of quality correlations for service compositions. Given a service composition and a set of candidate services, Q 2 C first preprocesses the quality correlations among the candidate services and then constructs a quality correlation index graph to enable efficient queries for quality correlations. Extensive experiments are conducted on a real-world web service dataset to demonstrate the effectiveness and efficiency of Q 2 C.
Generalized synchronization is ubiquitous in nature. It is well known that the auxiliary system approach has been widely used to verify the presence of generalized synchronization. This approach was ...firstly proposed in a drive-response system, then extended to the bidirectionally coupled systems and complex networks. However, the well-known generalized auxiliary system method lacks a rigorous theoretical basis for its various applications. Two recent counterexamples indicate us the inapplicability of this method. Inspired by the counterexamples, we find that it is interesting to ask the following two fundamental questions: i) Why is the generalized auxiliary system approach unworkable in the networks with bidirectional couplings? ii) Are there any essential conditions for the applications of this approach? This technical note aims at establishing a rigorous theoretical basis for the applicability of auxiliary system approach. That is, the generalized auxiliary system approach is effective only if there does not exist any path from nodes to their driving neighbors (who drive these nodes) in a network. Several representative examples are also given to validate our theoretical results.
Internet-delivered psychological treatment of major depression has been investigated in several trials, but the role of personalized treatment is less investigated. Studies suggest that guidance is ...important and that automated computerized programmes without therapist support are less effective. Individualized e-mail therapy for depression has not been studied in a controlled trial. Eighty-eight individuals with major depression were randomized to two different forms of Internet-delivered cognitive behaviour therapy (CBT), or to a waiting-list control group. One form of Internet treatment consisted of guided self-help, with weekly modules and homework assignments. Standard CBT components were presented and brief support was provided during the treatment. The other group received e-mail therapy, which was tailored and did not use the self-help texts i.e., all e-mails were written for the unique patient. Both treatments lasted for 8 weeks. In the guided self-help 93% completed (27/29) and in the e-mail therapy 96% (29/30) completed the posttreatment assessment. Results showed significant symptom reductions in both treatment groups with moderate to large effect sizes. At posttreatment 34.5% of the guided self-help group and 30% of the e-mail therapy group reached the criteria of high-end-state functioning (Beck Depression Inventory score below 9). At six-month follow-up the corresponding figures were 47.4% and 43.3%. Overall, the difference between guided self-help and e-mail therapy was small, but in favour of the latter. These findings indicate that both guided self-help and individualized e-mail therapy can be effective.
Discretization is an essential preprocessing technique used in many knowledge discovery and data mining tasks. Its main goal is to transform a set of continuous attributes into discrete ones, by ...associating categorical values to intervals and thus transforming quantitative data into qualitative data. In this manner, symbolic data mining algorithms can be applied over continuous data and the representation of information is simplified, making it more concise and specific. The literature provides numerous proposals of discretization and some attempts to categorize them into a taxonomy can be found. However, in previous papers, there is a lack of consensus in the definition of the properties and no formal categorization has been established yet, which may be confusing for practitioners. Furthermore, only a small set of discretizers have been widely considered, while many other methods have gone unnoticed. With the intention of alleviating these problems, this paper provides a survey of discretization methods proposed in the literature from a theoretical and empirical perspective. From the theoretical perspective, we develop a taxonomy based on the main properties pointed out in previous research, unifying the notation and including all the known methods up to date. Empirically, we conduct an experimental study in supervised classification involving the most representative and newest discretizers, different types of classifiers, and a large number of data sets. The results of their performances measured in terms of accuracy, number of intervals, and inconsistency have been verified by means of nonparametric statistical tests. Additionally, a set of discretizers are highlighted as the best performing ones.
The ability to store data in the DNA of a living organism has applications in a variety of areas including synthetic biology and watermarking of patented genetically modified organisms. Data stored ...in this medium are subject to errors arising from various mutations, such as point mutations, indels, and tandem duplication, which need to be corrected to maintain data integrity. In this paper, we provide error-correcting codes for errors caused by tandem duplications, which create a copy of a block of the sequence and insert it in a tandem manner, i.e., next to the original. In particular, we present two families of codes for correcting errors due to tandem duplications of a fixed length: the first family can correct any number of errors, while the second corrects a bounded number of errors. We also study codes for correcting tandem duplications of length up to a given constant k, where we are primarily focused on the cases of k = 2,3. Finally, we provide a full classification of the sets of lengths allowed in tandem duplication that result in a unique root for all sequences.
Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout. Semantic flow methods are designed to handle images depicting ...different instances of the same object or scene category. We introduce a novel approach to semantic flow, dubbed proposal flow, that establishes reliable correspondences using object proposals. Unlike prevailing semantic flow approaches that operate on pixels or regularly sampled local regions, proposal flow benefits from the characteristics of modern object proposals, that exhibit high repeatability at multiple scales, and can take advantage of both local and geometric consistency constraints among proposals. We also show that the corresponding sparse proposal flow can effectively be transformed into a conventional dense flow field. We introduce two new challenging datasets that can be used to evaluate both general semantic flow techniques and region-based approaches such as proposal flow. We use these benchmarks to compare different matching algorithms, object proposals, and region features within proposal flow, to the state of the art in semantic flow. This comparison, along with experiments on standard datasets, demonstrates that proposal flow significantly outperforms existing semantic flow methods in various settings.
In the virtual world, many internet applications are used by a mass of people for several purposes. Internet applications are the basic needs of people in the modern days of lifestyle which are also ...making habitual society. Like social media, e-mail technology is also more prevalent among people of different categories for personal and official communications. The widespread use of e-mail-based communication is also raising various types of cybercrimes, including cyberstalking. Cyberstalkers also use an e-mail-based approach to harass the victim in the form of cyberstalking. Cyberstalkers utilize several content-wise and intent-wise approaches to target the victim, such as spamming, phishing, spoofing, malicious, defamatory, e-mail bombing, and non-spam e-mails, including sexism, racism, and threatening, and finally, trying to hack the account over e-mail technology. This paper proposed an EBCD model for automatic cyberstalking detection on textual data of e-mail using the multi-model soft voting technique of the machine learning approach. Initially, experimental works were performed to train, test, and validate all classifiers of three model sets on three different labeled datasets. Dataset D1 contains spam, fraudulent, and phishing e-mail subject, dataset D2 contains spam e-mail body text, while dataset D3 contains harassment-related data. After that, trained, tested, and validated classifiers of all model sets were applied as a combined approach to automatically classify the unlabeled e-mails from the user's mailbox using the multi-model soft voting technique. The proposed EBCD model successfully classifies the e-mails from the user's mailbox into cyberstalking e-mails, suspicious e-mails (spam and fraudulent), and normal e-mails. In each model set of the EBCD model, several classifiers, namely support vector machine, random forest, naïve bayes, logistic regression, and soft voting, were used. The final decision in classifying the e-mails from the user's mailbox was taken by the soft voting technique of each model set. The TF-IDF feature extraction method was used with the entire applied machine learning model sets to obtain the feature vectors from the data. Experimental results show that the soft voting technique not only enhances the performance of the e-mail classification task but also supports making the right decision to avoid the wrong classification. Overall performance of the soft voting technique was better than other classifiers, although the performance of the support vector machine was also notable. As per experimental results, the soft voting technique obtained an accuracy of 97.7%, 97.7%, 98.9%, a precision of 97%, 98.3%, 98.6%, recall of 98.3%, 96.5%, 99.1%, f-score of 97.6%, 97.4%, 98.9%, and AUC of 99.4%, 99.7%, 99.9% on dataset D1, D2, and D3 respectively. The average performance of soft voting of each model set on classified e-mails from the user's mailbox was also notable, with an accuracy of 96.3%, precision of 98.1%, recall of 94%, f-score of 95.9%, and AUC of 96.8%.
In this paper, an Air-Ground Integrated VEhicular Network (AGIVEN) architecture is proposed, where the aerial high-altitude platforms (HAPs) proactively push contents to vehicles through large-area ...broadcast, while the ground roadside units (RSUs) provide high-rate unicast services on demand. To efficiently manage the multi-dimensional heterogeneous resources, a service-oriented network slicing approach is introduced, where the AGIVEN is virtually divided into multiple slices and each slice supports a specific application with guaranteed quality of service (QoS). Specifically, the fundamental problem of multi-resource provisioning in AGIVEN slicing is investigated by taking into account the typical vehicular applications of location-based map and popularity-based content services. For the location-based map service, the capability of HAP-vehicle proactive pushing is derived with respect to the HAP broadcast rate and vehicle cache size, wherein a saddle point exists, indicating the optimal communication-cache resource trading. For the popular contents of common interests, the average on-board content hit ratio is obtained with HAPs pushing newly generated contents to keep on-board cache fresh. Then, the minimal RSU transmission rate is derived to meet the average delay requirements of each slice. The obtained analytical results reveal the service-dependent resource provisioning and trading relationships among RSU transmission rate, HAP broadcast rate, and vehicle cache size, which provides guidelines for multi-resource network slicing in practice. Simulation results demonstrate that the proposed AGIVEN network slicing approach matches the multi-resources across slices, whereby the RSU transmission rate can be saved by 40% while maintaining the same QoS.