Nodes in real-world networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven ...to be a challenging task due to a plethora of definitions of network communities, intractability of methods for detecting them, and the issues with evaluation which stem from the lack of a reliable gold-standard ground-truth. In this paper, we distinguish between
structural
and
functional
definitions of network communities. Structural definitions of communities are based on connectivity patterns, like the density of connections between the community members, while functional definitions are based on (often unobserved) common function or role of the community members in the network. We argue that the goal of network community detection is to extract
functional
communities based on the
connectivity structure
of the nodes in the network. We then identify networks with explicitly labeled functional communities to which we refer as
ground-truth communities
. In particular, we study a set of 230 large real-world social, collaboration, and information networks where nodes explicitly state their community memberships. For example, in social networks, nodes explicitly join various interest-based social groups. We use such social groups to define a reliable and robust notion of ground-truth communities. We then propose a methodology, which allows us to compare and quantitatively evaluate how different structural definitions of communities correspond to ground-truth functional communities. We study 13 commonly used structural definitions of communities and examine their sensitivity, robustness and performance in identifying the ground-truth. We show that the 13 structural definitions are heavily correlated and naturally group into four classes. We find that two of these definitions, Conductance and Triad participation ratio, consistently give the best performance in identifying ground-truth communities. We also investigate a task of detecting communities given a single seed node. We extend the local spectral clustering algorithm into a heuristic parameter-free community detection method that easily scales to networks with more than 100 million nodes. The proposed method achieves 30 % relative improvement over current local clustering methods.
This open access book includes methods for retrieval, semantic representation, and analysis of Volunteered Geographic Information (VGI), geovisualization and user interactions related to VGI, and ...discusses selected topics in active participation, social context, and privacy awareness. It presents the results of the DFG-funded priority program "VGI: Interpretation, Visualization, and Social Computing" (2016-2023). The book includes three parts representing the principal research pillars within the program. Part I "Representation and Analysis of VGI" discusses recent approaches to enhance the representation and analysis of VGI. It includes semantic representation of VGI data in knowledge graphs; machine-learning approaches to VGI mining, completion, and enrichment as well as to the improvement of data quality and fitness for purpose. Part II "Geovisualization and User Interactions related to VGI" book explores geovisualizations and user interactions supporting the analysis and presentation of VGI data. When designing these visualizations and user interactions, the specific properties of VGI data, the knowledge and abilities of different target users, and technical viability of solutions need to be considered. Part III "Active Participation, Social Context and Privacy Awareness" of the book addresses the human impact associated with VGI. It includes chapters on the use of wearable sensors worn by volunteers to record their exposure to environmental stressors on their daily journeys, on the collective behavior of people using location-based social media and movement data from football matches, and on the motivation of volunteers who provide important support in information gathering, filtering and analysis of social media in disaster situations. The book is of interest to researchers and advanced professionals in geoinformation, cartography, visual analytics, data science and machine learning.
This open access book covers the use of data science, including advanced machine learning, big data analytics, Semantic Web technologies, natural language processing, social media analysis, time ...series analysis, among others, for applications in economics and finance. In addition, it shows some successful applications of advanced data science solutions used to extract new knowledge from data in order to improve economic forecasting models. The book starts with an introduction on the use of data science technologies in economics and finance and is followed by thirteen chapters showing success stories of the application of specific data science methodologies, touching on particular topics related to novel big data sources and technologies for economic analysis (e.g. social media and news); big data models leveraging on supervised/unsupervised (deep) machine learning; natural language processing to build economic and financial indicators; and forecasting and nowcasting of economic variables through time series analysis. This book is relevant to all stakeholders involved in digital and data-intensive research in economics and finance, helping them to understand the main opportunities and challenges, become familiar with the latest methodological findings, and learn how to use and evaluate the performances of novel tools and frameworks. It primarily targets data scientists and business analysts exploiting data science technologies, and it will also be a useful resource to research students in disciplines and courses related to these topics. Overall, readers will learn modern and effective data science solutions to create tangible innovations for economic and financial applications.
Much attention is currently being paid in both the academic and practitioner literatures to the value that organisations could create through the use of big data and business analytics (Gillon ...+Italic et al -Italic , 2012; Mithas , 2013). For instance, Chen (2012, p. 1166-1168) suggest that business analytics and related technologies can help organisations to 'better understand its business and markets' and 'leverage opportunities presented by abundant data and domain-specific analytics'. Similarly, LaValle (2011, p. 22) report that top-performing organisations 'make decisions based on rigorous analysis at more than double the rate of lower performing organisations' and that in such organisations analytic insight is being used to 'guide both future strategies and day-to-day operations'. We argue here that while there is some evidence that investments in business analytics can create value, the thesis that 'business analytics leads to value' needs deeper analysis. In particular, we argue here that the roles of organisational decision-making processes, including resource allocation processes and resource orchestration processes (Helfat +Italic et al -Italic , 2007; Teece, 2009), need to be better understood in order to understand how organisations can create value from the use of business analytics. Specifically, we propose that the first-order effects of business analytics are likely to be on decision-making processes and that improvements in organisational performance are likely to be an outcome of superior decision-making processes enabled by business analytics.
The business value of information technology (IT) has been one of the top concerns of both practitioners and scholars for decades. Numerous studies have documented the positive effects of IT ...capability on organizational performance but our knowledge of the processes through which such gains are achieved remains limited due to a lack of focus on the business environment. Such a linkage therefore remains the subject of debate in the information systems literature. In this study, we fill this gap by investigating the mediating role of business process agility and the moderating roles of environmental factors. On the basis of matched survey data obtained from 214 IT and business executives from manufacturing firms in China, our analyses show that even though firm-wide IT capability presents the characteristics of rarity, appropriability, non-reproducibility, and non-substitutability, its impact on organizational performance is fully mediated by business process agility. Our results also show that the impact of the environment is multifaceted and nuanced. In particular, environmental hostility weakens the effect of IT capability on business process agility, while environmental complexity strengthens it. The theoretical and practical implications of this study, and its limitations, are also discussed.
In this paper, we focus on a critical aspect of work in organizations: using information in work tasks which is provided by information systems (IS) such as enterprise content management (ECM) ...systems. Our study based on the IS success model, 34 interviews, and an empirical study of 247 ECM system users at a financial service provider indicates that it is appropriate to differentiate between contextual and representational information quality as two information quality dimensions. Furthermore, we reveal that in addition to system quality, the two information quality dimensions are important in determining end-user satisfaction, which in turn influences the manifestation of workarounds. Our study also finds that employees using workarounds to avoid an ECM system implemented several years is negatively related to individual net benefits of the ECM system. Hence, we conclude that when investigating large-scale IS such as ECM systems, it is important to differentiate among information quality dimensions to more deeply understand end-user satisfaction and the resulting manifestation of workarounds. Moreover, this research guides organizations in implementing the most appropriate countermeasures based on the importance of either contextual or representational information quality.
Ten years ago, we presented the DeLone and McLean Information Systems (IS) Success Model as a framework and model for measuring the complex-dependent variable in IS research. In this paper, we ...discuss many of the important IS success research contributions of the last decade, focusing especially on research efforts that apply, validate, challenge, and propose enhancements to our original model. Based on our evaluation of those contributions, we propose minor refinements to the model and propose an updated DeLone and McLean IS Success Model. We discuss the utility of the updated model for measuring e-commerce system success. Finally, we make a series of recommendations regarding current and future measurement of IS success.