Home | Research | Groups | Stefan Feuerriegel

Research Group Stefan Feuerriegel

Stefan Feuerriegel

Prof. Dr.

Principal Investigator

C4 | Computational Social Sciences

Artificial Intelligence in Management

Stefan Feuerriegel

is the head of the Institute of Artificial Intelligence (AI) in Management at LMU Munich.

His research focuses on developing AI algorithms to support data-driven decision-making for businesses and public organizations. He is also dedicated to advancing ‘AI for good’, aiming to create positive social impact through responsible and ethical AI applications.

Team members @MCML

PhD Students

Dominik Bär

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Nils Brockmann

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Marie Brockschmidt

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Octavia-Andreea Ciora

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Kerstin Forster

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dominique Geißler

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Konstantin Heß

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Pascal Janetzky

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Emil Javurek

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Yuchen Ma

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Abdurahman Maarouf

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Sebastian Maier

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Simon Schallmoser

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Maresa Schröder

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Yuxin Wang

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Publications @MCML

2025

[76]

Z. Jonassen, K. Lawrence, B. M. Wiesenfeld, S. Feuerriegel and D. Mann.
A qualitative analysis of remote patient monitoring: how a paradox mindset can support balancing emotional tensions in the design of healthcare technologies.
CSCW 2025 - 28th ACM SIGCHI Conference on Computer-Supported Cooperative Work and Social Computing. Bergen, Norway, Oct 18-22, 2025. To be published. Preprint available. DOI

Abstract

Remote patient monitoring (RPM) is the use of digital technologies to improve patient care at a distance. However, current RPM solutions are often biased toward tech-savvy patients. To foster health equity, researchers have studied how to address the socio-economic and cognitive needs of diverse patient groups, but their emotional needs have remained largely neglected. We perform the first qualitative study to explore the emotional needs of diverse patients around RPM. Specifically, we conduct a thematic analysis of 18 interviews and 4 focus groups at a large US healthcare organization. We identify emotional needs that lead to four emotional tensions within and across stakeholder groups when applying an equity focus to the design and implementation of RPM technologies. The four emotional tensions are making diverse patients feel: (i) heard vs. exploited; (ii) seen vs. deprioritized for efficiency; (iii) empowered vs. anxious; and (iv) cared for vs. detached from care. To manage these emotional tensions across stakeholders, we develop design recommendations informed by a paradox mindset (i.e., ‘both-and’ rather than ‘and-or’ strategies).

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[75]

J. Schweisthal, D. Frauen, M. Schröder, K. Heß, N. Kilbertus and S. Feuerriegel.
Learning Representations of Instruments for Partial Identification of Treatment Effects.
ICML 2025 - 42nd International Conference on Machine Learning. Vancouver, Canada, Jul 13-19, 2025. To be published. Preprint available. arXiv

Abstract

Reliable estimation of treatment effects from observational data is important in many disciplines such as medicine. However, estimation is challenging when unconfoundedness as a standard assumption in the causal inference literature is violated. In this work, we leverage arbitrary (potentially high-dimensional) instruments to estimate bounds on the conditional average treatment effect (CATE). Our contributions are three-fold: (1) We propose a novel approach for partial identification through a mapping of instruments to a discrete representation space so that we yield valid bounds on the CATE. This is crucial for reliable decision-making in real-world applications. (2) We derive a two-step procedure that learns tight bounds using a tailored neural partitioning of the latent instrument space. As a result, we avoid instability issues due to numerical approximations or adversarial training. Furthermore, our procedure aims to reduce the estimation variance in finite-sample settings to yield more reliable estimates. (3) We show theoretically that our procedure obtains valid bounds while reducing estimation variance. We further perform extensive experiments to demonstrate the effectiveness across various settings. Overall, our procedure offers a novel path for practitioners to make use of potentially high-dimensional instruments (e.g., as in Mendelian randomization).

MCML Authors

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Maresa Schröder

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Konstantin Heß

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Niki Kilbertus

Prof. Dr.

A3 | Computational Models

Ethics in Systems Design and Machine Learning

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[74]

Y. Ma, D. Frauen, E. Javurek and S. Feuerriegel.
Foundation Models for Causal Inference via Prior-Data Fitted Networks.
Preprint (Jun. 2025). arXiv

Abstract

Prior-data fitted networks (PFNs) have recently been proposed as a promising way to train tabular foundation models. PFNs are transformers that are pre-trained on synthetic data generated from a prespecified prior distribution and that enable Bayesian inference through in-context learning. In this paper, we introduce CausalFM, a comprehensive framework for training PFN-based foundation models in various causal inference settings. First, we formalize the construction of Bayesian priors for causal inference based on structural causal models (SCMs) in a principled way and derive necessary criteria for the validity of such priors. Building on this, we propose a novel family of prior distributions using causality-inspired Bayesian neural networks that enable CausalFM to perform Bayesian causal inference in various settings, including back-door, front-door, and instrumental variable adjustment. Finally, we instantiate CausalFM and explicitly train a foundation model for estimating conditional average treatment effects (CATEs) using back-door adjustment. We show that CausalFM performs competitively for CATE estimation using various synthetic and semi-synthetic benchmarks. In sum, our framework can be used as a general recipe to train foundation models for various causal inference settings. In contrast to the current state-of-the-art in causal inference, CausalFM offers a novel paradigm with the potential to fundamentally change how practitioners perform causal inference in medicine, economics, and other disciplines.

MCML Authors

Yuchen Ma

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Emil Javurek

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[73]

Y. Ma, J. Schweisthal, H. Zhang and S. Feuerriegel.
A Diffusion-Based Method for Learning the Multi-Outcome Distribution of Medical Treatments.
Preprint (Jun. 2025). arXiv

Abstract

In medicine, treatments often influence multiple, interdependent outcomes, such as primary endpoints, complications, adverse events, or other secondary endpoints. Hence, to make optimal treatment decisions, clinicians are interested in learning the distribution of multi-dimensional treatment outcomes. However, the vast majority of machine learning methods for predicting treatment effects focus on single-outcome settings, despite the fact that medical data often include multiple, interdependent outcomes. To address this limitation, we propose a novel diffusion-based method called DIME to learn the joint distribution of multiple outcomes of medical treatments. We addresses three challenges relevant in medical practice: (i)it is tailored to learn the joint interventional distribution of multiple medical outcomes, which enables reliable decision-making with uncertainty quantification rather than relying solely on point estimates; (ii)it explicitly captures the dependence structure between outcomes; (iii)it can handle outcomes of mixed type, including binary, categorical, and continuous variables. In DIME, we take into account the fundamental problem of causal inference through causal masking. For training, our method decomposes the joint distribution into a series of conditional distributions with a customized conditional masking to account for the dependence structure across outcomes. For inference, our method auto-regressively generates predictions. This allows our method to move beyond point estimates of causal quantities and thus learn the joint interventional distribution. To the best of our knowledge, DIME is the first neural method tailored to learn the joint, multi-outcome distribution of medical treatments. Across various experiments, we demonstrate that our method effectively learns the joint distribution and captures shared information among multiple outcomes.

MCML Authors

Yuchen Ma

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[72]

C. Kern, U. Fischer-Abaigar, J. Schweisthal, D. Frauen, R. Ghani, S. Feuerriegel, M. van der Schaar and F. Kreuter.
Algorithms for reliable decision-making need causal reasoning.
Nature Computational Science 5 (May. 2025). DOI

Abstract

Decision-making inherently involves cause–effect relationships that introduce causal challenges. We argue that reliable algorithms for decision-making need to build upon causal reasoning. Addressing these causal challenges requires explicit assumptions about the underlying causal structure to ensure identifiability and estimatability, which means that the computational methods must successfully align with decision-making objectives in real-world tasks. Algorithmic decision-making (ADM) has become common in a wide range of domains, including precision medicine, manufacturing, education, hiring, the public sector, and smart cities. At the core of ADM systems are data-driven models that learn from data to recommend decisions, often with the goal of maximizing a defined utility function1. For example, in smart city contexts, ADM is frequently used to optimize traffic flow through predictive models that analyze real-time data, thereby reducing congestion and improving urban mobility. Another prominent application area for ADM are normative decision support systems (often subsumed under ‘prescriptive analytics’) or, more recently, artificial intelligence (AI) agents that either inform or automatically execute managerial and operational decisions in industry. Yet, the applications of ADM to high-stakes decisions face safety and reliability issues1,2,3. Often, the objectives of ADM systems fail to align with the nuanced goals of real-world decision-making, thus creating a tension between the potential of ADM and the risk of harm and failure. Especially when deployed in dynamic, real-world environments, ADM can amplify systemic disadvantages for vulnerable communities and lead to flawed decisions. In this Comment, we argue that reliable algorithmic decision-making — systems that perform safely and robustly under deployment conditions — must be grounded in causal reasoning.

MCML Authors

Christoph Kern

Prof. Dr.

C4 | Computational Social Sciences

Social Data Science and AI Lab

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

Frauke Kreuter

Prof. Dr.

C4 | Computational Social Sciences

Social Data Science and AI

[71]

D. Frauen, V. Melnychuk, J. Schweisthal, M. van der Schaar and S. Feuerriegel.
Treatment Effect Estimation for Optimal Decision-Making.
Preprint (May. 2025). arXiv

Abstract

Decision-making across various fields, such as medicine, heavily relies on conditional average treatment effects (CATEs). Practitioners commonly make decisions by checking whether the estimated CATE is positive, even though the decision-making performance of modern CATE estimators is poorly understood from a theoretical perspective. In this paper, we study optimal decision-making based on two-stage CATE estimators (e.g., DR-learner), which are considered state-of-the-art and widely used in practice. We prove that, while such estimators may be optimal for estimating CATE, they can be suboptimal when used for decision-making. Intuitively, this occurs because such estimators prioritize CATE accuracy in regions far away from the decision boundary, which is ultimately irrelevant to decision-making. As a remedy, we propose a novel two-stage learning objective that retargets the CATE to balance CATE estimation error and decision performance. We then propose a neural method that optimizes an adaptively-smoothed approximation of our learning objective. Finally, we confirm the effectiveness of our method both empirically and theoretically. In sum, our work is the first to show how two-stage CATE estimators can be adapted for optimal decision-making.

MCML Authors

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[70]

D. Frauen, M. Schröder, K. Hess and S. Feuerriegel.
Orthogonal Survival Learners for Estimating Heterogeneous Treatment Effects from Time-to-Event Data.
Preprint (May. 2025). arXiv

Abstract

Estimating heterogeneous treatment effects (HTEs) is crucial for personalized decision-making. However, this task is challenging in survival analysis, which includes time-to-event data with censored outcomes (e.g., due to study dropout). In this paper, we propose a toolbox of novel orthogonal survival learners to estimate HTEs from time-to-event data under censoring. Our learners have three main advantages: (i) we show that learners from our toolbox are guaranteed to be orthogonal and thus come with favorable theoretical properties; (ii) our toolbox allows for incorporating a custom weighting function, which can lead to robustness against different types of low overlap, and (iii) our learners are model-agnostic (i.e., they can be combined with arbitrary machine learning models). We instantiate the learners from our toolbox using several weighting functions and, as a result, propose various neural orthogonal survival learners. Some of these coincide with existing survival learners (including survival versions of the DR- and R-learner), while others are novel and further robust w.r.t. low overlap regimes specific to the survival setting (i.e., survival overlap and censoring overlap). We then empirically verify the effectiveness of our learners for HTE estimation in different low-overlap regimes through numerical experiments. In sum, we provide practitioners with a large toolbox of learners that can be used for randomized and observational studies with censored time-to-event data.

MCML Authors

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Maresa Schröder

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[69]

N. Holzner, S. Maier and S. Feuerriegel.
Generative AI and Creativity: A Systematic Literature Review and Meta-Analysis.
Preprint (May. 2025). arXiv

Abstract

Generative artificial intelligence (GenAI) is increasingly used to support a wide range of human tasks, yet empirical evidence on its effect on creativity remains scattered. Can GenAI generate ideas that are creative? To what extent can it support humans in generating ideas that are both creative and diverse? In this study, we conduct a meta-analysis to evaluate the effect of GenAI on the performance in creative tasks. For this, we first perform a systematic literature search, based on which we identify n = 28 relevant studies (m = 8214 participants) for inclusion in our meta-analysis. We then compute standardized effect sizes based on Hedges’ g. We compare different outcomes: (i) how creative GenAI is; (ii) how creative humans augmented by GenAI are; and (iii) the diversity of ideas by humans augmented by GenAI. Our results show no significant difference in creative performance between GenAI and humans (g = -0.05), while humans collaborating with GenAI significantly outperform those working without assistance (g = 0.27). However, GenAI has a significant negative effect on the diversity of ideas for such collaborations between humans and GenAI (g = -0.86). We further analyze heterogeneity across different GenAI models (e.g., GPT-3.5, GPT-4), different tasks (e.g., creative writing, ideation, divergent thinking), and different participant populations (e.g., laypeople, business, academia). Overall, our results position GenAI as an augmentative tool that can support, rather than replace, human creativity-particularly in tasks benefiting from ideation support.

MCML Authors

Sebastian Maier

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[68]

M. Schröder, J. Hartenstein and S. Feuerriegel.
PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects.
Preprint (May. 2025). arXiv

Abstract

The average treatment effect (ATE) is widely used to evaluate the effectiveness of drugs and other medical interventions. In safety-critical applications like medicine, reliable inferences about the ATE typically require valid uncertainty quantification, such as through confidence intervals (CIs). However, estimating treatment effects in these settings often involves sensitive data that must be kept private. In this work, we present PrivATE, a novel machine learning framework for computing CIs for the ATE under differential privacy. Specifically, we focus on deriving valid privacy-preserving CIs for the ATE from observational data. Our PrivATE framework consists of three steps: (i) estimating a differentially private ATE through output perturbation; (ii) estimating the differentially private variance through a truncated output perturbation mechanism; and (iii) constructing the CIs while accounting for the uncertainty from both the estimation and privatization steps. Our PrivATE framework is model agnostic, doubly robust, and ensures valid CIs. We demonstrate the effectiveness of our framework using synthetic and real-world medical datasets. To the best of our knowledge, we are the first to derive a general, doubly robust framework for valid CIs of the ATE under (ε, δ)-differential privacy.

MCML Authors

Maresa Schröder

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[67]

D. Geißler, A. Maarouf and S. Feuerriegel.
Analyzing User Characteristics of Hate Speech Spreaders on Social Media.
WWW 2025 - ACM Web Conference. Sydney, Australia, Apr 28-May 02, 2025. To be published. Preprint available. arXiv

Abstract

Hate speech on social media threatens the mental and physical well-being of individuals and contributes to real-world violence. Resharing is an important driver behind the spread of hate speech on social media. Yet, little is known about who reshares hate speech and what their characteristics are. In this paper, we analyze the role of user characteristics in hate speech resharing across different types of hate speech (e.g., political hate). For this, we proceed as follows: First, we cluster hate speech posts using large language models to identify different types of hate speech. Then we model the effects of user attributes on users’ probability to reshare hate speech using an explainable machine learning model. To do so, we apply debiasing to control for selection bias in our observational social media data and further control for the latent vulnerability of users to hate speech. We find that, all else equal, users with fewer followers, fewer friends, fewer posts, and older accounts share more hate speech. This shows that users with little social influence tend to share more hate speech. Further, we find substantial heterogeneity across different types of hate speech. For example, racist and misogynistic hate is spread mostly by users with little social influence. In contrast, political anti-Trump and anti-right-wing hate is reshared by users with larger social influence. Overall, understanding the factors that drive users to share hate speech is crucial for detecting individuals at risk of engaging in harmful behavior and for designing effective mitigation strategies.

MCML Authors

Dominique Geißler

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Abdurahman Maarouf

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[66]

K. Forster, V. Wagner, L. Keil, M. A. Müller, T. Sellhorn and S. Feuerriegel.
Tracking ESG Disclosures of European Companies with Retrieval-Augmented Generation.
Climate Change AI @ICLR 2025 - Workshop on Tackling Climate Change with Machine Learning at the 13th International Conference on Learning Representations (ICLR 2025). Singapore, Apr 24-28, 2025. To be published.

Abstract

Corporations play a crucial role in mitigating climate change and accelerating progress toward environmental, social, and governance (ESG) objectives. However, structured information on the current state of corporate ESG efforts remains limited. In this paper, we propose a machine learning framework based on a retrieval-augmented generation (RAG) pipeline to track ESG indicators from N = 9, 200 corporate reports. Our analysis includes ESG indicators from 600 of the largest listed corporations in Europe between 2014 and 2023. We focus on two key dimensions: first, we identify gaps in corporate sustainability reporting in light of existing standards. Second, we provide comprehensive bottom-up estimates of key ESG indicators across European industries. Our findings enable policymakers and financial markets to effectively assess corporate ESG transparency and track progress toward global sustainability objectives.

MCML Authors

Kerstin Forster

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[65]

D. Frauen, K. Heß and S. Feuerriegel.
Model-agnostic meta-learners for estimating heterogeneous treatment effects over time.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. To be published. Preprint available. arXiv

Abstract

Estimating heterogeneous treatment effects (HTEs) over time is crucial in many disciplines such as personalized medicine. For example, electronic health records are commonly collected over several time periods and then used to personalize treatment decisions. Existing works for this task have mostly focused on model-based learners (i.e., learners that adapt specific machine-learning models). In contrast, model-agnostic learners – so-called meta-learners – are largely unexplored. In our paper, we propose several meta-learners that are model-agnostic and thus can be used in combination with arbitrary machine learning models (e.g., transformers) to estimate HTEs over time. Here, our focus is on learners that can be obtained via weighted pseudo-outcome regressions, which allows for efficient estimation by targeting the treatment effect directly. We then provide a comprehensive theoretical analysis that characterizes the different learners and that allows us to offer insights into when specific learners are preferable. Finally, we confirm our theoretical insights through numerical experiments. In sum, while meta-learners are already state-of-the-art for the static setting, we are the first to propose a comprehensive set of meta-learners for estimating HTEs in the time-varying setting.

MCML Authors

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Konstantin Heß

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[64]

K. Heß and S. Feuerriegel.
Stabilized Neural Prediction of Potential Outcomes in Continuous Time.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. To be published. Preprint available. arXiv

Abstract

Patient trajectories from electronic health records are widely used to predict potential outcomes of treatments over time, which then allows to personalize care. Yet, existing neural methods for this purpose have a key limitation: while some adjust for time-varying confounding, these methods assume that the time series are recorded in discrete time. In other words, they are constrained to settings where measurements and treatments are conducted at fixed time steps, even though this is unrealistic in medical practice. In this work, we aim to predict potential outcomes in continuous time. The latter is of direct practical relevance because it allows for modeling patient trajectories where measurements and treatments take place at arbitrary, irregular timestamps. We thus propose a new method called stabilized continuous time inverse propensity network (SCIP-Net). For this, we further derive stabilized inverse propensity weights for robust prediction of the potential outcomes. To the best of our knowledge, our SCIP-Net is the first neural method that performs proper adjustments for time-varying confounding in continuous time.

MCML Authors

Konstantin Heß

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[63]

M. Schröder, V. Melnychuk and S. Feuerriegel.
Differentially private learners for heterogeneous treatment effects.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. To be published. Preprint available. URL

Abstract

Patient data is widely used to estimate heterogeneous treatment effects and understand the effectiveness and safety of drugs. Yet, patient data includes highly sensitive information that must be kept private. In this work, we aim to estimate the conditional average treatment effect (CATE) from observational data under differential privacy. Specifically, we present DP-CATE, a novel framework for CATE estimation that is doubly robust and ensures differential privacy of the estimates. For this, we build upon non-trivial tools from semi-parametric and robust statistics to exploit the connection between privacy and model robustness. Our framework is highly general and applies to any two-stage CATE meta-learner with a Neyman-orthogonal loss function. It can be used with all machine learning models employed for nuisance estimation. We further provide an extension of DP-CATE where we employ RKHS regression to release the complete doubly robust CATE function while ensuring differential privacy. We demonstrate the effectiveness of DP-CATE across various experiments using synthetic and real-world datasets. To the best of our knowledge, we are the first to provide a framework for CATE estimation that is doubly robust and differentially private.

MCML Authors

Maresa Schröder

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[62]

Y. Wang, M. Schröder, D. Frauen, J. Schweisthal, K. Heß and S. Feuerriegel.
Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. To be published. Preprint available. arXiv

Abstract

Constructing confidence intervals (CIs) for the average treatment effect (ATE) from patient records is crucial to assess the effectiveness and safety of drugs. However, patient records typically come from different hospitals, thus raising the question of how multiple observational datasets can be effectively combined for this purpose. In our paper, we propose a new method that estimates the ATE from multiple observational datasets and provides valid CIs. Our method makes little assumptions about the observational datasets and is thus widely applicable in medical practice. The key idea of our method is that we leverage prediction-powered inferences and thereby essentially `shrink’ the CIs so that we offer more precise uncertainty quantification as compared to naïve approaches. We further prove the unbiasedness of our method and the validity of our CIs. We confirm our theoretical results through various numerical experiments. Finally, we provide an extension of our method for constructing CIs from combinations of experimental and observational datasets.

MCML Authors

Yuxin Wang

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Maresa Schröder

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Konstantin Heß

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[61]

A. Maarouf, S. Feuerriegel and N. Pröllochs.
A fused large language model for predicting startup success.
European Journal of Operational Research 322.1 (Apr. 2025). DOI

Abstract

Investors are continuously seeking profitable investment opportunities in startups and, hence, for effective decision-making, need to predict a startup’s probability of success. Nowadays, investors can use not only various fundamental information about a startup (e.g., the age of the startup, the number of founders, and the business sector) but also textual description of a startup’s innovation and business model, which is widely available through online venture capital (VC) platforms such as Crunchbase. To support the decision-making of investors, we develop a machine learning approach with the aim of locating successful startups on VC platforms. Specifically, we develop, train, and evaluate a tailored, fused large language model to predict startup success. Thereby, we assess to what extent self-descriptions on VC platforms are predictive of startup success. Using 20,172 online profiles from Crunchbase, we find that our fused large language model can predict startup success, with textual self-descriptions being responsible for a significant part of the predictive power. Our work provides a decision support tool for investors to find profitable investment opportunities.

MCML Authors

Abdurahman Maarouf

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[60]

D. Martens, G. Shmueli, T. Evgeniou, K. Bauer, C. Janiesch, S. Feuerriegel, S. Gabel, S. Goethals, T. Greene, N. Klein, M. Kraus, N. Kühl, C. Perlich, W. Verbeke, A. Zharova, P. Zschech and F. Provost.
Beware of 'Explanations' of AI.
Preprint (Apr. 2025). arXiv

Abstract

Understanding the decisions made and actions taken by increasingly complex AI system remains a key challenge. This has led to an expanding field of research in explainable artificial intelligence (XAI), highlighting the potential of explanations to enhance trust, support adoption, and meet regulatory standards. However, the question of what constitutes a ‘good’ explanation is dependent on the goals, stakeholders, and context. At a high level, psychological insights such as the concept of mental model alignment can offer guidance, but success in practice is challenging due to social and technical factors. As a result of this ill-defined nature of the problem, explanations can be of poor quality (e.g. unfaithful, irrelevant, or incoherent), potentially leading to substantial risks. Instead of fostering trust and safety, poorly designed explanations can actually cause harm, including wrong decisions, privacy violations, manipulation, and even reduced AI adoption. Therefore, we caution stakeholders to beware of explanations of AI: while they can be vital, they are not automatically a remedy for transparency or responsible AI adoption, and their misuse or limitations can exacerbate harm. Attention to these caveats can help guide future research to improve the quality and impact of AI explanations.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[59]

D. Bär, N. Pröllochs and S. Feuerriegel.
The role of social media ads for election outcomes: Evidence from the 2021 German election.
PNAS Nexus.pgaf073 (Mar. 2025). DOI

Abstract

Social media ads have become a key communication channel in politics. However, the relationship between political ads from social media and election outcomes is not fully understood. Here, we aim to estimate the association between online political advertising and election outcomes during the 2021 German federal election. For this, we analyze a large-scale dataset of 21,641 political ads from Facebook and Instagram that received ≈126 million impressions. Using regression analysis, we show that political advertising on social media has a positive relationship with a candidate’s election outcome and may even sway elections. All else equal, ≈200,000 additional impressions are predicted to increase a candidate’s votes by 2.1%. We further use a causal sensitivity analysis to evaluate how unobserved confounding may affect our estimates. We find that the estimated impact of ads cannot be reasonably explained away, highlighting the significance of social media for election outcomes.

MCML Authors

Dominik Bär

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[58]

P. Spitzer, D. Hendriks, J. Rudolph, S. Schläger, J. Ricke, N. Kühl, B. F. Hoppe and S. Feuerriegel.
The effect of medical explanations from large language models on diagnostic decisions in radiology.
Preprint (Mar. 2025). DOI

Abstract

Large language models (LLMs) are increasingly used by physicians for diagnostic support. A key advantage of LLMs is the ability to generate explanations that can help physicians understand the reasoning behind a diagnosis. However, the best-suited format for LLM-generated explanations remains unclear. In this large-scale study, we examined the effect of different formats for LLM explanations on clinical decision-making. For this, we conducted a randomized experiment with radiologists reviewing patient cases with radiological images (N=2020 assessments). Participants received either no LLM support (control group) or were supported by one of three LLM-generated explanations: (1) a standard output providing the diagnosis without explanation; (2) a differential diagnosis comparing multiple possible diagnoses; or (3) a chain-of-thought explanation offering a detailed reasoning process for the diagnosis. We find that the format of explanations significantly influences diagnostic accuracy. The chain-of-thought explanations yielded the best performance, improving the diagnostic accuracy by 12.2% compared to the control condition without LLM support (P=0.001). The chain-of-thought explanations are also superior to the standard output without explanation (+7.2%; P=0.040) and the differential diagnosis format (+9.7%; P=0.004). Evidently, explaining the reasoning for a diagnosis helps physicians to identify and correct potential errors in LLM predictions and thus improve overall decisions. Altogether, the results highlight the importance of how explanations in medical LLMs are generated to maximize their utility in clinical practice. By designing explanations to support the reasoning processes of physicians, LLMs can improve diagnostic performance and, ultimately, patient outcomes.

MCML Authors

Boj Friedrich Hoppe

Dr.

C1 | Medicine
→ Group Michael Ingrisch

Clinical Data Science in Radiology

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[57]

D. Tschernutter and S. Feuerriegel.
Data-driven dynamic police patrolling: An efficient Monte Carlo tree search.
European Journal of Operational Research 321.1 (Feb. 2025). DOI

Abstract

Crime is responsible for major financial losses and serious harm to the well-being of individuals, and, hence, a crucial task of police operations is effective patrolling. Yet, in existing decision models aimed at police operations, microscopic routing decisions from patrolling are not considered, and, furthermore, the objective is limited to surrogate metrics (e. g., response time) instead of crime prevention. In this paper, we thus formalize the decision problem of dynamic police patrolling as a Markov decision process that models microscopic routing decisions, so that the expected number of prevented crimes are maximized. We experimentally show that standard solution approaches for our decision problem are not scalable to real-world settings. As a remedy, we present a tailored and highly efficient Monte Carlo tree search algorithm. We then demonstrate our algorithm numerically using real-world crime data from Chicago and show that the decision-making by our algorithm offers significant improvements for crime prevention over patrolling tactics from current practice. Informed by our results, we finally discuss implications for improving the patrolling tactics in police operations.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[56]

K. Heß, D. Frauen, V. Melnychuk and S. Feuerriegel.
Efficient and Sharp Off-Policy Learning under Unobserved Confounding.
Preprint (Feb. 2025). arXiv

Abstract

We develop a novel method for personalized off-policy learning in scenarios with unobserved confounding. Thereby, we address a key limitation of standard policy learning: standard policy learning assumes unconfoundedness, meaning that no unobserved factors influence both treatment assignment and outcomes. However, this assumption is often violated, because of which standard policy learning produces biased estimates and thus leads to policies that can be harmful. To address this limitation, we employ causal sensitivity analysis and derive a statistically efficient estimator for a sharp bound on the value function under unobserved confounding. Our estimator has three advantages: (1) Unlike existing works, our estimator avoids unstable minimax optimization based on inverse propensity weighted outcomes. (2) Our estimator is statistically efficient. (3) We prove that our estimator leads to the optimal confounding-robust policy. Finally, we extend our theory to the related task of policy improvement under unobserved confounding, i.e., when a baseline policy such as the standard of care is available. We show in experiments with synthetic and real-world data that our method outperforms simple plug-in approaches and existing baselines. Our method is highly relevant for decision-making where unobserved confounding can be problematic, such as in healthcare and public policy.

MCML Authors

Konstantin Heß

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[55]

V. Melnychuk, D. Frauen, J. Schweisthal and S. Feuerriegel.
Orthogonal Representation Learning for Estimating Causal Quantities.
Preprint (Feb. 2025). arXiv

Abstract

Representation learning is widely used for estimating causal quantities (e.g., the conditional average treatment effect) from observational data. While existing representation learning methods have the benefit of allowing for end-to-end learning, they do not have favorable theoretical properties of Neyman-orthogonal learners, such as double robustness and quasi-oracle efficiency. Also, such representation learning methods often employ additional constraints, like balancing, which may even lead to inconsistent estimation. In this paper, we propose a novel class of Neyman-orthogonal learners for causal quantities defined at the representation level, which we call OR-learners. Our OR-learners have several practical advantages: they allow for consistent estimation of causal quantities based on any learned representation, while offering favorable theoretical properties including double robustness and quasi-oracle efficiency. In multiple experiments, we show that, under certain regularity conditions, our OR-learners improve existing representation learning methods and achieve state-of-the-art performance. To the best of our knowledge, our OR-learners are the first work to offer a unified framework of representation learning methods and Neyman-orthogonal learners for causal quantities estimation.

MCML Authors

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[54]

S. Feuerriegel, A. Maarouf, D. Bär, D. Geißler, J. Schweisthal, N. Pröllochs, C. E. Robertson, S. Rathje, J. Hartmann, S. M. Mohammad, O. Netzer, A. A. Siegel, B. Plank and J. J. Van Bavel.
Using natural language processing to analyse text data in behavioural science.
Nature Reviews Psychology (Jan. 2025). DOI

Abstract

Language is a uniquely human trait at the core of human interactions. The language people use often reflects their personality, intentions and state of mind. With the integration of the Internet and social media into everyday life, much of human communication is documented as written text. These online forms of communication (for example, blogs, reviews, social media posts and emails) provide a window into human behaviour and therefore present abundant research opportunities for behavioural science. In this Review, we describe how natural language processing (NLP) can be used to analyse text data in behavioural science. First, we review applications of text data in behavioural science. Second, we describe the NLP pipeline and explain the underlying modelling approaches (for example, dictionary-based approaches and large language models). We discuss the advantages and disadvantages of these methods for behavioural science, in particular with respect to the trade-off between interpretability and accuracy. Finally, we provide actionable recommendations for using NLP to ensure rigour and reproducibility.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

Abdurahman Maarouf

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dominik Bär

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dominique Geißler

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Barbara Plank

Prof. Dr.

B2 | Natural Language Processing

AI and Computational Linguistics

[53]

Y. Feng, S. Feuerriegel and Y. R. Shrestha.
Contextualizing Recommendation Explanations with LLMs: A User Study.
Preprint (Jan. 2025). arXiv

Abstract

Large language models (LLMs) are increasingly prevalent in recommender systems, where LLMs can be used to generate personalized recommendations. Here, we examine how different LLM-generated explanations for movie recommendations affect users’ perceptions of cognitive, affective, and utilitarian needs and consumption intentions. In a pre-registered, between-subject online experiment (N=759) and follow-up interviews (N=30), we compare (a) LLM-generated generic explanations, and (b) LLM-generated contextualized explanations. Our findings show that contextualized explanations (i.e., explanations that incorporate users’ past behaviors) effectively meet users’ cognitive needs while increasing users’ intentions to watch recommended movies. However, adding explanations offers limited benefits in meeting users’ utilitarian and affective needs, raising concerns about the proper design and implications of LLM-generated explanations. Qualitative insights from interviews reveal that referencing users’ past preferences enhances trust and understanding but can feel excessive if overused. Furthermore, users with more active and positive engagement with the recommender system and movie-watching get substantial gains from contextualized explanations. Overall, our research clarifies how LLM-generated recommendations influence users’ motivations and behaviors, providing valuable insights for the future development of user-centric recommender systems, a key element in social media platforms and online ecosystems.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

2024

[52]

Y. Ma, V. Melnychuk, J. Schweisthal and S. Feuerriegel.
DiffPO: A causal diffusion model for learning distributions of potential outcomes.
NeurIPS 2024 - 38th Conference on Neural Information Processing Systems. Vancouver, Canada, Dec 10-15, 2024. URL

Abstract

Predicting potential outcomes of interventions from observational data is crucial for decision-making in medicine, but the task is challenging due to the fundamental problem of causal inference. Existing methods are largely limited to point estimates of potential outcomes with no uncertain quantification; thus, the full information about the distributions of potential outcomes is typically ignored. In this paper, we propose a novel causal diffusion model called DiffPO, which is carefully designed for reliable inferences in medicine by learning the distribution of potential outcomes. In our DiffPO, we leverage a tailored conditional denoising diffusion model to learn complex distributions, where we address the selection bias through a novel orthogonal diffusion loss. Another strength of our DiffPO method is that it is highly flexible (e.g., it can also be used to estimate different causal quantities such as CATE). Across a wide range of experiments, we show that our method achieves state-of-the-art performance.

MCML Authors

Yuchen Ma

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[51]

V. Melnychuk, S. Feuerriegel and M. van der Schaar.
Quantifying Aleatoric Uncertainty of the Treatment Effect: A Novel Orthogonal Learner.
NeurIPS 2024 - 38th Conference on Neural Information Processing Systems. Vancouver, Canada, Dec 10-15, 2024. URL

Abstract

Estimating causal quantities from observational data is crucial for understanding the safety and effectiveness of medical treatments. However, to make reliable inferences, medical practitioners require not only estimating averaged causal quantities, such as the conditional average treatment effect, but also understanding the randomness of the treatment effect as a random variable. This randomness is referred to as aleatoric uncertainty and is necessary for understanding the probability of benefit from treatment or quantiles of the treatment effect. Yet, the aleatoric uncertainty of the treatment effect has received surprisingly little attention in the causal machine learning community. To fill this gap, we aim to quantify the aleatoric uncertainty of the treatment effect at the individualized (covariate-conditional) level, namely, the conditional distribution of the treatment effect (CDTE). Unlike average causal quantities, the CDTE is not point identifiable without strong additional assumptions. As a remedy, we employ partial identification to obtain sharp bounds on the CDTE and thereby quantify the aleatoric uncertainty of the treatment effect. We then develop a novel, orthogonal learner for the bounds on the CDTE, which we call AU-learner. We further show that our AU-learner has several strengths in that it satisfies Neyman-orthogonality and is doubly robust. Finally, we propose a fully-parametric deep learning instantiation of our AU-learner.

MCML Authors

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[50]

J. Senoner, S. Schallmoser, B. Kratzwald, S. Feuerriegel and T. Netland.
Explainable AI improves task performance in human–AI collaboration.
Scientific Reports 14.31150 (Dec. 2024). DOI

Abstract

Artificial intelligence (AI) provides considerable opportunities to assist human work. However, one crucial challenge of human-AI collaboration is that many AI algorithms operate in a black-box manner where the way how the AI makes predictions remains opaque. This makes it difficult for humans to validate a prediction made by AI against their own domain knowledge. For this reason, we hypothesize that augmenting humans with explainable AI as a decision aid improves task performance in human-AI collaboration. To test this hypothesis, we analyze the effect of augmenting domain experts with explainable AI in the form of visual heatmaps. We then compare participants that were either supported by (a) black-box AI or (b) explainable AI, where the latter supports them to follow AI predictions when the AI is accurate or overrule the AI when the AI predictions are wrong. We conducted two preregistered experiments with representative, real-world visual inspection tasks from manufacturing and medicine. The first experiment was conducted with factory workers from an electronics factory, who performed N=9,600 assessments of whether electronic products have defects. The second experiment was conducted with radiologists, who performed N=5,650 assessments of chest X-ray images to identify lung lesions. The results of our experiments with domain experts performing real-world tasks show that task performance improves when participants are supported by explainable AI instead of black-box AI. For example, in the manufacturing setting, we find that augmenting participants with explainable AI (as opposed to black-box AI) leads to a five-fold decrease in the median error rate of human decisions, which gives a significant improvement in task performance.

MCML Authors

Simon Schallmoser

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[49]

A. Bashardoust, S. Feuerriegel and Y. R. Shrestha.
Comparing the Willingness to Share for Human-generated vs. AI-generated Fake News.
CSCW 2024 - 27th ACM SIGCHI Conference on Computer-Supported Cooperative Work and Social Computing. San José, Costa Rica, Nov 09-13, 2024. DOI

Abstract

Generative artificial intelligence (AI) presents large risks for society when it is used to create fake news. A crucial factor for fake news to go viral on social media is that users share such content. Here, we aim to shed light on the sharing behavior of users across human-generated vs. AI-generated fake news. Specifically, we study: (1) What is the perceived veracity of human-generated fake news vs. AI-generated fake news? (2) What is the user’s willingness to share human-generated fake news vs. AI-generated fake news on social media? (3) What socio-economic characteristics let users fall for AI-generated fake news? To this end, we conducted a pre-registered, online experiment with N= 988 subjects and 20 fake news from the COVID-19 pandemic generated by GPT-4 vs. humans. Our findings show that AI-generated fake news is perceived as less accurate than human-generated fake news, but both tend to be shared equally. Further, several socio-economic factors explain who falls for AI-generated fake news.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[48]

D. Geißler and S. Feuerriegel.
Analyzing the Strategy of Propaganda using Inverse Reinforcement Learning: Evidence from the 2022 Russian Invasion of Ukraine.
CSCW 2024 - 27th ACM SIGCHI Conference on Computer-Supported Cooperative Work and Social Computing. San José, Costa Rica, Nov 09-13, 2024. DOI

Abstract

The 2022 Russian invasion of Ukraine was accompanied by a large-scale, pro-Russian propaganda campaign on social media. However, the strategy behind the dissemination of propaganda has remained unclear, particularly how the online discourse was strategically shaped by the propagandists’ community. Here, we analyze the strategy of the Twitter community using an inverse reinforcement learning (IRL) approach. Specifically, IRL allows us to model online behavior as a Markov decision process, where the goal is to infer the underlying reward structure that guides propagandists when interacting with users with a supporting or opposing stance toward the invasion. Thereby, we aim to understand empirically whether and how between-user interactions are strategically used to promote the proliferation of Russian propaganda. For this, we leverage a large-scale dataset with 349,455 posts with pro-Russian propaganda from 132,131 users. We show that bots and humans follow a different strategy: bots respond predominantly to pro-invasion messages, suggesting that they seek to drive virality; while messages indicating opposition primarily elicit responses from humans, suggesting that they tend to engage in critical discussions. To the best of our knowledge, this is the first study analyzing the strategy behind propaganda from the 2022 Russian invasion of Ukraine through the lens of IRL.

MCML Authors

Dominique Geißler

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[47]

A. Maarouf, N. Pröllochs and S. Feuerriegel.
The Virality of Hate Speech on Social Media.
CSCW 2024 - 27th ACM SIGCHI Conference on Computer-Supported Cooperative Work and Social Computing. San José, Costa Rica, Nov 09-13, 2024. DOI

Abstract

Online hate speech is responsible for violent attacks such as, e.g., the Pittsburgh synagogue shooting in 2018, thereby posing a significant threat to vulnerable groups and society in general. However, little is known about what makes hate speech on social media go viral. In this paper, we collect N = 25,219 cascades with 65,946 retweets from X (formerly known as Twitter) and classify them as hateful vs. normal. Using a generalized linear regression, we then estimate differences in the spread of hateful vs. normal content based on author and content variables. We thereby identify important determinants that explain differences in the spreading of hateful vs. normal content. For example, hateful content authored by verified users is disproportionally more likely to go viral than hateful content from non-verified ones: hateful content from a verified user (as opposed to normal content) has a 3.5 times larger cascade size, a 3.2 times longer cascade lifetime, and a 1.2 times larger structural virality. Altogether, we offer novel insights into the virality of hate speech on social media.

MCML Authors

Abdurahman Maarouf

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[46]

D. Bär, A. Maarouf and S. Feuerriegel.
Generative AI may backfire for counterspeech.
Preprint (Nov. 2024). arXiv

Abstract

Online hate speech poses a serious threat to individual well-being and societal cohesion. A promising solution to curb online hate speech is counterspeech. Counterspeech is aimed at encouraging users to reconsider hateful posts by direct replies. However, current methods lack scalability due to the need for human intervention or fail to adapt to the specific context of the post. A potential remedy is the use of generative AI, specifically large language models (LLMs), to write tailored counterspeech messages. In this paper, we analyze whether contextualized counterspeech generated by state-of-the-art LLMs is effective in curbing online hate speech. To do so, we conducted a large-scale, pre-registered field experiment (N=2,664) on the social media platform Twitter/X. Our experiment followed a 2x2 between-subjects design and, additionally, a control condition with no counterspeech. On the one hand, users posting hateful content on Twitter/X were randomly assigned to receive either (a) contextualized counterspeech or (b) non-contextualized counterspeech. Here, the former is generated through LLMs, while the latter relies on predefined, generic messages. On the other hand, we tested two counterspeech strategies: (a) promoting empathy and (b) warning about the consequences of online misbehavior. We then measured whether users deleted their initial hateful posts and whether their behavior changed after the counterspeech intervention (e.g., whether users adopted a less toxic language). We find that non-contextualized counterspeech employing a warning-of-consequence strategy significantly reduces online hate speech. However, contextualized counterspeech generated by LLMs proves ineffective and may even backfire.

MCML Authors

Dominik Bär

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Abdurahman Maarouf

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[45]

P. Janetzky, T. Schlagenhauf and S. Feuerriegel.
Slowing Down Forgetting in Continual Learning.
Preprint (Nov. 2024). arXiv

Abstract

A common challenge in continual learning (CL) is catastrophic forgetting, where the performance on old tasks drops after new, additional tasks are learned. In this paper, we propose a novel framework called ReCL to slow down forgetting in CL. Our framework exploits an implicit bias of gradient-based neural networks due to which these converge to margin maximization points. Such convergence points allow us to reconstruct old data from previous tasks, which we then combine with the current training data. Our framework is flexible and can be applied on top of existing, state-of-the-art CL methods to slow down forgetting. We further demonstrate the performance gain from our framework across a large series of experiments, including different CL scenarios (class incremental, domain incremental, task incremental learning) different datasets (MNIST, CIFAR10), and different network architectures. Across all experiments, we find large performance gains through ReCL. To the best of our knowledge, our framework is the first to address catastrophic forgetting by leveraging models in CL as their own memory buffers.

MCML Authors

Pascal Janetzky

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[44]

Y. Ozyurt, S. Feuerriegel and M. Sachan.
Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing.
Preprint (Oct. 2024). arXiv

Abstract

Knowledge tracing (KT) is a popular approach for modeling students’ learning progress over time, which can enable more personalized and adaptive learning. However, existing KT approaches face two major limitations: (1) they rely heavily on expert-defined knowledge concepts (KCs) in questions, which is time-consuming and prone to errors; and (2) KT methods tend to overlook the semantics of both questions and the given KCs. In this work, we address these challenges and present KCQRL, a framework for automated knowledge concept annotation and question representation learning that can improve the effectiveness of any existing KT model. First, we propose an automated KC annotation process using large language models (LLMs), which generates question solutions and then annotates KCs in each solution step of the questions. Second, we introduce a contrastive learning approach to generate semantically rich embeddings for questions and solution steps, aligning them with their associated KCs via a tailored false negative elimination approach. These embeddings can be readily integrated into existing KT models, replacing their randomly initialized embeddings. We demonstrate the effectiveness of KCQRL across 15 KT algorithms on two large real-world Math learning datasets, where we achieve consistent performance improvements.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[43]

Y. Ozyurt, S. Feuerriegel and C. Zhang.
Document-Level In-Context Few-Shot Relation Extraction via Pre-Trained Language Models.
Preprint (Oct. 2024). arXiv

Abstract

Document-level relation extraction aims at inferring structured human knowledge from textual documents. State-of-the-art methods for this task use pre-trained language models (LMs) via fine-tuning, yet fine-tuning is computationally expensive and cannot adapt to new relation types or new LMs. As a remedy, we leverage the generalization capabilities of pre-trained LMs and present a novel framework for document-level in-context few-shot relation extraction. Our framework has three strengths: it eliminates the need (1) for named entity recognition and (2) for human annotations of documents, and (3) it can be updated to new LMs without re-training. We evaluate our framework using DocRED, the largest publicly available dataset for document-level relation extraction, and demonstrate that our framework achieves state-of-the-art performance. We further show that our framework actually performs much better than the original labels from the development set of DocRED. Finally, we conduct an extensive benchmark demonstrating the effectiveness of our framework, achieving state-of-the-art results across six relation extraction datasets and outperforming more than 30 baseline methods. Unlike our framework, the baseline methods have large computational overhead (e.g., from fine-tuning). To the best of our knowledge, we are the first to reformulate the document-level relation extraction task as a tailored in-context few-shot learning paradigm.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[42]

D. Tschernutter, M. Kraus and S. Feuerriegel.
A Globally Convergent Algorithm for Neural Network Parameter Optimization Based on Difference-of-Convex Functions.
Transactions on Machine Learning Research (Sep. 2024). URL

Abstract

We propose an algorithm for optimizing the parameters of single hidden layer neural networks. Specifically, we derive a blockwise difference-of-convex (DC) functions representation of the objective function. Based on the latter, we propose a block coordinate descent (BCD) approach that we combine with a tailored difference-of-convex functions algorithm (DCA). We prove global convergence of the proposed algorithm. Furthermore, we mathematically analyze the convergence rate of parameters and the convergence rate in value (i.e., the training loss). We give conditions under which our algorithm converges linearly or even faster depending on the local shape of the loss function. We confirm our theoretical derivations numerically and compare our algorithm against state-of-the-art gradient-based solvers in terms of both training loss and test loss.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[41]

A. Bashardoust, Y. Feng, D. Geißler, S. Feuerriegel and Y. R. Shrestha.
The Effect of Education in Prompt Engineering: Evidence from Journalists.
Preprint (Sep. 2024). arXiv

Abstract

Large language models (LLMs) are increasingly used in daily work. In this paper, we analyze whether training in prompt engineering can improve the interactions of users with LLMs. For this, we conducted a field experiment where we asked journalists to write short texts before and after training in prompt engineering. We then analyzed the effect of training on three dimensions: (1) the user experience of journalists when interacting with LLMs, (2) the accuracy of the texts (assessed by a domain expert), and (3) the reader perception, such as clarity, engagement, and other text quality dimensions (assessed by non-expert readers). Our results show: (1) Our training improved the perceived expertise of journalists but also decreased the perceived helpfulness of LLM use. (2) The effect on accuracy varied by the difficulty of the task. (3) There is a mixed impact of training on reader perception across different text quality dimensions.

MCML Authors

Dominique Geißler

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[40]

M. Kuzmanovic, D. Frauen, T. Hatt and S. Feuerriegel.
Causal Machine Learning for Cost-Effective Allocation of Development Aid.
KDD 2024 - 30th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Barcelona, Spain, Aug 25-29, 2024. DOI

Abstract

The Sustainable Development Goals (SDGs) of the United Nations provide a blueprint of a better future by ’leaving no one behind’, and, to achieve the SDGs by 2030, poor countries require immense volumes of development aid. In this paper, we develop a causal machine learning framework for predicting heterogeneous treatment effects of aid disbursements to inform effective aid allocation. Specifically, our framework comprises three components: (i) a balancing autoencoder that uses representation learning to embed high-dimensional country characteristics while addressing treatment selection bias; (ii) a counterfactual generator to compute counterfactual outcomes for varying aid volumes to address small sample-size settings; and (iii) an inference model that is used to predict heterogeneous treatment-response curves. We demonstrate the effectiveness of our framework using data with official development aid earmarked to end HIV/AIDS in 105 countries, amounting to more than USD 5.2 billion. For this, we first show that our framework successfully computes heterogeneous treatment-response curves using semi-synthetic data. Then, we demonstrate our framework using real-world HIV data. Our framework points to large opportunities for a more effective aid allocation, suggesting that the total number of new HIV infections could be reduced by up to 3.3% (~50,000 cases) compared to the current allocation practice.

MCML Authors

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[39]

A. Maarouf, D. Bär, D. Geißler and S. Feuerriegel.
HQP: A human-annotated dataset for detecting online propaganda.
ACL 2024 - Findings of the 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand, Aug 11-16, 2024. DOI

Abstract

Online propaganda poses a severe threat to the integrity of societies. However, existing datasets for detecting online propaganda have a key limitation: they were annotated using weak labels that can be noisy and even incorrect. To address this limitation, our work makes the following contributions: (1) We present HQP: a novel dataset (N=30000) for detecting online propaganda with high-quality labels. To the best of our knowledge, HQP is the first large-scale dataset for detecting online propaganda that was created through human annotation. (2) We show empirically that state-of-the-art language models fail in detecting online propaganda when trained with weak labels (AUC: 64.03). In contrast, state-of-the-art language models can accurately detect online propaganda when trained with our high-quality labels (AUC: 92.25), which is an improvement of 44%. (3) We show that prompt-based learning using a small sample of high-quality labels can still achieve a reasonable performance (AUC: 80.27) while significantly reducing the cost of labeling. (4) We extend HQP to HQP+ to test how well propaganda across different contexts can be detected. Crucially, our work highlights the importance of high-quality labels for sensitive NLP tasks such as propaganda detection.

MCML Authors

Abdurahman Maarouf

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dominik Bär

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dominique Geißler

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[38]

D. Frauen, V. Melnychuk and S. Feuerriegel.
Fair Off-Policy Learning from Observational Data.
ICML 2024 - 41st International Conference on Machine Learning. Vienna, Austria, Jul 21-27, 2024. URL

Abstract

Algorithmic decision-making in practice must be fair for legal, ethical, and societal reasons. To achieve this, prior research has contributed various approaches that ensure fairness in machine learning predictions, while comparatively little effort has focused on fairness in decision-making, specifically off-policy learning. In this paper, we propose a novel framework for fair off-policy learning: we learn decision rules from observational data under different notions of fairness, where we explicitly assume that observational data were collected under a different – potentially discriminatory – behavioral policy. Importantly, our framework applies to different fairness notions for off-policy learning, where fairness is formalized based on actions or policy values. As our main contribution, we propose a neural network-based framework to learn optimal policies under different fairness notions. We further provide theoretical guarantees in the form of generalization bounds for the finite-sample version of our framework. We demonstrate the effectiveness of our framework through extensive numerical experiments using both simulated and real-world data. Altogether, our work enables algorithmic decision-making in a wide array of practical applications where fairness must be ensured.

MCML Authors

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[37]

J. Schweisthal, D. Frauen, M. Van der Schaar and S. Feuerriegel.
Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments.
ICML 2024 - 41st International Conference on Machine Learning. Vienna, Austria, Jul 21-27, 2024. URL

Abstract

Estimating the conditional average treatment effect (CATE) from observational data is relevant for many applications such as personalized medicine. Here, we focus on the widespread setting where the observational data come from multiple environments, such as different hospitals, physicians, or countries. Furthermore, we allow for violations of standard causal assumptions, namely, overlap within the environments and unconfoundedness. To this end, we move away from point identification and focus on partial identification. Specifically, we show that current assumptions from the literature on multiple environments allow us to interpret the environment as an instrumental variable (IV). This allows us to adapt bounds from the IV literature for partial identification of CATE by leveraging treatment assignment mechanisms across environments. Then, we propose different model-agnostic learners (so-called meta-learners) to estimate the bounds that can be used in combination with arbitrary machine learning models. We further demonstrate the effectiveness of our meta-learners across various experiments using both simulated and real-world data. Finally, we discuss the applicability of our meta-learners to partial identification in instrumental variable settings, such as randomized controlled trials with non-compliance.

MCML Authors

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[36]

M. Schröder, D. Frauen, J. Schweisthal, K. Heß, V. Melnychuk and S. Feuerriegel.
Conformal Prediction for Causal Effects of Continuous Treatments.
Preprint (Jul. 2024). arXiv

Abstract

Uncertainty quantification of causal effects is crucial for safety-critical applications such as personalized medicine. A powerful approach for this is conformal prediction, which has several practical benefits due to model-agnostic finite-sample guarantees. Yet, existing methods for conformal prediction of causal effects are limited to binary/discrete treatments and make highly restrictive assumptions such as known propensity scores. In this work, we provide a novel conformal prediction method for potential outcomes of continuous treatments. We account for the additional uncertainty introduced through propensity estimation so that our conformal prediction intervals are valid even if the propensity score is unknown. Our contributions are three-fold: (1) We derive finite-sample prediction intervals for potential outcomes of continuous treatments. (2) We provide an algorithm for calculating the derived intervals. (3) We demonstrate the effectiveness of the conformal prediction intervals in experiments on synthetic and real-world datasets. To the best of our knowledge, we are the first to propose conformal prediction for continuous treatments when the propensity score is unknown and must be estimated from data.

MCML Authors

Maresa Schröder

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Konstantin Heß

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[35]

C. F. Naumzik, A. Kongsted, W. Vach and S. Feuerriegel.
Data-driven subgrouping of patient trajectories with chronic diseases: Evidence from low back pain.
CHIL 2024 - 5th AHLI Conference on Health, Inference, and Learning . New York City, NY, USA, Jun 27-28, 2024. URL

Abstract

Clinical data informs the personalization of health care with a potential for more effective disease management. In practice, this is achieved by emph{subgrouping}, whereby clusters with similar patient characteristics are identified and then receive customized treatment plans with the goal of targeting subgroup-specific disease dynamics. In this paper, we propose a novel mixture hidden Markov model for subgrouping patient trajectories from emph{chronic diseases}. Our model is probabilistic and carefully designed to capture different trajectory phases of chronic diseases (i.e., “severe”, “moderate”, and “mild”) through tailored latent states. We demonstrate our subgrouping framework based on a longitudinal study across 847 patients with non-specific low back pain. Here, our subgrouping framework identifies 8 subgroups. Further, we show that our subgrouping framework outperforms common baselines in terms of cluster validity indices. Finally, we discuss the applicability of the model to other chronic and long-lasting diseases.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[34]

D. Bär, F. Pierri, G. De Francisci Morales and S. Feuerriegel.
Systematic discrepancies in the delivery of political ads on facebook and instagram.
PNAS Nexus.pgae247 (Jun. 2024). DOI

Abstract

Political advertising on social media has become a central element in election campaigns. However, granular information about political advertising on social media was previously unavailable, thus raising concerns regarding fairness, accountability, and transparency in the electoral process. In this article, we analyze targeted political advertising on social media via a unique, large-scale dataset of over 80,000 political ads from Meta during the 2021 German federal election, with more than billion impressions. For each political ad, our dataset records granular information about targeting strategies, spending, and actual impressions. We then study (i) the prevalence of targeted ads across the political spectrum; (ii) the discrepancies between targeted and actual audiences due to algorithmic ad delivery; and (iii) which targeting strategies on social media attain a wide reach at low cost. We find that targeted ads are prevalent across the entire political spectrum. Moreover, there are considerable discrepancies between targeted and actual audiences, and systematic differences in the reach of political ads (in impressions-per-EUR) among parties, where the algorithm favor ads from populists over others.

MCML Authors

Dominik Bär

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[33]

B. Felderer, L. Repke, W. Weber, J. Schweisthal and L. Bothmann.
Predicting the Validity and Reliability of Survey Questions.
Preprint (Jun. 2024). DOI

Abstract

The Survey Quality Predictor (SQP) is an open-access system to predict the quality, i.e., the reliability and validity, of survey questions based on the characteristics of the questions. The prediction is based on a meta-regression of many multitrait-multimethod (MTMM) experiments in which characteristics of the survey questions were systematically varied. The release of SQP 3.0 that is based on an expanded data base as compared to previous SQP versions raised the need for a new meta-regression. To find the best method for analyzing the complex data structure of SQP (e.g., the existence of various uncorrelated predictors), we compared four suitable machine learning methods in terms of their ability to predict both survey quality indicators: LASSO, elastic net, boosting and random forest. The article discusses the performance of the models and illustrates the importance of the individual item characteristics in the random forest model, which was chosen for SQP 3.0.

MCML Authors

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Ludwig Bothmann

Dr.

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

Statistical Learning and Data Science

[32]

D. Frauen, F. Imrie, A. Curth, V. Melnychuk, S. Feuerriegel and M. van der Schaar.
A Neural Framework for Generalized Causal Sensitivity Analysis.
ICLR 2024 - 12th International Conference on Learning Representations. Vienna, Austria, May 07-11, 2024. URL

Abstract

Unobserved confounding is common in many applications, making causal inference from observational data challenging. As a remedy, causal sensitivity analysis is an important tool to draw causal conclusions under unobserved confounding with mathematical guarantees. In this paper, we propose NeuralCSA, a neural framework for generalized causal sensitivity analysis. Unlike previous work, our framework is compatible with (i) a large class of sensitivity models, including the marginal sensitivity model, -sensitivity models, and Rosenbaum’s sensitivity model; (ii) different treatment types (i.e., binary and continuous); and (iii) different causal queries, including (conditional) average treatment effects and simultaneous effects on multiple outcomes. This generality is achieved by learning a latent distribution shift that corresponds to a treatment intervention using two conditional normalizing flows. We provide theoretical guarantees that NeuralCSA is able to infer valid bounds on the causal query of interest and also demonstrate this empirically using both simulated and real-world data.

MCML Authors

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[31]

K. Heß, V. Melnychuk, D. Frauen and S. Feuerriegel.
Bayesian Neural Controlled Differential Equations for Treatment Effect Estimation.
ICLR 2024 - 12th International Conference on Learning Representations. Vienna, Austria, May 07-11, 2024. URL

Abstract

Treatment effect estimation in continuous time is crucial for personalized medicine. However, existing methods for this task are limited to point estimates of the potential outcomes, whereas uncertainty estimates have been ignored. Needless to say, uncertainty quantification is crucial for reliable decision-making in medical applications. To fill this gap, we propose a novel Bayesian neural controlled differential equation (BNCDE) for treatment effect estimation in continuous time. In our BNCDE, the time dimension is modeled through a coupled system of neural controlled differential equations and neural stochastic differential equations, where the neural stochastic differential equations allow for tractable variational Bayesian inference. Thereby, for an assigned sequence of treatments, our BNCDE provides meaningful posterior predictive distributions of the potential outcomes. To the best of our knowledge, ours is the first tailored neural method to provide uncertainty estimates of treatment effects in continuous time. As such, our method is of direct practical value for promoting reliable decision-making in medicine.

MCML Authors

Konstantin Heß

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[30]

V. Melnychuk, D. Frauen and S. Feuerriegel.
Bounds on Representation-Induced Confounding Bias for Treatment Effect Estimation.
ICLR 2024 - 12th International Conference on Learning Representations. Vienna, Austria, May 07-11, 2024. URL

Abstract

State-of-the-art methods for conditional average treatment effect (CATE) estimation make widespread use of representation learning. Here, the idea is to reduce the variance of the low-sample CATE estimation by a (potentially constrained) low-dimensional representation. However, low-dimensional representations can lose information about the observed confounders and thus lead to bias, because of which the validity of representation learning for CATE estimation is typically violated. In this paper, we propose a new, representation-agnostic refutation framework for estimating bounds on the representation-induced confounding bias that comes from dimensionality reduction (or other constraints on the representations) in CATE estimation. First, we establish theoretically under which conditions CATE is non-identifiable given low-dimensional (constrained) representations. Second, as our remedy, we propose a neural refutation framework which performs partial identification of CATE or, equivalently, aims at estimating lower and upper bounds of the representation-induced confounding bias. We demonstrate the effectiveness of our bounds in a series of experiments. In sum, our refutation framework is of direct relevance in practice where the validity of CATE estimation is of importance.

MCML Authors

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[29]

M. Schröder, D. Frauen and S. Feuerriegel.
Causal Fairness under Unobserved Confounding: A Neural Sensitivity Framework.
ICLR 2024 - 12th International Conference on Learning Representations. Vienna, Austria, May 07-11, 2024. URL

Abstract

Fairness of machine learning predictions is widely required in practice for legal, ethical, and societal reasons. Existing work typically focuses on settings without unobserved confounding, even though unobserved confounding can lead to severe violations of causal fairness and, thus, unfair predictions. In this work, we analyze the sensitivity of causal fairness to unobserved confounding. Our contributions are three-fold. First, we derive bounds for causal fairness metrics under different sources of unobserved confounding. This enables practitioners to examine the sensitivity of their machine learning models to unobserved confounding in fairness-critical applications. Second, we propose a novel neural framework for learning fair predictions, which allows us to offer worst-case guarantees of the extent to which causal fairness can be violated due to unobserved confounding. Third, we demonstrate the effectiveness of our framework in a series of experiments, including a real-world case study about predicting prison sentences. To the best of our knowledge, ours is the first work to study causal fairness under unobserved confounding. To this end, our work is of direct practical value as a refutation strategy to ensure the fairness of predictions in high-stakes applications.

MCML Authors

Maresa Schröder

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[28]

K. Heß, D. Frauen, V. Melnychuk and S. Feuerriegel.
G-Transformer for Conditional Average Potential Outcome Estimation over Time.
Preprint (May. 2024). arXiv

Abstract

Estimating potential outcomes for treatments over time based on observational data is important for personalized decision-making in medicine. Yet, existing neural methods for this task either (1) do not perform proper adjustments for time-varying confounders, or (2) suffer from large estimation variance. In order to address both limitations, we introduce the G-transformer (GT). Our GT is a novel, neural end-to-end model which adjusts for time-varying confounders, and provides low-variance estimation of conditional average potential outcomes (CAPOs) over time. Specifically, our GT is the first neural model to perform regression-based iterative G-computation for CAPOs in the time-varying setting. We evaluate the effectiveness of our GT across various experiments. In sum, this work represents a significant step towards personalized decision-making from electronic health records.

MCML Authors

Konstantin Heß

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[27]

S. Feuerriegel, D. Frauen, V. Melnychuk, J. Schweisthal, K. Heß, A. Curth, S. Bauer, N. Kilbertus, I. S. Kohane and M. van der Schaar.
Causal machine learning for predicting treatment outcomes.
Nature Medicine 30 (Apr. 2024). DOI

Abstract

Causal machine learning (ML) offers flexible, data-driven methods for predicting treatment outcomes including efficacy and toxicity, thereby supporting the assessment and safety of drugs. A key benefit of causal ML is that it allows for estimating individualized treatment effects, so that clinical decision-making can be personalized to individual patient profiles. Causal ML can be used in combination with both clinical trial data and real-world data, such as clinical registries and electronic health records, but caution is needed to avoid biased or incorrect predictions. In this Perspective, we discuss the benefits of causal ML (relative to traditional statistical or ML approaches) and outline the key components and steps. Finally, we provide recommendations for the reliable use of causal ML and effective translation into the clinic.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Konstantin Heß

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Niki Kilbertus

Prof. Dr.

A3 | Computational Models

Ethics in Systems Design and Machine Learning

[26]

M. Maritsch, S. Föll, V. Lehmann, N. Styger, C. Bérubé, M. Kraus, S. Feuerriegel, T. Kowatsch, T. Züger, E. Fleisch, F. Wortmann and C. Stettler.
Smartwatches for non-invasive hypoglycaemia detection during cognitive and psychomotor stress.
Diabetes, Obesity and Metabolism 26.3 (Mar. 2024). DOI

Abstract

Hypoglycaemia is one of the most relevant complications of diabetes1 and induces alterations in physiological parameters2, 3 that can be measured with smartwatches and detected using machine learning (ML).4 The performance of these algorithms when applied to different hypoglycaemic ranges or in situations involving cognitive and psychomotor stress remains unclear. Demanding tasks can significantly affect the physiological responses on which the wearable-based hypoglycaemia detection relies.5 The present analysis aimed to investigate ML-based hypoglycaemia detection using wearable data at different levels of hypoglycaemia during a complex task involving cognitive and psychomotor challenges (driving).

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[25]

S. Feuerriegel, J. Hartmann, C. Janiesch and P. Zschech.
Generative AI.
Business and Information Systems Engineering 66.1 (Feb. 2024). DOI

Abstract

In this Catchword article, we provide a conceptualization of generative AI as an entity in socio-technical systems and provide examples of models, systems, and applications. Based on that, we introduce limitations of current generative AI and provide an agenda for BISE research. Previous papers discuss generative AI around specific methods such as language models (e.g., Teubner et al. 2023; Dwivedi et al. 2023; Schöbel et al. 2023) or specific applications such as marketing (e.g., Peres et al. 2023), innovation management (Burger et al. 2023), scholarly research (e.g., Susarla et al. 2023; Davison et al. 2023), and education (e.g., Kasneci et al. 2023; Gimpel et al. 2023). Different from these works, we focus on generative AI in the context of information systems, and, to this end, we discuss several opportunities and challenges that are unique to the BISE community and make suggestions for impactful directions for BISE research.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[24]

V. Lehmann, T. Zueger, M. Maritsch, M. Notter, S. Schallmoser, C. Bérubé, C. Albrecht, M. Kraus, S. Feuerriegel, E. Fleisch, T. Kowatsch, S. Lagger, M. Laimer, F. Wortmann and C. Stettler.
Machine Learning to Infer a Health State Using Biomedical Signals - Detection of Hypoglycemia in People with Diabetes while Driving Real Cars.
NEJM AI (Jan. 2024). DOI

Abstract

BACKGROUND: Hypoglycemia, one of the most dangerous acute complications of diabetes, poses a substantial risk for vehicle accidents. To date, both reliable detection and warning of hypoglycemia while driving remain unmet needs, as current sensing approaches are restricted by diagnostic delay, invasiveness, low availability, and high costs. This research aimed to develop and evaluate a machine learning (ML) approach for the detection of hypoglycemia during driving through data collected on driving characteristics and gaze/head motion.
METHODS: We collected driving and gaze/head motion data (47,998 observations) during controlled euglycemia and hypoglycemia from 30 individuals with type 1 diabetes (24 male participants; mean ±SD age, 40.1±10.3 years; mean glycated hemoglobin value, 6.9±0.7% [51.9±8.0 mmol/mol]) while participants drove a real car. ML models were built and evaluated to detect hypoglycemia solely on the basis of data regarding driving characteristics and gaze/head motion.
RESULTS: The ML approach detected hypoglycemia with high accuracy (area under the receiver-operating characteristic curve [AUROC], 0.80±0.11). When restricted to either driving characteristics or gaze/head motion data only, the detection performance remained high (AUROC, 0.73±0.07 and 0.70±0.16, respectively).
CONCLUSIONS: Hypoglycemia could be detected noninvasively during real car driving with an ML approach that used only data on driving characteristics and gaze/head motion, thus improving driving safety and self-management for people with diabetes. Interpretable ML also provided novel insights into behavioral changes in people driving while hypoglycemic. (Funded by the Swiss National Science Foundation and others; ClinicalTrials.gov numbers, NCT04569630 and NCT05308095.)

MCML Authors

Simon Schallmoser

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

2023

[23]

M. von Zahn, O. Hinz and S. Feuerriegel.
Locating disparities in machine learning.
IEEE BigData 2023 - IEEE International Conference on Big Data. Sorrento, Italy, Dec 15-18, 2023. DOI

Abstract

Machine learning can provide predictions with disparate outcomes, in which subgroups of the population (e.g., defined by age, gender, or other sensitive attributes) are systematically disadvantaged. In order to comply with upcoming legislation, practitioners need to locate such disparate outcomes. However, previous literature typically detects disparities through statistical procedures for when the sensitive attribute is specified a priori. This limits applicability in real-world settings where datasets are high dimensional and, on top of that, sensitive attributes may be unknown. As a remedy, we propose a data-driven framework called Automatic Location of Disparities (ALD) which aims at locating disparities in machine learning. ALD meets several demands from industry: ALD (1) is applicable to arbitrary machine learning classifiers; (2) operates on different definitions of disparities (e.g., statistical parity or equalized odds); (3) deals with both categorical and continuous predictors even if disparities arise from complex and multi-way interactions known as intersectionality (e.g., age above 60 and female). ALD produces interpretable audit reports as output. We demonstrate the effectiveness of ALD based on both synthetic and real-world datasets. As a result, we empower practitioners to effectively locate and mitigate disparities in machine learning algorithms, conduct algorithmic audits, and protect individuals from discrimination.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[22]

D. Frauen, V. Melnychuk and S. Feuerriegel.
Sharp Bounds for Generalized Causal Sensitivity Analysis.
NeurIPS 2023 - 37th Conference on Neural Information Processing Systems. New Orleans, LA, USA, Dec 10-16, 2023. URL

Abstract

Causal inference from observational data is crucial for many disciplines such as medicine and economics. However, sharp bounds for causal effects under relaxations of the unconfoundedness assumption (causal sensitivity analysis) are subject to ongoing research. So far, works with sharp bounds are restricted to fairly simple settings (e.g., a single binary treatment). In this paper, we propose a unified framework for causal sensitivity analysis under unobserved confounding in various settings. For this, we propose a flexible generalization of the marginal sensitivity model (MSM) and then derive sharp bounds for a large class of causal effects. This includes (conditional) average treatment effects, effects for mediation analysis and path analysis, and distributional effects. Furthermore, our sensitivity model is applicable to discrete, continuous, and time-varying treatments. It allows us to interpret the partial identification problem under unobserved confounding as a distribution shift in the latent confounders while evaluating the causal effect of interest. In the special case of a single binary treatment, our bounds for (conditional) average treatment effects coincide with recent optimality results for causal sensitivity analysis. Finally, we propose a scalable algorithm to estimate our sharp bounds from observational data.

MCML Authors

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[21]

V. Melnychuk, D. Frauen and S. Feuerriegel.
Partial Counterfactual Identification of Continuous Outcomes with a Curvature Sensitivity Model.
NeurIPS 2023 - 37th Conference on Neural Information Processing Systems. New Orleans, LA, USA, Dec 10-16, 2023. URL

Abstract

Counterfactual inference aims to answer retrospective ‘what if’ questions and thus belongs to the most fine-grained type of inference in Pearl’s causality ladder. Existing methods for counterfactual inference with continuous outcomes aim at point identification and thus make strong and unnatural assumptions about the underlying structural causal model. In this paper, we relax these assumptions and aim at partial counterfactual identification of continuous outcomes, i.e., when the counterfactual query resides in an ignorance interval with informative bounds. We prove that, in general, the ignorance interval of the counterfactual queries has non-informative bounds, already when functions of structural causal models are continuously differentiable. As a remedy, we propose a novel sensitivity model called Curvature Sensitivity Model. This allows us to obtain informative bounds by bounding the curvature of level sets of the functions. We further show that existing point counterfactual identification methods are special cases of our Curvature Sensitivity Model when the bound of the curvature is set to zero. We then propose an implementation of our Curvature Sensitivity Model in the form of a novel deep generative model, which we call Augmented Pseudo-Invertible Decoder. Our implementation employs (i) residual normalizing flows with (ii) variational augmentations. We empirically demonstrate the effectiveness of our Augmented Pseudo-Invertible Decoder. To the best of our knowledge, ours is the first partial identification model for Markovian structural causal models with continuous outcomes.

MCML Authors

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[20]

J. Schweisthal, D. Frauen, V. Melnychuk and S. Feuerriegel.
Reliable Off-Policy Learning for Dosage Combinations.
NeurIPS 2023 - 37th Conference on Neural Information Processing Systems. New Orleans, LA, USA, Dec 10-16, 2023. URL

Abstract

Decision-making in personalized medicine such as cancer therapy or critical care must often make choices for dosage combinations, i.e., multiple continuous treatments. Existing work for this task has modeled the effect of multiple treatments independently, while estimating the joint effect has received little attention but comes with non-trivial challenges. In this paper, we propose a novel method for reliable off-policy learning for dosage combinations. Our method proceeds along three steps: (1) We develop a tailored neural network that estimates the individualized dose-response function while accounting for the joint effect of multiple dependent dosages. (2) We estimate the generalized propensity score using conditional normalizing flows in order to detect regions with limited overlap in the shared covariate-treatment space. (3) We present a gradient-based learning algorithm to find the optimal, individualized dosage combinations. Here, we ensure reliable estimation of the policy value by avoiding regions with limited overlap. We finally perform an extensive evaluation of our method to show its effectiveness. To the best of our knowledge, ours is the first work to provide a method for reliable off-policy learning for optimal dosage combinations.

MCML Authors

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[19]

D. Geißler, D. Bär, N. Pröllochs and S. Feuerriegel.
Russian propaganda on social media during the 2022 invasion of Ukraine.
EPJ Data Science (Dec. 2023). DOI

Abstract

The Russian invasion of Ukraine in February 2022 was accompanied by practices of information warfare, yet existing evidence is largely anecdotal while large-scale empirical evidence is lacking. Here, we analyze the spread of pro-Russian support on social media. For this, we collected messages from Twitter with pro-Russian support. Our findings suggest that pro-Russian messages received ∼251,000 retweets and thereby reached around 14.4 million users. We further provide evidence that bots played a disproportionate role in the dissemination of pro-Russian messages and amplified its proliferation in early-stage diffusion. Countries that abstained from voting on the United Nations Resolution ES-11/1 such as India, South Africa, and Pakistan showed pronounced activity of bots. Overall, 20.28% of the spreaders are classified as bots, most of which were created at the beginning of the invasion. Together, our findings suggest the presence of a large-scale Russian propaganda campaign on social media and highlight the new threats to society that originate from it. Our results also suggest that curbing bots may be an effective strategy to mitigate such campaigns.

MCML Authors

Dominique Geißler

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dominik Bär

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[18]

J. Rausch, G. Rashiti, M. Gusev, C. Zhang and S. Feuerriegel.
DSG: An End-to-End Document Structure Generator.
ICDM 2023 - 23rd IEEE International Conference on Data Mining. Shanghai, China, Dec 01-04, 2023. DOI

Abstract

Information in industry, research, and the public sector is widely stored as rendered documents (e.g., PDF files, scans). Hence, to enable downstream tasks, systems are needed that map rendered documents onto a structured hierarchical format. However, existing systems for this task are limited by heuristics and are not end-to-end trainable. In this work, we introduce the Document Structure Generator (DSG), a novel system for document parsing that is fully end-to-end trainable. DSG combines a deep neural network for parsing (i) entities in documents (e.g., figures, text blocks, headers, etc.) and (ii) relations that capture the sequence and nested structure between entities. Unlike existing systems that rely on heuristics, our DSG is trained end-to-end, making it effective and flexible for real-world applications. We further contribute a new, large-scale dataset called E-Periodica comprising real-world magazines with complex document structures for evaluation. Our results demonstrate that our DSG outperforms commercial OCR tools and, on top of that, achieves state-of-the-art performance. To the best of our knowledge, our DSG system is the first end-to-end trainable system for hierarchical document parsing.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[17]

S. Feuerriegel, R. DiResta, J. A. Goldstein, S. Kumar, P. Lorenz-Spreen, M. Tomz and N. Pröllochs.
Research can help to tackle AI-generated disinformation.
Nature Human Behaviour 7 (Nov. 2023). DOI

Abstract

Generative artificial intelligence (AI) tools have made it easy to create realistic disinformation that is hard to detect by humans and may undermine public trust. Some approaches used for assessing the reliability of online information may no longer work in the AI age. We offer suggestions for how research can help to tackle the threats of AI-generated disinformation.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[16]

Y. R. Shrestha, G. von Krogh and S. Feuerriegel.
Building open-source AI.
Nature Computational Science 3.11 (Oct. 2023). DOI

Abstract

Artificial intelligence (AI) drives innovation across society, economies and science. We argue for the importance of building AI technology according to open-source principles to foster accessibility, collaboration, responsibility and interoperability.
The computer science community has a long tradition of embracing open-source principles. However, companies increasingly restrict access to AI innovations. An example is OpenAI, which was founded to make scientific research openly available but which eventually restricted access to research findings. Although such a strategy reflects a company’s legitimate incentive to obtain financial returns, such protection increases concentration of power, restricting access to AI technology. Further down the road, concentrated power could lead to growing inequality in AI research, education and public use. Here we discuss why proprietary AI technology should be complemented by open-source AI across the essential components for building AI technology: datasets, source codes and models.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[15]

Y. Ma, D. Frauen, V. Melnychuk and S. Feuerriegel.
Counterfactual Fairness for Predictions using Generative Adversarial Networks.
Preprint (Oct. 2023). arXiv

Abstract

Fairness in predictions is of direct importance in practice due to legal, ethical, and societal reasons. It is often achieved through counterfactual fairness, which ensures that the prediction for an individual is the same as that in a counterfactual world under a different sensitive attribute. However, achieving counterfactual fairness is challenging as counterfactuals are unobservable. In this paper, we develop a novel deep neural network called Generative Counterfactual Fairness Network (GCFN) for making predictions under counterfactual fairness. Specifically, we leverage a tailored generative adversarial network to directly learn the counterfactual distribution of the descendants of the sensitive attribute, which we then use to enforce fair predictions through a novel counterfactual mediator regularization. If the counterfactual distribution is learned sufficiently well, our method is mathematically guaranteed to ensure the notion of counterfactual fairness. Thereby, our GCFN addresses key shortcomings of existing baselines that are based on inferring latent variables, yet which (a) are potentially correlated with the sensitive attributes and thus lead to bias, and (b) have weak capability in constructing latent representations and thus low prediction performance. Across various experiments, our method achieves state-of-the-art performance. Using a real-world case study from recidivism prediction, we further demonstrate that our method makes meaningful predictions in practice.

MCML Authors

Yuchen Ma

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[14]

D. Bär, N. Pröllochs and S. Feuerriegel.
New Threats to Society from Free-Speech Social Media Platforms.
Communications of the ACM 66.10 (Sep. 2023). DOI

Abstract

Understanding emerging threats from social media platforms.

MCML Authors

Dominik Bär

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[13]

M. Toetzke, B. Probst and S. Feuerriegel.
Leveraging large language models to monitor climate technology innovation.
Environmental Research Letters 18.9 (Sep. 2023). DOI

Abstract

To achieve net-zero emissions, public policy needs to foster rapid innovation of climate technologies. However, there is a scarcity of comprehensive and up-to-date evidence to guide policymaking by monitoring climate innovation systems. This is notable, especially at the center of the innovation process, where nascent inventions transition into profitable and scalable market solutions. Here, we discuss the potential of large language models (LLMs) to monitor climate technology innovation. By analyzing large pools of unstructured text data sources, such as company reports and social media, LLMs can automate information retrieval processes and thereby improve existing monitoring in terms of cost-effectiveness, timeliness, and comprehensiveness. In this perspective, we show how LLMs can play a crucial role in informing innovation policy for the energy transition by highlighting promising use cases and prevailing challenges for research and policy.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[12]

V. Melnychuk, D. Frauen and S. Feuerriegel.
Normalizing Flows for Interventional Density Estimation.
ICML 2023 - 40th International Conference on Machine Learning. Honolulu, Hawaii, Jul 23-29, 2023. URL

Abstract

Existing machine learning methods for causal inference usually estimate quantities expressed via the mean of potential outcomes (e.g., average treatment effect). However, such quantities do not capture the full information about the distribution of potential outcomes. In this work, we estimate the density of potential outcomes after interventions from observational data. For this, we propose a novel, fully-parametric deep learning method called Interventional Normalizing Flows. Specifically, we combine two normalizing flows, namely (i) a nuisance flow for estimating nuisance parameters and (ii) a target flow for parametric estimation of the density of potential outcomes. We further develop a tractable optimization objective based on a one-step bias correction for efficient and doubly robust estimation of the target flow parameters. As a result, our Interventional Normalizing Flows offer a properly normalized density estimator. Across various experiments, we demonstrate that our Interventional Normalizing Flows are expressive and highly effective, and scale well with both sample size and high-dimensional confounding. To the best of our knowledge, our Interventional Normalizing Flows are the first proper fully-parametric, deep learning method for density estimation of potential outcomes.

MCML Authors

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[11]

D. Bär, N. Pröllochs and S. Feuerriegel.
Finding Qs: Profiling QAnon Supporters on Parler.
ICWSM 2023 - 17th International AAAI Conference on Web and Social Media. Limassol, Cyprus, Jun 05-08, 2023. DOI

Abstract

The social media platform ‘Parler has emerged into a prominent fringe community where a significant part of the user base are self-reported supporters of QAnon, a far-right conspiracy theory alleging that a cabal of elites controls global politics. QAnon is considered to have had an influential role in the public discourse during the 2020 U.S. presidential election. However, little is known about QAnon supporters on Parler and what sets them aside from other users. Building up on social identity theory, we aim to profile the characteristics of QAnon supporters on Parler. We analyze a large-scale dataset with more than 600,000 profiles of English-speaking users on Parler. Based on users’ profiles, posts, and comments, we then extract a comprehensive set of user features, linguistic features, network features, and content features. This allows us to perform user profiling and understand to what extent these features discriminate between QAnon and non-QAnon supporters on Parler. Our analysis is three-fold: (1) We quantify the number of QAnon supporters on Parler, finding that 34,913 users (5.5% of all users) openly report supporting the conspiracy. (2) We examine differences between QAnon vs. non-QAnon supporters. We find that QAnon supporters differ statistically significantly from non-QAnon supporters across multiple dimensions. For example, they have, on average, a larger number of followers, followees, and posts, and thus have a large impact on the Parler network. (3) We use machine learning to identify which user characteristics discriminate QAnon from non-QAnon supporters. We find that user features, linguistic features, network features, and content features, can - to a large extent - discriminate QAnon vs. non-QAnon supporters on Parler. In particular, we find that user features are highly discriminatory, followed by content features and linguistic features.

MCML Authors

Dominik Bär

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[10]

D. Frauen and S. Feuerriegel.
Estimating individual treatment effects under unobserved confounding using binary instruments.
ICLR 2023 - 11th International Conference on Learning Representations. Kigali, Rwanda, May 01-05, 2023. URL

Abstract

Estimating conditional average treatment effects (CATEs) from observational data is relevant in many fields such as personalized medicine. However, in practice, the treatment assignment is usually confounded by unobserved variables and thus introduces bias. A remedy to remove the bias is the use of instrumental variables (IVs). Such settings are widespread in medicine (e.g., trials where the treatment assignment is used as binary IV). In this paper, we propose a novel, multiply robust machine learning framework, called MRIV, for estimating CATEs using binary IVs and thus yield an unbiased CATE estimator. Different from previous work for binary IVs, our framework estimates the CATE directly via a pseudo outcome regression. (1)~We provide a theoretical analysis where we show that our framework yields multiple robust convergence rates: our CATE estimator achieves fast convergence even if several nuisance estimators converge slowly. (2)~We further show that our framework asymptotically outperforms state-of-the-art plug-in IV methods for CATE estimation, in the sense that it achieves a faster rate of convergence if the CATE is smoother than the individual outcome surfaces. (3)~We build upon our theoretical results and propose a tailored deep neural network architecture called MRIV-Net for CATE estimation using binary IVs. Across various computational experiments, we demonstrate empirically that our MRIV-Net achieves state-of-the-art performance. To the best of our knowledge, our MRIV is the first multiply robust machine learning framework tailored to estimating CATEs in the binary IV setting.

MCML Authors

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[9]

N. Banholzer, T. Mellan, H. J. T. Unwin, S. Feuerriegel, S. Mishra and S. Bhatt.
A comparison of short-term probabilistic forecasts for the incidence of COVID-19 using mechanistic and statistical time series models.
Preprint (May. 2023). arXiv

Abstract

Short-term forecasts of infectious disease spread are a critical component in risk evaluation and public health decision making. While different models for short-term forecasting have been developed, open questions about their relative performance remain. Here, we compare short-term probabilistic forecasts of popular mechanistic models based on the renewal equation with forecasts of statistical time series models. Our empirical comparison is based on data of the daily incidence of COVID-19 across six large US states over the first pandemic year. We find that, on average, probabilistic forecasts from statistical time series models are overall at least as accurate as forecasts from mechanistic models. Moreover, statistical time series models better capture volatility. Our findings suggest that domain knowledge, which is integrated into mechanistic models by making assumptions about disease dynamics, does not improve short-term forecasts of disease incidence. We note, however, that forecasting is often only one of many objectives and thus mechanistic models remain important, for example, to model the impact of vaccines or the emergence of new variants.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[8]

D. Bär, F. Calderon, M. Lawlor, S. Licklederer, M. Totzauer and S. Feuerriegel.
Analyzing Social Media Activities at Bellingcat.
WebSci 2023 - 15th ACM Web Science Conference 2023. Austin, TX, USA, Apr 30-May 01, 2023. DOI

Abstract

Open-source journalism emerged as a new phenomenon in the media ecosystem, which uses crowdsourcing to fact-check and generate investigative reports for world events using open sources (e.g., social media). A particularly prominent example is Bellingcat. Bellingcat is known for its investigations on the illegal use of chemical weapons during the Syrian war, the Russian responsibility for downing flight MH17, the identification of the perpetrators in the attempted murder of Alexei Navalny, and war crimes in the Russo-Ukraine war. Crucial for this is social media in order to disseminate findings and crowdsource fact-checks. In this work, we characterize the social media activities at Bellingcat on Twitter. For this, we built a comprehensive dataset of all N=24,682 tweets posted by Bellingcat on Twitter since its inception in July 2014. Our analysis is three-fold: (1) We analyze how Bellingcat uses Twitter to disseminate information and collect information from its follower base. Here, we find a steady increase in both posts and replies over time, particularly during the Russo-Ukrainian war, which is in line with the growing importance of Bellingcat for the traditional media ecosystem. (2) We identify characteristics of posts that are successful in eliciting user engagement. User engagement is particularly large for posts embedding additional media items and with a more negative sentiment. (3) We examine how the follower base has responded to the Russian invasion of Ukraine. Here, we find that the sentiment has become more polarized and negative. We attribute this to a ~13-fold increase in bots interacting with the Bellingcat account. Overall, our findings provide recommendations for how open-source journalism such as Bellingcat can successfully operate on social media.

MCML Authors

Dominik Bär

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[7]

N. Pröllochs and S. Feuerriegel.
Mechanisms of True and False Rumor Sharing in Social Media: Collective Intelligence or Herd Behavior?
CHI 2023 - Conference on Human Factors in Computing Systems. Hamburg, Germany, Apr 23-28, 2023. DOI

Abstract

Social media platforms disseminate extensive volumes of online content, including true and, in particular, false rumors. Previous literature has studied the diffusion of offline rumors, yet more research is needed to understand the diffusion of online rumors. In this paper, we examine the role of lifetime and crowd effects in social media sharing behavior for true vs. false rumors. Based on 126,301 Twitter cascades, we find that the sharing behavior is characterized by lifetime and crowd effects that explain differences in the spread of true as opposed to false rumors. All else equal, we find that a longer lifetime is associated with less sharing activities, yet the reduction in sharing is larger for false than for true rumors. Hence, lifetime is an important determinant explaining why false rumors die out. Furthermore, we find that the spread of false rumors is characterized by herding tendencies (rather than collective intelligence), whereby the spread of false rumors becomes proliferated at a larger cascade depth. These findings explain differences in the diffusion dynamics of true and false rumors and further offer practical implications for social media platforms.

MCML Authors

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[6]

D. Frauen, T. Hatt, V. Melnychuk and S. Feuerriegel.
Estimating Average Causal Effects from Patient Trajectories.
AAAI 2023 - 37th Conference on Artificial Intelligence. Washington, DC, USA, Feb 07-14, 2023. DOI

Abstract

In medical practice, treatments are selected based on the expected causal effects on patient outcomes. Here, the gold standard for estimating causal effects are randomized controlled trials; however, such trials are costly and sometimes even unethical. Instead, medical practice is increasingly interested in estimating causal effects among patient (sub)groups from electronic health records, that is, observational data. In this paper, we aim at estimating the average causal effect (ACE) from observational data (patient trajectories) that are collected over time. For this, we propose DeepACE: an end-to-end deep learning model. DeepACE leverages the iterative G-computation formula to adjust for the bias induced by time-varying confounders. Moreover, we develop a novel sequential targeting procedure which ensures that DeepACE has favorable theoretical properties, i. e., is doubly robust and asymptotically efficient. To the best of our knowledge, this is the first work that proposes an end-to-end deep learning model tailored for estimating time-varying ACEs. We compare DeepACE in an extensive number of experiments, confirming that it achieves state-of-the-art performance. We further provide a case study for patients suffering from low back pain to demonstrate that DeepACE generates important and meaningful findings for clinical practice. Our work enables practitioners to develop effective treatment recommendations based on population effects.

MCML Authors

Dennis Frauen

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

[5]

S. Schallmoser, T. Zueger, M. Kraus, M. Saar-Tsechansky, C. Stettler and S. Feuerriegel.
Machine Learning for Predicting Micro- and Macrovascular Complications in Individuals With Prediabetes or Diabetes: Retrospective Cohort Study.
Journal of Medical Internet Research 25 (Feb. 2023). DOI

Abstract

Background: Micro- and macrovascular complications are a major burden for individuals with diabetes and can already arise in a prediabetic state. To allocate effective treatments and to possibly prevent these complications, identification of those at risk is essential.
Objective: This study aimed to build machine learning (ML) models that predict the risk of developing a micro- or macrovascular complication in individuals with prediabetes or diabetes.
Methods: In this study, we used electronic health records from Israel that contain information about demographics, biomarkers, medications, and disease codes; span from 2003 to 2013; and were queried to identify individuals with prediabetes or diabetes in 2008. Subsequently, we aimed to predict which of these individuals developed a micro- or macrovascular complication within the next 5 years. We included 3 microvascular complications: retinopathy, nephropathy, and neuropathy. In addition, we considered 3 macrovascular complications: peripheral vascular disease (PVD), cerebrovascular disease (CeVD), and cardiovascular disease (CVD). Complications were identified via disease codes, and, for nephropathy, the estimated glomerular filtration rate and albuminuria were considered additionally. Inclusion criteria were complete information on age and sex and on disease codes (or measurements of estimated glomerular filtration rate and albuminuria for nephropathy) until 2013 to account for patient dropout. Exclusion criteria for predicting a complication were diagnosis of this specific complication before or in 2008. In total, 105 predictors from demographics, biomarkers, medications, and disease codes were used to build the ML models. We compared 2 ML models: logistic regression and gradient-boosted decision trees (GBDTs). To explain the predictions of the GBDTs, we calculated Shapley additive explanations values.
Results: Overall, 13,904 and 4259 individuals with prediabetes and diabetes, respectively, were identified in our underlying data set. For individuals with prediabetes, the areas under the receiver operating characteristic curve for logistic regression and GBDTs were, respectively, 0.657 and 0.681 (retinopathy), 0.807 and 0.815 (nephropathy), 0.727 and 0.706 (neuropathy), 0.730 and 0.727 (PVD), 0.687 and 0.693 (CeVD), and 0.707 and 0.705 (CVD); for individuals with diabetes, the areas under the receiver operating characteristic curve were, respectively, 0.673 and 0.726 (retinopathy), 0.763 and 0.775 (nephropathy), 0.745 and 0.771 (neuropathy), 0.698 and 0.715 (PVD), 0.651 and 0.646 (CeVD), and 0.686 and 0.680 (CVD). Overall, the prediction performance is comparable for logistic regression and GBDTs. The Shapley additive explanations values showed that increased levels of blood glucose, glycated hemoglobin, and serum creatinine are risk factors for microvascular complications. Age and hypertension were associated with an elevated risk for macrovascular complications.
Conclusions: Our ML models allow for an identification of individuals with prediabetes or diabetes who are at increased risk of developing micro- or macrovascular complications. The prediction performance varied across complications and target populations but was in an acceptable range for most prediction tasks.

MCML Authors

Simon Schallmoser

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Stefan Feuerriegel

Prof. Dr.

C4 | Computational Social Sciences

Artificial Intelligence in Management

2022

[4]

E. Dorigatti, J. Schweisthal, B. Bischl and M. Rezaei.
Robust and Efficient Imbalanced Positive-Unlabeled Learning with Self-supervision.
Preprint (Sep. 2022). arXiv GitHub

Abstract

Learning from positive and unlabeled (PU) data is a setting where the learner only has access to positive and unlabeled samples while having no information on negative examples. Such PU setting is of great importance in various tasks such as medical diagnosis, social network analysis, financial markets analysis, and knowledge base completion, which also tend to be intrinsically imbalanced, i.e., where most examples are actually negatives. Most existing approaches for PU learning, however, only consider artificially balanced datasets and it is unclear how well they perform in the realistic scenario of imbalanced and long-tail data distribution. This paper proposes to tackle this challenge via robust and efficient self-supervised pretraining. However, training conventional self-supervised learning methods when applied with highly imbalanced PU distribution needs better reformulation. In this paper, we present textit{ImPULSeS}, a unified representation learning framework for underline{Im}balanced underline{P}ositive underline{U}nlabeled underline{L}earning leveraging underline{Se}lf-underline{S}upervised debiase pre-training. ImPULSeS uses a generic combination of large-scale unsupervised learning with debiased contrastive loss and additional reweighted PU loss. We performed different experiments across multiple datasets to show that ImPULSeS is able to halve the error rate of the previous state-of-the-art, even compared with previous methods that are given the true prior. Moreover, our method showed increased robustness to prior misspecification and superior performance even when pretraining was performed on an unrelated dataset. We anticipate such robustness and efficiency will make it much easier for practitioners to obtain excellent results on other PU datasets of interest.

MCML Authors

Emilio Dorigatti

Dr.

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

* Former Member

Jonas Schweisthal

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Bernd Bischl

Prof. Dr.

A1 | Statistical Foundations & Explainability

Statistical Learning and Data Science

Mina Rezaei

Dr.

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

Statistical Learning and Data Science

2020

[3]

V. Melnychuk, E. Faerman, I. Manakov and T. Seidl.
Matching the Clinical Reality: Accurate OCT-Based Diagnosis From Few Labels.
CIKMW @CIKM 2020 - Workshop at the 29th ACM International Conference on Information and Knowledge Management (CIKM 2020). Galway, Ireland, Oct 19-23, 2020. PDF GitHub

Abstract

Unlabeled data is often abundant in the clinic, making machine learning methods based on semi-supervised learning a good match for this setting. Despite this, they are currently receiving relatively little attention in medical image analysis literature. Instead, most practitioners and researchers focus on supervised or transfer learning approaches. The recently proposed Mix-Match and FixMatch algorithms have demonstrated promising results in extracting useful representations while requiring very few labels. Motivated by these recent successes, we apply MixMatch and FixMatch in an ophthalmological diagnostic setting and investigate how they fare against standard transfer learning. We find that both algorithms outperform the transfer learning baseline on all fractions of labelled data. Furthermore, our experiments show that Mean Teacher, which is a component of both algorithms, is not needed for our classification problem, as disabling it leaves the outcome unchanged.

MCML Authors

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Evgeny Faerman

Dr.

A3 | Computational Models
→ Group Matthias Schubert

* Former Member

Thomas Seidl

Prof. Dr.

A3 | Computational Models

Database Systems and Data Mining

[2]

M. Berrendorf, E. Faerman, V. Melnychuk, V. Tresp and T. Seidl.
Knowledge Graph Entity Alignment with Graph Convolutional Networks: Lessons Learned.
ECIR 2020 - 42nd European Conference on Information Retrieval. Virtual, Apr 14-17, 2020. DOI GitHub

Abstract

In this work, we focus on the problem of entity alignment in Knowledge Graphs (KG) and we report on our experiences when applying a Graph Convolutional Network (GCN) based model for this task. Variants of GCN are used in multiple state-of-the-art approaches and therefore it is important to understand the specifics and limitations of GCN-based models. Despite serious efforts, we were not able to fully reproduce the results from the original paper and after a thorough audit of the code provided by authors, we concluded, that their implementation is different from the architecture described in the paper. In addition, several tricks are required to make the model work and some of them are not very intuitive.We provide an extensive ablation study to quantify the effects these tricks and changes of architecture have on final performance. Furthermore, we examine current evaluation approaches and systematize available benchmark datasets.We believe that people interested in KG matching might profit from our work, as well as novices entering the field.

MCML Authors

Max Berrendorf

Dr.

A3 | Computational Models
→ Group Volker Tresp

* Former Member

Evgeny Faerman

Dr.

A3 | Computational Models
→ Group Matthias Schubert

* Former Member

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management

Volker Tresp

Prof. Dr.

A3 | Computational Models

Database Systems and Data Mining

Thomas Seidl

Prof. Dr.

A3 | Computational Models

Database Systems and Data Mining

[1]

D. Davletshina, V. Melnychuk, V. Tran, H. Singla, M. Berrendorf, E. Faerman, M. Fromm and M. Schubert.
Unsupervised Anomaly Detection for X-Ray Images.
Preprint (Jan. 2020). arXiv GitHub

Abstract

Obtaining labels for medical (image) data requires scarce and expensive experts. Moreover, due to ambiguous symptoms, single images rarely suffice to correctly diagnose a medical condition. Instead, it often requires to take additional background information such as the patient’s medical history or test results into account. Hence, instead of focusing on uninterpretable black-box systems delivering an uncertain final diagnosis in an end-to-end-fashion, we investigate how unsupervised methods trained on images without anomalies can be used to assist doctors in evaluating X-ray images of hands. Our method increases the efficiency of making a diagnosis and reduces the risk of missing important regions. Therefore, we adopt state-of-the-art approaches for unsupervised learning to detect anomalies and show how the outputs of these methods can be explained. To reduce the effect of noise, which often can be mistaken for an anomaly, we introduce a powerful preprocessing pipeline. We provide an extensive evaluation of different approaches and demonstrate empirically that even without labels it is possible to achieve satisfying results on a real-world dataset of X-ray images of hands. We also evaluate the importance of preprocessing and one of our main findings is that without it, most of our approaches perform not better than random.

MCML Authors

Valentyn Melnychuk

C4 | Computational Social Sciences
→ Group Stefan Feuerriegel

Artificial Intelligence in Management