Posts by Collection

portfolio

External investigation into an algorithm of the Dutch Education Executive Agency

Led the quantitative research for Algorithm Audit into an algorithm used by Dutch Education Executive Agency (DUO) that was suspected to be profiling against students of non-European migration backgrounds. Based on conclusions from our initial report, the Minister of education apologised for the use of the Algorithm. In a second report, after obtaining data from the Central Bureau of Statistics (CBS), we investigated the sources of predjudice for the algorithm. The second report can be found here here, and our code to reproduce findings from the report here. A working paper with our findings can be found here

publications

Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation

International Conference of Machine Learning (ICML) 2024

Out-of-distribution generalization in neural networks is often hampered by spurious correlations. A common strategy is to mitigate this by removing spurious concepts from the neural network representation of the data. Existing concept-removal methods tend to be overzealous by inadvertently eliminating features associated with the main task of the model, thereby harming model performance. We propose an iterative algorithm that separates spurious from main-task concepts by jointly identifying two low-dimensional orthogonal subspaces in the neural network representation. We evaluate the algorithm on benchmark datasets for computer vision (Waterbirds, CelebA) and natural language processing (MultiNLI), and show that it outperforms existing concept removal methods

Recommended citation: Holstege, F., Wouters, B., Giersbergen, N., & Diks, C. (2024). Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation. In Proceedings of the 41st International Conference on Machine Learning (pp. 18568–18610). PMLR.
Download Paper

Optimizing importance weighting in the presence of sub-population shifts

International Conference on Learning Representations (ICLR) 2025

A distribution shift between the training and test data can severely harm performance of machine learning models. Importance weighting addresses this issue by assigning different weights to data points during training. We argue that existing heuristics for determining the weights are suboptimal, as they neglect the increase of the variance of the estimated model due to the finite sample size of the training data. We interpret the optimal weights in terms of a bias-variance trade-off, and propose a bi-level optimization procedure in which the weights and model parameters are optimized simultaneously. We apply this optimization to existing importance weighting techniques for last-layer retraining of deep neural networks in the presence of sub-population shifts and show empirically that optimizing weights significantly improves generalization performance.

Download Paper


Preserving Task-Relevant Information Under Linear Concept Removal

NeurIPS 2025

Modern neural networks often encode unwanted concepts alongside task-relevant information, leading to fairness and interpretability concerns. Existing post-hoc approaches can remove undesired concepts but often degrade useful signals. We introduce SPLICE-Simultaneous Projection for LInear concept removal and Covariance prEservation-which eliminates sensitive concepts from representations while exactly preserving their covariance with a target label. SPLICE achieves this via an oblique projection that “splices out” the unwanted direction yet protects important label correlations. Theoretically, it is the unique solution that removes linear concept predictability and maintains target covariance with minimal embedding distortion. Empirically, SPLICE outperforms baselines on benchmarks such as Bias in Bios and Winobias, removing protected attributes while minimally damaging main-task information.

Download Paper

talks

teaching