Mitigating Occupational Gender Bias in CLIP through Direction Loss-Augmented Learnable Prompts

Rahmi Fariza; Kurniawati Azizah

doi:10.58421/misro.v5i2.1674

DOI:

https://doi.org/10.58421/misro.v5i2.1674

Authors

Rahmi Fariza Universitas Indonesia https://orcid.org/0009-0000-8074-6010
Kurniawati Azizah Universitas Indonesia

Keywords:

Content Optimization, The Vision Language, Gender Direction, Loss Function, CLIP

Abstract

Vision-language models such as CLIP achieve strong zero-shot performance but inherit gender bias from their web-scale pretraining data, which is especially visible when the model is used to retrieve images for occupations. Existing prompt-based debiasing methods rely on manually crafted text prompts, which require extensive trial and error and dont transfer easily across professions. This study proposes CoOp with Direction Loss (CoOp+DL), which augments Context Optimization (CoOp), a learnable-prompt method, with an auxiliary loss that pushes the learned prompt representations away from a gender direction computed from contrasting male- and female-referencing prompts. The framework is evaluated on 500 images covering 10 professions with a balanced gender distribution, using three CLIP backbones (ViT-B/32, ViT-B/16, and OpenCLIP ViT-B/32) and three metrics: Gender Bias Score (GBS), Precision-at-K, and SignedSkew. CoOp+DL reduces GBS by 10.3% on ViT-B/32, 5.9% on ViT-B/16, and 9.7% on OpenCLIP, an average of 8.65% across backbones, with bootstrap confidence intervals (n = 1,000) indicating that the direction loss is an active contributor to this reduction rather than an artifact of additional prompt capacity. Retrieval utility (Precision@K) improves on ViT-B/32 and ViT-B/16 (+6.8% and +4.3%) but decreases on OpenCLIP (−8.2%), indicating a backbone-dependent fairness-utility trade-off. CoOp+DL achieves bias reduction that is statistically comparable to a manually engineered ensemble prompt, without requiring manual prompt design. The findings should be interpreted with caution, given the modest evaluation set (500 images, 10 professions) and the binary gender formulation used to define the direction vector, both of which limit generalization and warrant further validation before deployment.

Downloads

Download data is not yet available.

References

J. Zhang, J. Huang, S. Jin, and S. Lu, “Vision-language models for vision tasks: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 8, pp. 5625–5644, 2024.

Y. Du, Z. Liu, J. Li, and W. X. Zhao, “A survey of vision-language pre-trained models,” arXiv Prepr. arXiv2202.10936, 2022.

C. Jia et al., “Scaling up visual and vision-language representation learning with noisy text supervision,” in International conference on machine learning, PMLR, 2021, pp. 4904–4916.

C. Wen, Z. Peng, Y. Huang, X. Yang, and W. Shen, “Domain generalization in clip via learning with diverse text prompts,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025, pp. 9559–9569.

K. Hamidieh, H. Zhang, W. Gerych, T. Hartvigsen, and M. Ghassemi, “Identifying implicit social biases in vision-language models,” in Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 2024, pp. 547–561.

Z. Al Sahili, I. Patras, and M. Purver, “Data Matters Most: Auditing Social Bias in Contrastive Vision Language Models,” arXiv Prepr. arXiv2501.13223, 2025.

C.-Y. Chuang, V. Jampani, Y. Li, A. Torralba, and S. Jegelka, “Debiasing vision-language models via biased prompts,” arXiv Prepr. arXiv2302.00070, 2023.

J. Gu et al., “A systematic survey of prompt engineering on vision-language foundation models,” arXiv Prepr. arXiv2307.12980, 2023.

H. Jung, T. Jang, and X. Wang, “A unified debiasing approach for vision-language models across modalities and tasks,” Adv. Neural Inf. Process. Syst., vol. 37, pp. 21034–21058, 2024.

K. Zhou, J. Yang, C. C. Loy, and Z. Liu, “Learning to prompt for vision-language models,” Int. J. Comput. Vis., vol. 130, no. 9, pp. 2337–2348, 2022.

A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv Prepr. arXiv2010.11929, 2020.

S. Pratt, I. Covert, R. Liu, and A. Farhadi, “What does a platypus look like? generating customized prompts for zero-shot image classification,” in Proceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 15691–15701.

B. Zhu, Y. Niu, Y. Han, Y. Wu, and H. Zhang, “Prompt-aligned gradient for prompt tuning,” in Proceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 15659–15669.

A. Radford et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning, PmLR, 2021, pp. 8748–8763.

S. C. Geyik, S. Ambler, and K. Kenthapadi, “Fairness-aware ranking in search & recommendation systems with application to linkedin talent search,” in Proceedings of the 25th acm sigkdd international conference on knowledge discovery & data mining, 2019, pp. 2221–2231.

A. Seth, M. Hemani, and C. Agarwal, “Dear: Debiasing vision-language models with additive residuals,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 6820–6829.

N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan, “A survey on bias and fairness in machine learning,” ACM Comput. Surv., vol. 54, no. 6, pp. 1–35, 2021.

H. Berg, S. Hall, Y. Bhalgat, H. Kirk, A. Shtedritski, and M. Bain, “A prompt array keeps the bias away: Debiasing vision-language models with adversarial learning,” in Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2022, pp. 806–822.

C. D. Manning, P. Raghavan, and H. Schütze, Introduction to information retrieval. Cambridge university press, 2008.

A. Kumar, A. Raghunathan, R. Jones, T. Ma, and P. Liang, “Fine-tuning can distort pretrained features and underperform out-of-distribution,” arXiv Prepr. arXiv2202.10054, 2022.

T. Wang, J. Zhao, M. Yatskar, K.-W. Chang, and V. Ordonez, “Balanced datasets are not enough: Estimating and mitigating gender bias in deep image representations,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 5310–5319.

Y. Xian, C. H. Lampert, B. Schiele, and Z. Akata, “Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 9, pp. 2251–2265, 2018.

S. Muradova and W. H. Seitz, “Gender discrimination in hiring: Evidence from an audit experiment in Uzbekistan,” The World Bank, 2021.

S. Y. Park and E. Oh, “Getting a foot in the door: A meta-analysis of us audit studies of gender bias in hiring,” Sociol. Sci., vol. 12, pp. 26–50, 2025.

M. Hall, L. Gustafson, A. Adcock, I. Misra, and C. Ross, “Vision-language models performing zero-shot tasks exhibit disparities between gender groups,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 2778–2785.