Research

Philosophy and AI

[JH] In recent years, any number of universities have built bridges between computer science and ethics. Students whose future careers involve designing AIs, the thought is, should learn some basic ideas in ethics. Research done at the ValuesLab questions this widely held view. In a slogan, AI needs philosophy, not “only” ethics. Why? Ethics certainly has a role to play in the training of future AI designers. Many of the key questions, however, are studied elsewhere. Here are some examples. Can AIs think? Does it make sense to speak as if AIs held beliefs, make inferences, engage in reasoning, have intentions, and so on? The relevant notions here are studied in the philosophy of mind. What makes AIs explainable? Understanding, explanation, and justification are topics in epistemology. How do LLMs work, and how can their performance be improved? This involves any number of themes from the philosophy of language. What makes an AI model unfair? Here it matters how we conceive of causation and counterfactuals, key notions in metaphysics and the philosophy of science. The list goes on. The upshot is that core questions in major subfields of philosophy bear directly on challenges in AI. Instead of “embedded ethicists,” we argue, AI courses need embedded philosophers.

Explainable AI and Ancient Values

[KMV] Contrary to modern moral philosophy, ancient Greek ethics argues that values related to knowledge and understanding—in Greek, epistêmê—are fundamental to human life. This project revives this conviction and argues that it speaks to key concerns in AI. AIs decide or help decide who gets an interview for a job, whose loan application is approved, what a patient’s medical diagnosis is and what their best treatment options are, and so forth. Much of the research in the ethics of AI is about the fairness of these decisions. We argue that research should also address epistemic values such as understanding and explainability. We aim to contribute to so-called explainable AI (XAI), which appreciates that it is a basic feature of the human mind to ask and expect answers to why-questions.

Reasons, Causes, and Explainable AI

[TP] It is widely recognized that users need to be able to understand AI. This is the main goal of explainable AI research. But what is it, exactly, that users need to be able to understand? Presumably, a user needs to understand how a system makes a decision; they need to understand the so-called “black box,” that elusive function that takes input (data) and gives output (a decision, such as to reject a loan application). There are importantly different ways to understand a decision, however, that correspond to different ways of understanding and answering “why”-questions. Say you have a friend who believes that God exists and you ask her why she thinks so. She might answer by explaining the various things going on in the brain that figure in forming the belief, while clarifying that this is all based on models and hypotheses; no one in the field really knows what happened in her mind precisely when she formed the believe “God exists.” This would be analogous to the black-box problem. Alternatively, rather than a mechanical cause, she might answer with a social cause: because my parents brought me up to believe that God exists. Or, she might give you a reason in support of her belief: perhaps that God is the best explanation for the existence of our cosmos.

These distinctions could be important for research and development in explainable AI. At minimum, they help define what is at issue. This project aims to clarify the roles of these different “why”-questions and responses. In doing so, it asks whether and why the black box problem is a genuine problem. Would answers to the other why-question be sufficient for the individual questioner who is subject to AI-decisions? If not, what is missing?

Alignment and Confucian Ethics

[KMV] This project explores the resources of Confucian ethics for AI development, specifically with a view to value-alignment. It is widely assumed that ethics should inform AI design. Which ethics are we talking about? Researchers often invoke consequentialism, Kantian ethics, and Aristotle. Thus AI ethics seems to find itself squarely in the Western tradition. By contrast, we ask how Confucian philosophy may contribute to the development of AIs. Confucian ethics offers a framework that attends to (i) roles and relationships, (ii) constraints on what is sayable/doable for a polite person, and (iii) specified situations. For example, a son would not say such-and-such to a father at such-and-such an occasion. It is our hypothesis that this structure of ethical theory is helpful for AI ethics. For example, LLMs need to be trained on what not to say. Some of what counts as unsayable, offensive, and so on, is addressee- and context-dependent. We also consider AI models that are used in medicine, education, and so on. In these contexts, people speak as doctor to patients, as teacher to student, and so on. Presumably, AIs used in such domains should be informed by roles and relationships, conventions of the sayable/doable, and situations. In other words, Confucian ethics may provide a structure that can be modeled in AIs.

Videos

Philosophy and AI: An Introduction

Alignment

Lecture Slides

Do AIs have beliefs? Do they have intentions?

AI and Value Alignment

AI and Fairness

Papers

Measure Realism

in progress, co-authored with Jens Haas

Generics and Inference

in progress