Videos

Philosophy and AI: An Introduction

Part 1 of Intro to Philosophy and AI: A Proposal
Part 2 of Intro to Philosophy and AI: The Lazy Argument
Part 3 of Intro to Philosophy and AI: A Case Study

Alignment

Part 4 of Intro to Philosophy and AI: Pluralism and Value Alignment
Part 5 of Intro to Philosophy and AI: Alignment and Constitutional AI
Part 6 of Intro to Philosophy and AI: Alignment and Human Values
Part 7 of Intro to Philosophy and AI: Fairness in AI

Intelligence and Mental States

Part 8 of Intro to Philosophy and AI: Intelligence and Thinking
Part 9 of Intro to Philosophy and AI: The Turing Test
Part 10 of Intro to Philosophy and AI: Resisting Ascriptionism
Part 11 of Intro to Philosophy and AI: The Role of Memory

Lecture Slides

AI and Mental States

The Predictive Brain

AI and Value Alignment

AI and Fairness

Projects

Philosophy and AI

[JH] In recent years, any number of universities have built bridges between computer science and ethics. Students whose future careers involve designing AIs, the thought is, should learn some basic ideas in ethics. Research done at the ValuesLab questions this widely held view. In a slogan, AI needs philosophy, not “only” ethics. Why? Ethics certainly has a role to play in the training of future AI designers. Many of the key questions, however, are studied elsewhere. Here are some examples. Can AIs think? Does it make sense to speak as if AIs held beliefs, make inferences, engage in reasoning, have intentions, and so on? The relevant notions here are studied in the philosophy of mind. What makes AIs explainable? Understanding, explanation, and justification are topics in epistemology. How do LLMs work, and how can their performance be improved? This involves any number of themes from the philosophy of language. What makes an AI model unfair? Here it matters how we conceive of causation and counterfactuals, key notions in metaphysics and the philosophy of science. The list goes on. The upshot is that core questions in major subfields of philosophy bear directly on challenges in AI. Instead of “embedded ethicists,” we argue, AI courses need embedded philosophers.

Explainable AI and Ancient Values

[KMV] Contrary to modern moral philosophy, ancient Greek ethics argues that values related to knowledge and understanding—in Greek, epistêmê—are fundamental to human life. This project revives this conviction and argues that it speaks to key concerns in AI. AIs decide or help decide who gets an interview for a job, whose loan application is approved, what a patient’s medical diagnosis is and what their best treatment options are, and so forth. Much of the research in the ethics of AI is about the fairness of these decisions. We argue that research should also address epistemic values such as understanding and explainability. We aim to contribute to so-called explainable AI (XAI), which appreciates that it is a basic feature of the human mind to ask and expect answers to why-questions.

Reasons, Causes, and Explainable AI

[TP] It is widely recognized that users need to be able to understand AI. This is the main goal of explainable AI research. But what is it, exactly, that users need to be able to understand? Presumably, a user needs to understand how a system makes a decision; they need to understand the so-called “black box,” that elusive function that takes input (data) and gives output (a decision, such as to reject a loan application). There are importantly different ways to understand a decision, however, that correspond to different ways of understanding and answering “why”-questions. Say you have a friend who believes that God exists and you ask her why she thinks so. She might answer by explaining the various things going on in the brain that figure in forming the belief, while clarifying that this is all based on models and hypotheses; no one in the field really knows what happened in her mind precisely when she formed the believe “God exists.” This would be analogous to the black-box problem. Alternatively, rather than a mechanical cause, she might answer with a social cause: because my parents brought me up to believe that God exists. Or, she might give you a reason in support of her belief: perhaps that God is the best explanation for the existence of our cosmos.

These distinctions could be important for research and development in explainable AI. At minimum, they help define what is at issue. This project aims to clarify the roles of these different “why”-questions and responses. In doing so, it asks whether and why the black box problem is a genuine problem. Would answers to the other why-question be sufficient for the individual questioner who is subject to AI-decisions? If not, what is missing?

Alignment and Confucian Ethics

[KMV] This project explores the resources of Confucian ethics for AI development, specifically with a view to value-alignment. It is widely assumed that ethics should inform AI design. Which ethics are we talking about? Researchers often invoke consequentialism, Kantian ethics, and Aristotle. Thus AI ethics seems to find itself squarely in the Western tradition. By contrast, we ask how Confucian philosophy may contribute to the development of AIs. Confucian ethics offers a framework that attends to (i) roles and relationships, (ii) constraints on what is sayable/doable for a polite person, and (iii) specified situations. For example, a son would not say such-and-such to a father at such-and-such an occasion. It is our hypothesis that this structure of ethical theory is helpful for AI ethics. For example, LLMs need to be trained on what not to say. Some of what counts as unsayable, offensive, and so on, is addressee- and context-dependent. We also consider AI models that are used in medicine, education, and so on. In these contexts, people speak as doctor to patients, as teacher to student, and so on. Presumably, AIs used in such domains should be informed by roles and relationships, conventions of the sayable/doable, and situations. In other words, Confucian ethics may provide a structure that can be modeled in AIs.

Student Papers

Joseph Benjamin Karaganis

A Sentimental Education… For Robots: Incommensurability and the Bounds of Artificial Reason. This paper compares and evaluates two distinct proposals for the ‘alignment’ of sophisticated AI models: Peter Railton’s ‘ethical learning’ approach and Ruth Chang’s argument that humans should be ‘put in the loop’ when AI agents face ‘hard choices’. I suggest that while both of these strategies have their limitations, they each address an important piece of the broader alignment puzzle, and do so in complementary ways. Thus, an alignment approach that combines the two—and preserves their most important insights—is likely to be more successful than either pursued individually.

In Defense of Bibliotechnism: Immoderate Interpretationism in the Philosophy of AI. This paper defends ‘Bibliotechnism’—the view that all LLM-produced text is semantically-derivative, and therefore, that LLMs lack intentional attitudes. While some have argued that LLMs can create ‘novel references’, I suggest that these are actually derivative with respect to the model’s original dataset. I further claim that even if LLMs could generate original semantic content, this would not warrant the attribution of intentionality.

Oscar Alexander Lloyd

Normativity in Large Language Models. This project examines how and if normativity transfers in Large Language Models between the input data and output text. It proposes a relational rather than semantic understanding of the ‘reasoning’ performed by these models based on their inability to take into account both sense and reference. This means that text generated by an LLM, which we might ordinarily treat as intuitively motivational, fails to have normative power and we should treat it as such.

Henry Michaelson

From Panopticon to Protocol: Reimagining Social Contract Theory in the Age of Web3. Through this paper, I hope to explore and establish that the conditions under which the social contract theories were formulated in the pre-digital world no longer apply. Large technology companies have systematically attempted to erode and eradicate the moral and political institutions—first and foremost the state—that were theorized within the framework of a social contract. Rather than argue that the notion of a social contract no longer makes sense, I contend that the rise of Web3 technology marks a shift not back to the previous status quo, but toward a more practical and reinvigorated social contract, both politically and morally.