Research


[Contributors: Qian Cao, Susan Danziger, Jens Haas, Lucas Haugeberg, Anthony Dyson Hejduk, Elliot Blake Hueske, Wooseok Kim, Helen Han Wei Luo, Tahlia Pajaczkowska-Russell, Taylor Pincin, Jonathan Tanaka, Katja Maria Vogt, Albert Wenger]

v2024.07.09


Agency

— Are current models built as if the AI pursues its ends? If yes, does this speak for a radical shift, toward models oriented toward human final ends?

— If AIs can have values or posited ends, should there be a built-in, strict dominance of human values to AI values?

— Should AIs try to emulate that human decision-making is fundamentally concerned with sustaining human life and guided by what agents take to be well-lived human lives?

— Who is responsible for decisions that the AI makes? Is the AI itself responsible, or are its creators responsible for the AI’s decisions?

— Is AI as a technology “value neutral,” simply reflective of the values of its creators and the data it is trained on? Alternatively, is it imbued with a bent toward, or away from, particular values?


Ethical Values

— Are there values, for example, related to the survival of humankind, that neither human beings nor intelligent machines should override?

— What notion(s) of fairness do AI researchers employ? 

— When researchers describe AIs as fair or just, do they invoke agential or systemic notions? In other words, should we think of AIs as agents in the world, who can have virtues such as justice or fairness? Should we think of them as components of the social environments that constrain human action?

— Can the special weight of moral considerations in human reasoning be simulated by AIs?

— Ethics is concerned with how human beings should live. Does this mean that AIs should ask “what should a human agent do?” (as opposed to “what should one do?”).

— Would it be appropriate for human beings to defer ethical decision-making to AIs? Does this manner of decision-making threaten human autonomy?


Truth and Other Epistemic Values

— What is the role of epistemic norms, for example, norms that request attention to evidence, careful thinking, etc., in AIs?

— Is sensitivity to value an additional, separable dimension of AIs, to be added to existing systems? Alternatively, are “ethical abilities” integrated dimensions of the “thinking abilities” of AIs, such that they improve along with them?

— What is an LLM’s relationship to the truth or falsity of its outputs? What does it mean for an LLM to “tell the truth,” “lie,” “hallucinate,” etc.? Can AIs have virtues such as honesty?

— How can LLMs distinguish between domains where responses to prompts should draw on expertise, and domains such as ethics where it is not immediately obvious what constitutes expertise? In the former, experts tend to agree; in the latter, even experts disagree.

— How do we assess what constitutes good thinking in human beings? Should AIs emulate excellent human thinking, or do they come with their own standards of excellence?


Credences and Risk

— Suppose that AIs should be designed such that they assign probabilistically coherent credences to propositions or to surrogates for propositions. What ought to be the credence thresholds for action or belief reports? 

— What should guide AI designers with regard to credence thresholds? Risk aversion?

— Should AIs be designed to primarily avoid assigning high credence to falsehoods, or should it primarily aim to assign high credences to truths?

— Can we afford to design AIs to make mistakes from time to time, or should all high credence assignments result from extremely good epistemic positions?


Mental States

— We don’t know whether AIs will ever have mental states such as intentions and beliefs. Lying arguably involves both: an intention to deceive and saying something one believes to be false. Is it a mere metaphor to describe AI-outputs in terms of truth-telling versus lying?

— Should we, instead of asking whether AIs can have beliefs and intentions, ask whether we must come up with broader notions of belief and intention, which can encompass human states and AI states?

— An AI-system may be said to have “information.” Does it “believe” the things it can provide as information? Does it “know” them? What are the functional analogues to mental states such as belief and knowledge?

— How do questions of interpretability bear on ethics? For example, are ethical questions ones where it is especially important to not only have the answer, but also to understand how the answer was generated? 

— Should AI models try to emulate the roles of emotion and desiderative/aversive attitudes in human decision-making? If yes, how?


Flawed Thinking

— Should value integration start from the premise that intelligent machines ought to help us become better thinkers? How does this relate to the frequently-asked question of whether intelligent machines are, or will soon become, better thinkers than we are?

— Via the corpora of text, images, etc., that LLMs ingest, they inherit flaws of human thinking, e.g., jumping to conclusions, fallacies. Which dimensions of human reasoning do we want to reproduce? Which dimensions of human reasoning can be improved with the help of machines?

— In human beings, informal fallacies are often treated as “shortcuts” or “fast track” thinking, saving time and mental energy in resource limited environments. Are we aiming for AIs without any such “shortcuts”? If yes, this would constitute a major difference between human and machine reasoning.

— Does an AI possess “inherent knowledge,” prior to ingesting data? Does the algorithm itself qualify as such? 


Pluralism and Disagreement

— How can computational intelligence recognize value pluralism, disagreement, and historical changes in evaluative outlooks?

— Is there a role for the use of different outputs from different LLMs? Do LLMs that are tailored to particular viewpoints generate “echo chamber” problems?

— How can AIs recognize that human evaluative outlooks tend to be fragmented and inconsistent? Should they seek to correct these inconsistencies?

— What is the effect of adding an extensive curriculum in ethics, including works that defend a range of approaches, to the training of an AI model?