The development of ethical artificial intelligence has attracted intense attention in recent years, with numerous frameworks emerging from governments, non-profits, and private industry. These frameworks aim to reduce or eliminate harms linked to AI adoption, but translating them into practice has proven challenging. One consequence is the accumulation of “ethical debt,” a term analogous to technical debt, where ethical consequences are ignored until they manifest. As the Ad Hoc Committee on Responsible Computing stated, “The people who design, develop, or deploy a computing artifact are morally responsible for that artifact, and for the foreseeable effects of that artifact.”

Regulatory measures, particularly in Europe, are beginning to codify this responsibility, enabling lawsuits against AI developers for harms or imposing fines on platforms that fail to manage them. The United States has introduced an “AI Bill of Rights,” though criticisms highlight its limitations. Yet, foreseeing harms is complicated by emergent behaviors—properties arising from interactions within sociotechnical systems that cannot be predicted by examining components in isolation. At the individual level, AI integration can alter trust dynamics and workflow patterns. At the system level, it can reshape the nature of work and collaboration.
Legal doctrines around liability depend on whether harms could reasonably have been foreseen. If emergent harms were unpredictable, developers may escape accountability. While explainability is often proposed as a solution, scholars such as Selbst argue that individuals cannot be expected to fully understand AI systems, regardless of interpretability. Cofone suggests classifying AI by its level of emergence to better assess predictability.
To improve foresight, principles from Naturalistic Decision Making (NDM) offer a promising approach. NDM studies how experienced professionals make decisions under realistic conditions characterized by complexity, uncertainty, and time pressure. Unlike laboratory-based microcognitive paradigms, NDM examines how experts recognize cues, diagnose anomalies, and mentally simulate outcomes in novel situations. This macrocognitive perspective enables the study of emergent phenomena such as feedback loops and self-organization, and has been applied in aviation, healthcare, firefighting, intelligence analysis, and military contexts.
Among NDM tools, the premortem technique stands out. A premortem reverses the logic of a postmortem: participants assume a project has failed and quickly list reasons why. Facilitators collect unique risks from each participant, often uncovering creative, interdisciplinary insights that span human and technological factors. Such risks might include user behavior patterns or long-term cultural shifts that would otherwise be overlooked. Studies have shown premortems can identify personnel, policy, operational, and organizational risks, making them valuable for predicting ethical harms from AI in complex systems.
Other NDM methods include analytic wargames, which allow experienced participants to explore disruptive technologies and observe emergent behaviors, and the critical incident technique, used to create evidence-based checklists for AI developers. Newer approaches such as Systematic Contributors and Adaptation Diagramming (SCAD) and Joint Activity Monitoring (JAM) aim to identify systemic issues proactively, though they rely on data from existing systems and may not fully anticipate novel AI impacts.
Predictive policing provides a telling case study. Intended to reduce crime and improve policing efficiency, these AI systems have instead generated ethical harms, including disproportionate targeting of certain groups and threats to privacy. Contributing factors include poorly annotated or manipulated data, biased training sets, and unclear guidance for interpreting AI outputs. Premortems could have identified many of these risks before deployment, by bringing diverse perspectives to bear on potential failure points.
While NDM tools cannot guarantee exhaustive risk identification, they can reveal a reasonable set of emergent harms, helping developers address ethical debt before it accumulates. The challenge remains to integrate these tools into AI development processes in ways that inform requirements, design, and evaluation. NDM’s track record in fostering trustworthy and resilient systems suggests it has significant value for AI ethics, even if it is not a universal solution.
