Regulating LLMs in Warfare: A U.S. Strategy for Military AI Accountability
DOI:
https://doi.org/10.60690/g2m7y037Keywords:
AI, defense, military, AI Policy, Large Language Models, Technology RegulationAbstract
Large language models (LLMs) are rapidly entering military workflows that shape intelligence synthesis, operational planning, logistics, cyber operations, and information activities, yet U.S. governance has not kept pace with their distinct risk profile. This memo argues that existing frameworks remain ill-suited to LLM-enabled decision-support: international efforts under the UN Convention on Certain Conventional Weapons focus primarily on lethal autonomous weapons, while U.S. policy relies on high-level ethical principles that have not been operationalized into enforceable requirements for evaluation, monitoring, logging, and lifecycle control. The paper identifies four core risks arising from LLM deployment in high-consequence contexts: inadvertent escalation driven by overconfident or brittle recommendations under uncertainty; scalable information operations and disinformation; expanded security vulnerabilities including data poisoning, prompt-injection, and sensitive-data leakage; and accountability gaps when human actors defer responsibility to opaque model outputs. In response, the memo proposes a U.S. regulatory framework organized around four pillars: (1) human decision rights and escalation controls, including documented authorization for crisis-sensitive uses; (2) mandatory human review and traceability for information-operations content; (3) baseline security, data governance, and continuous adversarial testing for training and deployment pipelines; and (4) accountability mechanisms, including auditable logs and incident reporting overseen by an independent Military AI Oversight Committee. The memo concludes that LLM-specific guardrails complement, rather than displace, existing weapons autonomy policy and would strengthen U.S. credibility in shaping international norms for responsible military AI. This paper was submitted to Dr. Cynthia Bailey's course CS121 Equity and Governance for Artificial Intelligence, Stanford University