Towards Safe and Ethical AI

Authors

  • Johann Lee Cornell University
  • Darynne Lee Stanford University

DOI:

https://doi.org/10.60690/1v7dy054

Keywords:

AI benchmarking, AI safety, MLCommons AI Safety Benchmark, Perspective API, Benchmark-driven evaluation

Abstract

As large pre-trained language models grow prevalent, efforts in preventing biased and hateful outputs related to race and gender are increasingly critical. Since initiatives are scattered and fragmented, this review outlines the latest methods for measuring safe, ethical AI and discusses their limitations. By spotlighting the proper utilization and challenges of state-of-the-art methods, this review seeks to foster continuing discourse and innovation among both technical developers and non-technical policymakers.

A robot measuring models getting a C+ report card that says getting better

Downloads

Published

2025-04-03

Issue

Section

Research and Technology Reviews