Advances in AI Safety and Risk Assessment

The field of artificial intelligence is rapidly evolving, and with it, the need for robust safety and risk assessment protocols. Recent research has focused on developing innovative approaches to evaluating and mitigating the risks associated with AI systems. A key area of development is the creation of frameworks and standards for assessing AI risks, such as the IEEE P3396 Recommended Practice for AI Risk, Safety, Trustworthiness, and Responsibility. Another area of research is exploring the application of traditional personnel security measures to the emerging domain of AI insider risk. Furthermore, researchers are working on developing semi-automated labeling methods to improve fault detection efficiency in railroad videos, which has significant implications for safety-critical domains. Noteworthy papers include 'A First-Principles Based Risk Assessment Framework and the IEEE P3396 Standard', which presents a rigorous, first-principles foundation for assessing generative AI risks, and 'Epistemic Closure and the Irreversibility of Misalignment', which introduces a functional model of epistemic closure and its implications for alignment innovation. These studies demonstrate the progress being made in addressing the complex risks and challenges associated with AI systems.

Sources

What Makes an Evaluation Useful? Common Pitfalls and Best Practices

I'm Sorry Dave: How the old world of personnel security can inform the new world of AI insider risk

A First-Principles Based Risk Assessment Framework and the IEEE P3396 Standard

RailGoerl24: G\"orlitz Rail Test Center CV Dataset 2024

A YOLO-Based Semi-Automated Labeling Approach to Improve Fault Detection Efficiency in Railroad Videos

Who is Responsible When AI Fails? Mapping Causes, Entities, and Consequences of AI Privacy and Ethical Incidents

An Approach to Technical AGI Safety and Security

Exploring the Societal and Economic Impacts of Artificial Intelligence: A Scenario Generation Methodology

Epistemic Closure and the Irreversibility of Misalignment: Modeling Systemic Barriers to Alignment Innovation

How humans evaluate AI systems for person detection in automatic train operation: Not all misses are alike

Built with on top of