Kansas State University

search

AI Safety Research Initiative

Category: Papers

New Paper: A Psychopathological Approach to Safety Engineering in AI and AGI

The pre-print of our new paper, written by Vahid Behzadan and Dr. Arslan Munir, in collaboration with University of Louisville’s Prof. Roman Yampolskiy , has been made available to public. The abstract of this paper, titled “A Psychopathological Approach to Safety Engineering in AI and AGI”, is as follows:

The complexity of dynamics in AI techniques is already approaching that of complex adaptive systems, thus curtailing the feasibility of formal controllability and reachability analysis in the context of AI safety. It follows that the envisioned instances of Artificial General Intelligence (AGI) will also suffer from challenges of complexity. To tackle such issues, we propose the modeling of deleterious behaviors in AI and AGI as psychological disorders, thereby enabling the employment of psychopathological approaches to analysis and control of misbehaviors. Accordingly, we present a discussion on the feasibility of the psychopathological approaches to AI safety, and propose general directions for research on modeling, diagnosis, and treatment of psychological disorders in AGI.

The full text of this paper is available here.

New Paper: Whatever does not kill deep reinforcement learning, makes it stronger

Abstract: Recent developments have established the vulnerability of deep Reinforcement Learning (RL) to policy manipulation attacks via adversarial perturbations. In this paper, we investigate the robustness and resilience of deep RL to training-time and test-time attacks. Through experimental results, we demonstrate that under noncontiguous training-time attacks, Deep Q-Network (DQN) agents can recover and adapt to the adversarial conditions by reactively adjusting the policy. Our results also show that policies learned under adversarial perturbations are more robust to test-time attacks. Furthermore, we compare the performance of ϵ-greedy and parameter-space noise exploration methods in terms of robustness and resilience against adversarial perturbations.

Read the preprint draft here.