AI Safety Research Initiative

New Paper: A Psychopathological Approach to Safety Engineering in AI and AGI

Posted on May 24, 2018June 4, 2018 by Vahid Behzadan

The pre-print of our new paper, written by Vahid Behzadan and Dr. Arslan Munir, in collaboration with University of Louisville’s Prof. Roman Yampolskiy , has been made available to public. The abstract of this paper, titled “A Psychopathological Approach to Safety Engineering in AI and AGI”, is as follows:

The complexity of dynamics in AI techniques is already approaching that of complex adaptive systems, thus curtailing the feasibility of formal controllability and reachability analysis in the context of AI safety. It follows that the envisioned instances of Artificial General Intelligence (AGI) will also suffer from challenges of complexity. To tackle such issues, we propose the modeling of deleterious behaviors in AI and AGI as psychological disorders, thereby enabling the employment of psychopathological approaches to analysis and control of misbehaviors. Accordingly, we present a discussion on the feasibility of the psychopathological approaches to AI safety, and propose general directions for research on modeling, diagnosis, and treatment of psychological disorders in AGI.

The full text of this paper is available here.

Lectures on Reinforcement Learning

Posted on May 12, 2018June 4, 2018 by Vahid Behzadan

During the Spring ’18 semester, Vahid Behzadan presented a series of guest lectures on Reinforcement Learning (RL) and deep RL to the CIS 732 (Machine Learning) course students at K-State. The recordings of these lectures are made available through the following link for anyone with an interest in RL.

Undergraduate Project Defense

Posted on May 11, 2018June 4, 2018 by Vahid Behzadan

James Minton, an undergraduate affiliate of the AI Safety Research Initiative, defended his senior project today. He has been working alongside Vahid Behzadan and Dr. Munir on developing a platform for experiments on ethical reinforcement learning in the context of autonomous navigation. James will be joining us this summer to further advance this project, which will be made available to public for research on ethical decision making and the value alignment problem. Many congratulations to James, and job well done : )

Lecture on Machine Learning for Cyber-Security

Posted on May 9, 2018June 4, 2018 by Vahid Behzadan

In his capacity as external advisor to the OWASP Nettacker project, Vahid Behzadan presented an introductory lecture on Machine Learning for Cyber-Security to the new interns joining the project through the Google Summer of Code program. You can view a recording of this lecture in the following video:

Discussion Panel on AI Safety

Posted on April 16, 2018June 4, 2018 by Vahid Behzadan

Last Friday, Vahid Behzadan was invited to host a discussion on AI safety for the KDD research group at K-State. In this session, Vahid touched upon various topics including AGI, safety issues in current and future AI, economics of emergent catastrophe, value alignment problem, game theory, counter-factual reasoning, and an overview of good resources to kickstart research in AI safety. You can view a video of this discussion via the following Youtube link:

RLAttack Examples in Cleverhans

Posted on January 29, 2018February 13, 2018 by Vahid Behzadan

The Cleverhans project now includes two examples of RLAttack for test-time and training-time FGSM attacks on Deep Q-Networks (DQNs).

GitHub Repository

RLAttack: Crafting Adversarial Example Attacks on Policy Learners

Posted on January 15, 2018February 13, 2018 by Vahid Behzadan

Framework for experimental analysis of adversarial example attacks on policy learning in Deep RL. Attack methodologies are detailed in our paper “Whatever Does Not Kill Deep Reinforcement Learning, Makes It Stronger” (Behzadan & Munir, 2017 – https://arxiv.org/abs/1712.09344 ).

This project provides an interface between @openai/baselines and @tensorflow/cleverhans to facilitate the crafting and implementation of adversarial example attacks on deep RL algorithms. We would also like to thank @andrewliao11/NoisyNet-DQN for inspiring solutions to implementing the NoisyNet algorithm for DQN.

GitHub Repository

New Paper: Whatever does not kill deep reinforcement learning, makes it stronger

Posted on December 29, 2017February 13, 2018 by Vahid Behzadan

Abstract: Recent developments have established the vulnerability of deep Reinforcement Learning (RL) to policy manipulation attacks via adversarial perturbations. In this paper, we investigate the robustness and resilience of deep RL to training-time and test-time attacks. Through experimental results, we demonstrate that under noncontiguous training-time attacks, Deep Q-Network (DQN) agents can recover and adapt to the adversarial conditions by reactively adjusting the policy. Our results also show that policies learned under adversarial perturbations are more robust to test-time attacks. Furthermore, we compare the performance of ϵ-greedy and parameter-space noise exploration methods in terms of robustness and resilience against adversarial perturbations.

Read the preprint draft here.

Kansas State University