research

Our research spans six key areas in AI security.

AI Agent Security

Securing autonomous AI agents against adversarial manipulation, prompt injection, and unintended behaviors in real-world deployments.


Interpretable AI Security

Leveraging interpretability and explainability techniques to understand, diagnose, and mitigate vulnerabilities in AI systems.


Usable Security of AI

Designing intuitive security mechanisms and interfaces that help users safely interact with and configure AI systems.


AI Misuse Measurement

Understanding and measuring how AI is being misused in the wild, from MCP server poisoning to adversarial exploitation of AI-powered services, and developing methods to quantify such misuse at scale.


AI Society Security

Studying the safety implications of deploying AI as participants in social networks and human-like environments, including how AI handles moral dilemmas, social norms, and trust dynamics when acting as autonomous social agents.


AI for Security

Applying AI techniques to strengthen cybersecurity defenses, including automated vulnerability repair, threat detection, and security analysis.