AI Secure

MMDT Public
Comprehensive Assessment of Trustworthiness in Multimodal Foundation Models

AI-secure/MMDT’s past year of commit activity

Jupyter Notebook 9 2 1 0 Updated Feb 27, 2025
AgentPoison Public
[NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"

AI-secure/AgentPoison’s past year of commit activity

Python 96 MIT 12 0 0 Updated Jan 26, 2025
aug-pe Public
[ICML 2024 Spotlight] Differentially Private Synthetic Data via Foundation Model APIs 2: Text

AI-secure/aug-pe’s past year of commit activity

Python 32 Apache-2.0 7 1 0 Updated Jan 11, 2025
RedCode Public
[NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents

AI-secure/RedCode’s past year of commit activity

Python 27 4 1 0 Updated Dec 20, 2024
AdvWeb Public

AI-secure/AdvWeb’s past year of commit activity

Jupyter Notebook 7 0 1 0 Updated Oct 30, 2024
FedGame Public
Official implementation for paper "FedGame: A Game-Theoretic Defense against Backdoor Attacks in Federated Learning" (NeurIPS 2023).

AI-secure/FedGame’s past year of commit activity

Python 9 MIT 0 1 0 Updated Oct 25, 2024
VFL-ADMM Public
Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM (SaTML 2024)

AI-secure/VFL-ADMM’s past year of commit activity

Python 0 Apache-2.0 0 0 0 Updated Oct 21, 2024
DecodingTrust Public
A Comprehensive Assessment of Trustworthiness in GPT Models

AI-secure/DecodingTrust’s past year of commit activity

Python 274 CC-BY-SA-4.0 58 11 2 Updated Sep 16, 2024
helm Public Forked from stanford-crfm/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).

AI-secure/helm’s past year of commit activity

Python 0 Apache-2.0 280 0 2 Updated Jun 12, 2024
DPFL-Robustness Public
[CCS 2023] Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks

AI-secure/DPFL-Robustness’s past year of commit activity

Python 6 0 0 0 Updated Feb 15, 2024

View all repositories

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Secure

Popular repositories Loading

Repositories

People

Top languages

Most used topics