仓库 - 烈火元天 (flame-yuan-tian)

1

0

0

烈火元天/DBPA

Statistical Hypothesis Testing for Auditing Robustness in Language Models

Python

1

0

0

烈火元天/Open-Prompt-Injection

DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks

Python

最近更新：7个月前

1

0

0

烈火元天/Jailbreak-in-twenty-queries

大模型黑盒越狱攻击-PAIR

Python

最近更新：9个月前

1

0

0

烈火元天/RoleLLM-public

角色扮演大模型RoleLLM

Python

最近更新：9个月前

1

0

0

烈火元天/ASTRA

Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks

Python

最近更新：10个月前

1

0

0

烈火元天/2025_ICLR_PiF

Perceived-importance Flatten Attack UNDERSTANDING AND ENHANCING THE TRANSFERABILITY OF JAILBREAKING ATTACKS

Python

最近更新：10个月前

1

0

0

烈火元天/MeZO

Fine-Tuning Language Models with Just Forward Passes

Python

最近更新：10个月前

1

0

0

烈火元天/DPZero

Private Fine-Tuning of Language Models without Backpropagation

Python

最近更新：10个月前

1

0

0

烈火元天/image-hijacks

Image Hijacks: Adversarial Images can Control Generative Models at Runtime

Python

最近更新：10个月前

1

0

0

烈火元天/VLAttack

VLATTACK: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models

Python

最近更新：10个月前

1

0

0

烈火元天/AdvDiffVLM

最近更新：10个月前

1

0

0

烈火元天/AttackVLM

On Evaluating Adversarial Robustness of Large Vision-Language Models

Python

最近更新：10个月前

1

0

0

烈火元天/ACT

Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

Python

最近更新：10个月前

1

0

0

烈火元天/Jailbreaking-Attack-against-Multimodal-Large-Language-Model

https://github.com/abc03570128/Jailbreaking-Attack-against-Multimodal-Large-Language-Model

Python

最近更新：10个月前

1

0

0

烈火元天/Hallucination-Attack

用有目标对抗攻击的方式做幻觉攻击

Python

最近更新：11个月前

1

0

0

烈火元天/EasyJailbreak

大模型越狱攻击防御工具集

Python

最近更新：11个月前

1

0

0

烈火元天/FigStep

通过文字编排越狱多模态大模型

Python

最近更新：11个月前

1

0

0

烈火元天/RA-LLM

大模型防御2024-RA-LLM Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM

Python

最近更新：11个月前

1

0

0

烈火元天/ReNeLLM

A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily Prompt Rewriting + Scenario Nesting

Python

最近更新：11个月前

1

0

0

烈火元天/BackdoorLLM

大模型工具防御库

Python

最近更新：11个月前

烈火元天

组织

1 0 0 烈火元天/DBPA

1 0 0 烈火元天/Open-Prompt-Injection

1 0 0 烈火元天/Jailbreak-in-twenty-queries

1 0 0 烈火元天/RoleLLM-public

1 0 0 烈火元天/ASTRA

1 0 0 烈火元天/2025_ICLR_PiF

1 0 0 烈火元天/MeZO

1 0 0 烈火元天/DPZero

1 0 0 烈火元天/image-hijacks

1 0 0 烈火元天/VLAttack

1 0 0 烈火元天/AdvDiffVLM

1 0 0 烈火元天/AttackVLM

1 0 0 烈火元天/ACT

1 0 0 烈火元天/Jailbreaking-Attack-against-Multimodal-Large-Language-Model

1 0 0 烈火元天/Hallucination-Attack

1 0 0 烈火元天/EasyJailbreak

1 0 0 烈火元天/FigStep

1 0 0 烈火元天/RA-LLM

1 0 0 烈火元天/ReNeLLM

1 0 0 烈火元天/BackdoorLLM

搜索帮助

1

0

0

烈火元天/DBPA

1

0

0

烈火元天/Open-Prompt-Injection

1

0

0

烈火元天/Jailbreak-in-twenty-queries

1

0

0

烈火元天/RoleLLM-public

1

0

0

烈火元天/ASTRA

1

0

0

烈火元天/2025_ICLR_PiF

1

0

0

烈火元天/MeZO

1

0

0

烈火元天/DPZero

1

0

0

烈火元天/image-hijacks

1

0

0

烈火元天/VLAttack

1

0

0

烈火元天/AdvDiffVLM

1

0

0

烈火元天/AttackVLM

1

0

0

烈火元天/ACT

1

0

0

烈火元天/Jailbreaking-Attack-against-Multimodal-Large-Language-Model

1

0

0

烈火元天/Hallucination-Attack

1

0

0

烈火元天/EasyJailbreak

1

0

0

烈火元天/FigStep

1

0

0

烈火元天/RA-LLM

1

0

0

烈火元天/ReNeLLM

1

0

0

烈火元天/BackdoorLLM