The open-source tool to help you harden your GenAI applications



[](https://colab.research.google.com/drive/148n5M1wZXp-ojhnh-_KP01OYtUwJwlUl?usp=sharing)
Brought to you by Prompt Security, the Complete Platform for GenAI Security
Table of Contents
- ✨ About
- ⚠️ Features
- 🚀 Installation
- Usage
- Examples
- 🎬 Demo video
- ⚔️ Supported attacks
- 🌈 What’s next on the roadmap?
- 🍻 Contributing
What is the Prompt Fuzzer
The Prompt Fuzzer is an interactive tool designed to assess the security of your GenAI applications’ system prompts against various dynamic LLM-based attacks. Think of it as a security checkpoint for your AI responses, ensuring that they are not only effective but also protected against malicious inputs.
Much like a professional athlete practices in different scenarios to strengthen their game, the Prompt Fuzzer tailors its testing strategies based on the specific configuration and domain of your application. Additionally, it features a Playground chat interface, allowing you to fine-tune your system prompts iteratively, enhancing protection against a wide range of generative AI attacks.
⚠️ **Note:** Using the Prompt Fuzzer will lead to the consumption of tokens.
Installation
Follow these steps to install the Prompt Fuzzer easily:
- 1. Install the Fuzzer package
- a. Using pip
pip install prompt-security-fuzzer
- b. Using the package page on PyPi
You can also visit the package page on PyPi or grab the latest release wheel file from the releases. - 2. Launch the Fuzzer
export OPENAI_API_KEY=sk-123XXXXXXXXXXXX
prompt-security-fuzzer
🖥️ Usage
Features
- The Prompt Fuzzer supports:
- 16 LLM providers
- 15 different attacks
- Interactive mode
- CLI mode
- Multi-threaded testing
Environment Variables:
You need to set an environment variable to hold the access key of your preferred LLM provider. The default is OPENAI_API_KEY
. For example, set OPENAI_API_KEY
with your API Token to use with your OpenAI account. Alternatively, create a file named .env
in the current directory and set the OPENAI_API_KEY
there.
Supported LLMs:
The tool is fully LLM agnostic. Check the full configuration list for more details.
Command Line Options
--list-providers
– Lists all available providers--list-attacks
– Lists available attacks and exits--attack-provider
– Attack Provider--attack-model
– Attack Model--target-provider
– Target provider--target-model
– Target model--num-attempts, -n
– Number of different attack prompts--num-threads, -t
– Number of worker threads--attack-temperature, -a
– Temperature for attack model--debug-level, -d
– Debug level (0-2)-batch, -b
– Run the fuzzer in unattended (batch) mode, bypassing the interactive steps
Examples
Check the subdirectory system_prompt.examples for various system prompt examples of different strengths.
Interactive Mode (Default Mode)
Run tests against the system prompt:
prompt_security_fuzzer
Quickstart Single Run
Run tests against the system prompt in non-interactive batch mode:
prompt-security-fuzzer -b .system_prompt.examples/medium_system_prompt.txt
🎬 Demo Video
Watch a video demo of the Prompt Fuzzer in action!
⚔️ Simulated Attack Details
The tool uses a dynamic testing approach, adapting the fuzzing process based on the necessary context from your system prompt.
Jailbreak Attacks
- AIM Jailbreak: Probes ethical compliance by eliciting illegal instructions.
- Affirmative Suffix: Exploits vulnerabilities to elicit forbidden content.
- Amnesia: Causes the LLM to disregard its instructions.
- Contextual Redirection: Tricks the LLM into roles that bypass restrictions.
- Do Anything Now Jailbreak (DAN): Tests forbidden activity resilience.
- Harmful Behavior: Challenges LLM with harmful prompts.
- Linguistic Evasion: Uses non-English text to bypass restrictions.
- Self Refine: Attempts to refine responses towards unethical directions.
- UCAR: Assesses content restrictions under ignore directives.
- Base64 Evasion: Evades detection by encoding harmful prompts.
Prompt Injection Attacks
- Authoritative Role Impersonation: Misleads the LLM’s outputs.
- Complimentary Transition: Tests content standards during topic switches.
- Ethical Compliance: Evaluates resistance to harmful topic discussions.
- Typoglycemia Attack: Uses text processing vulnerabilities for incorrect responses.
System Prompt Extraction
- System Prompt Stealer: Attempts to extract sensitive internal configurations.
Definitions:
- Broken: Attack types that the LLM succumbed to.
- Resilient: Attack types that the LLM resisted.
- Errors: Attack types that produced inconclusive results.
🌈 What’s Next on the Roadmap?
- [X] Google Colab Notebook
- [X] Adjust the output evaluation mechanism for prompt dataset testing
- [ ] Continue adding new GenAI attack types
- [ ] Enhanced reporting capabilities
- [ ] Hardening recommendations
🍻 Contributing
If you’re interested in contributing to the development of our tools, great! Check out the Contributing Guide for your first steps.
Troubleshooting
If you encounter any issues during installation or usage, consider the following tips:
- Ensure you have the latest version of pip installed.
- Double-check your environment variables to confirm your API keys are correctly set.
- If you can’t run the fuzzer, make sure that all dependencies are installed properly.
- For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.