Introduction - cshunor02/sponge-attack GitHub Wiki

Our project’s main goal was to design various attack scenarios and reproduce sponge attacks against Large Language Models (LLMs). To do such attacks, we have used different types of models (as described in later chapters) and settings. The main forms of sponge attacks that we analysed are the following:

  • Flooding attacks
  • DoS attack
  • Energy-latency attack
  • Adversarial examples
  • Deceptive inputs

To conclude our findings, we created plots and tables to show the success rate of the attacks.