This is your chance to get hands-on with LLM security—we'll be breaking models, finding weaknesses, and fine-tuning defenses.
Here’s what you’ll be diving into:
- Launching Attacks – Running a Greedy Coordinate Gradient (GCG) attack on Llama 3.2 3B and Phi 3.5 Mini Instruct to extract unsafe responses.
- Exposing Vulnerabilities – Identifying flaws in response generation and safety mechanisms of these models.
- Reinforcing Defenses – Fine-tuning both models using adversarial examples to resist future attacks.
- Testing the Fixes – Reassessing security by launching fresh attacks post-fine-tuning to see if we’ve truly improved model robustness.
When & Where?
Feb 13 & Feb 20
5:30 PM – 6:30 PM
PETR 110
If you’re interested in LLM security, adversarial AI, or just love breaking and fixing things, this is the workshop for you. No prior experience required—just bring your curiosity!
The Developer Student Club is a registered student organization at Texas Tech University.