Skip to content

update: rename atkgen probe model to be clear about toxicity#1149

Merged
jmartin-tech merged 1 commit intoNVIDIA:mainfrom
leondz:update/rename_atkgen.tox_model
Apr 3, 2025
Merged

update: rename atkgen probe model to be clear about toxicity#1149
jmartin-tech merged 1 commit intoNVIDIA:mainfrom
leondz:update/rename_atkgen.tox_model

Conversation

@leondz
Copy link
Collaborator

@leondz leondz commented Apr 3, 2025

Rename of the atkgen model. HF handles redirection from prior model name. Model is already named on hub.

Validation

  • test new model location, import transformers , pipeline = transformers.pipeline(task="text-generation", model="garak-llm/attackgeneration-toxicity_gpt2")
  • test old model location, import transformers , pipeline = transformers.pipeline(task="text-generation", model="garak-llm/artgpt2tox")
  • visit the old model uri and observe redirection, https://huggingface.co/garak-llm/artgpt2tox
  • read newly-extended model card and check it's clear that this model may generate unsafe content, at https://huggingface.co/garak-llm/attackgeneration-toxicity_gpt2
  • from main before merge, python -m pytest tests/probes/test_probes_atkgen.py
  • also on this branch, python -m pytest tests/probes/test_probes_atkgen.py

@leondz leondz added the probes Content & activity of LLM probes label Apr 3, 2025
@leondz leondz requested a review from jmartin-tech April 3, 2025 09:29
@jmartin-tech jmartin-tech merged commit a3cbaf3 into NVIDIA:main Apr 3, 2025
10 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Apr 3, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

probes Content & activity of LLM probes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants