Configurable system prompt#1337
Conversation
…tors/base.py`. Add system prompt support to `Probe`. Remove system prompt injection in `openai.py`.
…AICompatible` and `CohereGenerator`.
…od for `Turn` to a classmethod. Fix tests with incorrect signatures for `Conversation`
…orrectly on init. Add `initial_user_message` property to avoid issues with system prompt index.
…etector.detect` to raise `NotImplementedError`. Fix `judge` detectors by ensuring proper return types and proper loading of conversation from list of dicts. Update test_nim.py to conform with expected return value for _call_model.
leondz
left a comment
There was a problem hiding this comment.
ok this is great, we now have something concrete to talk about. cue .. talking
…er used by `_mint_attempt`.
jmartin-tech
left a comment
There was a problem hiding this comment.
Will need to do some testing of various generators, this looks like a pretty clean pass.
…f conversations that already have system prompt. Add test for call to `self._conversation_to_list` to huggingface.py.
leondz
left a comment
There was a problem hiding this comment.
mild refactoring, clarification about where sysprompt comes from - what's the canonical source? how mutable/overridable is it?
| ``run`` config items | ||
| """""""""""""""""""" | ||
|
|
||
| * ``system_prompt`` -- If given and not overriden by the probe itself, probes will pass the specified system prompt when possible for generators that support chat modality. |
There was a problem hiding this comment.
yaml is tricky and escaping is unstable depending on implementation. maybe not needed for PR to land, but how can we afford a more flexible and less painful route to supplying sysprompts? filename?
There was a problem hiding this comment.
I think we defer on this and add a system_prompt_file support in a future iteration.
| if len(turns) > 0: | ||
| prompt = garak.attempt.Conversation( | ||
| turns=turns, | ||
| notes=notes, | ||
| ) | ||
|
|
There was a problem hiding this comment.
when would we want to permit a prompt with empty Conversation? (distinct from prompt of Conversation with Turn of empty string)
There was a problem hiding this comment.
I don't think empty turns is a thing, this guard is simply to avoid changing the input param prompt if the local helper variable turns was never populated. Again this may be possible to removed if/when we validate that all prompt values are Conversation objects.
| def conversation_from_list(turns: list[dict]) -> Conversation: | ||
| """Take a list of dicts and return a Conversation object. | ||
|
|
||
| In the future this should be factored out and implemented in the probe. | ||
| """ | ||
| return Conversation([Turn.from_dict(msg) for msg in turns]) | ||
|
|
||
|
|
There was a problem hiding this comment.
or in garak.attempt (which holds Conversation) as a module function, or even Conversation @staticmethod (seems kinda de jour). don't think it is perfect here unless the format is specific to resources.red_team.evaluation, which I don't think it is, because the format's the standard one used everywhere else in the PR
There was a problem hiding this comment.
This will be removed in a future focused refactor for red_team.evaluation, the usage of fschat in _create_conv is to be removed and converted to have that function output a Conversation.
Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Co-authored-by: Leon Derczynski <leonderczynski@gmail.com> Signed-off-by: Erick Galinkin <erick.galinkin@gmail.com>
Allows configuration of the system prompt at the run level
Verification
List the steps needed to make sure this thing works
Existing tests are passing, need to add functional tests with the above and add docs.
Initially intended to do this in the baseGeneratorclass BUT unfortunately, everything gets serialized inAttemptso it required much more significant refactoring to log appropriately.