Skip to content

Conversation

@rlundeen2
Copy link
Contributor

@rlundeen2 rlundeen2 commented Dec 5, 2025

Description of Three changes

This should and could be three PRs. But in my first change, I ended up having to make several other changes.

First Change: Scenario Dataset Initialization

This is the problem I set out to tackle. But if we break it up this needs to go in last.

  • Added a scenario abstract method required_datasets for scenarios to define required default dataset names.
  • Updated the CLI to display required_datasets as part of the scenario info
  • Created load_default_datasets, a PyRItInitializer which loads all necessary datasets into memory using required_datasets and DatasetProvider
  • Updated scenarios to always load default datasets from memory (previous this was mostly yaml), which allows more flexibility in choosing.
  • Updated docs to often use required_datasets, since it's a really easy way to get started.

One problem: PyRItInitializer was synchronous, but we needed it to be async to use DatasetProvider and load SeedPrompts into memory

Ideally this would have been a first PR. But it is a change I didn't know we needed until almost done with the first part (I originally wrapped with asyncio, but that was breaking notebooks and other eventloops).

  • This is likely something we want anyway. Many PyRIT functions are async, and this gives us flexibility to use them.
  • I would have rather done this later, but there was no non-kludgy way to make required_datasets without async
  • There are a few layers that also needed to be updated to async to support this. Most logic was in the front end code, non-breaking for a release
  • The biggest breaking change is initialze_pyrit needs to be async to support this. Which changes a lot of docs. (again I like this pre-release)

Second problem: Many notebooks failing with target updates

And when running integration tests, I fixed times when the targets weren't working due to model not being defined.

PR Note: Again, this would be nice as a separate PR, but I was running this to fix my own changes (which essentially breaks every notebook due to initialize_pyrit updates) and stumbled upon this

PR Strategy

For the record, I'd say 80% of the code is just re-executing the jupyter notebooks because nearly every one had to be updated to use asynchronous initialize_pyrit_async. And because all the output is new, that's a lot of files and LOC.

This could be broken into three PRs, but it would go into reverse order and is a bit tough to pull apart. I'd prefer to keep as is because it's a lot less work. I recognize this adds difficult to the reviewer, but if it's too difficult to review, please raise this and I can break apart. But I think a lot of that breaking up would be manual

Tests and Docs

commit: 7a86bf4

  • All integration tests pass.
  • Except 11_harm_categories, but that was previously broken. I am tempted to remove this file until we can fix it, but the fix is non-trivial. Right now it only works if you're sending SeedPrompts (not SeedObjectives) with a harm category. As such, it won't work with most of our prompts or any multi_turn scenarios, so the behavior is fairly unpredictable. Maybe another PR to remove this before release.
  • I ran all scenarios. I want to create integration tests here but will wait for another PR.

@rlundeen2 rlundeen2 changed the title Draft FEAT: Dataset Initializer and Scenario Datasets FEAT Breaking: Dataset Initializer and Scenario Datasets Dec 6, 2025
@rlundeen2 rlundeen2 marked this pull request as ready for review December 6, 2025 22:47
@romanlutz
Copy link
Contributor

I don't think we have integration tests covering this yet. We probably should...

@rlundeen2 rlundeen2 merged commit eba5796 into Azure:main Dec 9, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants