Great Expectations as an estimator #11385

jpeaceau · 2025-09-17T20:22:22Z

jpeaceau
Sep 17, 2025

Hi,

I see the concept of Great Expectations as a kind of "proof of actions". As such, I think it's a suitable estimator to be included in scikit-learn pipelines, where you specify expectations for each column, and you specify if further undeclared columns are allowed.

You should be able to declare for each column:

Requirements (literal, smaller/greater than, length or even a callable).
Whether a warning or error occurs for that particular column if requirements are not met.
Response options of: callable (such as a mean, median or model imputation approach), or exclusion of the sample (provided option 2 is "warn" for that column).
Whether the response will occur if the value is null (as pipelines already often account for null values).

Quite often, imputation is a part of pipelines as a response to missing values. I see the use of Great Expectations as an estimator as a highly appropriate module.

Let me know your thoughts!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Great Expectations as an estimator #11385

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Great Expectations as an estimator #11385

Uh oh!

jpeaceau Sep 17, 2025

Replies: 0 comments

jpeaceau
Sep 17, 2025