How to use pandera in conjunction with typeguard and/or as a type checker for the other arguments? #837

acovaci · 2022-04-21T09:22:31Z

acovaci
Apr 21, 2022

Question about pandera

Hi. I am facing the following scenario. Due to ecosystem limitations, I can't use any static type checkers. So how I got around this is by using typeguard for most of my functions to check for the right types.

The issue I'm facing now is if I have a function that has both pandera and non-pandera type annotations. Typeguard seems to not like that pandas.core.frame.DataFrame (the type of the arguments) is not pandera.typing.pandas.DataFrame.

@typeguard.typechecked
@pa.check_types
def some_function(df: pa.typing.DataFrame[Schema], some_arg: int) -> None:
    ...

This throws:

TypeError: type of argument "df" must be pandera.typing.pandas.DataFrame; got pandas.core.frame.DataFrame instead

Any ideas how to get around this? Ultimately, I could make a pull request into typeguard to add functionality to ignore some arguments, but I'm trying to see if there's a simpler solution. Can pandera check the oher types as well?

Thanks 🐶 😄

Answered by cosmicBboy

Apr 21, 2022

If you really care about types, then you can actually use pandera.typing.DataFrame[Schema](data) (where data is some valid data you want for instantiating the dataframe). In this case, the types should match and typeguard should stop complaining.

You can think of pandera.typing.DataFrame[Schema](...) as a schema-typed dataframe. Initializing a dataframe like this basically validates the data coming in at initialization.

@typeguard.typechecked
@pa.check_types
def some_function(df: pa.typing.DataFrame[Schema], some_arg: int) -> None:
    ...

# when you invoke the function, make sure to pass in the schema-typed dataframe
some_function(pa.typing.DataFrame[Schema](...))

Although you're not us…

View full answer

acovaci · 2022-04-21T12:43:21Z

acovaci
Apr 21, 2022
Author

In the meantime, for now I switched to using @pydantic.validate_arguments

0 replies

cosmicBboy · 2022-04-21T12:45:21Z

cosmicBboy
Apr 21, 2022
Maintainer

hi @acovaci !

Is typeguard a requirement for your use case?

Another thing you can do is use @pa.check_types(with_pydantic=True), and under the hood it uses pydantic.validate_arguments to validate inputs, but it will also validate the output.

0 replies

cosmicBboy · 2022-04-21T12:49:13Z

cosmicBboy
Apr 21, 2022
Maintainer

If you really care about types, then you can actually use pandera.typing.DataFrame[Schema](data) (where data is some valid data you want for instantiating the dataframe). In this case, the types should match and typeguard should stop complaining.

You can think of pandera.typing.DataFrame[Schema](...) as a schema-typed dataframe. Initializing a dataframe like this basically validates the data coming in at initialization.

@typeguard.typechecked
@pa.check_types
def some_function(df: pa.typing.DataFrame[Schema], some_arg: int) -> None:
    ...

# when you invoke the function, make sure to pass in the schema-typed dataframe
some_function(pa.typing.DataFrame[Schema](...))

Although you're not using mypy, it would still be worth reading this mypy integration guide to understand the gotchas of using typed dataframes.

0 replies

acovaci · 2022-04-21T12:51:29Z

acovaci
Apr 21, 2022
Author

@cosmicBboy Thanks so much, I'll give the guide a read.

Is typeguard a requirement for your use case?

Not really, it was just the easiest library to integrate into my ecosystem.

Another thing you can do is use @pa.check_types(with_pydantic=True)

Ah this is awesome, though seems like I can't pass a config to it? This is how I'm currently using it:

@pydantic.validate_arguments(config={"arbitrary_types_allowed": True})
def some_function(df1: pa.typing.DataFrame[Schema], df2: pd.DataFrame, some_var: int) -> None:
    ...

Sadly I need this for passing pd.DataFrame arguments, sadly again, for ecosystem reasons.

If not, the explicit type instantiation is defo an option.

Thanks 👼

0 replies

cosmicBboy · 2022-04-21T12:55:19Z

cosmicBboy
Apr 21, 2022
Maintainer

You're welcome! Converting this to a discussion, mind marking my answer as the correct one?

0 replies

acovaci · 2022-04-21T13:10:25Z

acovaci
Apr 21, 2022
Author

Do you think it might be worth it making a pull request implementing passing a config object to check_types? Currently I'm working around it by setting the config globally, which isn't necessarily ideal (though it gets the job done)

pydantic.BaseModel.Config.arbitrary_types_allowed = True

1 reply

cosmicBboy Apr 21, 2022
Maintainer

Yes! please feel free to make a PR, I'd suggest pydantic_config as the kwarg name, but I'll leave that up to you!

Also if you can, please make an issue to track progress

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use pandera in conjunction with typeguard and/or as a type checker for the other arguments? #837

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments 1 reply

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

How to use pandera in conjunction with typeguard and/or as a type checker for the other arguments? #837

acovaci Apr 21, 2022

Question about pandera

Replies: 6 comments · 1 reply

acovaci Apr 21, 2022 Author

cosmicBboy Apr 21, 2022 Maintainer

cosmicBboy Apr 21, 2022 Maintainer

acovaci Apr 21, 2022 Author

cosmicBboy Apr 21, 2022 Maintainer

acovaci Apr 21, 2022 Author

cosmicBboy Apr 21, 2022 Maintainer

acovaci
Apr 21, 2022

Replies: 6 comments 1 reply

acovaci
Apr 21, 2022
Author

cosmicBboy
Apr 21, 2022
Maintainer

cosmicBboy
Apr 21, 2022
Maintainer

acovaci
Apr 21, 2022
Author

cosmicBboy
Apr 21, 2022
Maintainer

acovaci
Apr 21, 2022
Author

cosmicBboy Apr 21, 2022
Maintainer