object dtypes in custom check strategies. #817
-
I have been trying to add a data synthesis strategy for shapely geometries but I have run into a couple of issues. The first is in Engine.numpy_dtype. Which doesn't handle the geometry type which makes sense and should probably be aliased as object. Adding a check for geometry type in Engine.numpy_dtype and removing the object to str cast in to_numpy_dtype allows me to create a custom strategy which returns shapely geometry, and get back a GeoSeries with the correct dtype. Thing is I'm not sure if I am missing something important here, or even the reason the objects are being cast to str. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
Long story short, writing custom strategies is currently not well supported in pandera, @bphillips-exos has found a workaround, and there's an issue for adding better support for this: #561
You should check out the You can use this together with the workaround linked above to achieve this.
The main reason is that import numpy as np
import hypothesis.extra.numpy as numpy_st
st = numpy_st.from_dtype(np.dtype(object))
# *** hypothesis.errors.InvalidArgument: No strategy inference for object The behavior of |
Beta Was this translation helpful? Give feedback.
-
Thanks for your reply. Yeah already using the geopandas types. Ok that makes more sense, as you are essentially just trying to map dataframe dtypes straight to the data synthesises that hypothesis has already implemented. So really the root of the issue is that this should be by passed for custom strategies as we do not need to worry about what hypothesis is doing for us as that is on the user themselves. @bphillips-exos approach does by pass that logic so would work for me, and (potentially) separates the check from the strategy which is not needed in some cases. I think I will adopt that approach. This is something i need for a project I'm currently working on, so i could maybe just pick this up and create a PR. I think it should be handled globally in some way, but that would require some more investigation in how best to do it. |
Beta Was this translation helpful? Give feedback.
-
Thanks for you help. If your undergoing a major revision, probably best to hold off, and I'll just use this work around for now. |
Beta Was this translation helpful? Give feedback.
Thanks for your reply.
Yeah already using the geopandas types.
Ok that makes more sense, as you are essentially just trying to map dataframe dtypes straight to the data synthesises that hypothesis has already implemented. So really the root of the issue is that this should be by passed for custom strategies as we do not need to worry about what hypothesis is doing for us as that is on the user themselves.
@bphillips-exos approach does by pass that logic so would work for me, and (potentially) separates the check from the strategy which is not needed in some cases. I think I will adopt that approach. This is something i need for a project I'm currently working on, so i could maybe just pic…