-
-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unique keyword arg #580
Unique keyword arg #580
Conversation
hey @fkrull8 would you mind giving me push access to your fork of pandera? https://github.com/fkrull8/pandera Just fixed all the merge conflicts mentioned above. |
Hi @cosmicBboy , apologies for not seeing this earlier! I think I should have just provided you with collaborator access to the repo. The PR was only half finished, as I was having trouble with the property assignment of the new unique keyword in the base class. The tests that I had written should have failed where I was having trouble. If I could get an assist on that, I could pick this PR back up and iron it out. |
c2d6048
to
6e71146
Compare
Co-authored-by: Niels Bantilan <[email protected]>
no worries @fkrull8, and thanks for the write permissions! I'll iron out some of the tests and hand it back over to you |
6e71146
to
4aea6cc
Compare
Codecov Report
@@ Coverage Diff @@
## master #580 +/- ##
==========================================
- Coverage 99.45% 98.77% -0.68%
==========================================
Files 21 29 +8
Lines 2572 3359 +787
==========================================
+ Hits 2558 3318 +760
- Misses 14 41 +27
Continue to review full report at Codecov.
|
Hello Thank you for this PR. This functionality is exactly what we need. |
hey @rbngz yes! this PR is almost ready. @fkrull8 I managed to iron out a bunch of the tests since revamping the type system... I might be missing something, but this PR seems to be feature complete? The only thing blocking is adding test coverage for these non-covered lines of code: https://github.com/pandera-dev/pandera/pull/580/checks?check_run_id=3281386477 @rbngz once this PR is merged would you be open to making a contribution to add support for multi-column uniqueness in |
…unique_keyword_arg � Conflicts: � pandera/io.py � pandera/model.py � pandera/model_components.py � pandera/schema_components.py � pandera/schemas.py � tests/core/test_schemas.py
Hi @cosmicBboy, apologies for the late follow-up on this. Yes! The issues I was having before seem to have been ironed out. The feature that is not covered with this PR is the on the hypothesis side. As far as I can tell, there is no ready-made methods from hypothesis for setting unique-ness over multiple columns. That would have to be custom-made. |
I have to admit I'm a bit mystified by the coverage results -- the uncovered bits don't appear to be different than on master branch. The uncovered cases from #580 (comment) have been addressed with dc88f59 . Sorry about the earlier linting errors, my dev environment wasnt set up properly for tox. |
yeah coverage results can be a little finicky. |
@fkrull8 going to merge this now, thanks and congratulations on your first contribution to pandera 🎉 |
* Unique keyword arg (#580) * add copy button to docs (#448) * Add missing inplace arg to SchemaModel's validate (#450) * link documentation to github (#449) Co-authored-by: Niels Bantilan <[email protected]> * intermediate commit for review by @cosmicBboy * link documentation to github (#449) Co-authored-by: Niels Bantilan <[email protected]> * intermediate commit for review by @cosmicBboy * WIP * fix test errors, re-factor allow_duplicates handling * fix io tests * fix docs, remove _allow_duplicates private var * update unique type signature in strategies * completing tests for setters and lazy evaluation of unique kw * small fix for the linting errors * support dataframe-level uniqueness in strategies * add docs, fix error formatting, add multiindex support Co-authored-by: Jean-Francois Zinque <[email protected]> Co-authored-by: tfwillems <[email protected]> Co-authored-by: fkroll8 <[email protected]> Co-authored-by: fkroll8 <[email protected]> * Add support for timezone-aware datetime strategies (#595) * add support for Any annotation in schema model (#594) * add support for Any annotation in schema model the motivation behind this feature is to support column annotations that can have any type, to support use cases like the one described in #592, where custom checks can be applied to any column except for ones that are explicitly defined in the schema model class attributes * update pylint, fix lint * Docs/scaling - Bring Pandera to Spark and Dask (#588) * scaling.rst * edited conf * finished first pass * removing FugueWorkflow * Update index.rst * Update docs/source/scaling.rst Co-authored-by: Niels Bantilan <[email protected]> * add support for timezone-aware datetime strategies * fix le/ge strategies with datetime * fix mypy errors Co-authored-by: Niels Bantilan <[email protected]> Co-authored-by: Kevin Kho <[email protected]> * support frictionless primary keys with multiple fields Co-authored-by: Jean-Francois Zinque <[email protected]> Co-authored-by: tfwillems <[email protected]> Co-authored-by: fkroll8 <[email protected]> Co-authored-by: fkroll8 <[email protected]> Co-authored-by: Kevin Kho <[email protected]>
* Unique keyword arg (#580) * add copy button to docs (#448) * Add missing inplace arg to SchemaModel's validate (#450) * link documentation to github (#449) Co-authored-by: Niels Bantilan <[email protected]> * intermediate commit for review by @cosmicBboy * link documentation to github (#449) Co-authored-by: Niels Bantilan <[email protected]> * intermediate commit for review by @cosmicBboy * WIP * fix test errors, re-factor allow_duplicates handling * fix io tests * fix docs, remove _allow_duplicates private var * update unique type signature in strategies * completing tests for setters and lazy evaluation of unique kw * small fix for the linting errors * support dataframe-level uniqueness in strategies * add docs, fix error formatting, add multiindex support Co-authored-by: Jean-Francois Zinque <[email protected]> Co-authored-by: tfwillems <[email protected]> Co-authored-by: fkroll8 <[email protected]> Co-authored-by: fkroll8 <[email protected]> * Add support for timezone-aware datetime strategies (#595) * add support for Any annotation in schema model (#594) * add support for Any annotation in schema model the motivation behind this feature is to support column annotations that can have any type, to support use cases like the one described in #592, where custom checks can be applied to any column except for ones that are explicitly defined in the schema model class attributes * update pylint, fix lint * Docs/scaling - Bring Pandera to Spark and Dask (#588) * scaling.rst * edited conf * finished first pass * removing FugueWorkflow * Update index.rst * Update docs/source/scaling.rst Co-authored-by: Niels Bantilan <[email protected]> * add support for timezone-aware datetime strategies * fix le/ge strategies with datetime * fix mypy errors Co-authored-by: Niels Bantilan <[email protected]> Co-authored-by: Kevin Kho <[email protected]> * schemas with multi-index columns correctly report errors (#600) fixes #589 * strategies module supports undefined checks in regex columns (#599) * Add support for empty data type annotation in SchemaModel (#602) * remove artifacts of py3.6 support * add support for empty data type annotation in SchemaModel * fix frictionless version in dev dependencies * fix setuptools version instead of frictionless * fix setuptools pinning * remove frictionless from core pandera deps (#609) * support frictionless primary keys with multiple fields (#608) * fix validation of check raising error without message (#613) * docs/requirements.txt pin setuptools (#611) * bump version 0.7.1 Co-authored-by: Jean-Francois Zinque <[email protected]> Co-authored-by: tfwillems <[email protected]> Co-authored-by: fkroll8 <[email protected]> Co-authored-by: fkroll8 <[email protected]> Co-authored-by: Kevin Kho <[email protected]>
fixes #390
@fkrull8 I'm re-opening a pull request against the current
master
branch and will work on getting this merged