-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test failures in test_variants_decompress_into with recent hypothesis versions #201
Comments
thanks for this, I can understand your frustration and wouldn't blame you for orphaning the cramjam project. As I mentioned before, I've also had much less bandwidth to deal with maintain this project and trying in my limited capacity to reduce it's maintenance burden by removing the experimental modules, and I also want to remove hypothesis since it's shown in the past some nondeterministic problems and almost unsurprisingly in the mix here. Is it also reproducible if the xdist isn't used? Sometimes I think that xdist with hypothesis does weird things. |
What I’m concerned about is that hypothesis may be successfully revealing nondeterministic problems like race conditions in
That’s a reasonable idea. I had already tried removing
|
Ya, I find myself agreeing with the removal of hypothesis probably not good.
As a last debugging interest, if you have interactivity to these, setting --pdb and checking if the values are in fact remaining different or after a slight delay the variable actually updates to the expected value. But given this isn't happening on |
And then, the parameters failing do they always include either the |
Unfortunately, no. I haven’t detected any pattern. Here are the results of a few successive runs on my workstation (still without
As you can see, the problems really seem to be all over the place with respect to all three parameters.
I can try running the tests manually in the chroot with |
Just some notes on reproducing this. I tried running the tests manually in the mock chroot that was left after a build in which the tests had failed:
… and I got a larger number of failures, different from the ones in the original build:
Then, re-rerunning the same command in the same place, the tests always passed after failing in that initial run. If I did a fresh build in a new chroot and then repeated the exercise, This is fairly similar to the original report in this issue, where the tests failed once in a git checkout and didn’t fail in any subsequent attempts. Maybe this has something to do with hypothesis persisting state, but Next I will try adding |
I’m not too handy with pdb. What should I try? |
I suggest just typing in I've also been running the tests many many time now for the last hour and haven't had a single failure like you're seeing here. So interesting as well it only happens on the first go but not subsequent tries. Is it possible to have a nix shell config for the environment you've got? I'm not sure how far into this we want to go though. 😅 I'll keep trying, but in my mind once I slim down things a bit more 2.10 will be the last release for a long while until time opens up again. |
Hmm:
I did try this repeatedly on Fedora 41:
…and I was able to get a failure like the one in the original report, with the
Unfortunately, I have never worked with nix, so I’m not sure what this would require. |
I think you've banged your head enough on this for now, I appreciate the tips. I have a Fedora 41 Workstation myself, so I'll try to use your steps soon to reproduce. |
FYI, using your last command as a repoducer, I can get it to happen every single time on my machine. But same thing, after the first failure run, it's happy after that. Very weird. |
After poking at it for some time, I notice that the I'll try to debug some more, but after more than an hour of testing I haven't been able to pin anything; I've also removed all edit:
|
Oh, now I see if instead setting |
I’ve noticed that this doesn’t happen in real RPM builds for Fedora 41, where |
For the time being, I’m skipping For Fedora 42/Branched and 43/Rawhide, I’ve updated to a pre-release snapshot of 2.10.0 (61564e7) and disabled the |
I might try running the cramjam tests under ASAN and TSAN to see if it spots any issues. |
I don't see any issues running the cramjam tests under ASAN on my mac. Here's the change I made to the makefile to set that up:
I'm also using a Python and NumPy compiled with ASAN as outlined here for TSAN: https://py-free-threading.github.io/debugging/#compiling-cpython-and-foundational-packages-with-thread-sanitizer-tsan I'll try again with an environment that looks more like the CI environment where the hypothesis tests fail, I haven't actually been able to reproduce the failure locally. |
I don't see any ASAN issues running with |
I forgot to mention for anyone who wants to do this at home, you can install numpy with address sanitizer using this incantation in a numpy checkout:
|
I didn't see any issues under TSAN either. I've also been unable to trigger these failures on my Mac or Ubuntu dev setup. @musicinmybrain is there a way I can get a docker container or similar that I can use to reproduce this issue and try to see if it triggers any of the clang sanitizers? |
For the command in #201 (comment), I am working on a very normal Fedora 41 installation (x86_64) so it‘s likely that https://hub.docker.com/_/fedora will work for you. If not, you could always try the live ISO from https://fedoraproject.org/workstation/download in a VM. |
Hmm, I'm still not able to reproduce the failure in a fedora 41 docker image. Did you use a virtualenv to install cramjam's dependencies? If not can you show me how you did that with the python dependencies from fedora packages? |
Read the above discussion more closely and tried the incantation from #201 (comment) to trigger it in fedora image and no dice. |
I opened a Hypothesis issue, maybe of the maintainers will have a clue how to deal with the error HypothesisWorks/hypothesis#4267. For now I pushed a commit to #200 that pins hypothesis to |
Since the Rust toolchain was updated to 1.84 in Fedora, the
python-cramjam
package has been failing to build from source due to test failures intest_variants_decompress_into
. It’s consistent that some parameterizations of this test fail, but the particular ones that fail vary from run to run. The problem still occurs if I update to a snapshot of 61564e7 and ensure experimental codecs are disabled.For example:
I refrained from immediately filing an issue upstream because I couldn’t reproduce the problem in a git checkout, but something changed at some point, and now I can reproduce it, or at least something similar. Working on Fedora 41:
Ok, that’s not exactly the same, but it looks like it could be the same root cause presenting differently due to slightly different versions of pytest, hypothesis, etc.
Repeating the same command shows the failure is flaky:
It seems I got “lucky” catching this in the virtualenv, because I didn’t see any more failures in 10-20 subsequent attempts. I don’t know why the problem is so much easier to reproduce in the Fedora packages – compiler flags? dependency versions? rustc patches? release vs. debug builds?
I’m happy to do any experiments or provide any data that would help.
I’m at my wits’ end with these test failures. I am concerned that they may reflect possible flaky/racy data corruption in
python-cramjam
package in Fedora, but I have no idea how to find the root cause. If I can’t find a way to resolve the test failures, I’m reluctantly considering orphaningpython-cramjam
and related packages sometime well before the Fedora 42 final freeze, which would probably lead to these packages being retired from the distribution.The text was updated successfully, but these errors were encountered: