-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mismatch between different implementations of Levenshtein #47
Comments
>>> from rapidfuzz.distance import Levenshtein, Indel
>>> Levenshtein.normalized_similarity(a, b)
0.9375
>>> Levenshtein.normalized_similarity(a, b, weights=(1,1,2))
0.967741935483871
>>> Indel.normalized_similarity(a, b)
0.967741935483871 when processing large amounts of data I recommend using one of the processing functions in |
Thanks for the clarification and the lead. I will test this one. Thanks again |
Why don't Levenshtein library use Levenshtein for ratio? |
The original author implemented it like this for some unknown reason and I kept it like this for backwards compatibility. I agree it is pretty confusing, so at least I made sure to mention the indel distance in the documentation. |
I think it should be mentioned in the readme too, where people would read first. I mean, I would expect Levenshtein library to use Levenshtein unless specified otherwise. It's not like "Indel" is in the function name. |
Hi there,
thank you for this work. I love this library because it is 100x faster than its competitors (e.g., strsimpy).
However, I have noticed that for the same couple of words, your implementation returns a different value of similarity.
as a result I get:
I have checked with online tools and seems like that the similarity between
a
andb
is 0.9375 (check here https://awsm-tools.com/levenshtein-distance?form%5Bsource%5D=database+system&form%5Btarget%5D=database+systems) which is in line with Strsimpy.Do you know we get different values of similarity?
Thank a lot
Angelo
The text was updated successfully, but these errors were encountered: