partial_ratio not using best aligned substring with python-Levenshtein #274

maxbachmann · 2020-04-15T19:10:05Z

When using partial_ratio with python-Levenshtein it is not guaranteed to use the best aligned string, even though thats the purpose of partial_ratio.
As an example:

>>> fuzz.partial_ratio("aaaa", "babaaaab")
75.0

Here the best aligned string is a exact match. However the get_matching_blocks method from python-Levenshtein only finds the alignment aaaa <-> abaa and therefore calculates a ratio of 75% in partial_ratio.

In my opinion it should be either explicitly mentioned (probably in the docstring), that when using python-Levenshtein this function is not guaranteed to use the best aligned string, or it should keep using difflib to calculate the matching_blocks and python-Levenshtein only for the final ratio calculation when looping over the matching_blocks even when python-Levenshtein is available.

The text was updated successfully, but these errors were encountered:

maxbachmann · 2020-04-15T19:21:06Z

duplicate of #79

maxbachmann closed this as completed Apr 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

partial_ratio not using best aligned substring with python-Levenshtein #274

partial_ratio not using best aligned substring with python-Levenshtein #274

maxbachmann commented Apr 15, 2020

maxbachmann commented Apr 15, 2020

partial_ratio not using best aligned substring with python-Levenshtein #274

partial_ratio not using best aligned substring with python-Levenshtein #274

Comments

maxbachmann commented Apr 15, 2020

maxbachmann commented Apr 15, 2020