Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tornado_mysql: round to the nearest full microsecond #2

Merged
merged 1 commit into from
Apr 12, 2017
Merged

tornado_mysql: round to the nearest full microsecond #2

merged 1 commit into from
Apr 12, 2017

Conversation

xs
Copy link

@xs xs commented Apr 12, 2017

After we convert the microseconds string to a float and multiply by 1e6, it's possible that we lose a microsecond when casting to an integer again:

Example:

In [1]: usecs = float('0.524226') * 1e6

In [2]: usecs
Out[2]: 524225.99999999994

In [3]: int(usecs)
Out[3]: 524225

Test code:

old_errors = 0
new_errors = 0

for expected in range(1000000):
    usecs_string = '0.' + str(expected).zfill(6)

    usecs = float(usecs_string) * 1e6

    old_actual = int(usecs)
    new_actual = int(round(usecs))

    if old_actual != expected:
        old_errors += 1

    if new_actual != expected:
        new_errors += 1


print 'old_errors: ' + str(old_errors)
print 'new_errors: ' + str(new_errors)

Test output:

old_errors: 11549
new_errors: 0

cc @V0idmain

@@ -123,7 +123,7 @@ def convert_datetime(obj):
usecs = '0'
if '.' in hms:
hms, usecs = hms.split('.')
usecs = float('0.' + usecs) * 1e6
usecs = round(float('0.' + usecs) * 1e6)
Copy link

@dangnvang dangnvang Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked yet, but is usecs a whole number microseconds (before the round bit)? If so, seems like it could be simpler to just use that directly (and/or convert it to int from string)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may be right. I haven't checked either!

I just wanted to leave the smallest footprint that passes for all million possible microsecond test cases.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dangnvang

i think that any of the changes should be fine, i added a new scenario and it looks ok as well 🙂

This test code:

new2_errors = 0

for expected in range(1000000):
    usecs_string2 = round(float('0.' + str(expected).zfill(6)) * 1e6)

    if usecs_string2 != expected:
        new2_errors += 1

print 'new2_errors: ' + str(new2_errors)

Throws this output

new2_errors: 0

So i think both cases are ok, what do you think?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agustinavarela:
I was thinking more like usecs = int(usecs) instead of usecs = round(float('0.' + usecs) * 1e6)

Looking at the rest of the code, it looks like this is the conversion from MySQL datetimes to Python. The MySQL datetime format, when returned with microseconds (microseconds is the only fractional second supported it looks like), always has 6 numbers after the decimal point, so I think we're safe to go with using the more simple int(usec) conversion method.

Regardless of which one we go with, this change should be tested with an actual DB where you write a datetime and read it back, ensuring that the microseconds portion of it is correctly converted in and out of MySQL.

Copy link

@dangnvang dangnvang Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I did some testing and determined that the MySQL documentation is a little misleading in some places.

In actuality, datetimes do have a fractional precision that you can specify for the number of decimal places to use. Thus, with testing actual use of the library, the int(usec) approach will not work because it's not always 6 characters as I previously inferred from other parts of the documentation.

Keeping it as round(float('0.' + usecs) * 1e6) should be ok!

Sorry for the confusion!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dangnvang thanks so much for checking that! :)

@agustinavarela agustinavarela merged commit 47d1ed9 into buzzfeed:master Apr 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants