ref: Add HTTPReader #406

tkaemming · 2019-08-09T18:51:00Z

No description provided.

fpacifici · 2019-08-09T20:24:36Z

remark about error conditions: when we send an invalid query to clickhouse, it will reply with HTTP 500 with the http interface. We will have to properly parse the response in these cases as well otherwise those error messages, that now are visible in sentry, will be lost

tkaemming

This is "ready for review" in the sense that this is ready for some though about the questions noted inline.

snuba/api.py

snuba/reader.py

tkaemming · 2019-08-09T22:04:53Z

snuba/reader.py

+    def __init__(
+        self, host: str, port: int, options: Optional[Mapping[str, str]] = None
+    ):
+        self.__base_url = f"http://{host}:{port}/"


Python 3. 🙌

tkaemming · 2019-08-09T22:06:33Z

snuba/reader.py

+        response = requests.post(
+            urljoin(self.__base_url, "?" + urlencode(parameters)),
+            data=query.encode("utf-8"),
+        )


This could probably use straight urllib3, but I used requests here for consistency with the HTTPBatchWriter.

Not anymore @tkaemming.

Used 97b36c7 as the pattern for this change. Probably need to take a closer look at request headers.

snuba/util.py

tkaemming · 2019-08-09T22:41:51Z

snuba/util.py

@@ -444,7 +444,11 @@ def raw_query(body, sql, client, timer, stats=None):
                            query_settings['max_threads'] = 1

                        try:
-                            result = NativeDriverReader(client).execute(
+                            result = HTTPReader(


We'll probably need a way to find a way to switch these out at runtime, since this is a bit more touchy than the writer change. The only way that I've thought of testing this (short of Matt doing some wacky Kubernetes stuff) is to (random?) sample in requests based on dynamic configurations. Looking for additional ideas here.

Doing something like serviceDelegator and run both for a small number of queries? That will provide you a more solid validation in a shorter time. The fact reader is a class on its own helps:

class DelegatingReader: def __init__(self, native, http): ... def execute (): if random < settings.EXPERIMENT_SAMPLE: native.execute http.execute compare and log discrepancy return native result if discrepancy is too big else: if settings.PROD_READER = 'native: return native.execute else: return http.execute

You can add a fallback to native in case of error on http so we do not take down Snuba.

One issue with this is that we don't run uwsgi with threads enabled. I don't specifically recall why (#312) but I think we had to turn them off for some reason?

If we just want to spot check the results, we could possibly try running some sample of queries that were logged to Kafka with a consumer that runs them with both readers? This would also be outside of the request/response cycle, which I think is a benefit in this case?

We do run with threads now. We explicitly do it in Kubernetes config, and a part of the mywsgi changes we’re making that default. It’s silly to run with the GIL disabled.

Considering we are already using the HTTP interface for writes I would try to keep this experiment simple. I can think about two possible issues we are trying to protect against:

results are incorrect. For this we need to run both queries and compare, so either we run both queries within the api call or outside by running the queries logged into Kafka. If we are already logging into Kafka, why not writing a quick consumer that runs both queries and compare and try to consume that topic? That seems quicker than introducing a DelegatingReader in Snuba

performance are dismal. For this we could (I don;'t know how much of a pain it is with our Kube deployment) do canary instead ? Deploy the api with HTTP reader on just a few node and monitor those?

HTTP reader makes Snuba unstable. See "performance are dismal". If we could do canary we would not need a DelegatingReader for this as well.

What is concerning you the most?

I'm not super worried about performance or stability, since I think that would show up pretty quickly in the metrics or in Sentry and a rollback would be pretty straightforward. My biggest concern is subtle inconsistencies in the data returned between the two drivers that causes weird behavior that isn't easily identified by any specific metrics close to the source of the problem.

I think the only drawback of running a canary for performance and stability sanity checking would be the quantity of work required to enable that, which I can't speak to. Maybe @mattrobenolt could provide some context here?

If we are already logging into Kafka, why not writing a quick consumer that runs both queries and compare and try to consume that topic? That seems quicker than introducing a DelegatingReader in Snuba

Yeah, I agree this seems like the lowest impact way to do that kind of comparison right now.

It wouldn't be hard if it's behind some environment variable that can be toggled. I can then just run 1 or more Pods with this on or whatever.

fpacifici

Are you planning to also remove all the readonly accesses to ClickhousePool and replace them with the HTTPReader ?

snuba/reader.py

snuba/util.py

tkaemming

Are you planning to also remove all the readonly accesses to ClickhousePool and replace them with the HTTPReader ?

Eventually, yes — not in a huge hurry to get to this but it should be done at some point for consistency, I think.

snuba/reader.py

snuba/util.py

fpacifici · 2019-08-23T18:01:09Z

snuba/reader.py

+
+        # TODO: Distinguish between ClickHouse errors and other HTTP errors.
+        if response.status // 100 != 2:
+            raise HTTPError(f"{response.status} Unexpected")


I think we should include the error message in the response payload before this goes to production otherwise we would not be able to see Clickhouse errors in prod.
Also not having the clickhouse error would make things very hard to troubleshoot if something goes wrong when you roll this out.

fpacifici · 2019-08-23T18:08:19Z

snuba/util.py

@@ -444,7 +444,11 @@ def raw_query(body, sql, client, timer, stats=None):
                            query_settings['max_threads'] = 1

                        try:
-                            result = NativeDriverReader(client).execute(
+                            result = HTTPReader(


Doing something like serviceDelegator and run both for a small number of queries? That will provide you a more solid validation in a shorter time. The fact reader is a class on its own helps:

class DelegatingReader: def __init__(self, native, http): ... def execute (): if random < settings.EXPERIMENT_SAMPLE: native.execute http.execute compare and log discrepancy return native result if discrepancy is too big else: if settings.PROD_READER = 'native: return native.execute else: return http.execute

You can add a fallback to native in case of error on http so we do not take down Snuba.

fpacifici

I think the only blocker is the rollout plan.

fpacifici · 2019-08-27T16:45:45Z

snuba/clickhouse/http.py

+    def execute(
+        self,
+        query: str,
+        settings: Optional[Mapping[str, str]] = None,


Is there a reason why writer does not take settings per query (in the write method) while reader does? If not I would add them in both for consistency in the API.

I don't think there was a need for it at the call site, I'm fine with adding it though.

fpacifici · 2019-08-27T16:50:11Z

snuba/clickhouse/http.py

+            body=query.encode("utf-8"),
+        )
+
+        handle_clickhouse_response(response)


nit on naming: since this method either does nothing to the response or throws, why not following a naming convention like raise_for_status method. We could rename this to something like "assert_successful_response" that implies it will throw.

Makes sense, will come up with something better here.

fpacifici · 2019-08-27T16:51:22Z

snuba/clickhouse/http.py

+        handle_clickhouse_response(response)
+
+        result = json.loads(response.data.decode("utf-8"))
+        del result["statistics"]


You are removing these just not to expose Clickhouse internals, is that correct ?

Sort of, yeah — this is just to make the result structure consistent between both implementations (the native driver doesn't contain these fields.)

I think some of them might be helpful to use internally later (and can be fetched through other APIs with the native driver) but I want to avoid changing the return type too much here with this change. I also think we should probably try to avoid exposing them externally if possible.

fpacifici · 2019-08-27T17:06:44Z

snuba/util.py

@@ -444,7 +444,11 @@ def raw_query(body, sql, client, timer, stats=None):
                            query_settings['max_threads'] = 1

                        try:
-                            result = NativeDriverReader(client).execute(
+                            result = HTTPReader(


Considering we are already using the HTTP interface for writes I would try to keep this experiment simple. I can think about two possible issues we are trying to protect against:

results are incorrect. For this we need to run both queries and compare, so either we run both queries within the api call or outside by running the queries logged into Kafka. If we are already logging into Kafka, why not writing a quick consumer that runs both queries and compare and try to consume that topic? That seems quicker than introducing a DelegatingReader in Snuba

performance are dismal. For this we could (I don;'t know how much of a pain it is with our Kube deployment) do canary instead ? Deploy the api with HTTP reader on just a few node and monitor those?

HTTP reader makes Snuba unstable. See "performance are dismal". If we could do canary we would not need a DelegatingReader for this as well.

What is concerning you the most?

tkaemming · 2019-08-28T01:52:07Z

snuba/util.py

@@ -478,7 +478,7 @@ def raw_query(body, sql, client, timer, stats=None):
                            error = str(ex)
                            status = 500
                            logger.exception("Error running query: %s\n%s", sql, error)
-                            if isinstance(ex, ClickHouseError):
+                            if isinstance(ex, (NativeDriverClickHouseError, HTTPDriverClickHouseError)):


This is probably a smell that this point that the Reader doesn't raise a consistent error type.

tkaemming · 2019-08-28T01:53:55Z

snuba/api.py

+    settings.CLICKHOUSE_HOST,
+    settings.CLICKHOUSE_HTTP_PORT,
+    {'output_format_json_quote_64bit_integers': '0'},
+)


Going to make this switchable, just haven't decided how yet.

This is also another thing that will require per-dataset configuration (since not every dataset is going to use the same host) but that's nothing new.

Also, the motivation to moving this to the module level is that the connection pool should probably only be instantiated once per process, doing this in raw_query probably would have created a lot of unnecessary connections.

tkaemming · 2019-08-30T01:12:20Z

Also going to need to make sure this is set to use readonly before deploying: https://clickhouse.yandex/docs/en/operations/settings/permissions_for_queries/#settings_readonly

tkaemming · 2020-03-05T19:19:35Z

This has been replaced by #819 which is a much cleaner implementation of the same ideas.

tkaemming commented Aug 9, 2019

View reviewed changes

tkaemming requested a review from a team August 9, 2019 22:52

fpacifici reviewed Aug 12, 2019

View reviewed changes

snuba/reader.py Outdated Show resolved Hide resolved

snuba/reader.py Outdated Show resolved Hide resolved

snuba/reader.py Outdated Show resolved Hide resolved

snuba/util.py Outdated Show resolved Hide resolved

tkaemming commented Aug 12, 2019

View reviewed changes

snuba/reader.py Outdated Show resolved Hide resolved

snuba/reader.py Outdated Show resolved Hide resolved

snuba/util.py Outdated Show resolved Hide resolved

tkaemming mentioned this pull request Aug 14, 2019

ref(reader): Return date and datetimes instead of strings #414

Merged

tkaemming mentioned this pull request Aug 21, 2019

ref(reader): Generalize date/datetime handling #433

Merged

tkaemming force-pushed the http-reader branch 2 times, most recently from 1e1401f to cc69f00 Compare August 23, 2019 04:33

fpacifici reviewed Aug 23, 2019

View reviewed changes

tkaemming added 7 commits August 26, 2019 13:22

ref: Add HTTPReader

f58660d

review/self-review fixes/improvements

0767706

a couple fixes for consistent payload structures

6b86175

consistent date formatting

d66877d

use urllib3

b75de27

keepalive and gzip,deflate

e4dbfa8

updates to account for clickhouse module move

0c8bbc1

tkaemming force-pushed the http-reader branch from fe90690 to dd82e3c Compare August 26, 2019 20:41

tkaemming added 2 commits August 26, 2019 13:42

reuse common error handling bits

1eb1b35

consolidate error handling

ee9adac

tkaemming force-pushed the http-reader branch from dd82e3c to ee9adac Compare August 26, 2019 20:42

tkaemming added 2 commits August 26, 2019 13:55

unused import cleanup

1b99bfb

remove unnecessary return

490d11b

fpacifici reviewed Aug 27, 2019

View reviewed changes

tkaemming added 2 commits August 27, 2019 18:44

better name for error response

39027fd

move reader instantiation out of raw_query

ae67337

tkaemming commented Aug 28, 2019

View reviewed changes

tkaemming mentioned this pull request Feb 21, 2020

ref(query): Add optional format parameter to ClickhouseQuery.format_sql #790

Merged

tkaemming mentioned this pull request Mar 4, 2020

ref(clickhouse): Consolidate error for native and HTTP reader/writer #813

Merged

tkaemming mentioned this pull request Mar 5, 2020

feat(reader): Add ClickHouse HTTPReader #819

Closed

tkaemming closed this Mar 5, 2020

tkaemming deleted the http-reader branch March 5, 2020 19:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ref: Add HTTPReader #406

ref: Add HTTPReader #406

tkaemming commented Aug 9, 2019

fpacifici commented Aug 9, 2019

tkaemming left a comment

tkaemming Aug 9, 2019

tkaemming Aug 9, 2019

mattrobenolt Aug 21, 2019

tkaemming Aug 21, 2019

tkaemming Aug 23, 2019

tkaemming Aug 9, 2019

fpacifici Aug 23, 2019

tkaemming Aug 26, 2019

mattrobenolt Aug 26, 2019

tkaemming Aug 26, 2019

fpacifici Aug 27, 2019

tkaemming Aug 27, 2019

mattrobenolt Aug 27, 2019

fpacifici left a comment

tkaemming left a comment

fpacifici Aug 23, 2019

fpacifici Aug 23, 2019

fpacifici left a comment

fpacifici Aug 27, 2019

tkaemming Aug 27, 2019

fpacifici Aug 27, 2019

tkaemming Aug 27, 2019

fpacifici Aug 27, 2019

tkaemming Aug 27, 2019

fpacifici Aug 27, 2019

tkaemming Aug 28, 2019

tkaemming Aug 28, 2019

tkaemming Aug 28, 2019

tkaemming commented Aug 30, 2019

tkaemming commented Mar 5, 2020

ref: Add HTTPReader #406

ref: Add HTTPReader #406

Conversation

tkaemming commented Aug 9, 2019

fpacifici commented Aug 9, 2019

tkaemming left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fpacifici left a comment

Choose a reason for hiding this comment

tkaemming left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fpacifici left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkaemming commented Aug 30, 2019

tkaemming commented Mar 5, 2020