-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DEV-11770] [DEV-11771] - Add account_download create and load commands, Add generate spark download command #4265
base: poc/dev-11769-spark-downloads-from-table
Are you sure you want to change the base?
Conversation
…into ftr/dev-11771-spark-account-download-from-table
…park-account-download-from-table [DEV-11771] - Spark account download from unfiltered table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more changes on the download query that I don't think will impact your captured metrics, but should be corrected so we don't carry forward that error.
@@ -108,7 +116,7 @@ def download_to_csv( | |||
raise e | |||
finally: | |||
Path(temp_file_path).unlink() | |||
return [destination_path], count | |||
return CSVDownloadMetadata([destination_path], row_count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not worth fixing as part of this work, but worth calling out so that we remember. This change to return a dataclass will mean that the current download functionality will need to take this new return into account as well.
if self.spark_created_by_command: | ||
self.spark.stop() | ||
append_files_to_zip_file(final_csv_data_file_locations, download_zip_path) | ||
self._logger.info(f"Generated the following data csv files {final_csv_data_file_locations}") | ||
return final_csv_data_file_locations, record_count | ||
return CSVDownloadMetadata(final_csv_data_file_locations, record_count, column_count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
( | ||
( | ||
submission_attributes.quarter_format_flag = TRUE | ||
AND submission_attributes.reporting_fiscal_quarter = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are really close to this SQL being done. This download query should not filter by anything that would be input by a user. In this case the Fiscal Quarter, Period, and Year fall into that category. Similar to how the filtering by Fiscal Year had to be moved into the next query, these fields do as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment applies to anywhere that is filtering by a specific Fiscal Period or Quarter.
Description:
This PR updates the load_query_to_delta command to allow the creation and loading of a table that represents an unfiltered account download.
Technical details: