Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DEV-11770] [DEV-11771] - Add account_download create and load commands, Add generate spark download command #4265

Open
wants to merge 14 commits into
base: poc/dev-11769-spark-downloads-from-table
Choose a base branch
from

Conversation

zachflanders-frb
Copy link
Contributor

Description:
This PR updates the load_query_to_delta command to allow the creation and loading of a table that represents an unfiltered account download.

Technical details:

@zachflanders-frb zachflanders-frb added the do not merge [pr] This PR should not be merged label Jan 31, 2025
@zachflanders-frb zachflanders-frb marked this pull request as ready for review February 7, 2025 15:12
@zachflanders-frb zachflanders-frb added the ready for review Desiring a code review, even if not ready to be merged label Feb 7, 2025
@zachflanders-frb zachflanders-frb changed the base branch from qat to poc/dev-11769-spark-downloads-from-query February 19, 2025 15:51
@zachflanders-frb zachflanders-frb removed the do not merge [pr] This PR should not be merged label Feb 19, 2025
@zachflanders-frb zachflanders-frb changed the title [DEV-11770] - Add account_download create and load commands [DEV-11770] [DEV-11771] - Add account_download create and load commands, Add generate spark download command Feb 19, 2025
@sethstoudenmier sethstoudenmier changed the base branch from poc/dev-11769-spark-downloads-from-query to poc/dev-11769-spark-downloads-from-table February 20, 2025 18:34
Copy link
Contributor

@sethstoudenmier sethstoudenmier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more changes on the download query that I don't think will impact your captured metrics, but should be corrected so we don't carry forward that error.

@@ -108,7 +116,7 @@ def download_to_csv(
raise e
finally:
Path(temp_file_path).unlink()
return [destination_path], count
return CSVDownloadMetadata([destination_path], row_count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not worth fixing as part of this work, but worth calling out so that we remember. This change to return a dataclass will mean that the current download functionality will need to take this new return into account as well.

if self.spark_created_by_command:
self.spark.stop()
append_files_to_zip_file(final_csv_data_file_locations, download_zip_path)
self._logger.info(f"Generated the following data csv files {final_csv_data_file_locations}")
return final_csv_data_file_locations, record_count
return CSVDownloadMetadata(final_csv_data_file_locations, record_count, column_count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

(
(
submission_attributes.quarter_format_flag = TRUE
AND submission_attributes.reporting_fiscal_quarter = 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are really close to this SQL being done. This download query should not filter by anything that would be input by a user. In this case the Fiscal Quarter, Period, and Year fall into that category. Similar to how the filtering by Fiscal Year had to be moved into the next query, these fields do as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment applies to anywhere that is filtering by a specific Fiscal Period or Quarter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for review Desiring a code review, even if not ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants