Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvester force update option #5096

Open
1 task
jbrown-xentity opened this issue Feb 19, 2025 · 0 comments
Open
1 task

Harvester force update option #5096

jbrown-xentity opened this issue Feb 19, 2025 · 0 comments
Labels
H2.0/Harvest-General General Harvesting 2.0 Issues

Comments

@jbrown-xentity
Copy link
Contributor

User Story

In order to reprocess records with new code/code changes, data.gov admins want a "force update" feature that ignores the metadata source comparison check and updates datasets from the source.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

  • GIVEN harvest code has been changed
    AND a harvest source needs to be re-processed
    WHEN a manual harvest is initialized with the "force update" flag
    THEN all datasets from the source are processed (regardless of whether they have changed or not)

Background

Usually, the only thing that would have us want to re-process a dataset would be the metadata changing. However if the harvester code or logic changes, there are reasons to ignore those optimization checks and simply re-pull the data.
In the past we have sometimes managed this by clearing and re-harvesting a data source; this is inelegant and also causes downtime for datasets, and sometimes can result in URL changes for dataset pages. Update in place is much better.
In the future we could even implement running a force for all datasets; manually re-syncing data sources once a year seems like a good practice.

This will also allow for better bug-fixes as we go live, as changes may be required.

Security Considerations (required)

[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]

Sketch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
H2.0/Harvest-General General Harvesting 2.0 Issues
Projects
Status: 📥 Queue
Development

No branches or pull requests

1 participant