Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Add OWID data managers to metadata #3786

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Conversation

Marigold
Copy link
Collaborator

@Marigold Marigold commented Jan 3, 2025

Implements #2465

Add a field owid_data_managers to DatasetMeta. The original proposal suggested setting it in snapshot and then propagating (+combining), this PR doesn't add it to Snapshot and only uses it in the garden / grapher step. I felt that it's not as useful in snapshot metadata and would unnecessarily complicate things.

The grapher step should map managers to user ids and save them in the new JSON column in datasets table. Alternatively, we could save just manager names (since querying on JSON fields is painful anyway).

Notes from Data architecture meeting

  • Use cases
    • Can I archive this dataset? → knowing who to ask
    • Reporting an issue on a chart → knowing who to tag
  • What is involved?
    • Version A: add it to snapshots, then propagate ← harder
    • 🏆 Version B: add it late in the YAML ← easier
  • Could we do it historically/automatically?
    • ✅ Historically: we could use Git commit history on the YAML doc for a dataset to populate the dataset
      • …but we won’t be able to do this for fast track, since owidbot did the magic
    • ✅ Automatically: can we do something with the wizard, using who we autodetect you to be (from Tailscale info)
  • Related issues
    • Charts are sometimes “owned” by the ETL user, rather than by a person
      • Could we detect the owner of owner of the indicators used in a chart?

TODO

  • Add data manager field to dataset metadata
  • Fill data manager automatically when creating steps in Wizard
  • Fill it historically from git history whenever possible
  • Fill it when using fast-track (use IP -> Tailscale)

@owidbot
Copy link
Contributor

owidbot commented Jan 3, 2025

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-owid-data-managers

chart-diff: ✅ No charts for review.
data-diff: ❌ Found differences
~ Dataset garden/biodiversity/2024-01-25/cherry_blossom
+   + owid_data_managers:
+   +   - Fiona Spooner
  = Table cherry_blossom
= Dataset garden/who/2024-09-09/flu_test
  = Table flu_test


Legend: +New  ~Modified  -Removed  =Identical  Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet

Automatically updated datasets matching weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2025-01-03 07:53:41 UTC
Execution time: 17.97 seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants