You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I love the SchemaModel API and how it allows me to exploit inheritance in a Pydantic-style way. One use case that I'm having trouble finding an elegant solution for is when I have a pipeline with a function that transforms an InputSchema in a way that subsets the upstream columns (and likely adds a few more). I currently have to copy and paste the columns that are in both the InputSchema and OutputSchema. I'd love to have OutputSchema inherit the InputSchema and then just do modifications like remove columns. Current implementation for a input and output schema for a function that drops a column:
It's clear from this implementation that keep_col is duplicated exactly across both schema, so it would be nice to figure out how to deduplicate that code somehow.
I know that I could use SchemaModel.to_schema() and then use the DataFrameSchema Transformation methods, but that would require me to mix and match the different syntactic representations of schema, which isn't ideal.
Does anyone else run into a similar problem and have an elegant solution?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello!
I love the SchemaModel API and how it allows me to exploit inheritance in a Pydantic-style way. One use case that I'm having trouble finding an elegant solution for is when I have a pipeline with a function that transforms an InputSchema in a way that subsets the upstream columns (and likely adds a few more). I currently have to copy and paste the columns that are in both the InputSchema and OutputSchema. I'd love to have OutputSchema inherit the InputSchema and then just do modifications like remove columns. Current implementation for a input and output schema for a function that drops a column:
It's clear from this implementation that
keep_col
is duplicated exactly across both schema, so it would be nice to figure out how to deduplicate that code somehow.I know that I could use
SchemaModel.to_schema()
and then use the DataFrameSchema Transformation methods, but that would require me to mix and match the different syntactic representations of schema, which isn't ideal.Does anyone else run into a similar problem and have an elegant solution?
Beta Was this translation helpful? Give feedback.
All reactions