-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(Views): multiple joins #1611
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…355-query * commit '61ba3530794fbe2b5739a04c26dc35e704ce69c9': fix(dataset): slug format validation on load (sinaptik-ai#1609) fix(views): transformation using raw sql (sinaptik-ai#1608) Release v3.0.0-beta.10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 72b0584 in 2 minutes and 44 seconds
More details
- Looked at
176
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
8
drafted comments based on config settings.
1. pandasai/query_builders/view_query_builder.py:127
- Draft comment:
Ensure that the target dataset key exists in schema_dependencies_dict before using it. Consider adding error handling for missing keys to prevent KeyError. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
The comment raises a valid concern about potential KeyErrors. However, schema_dependencies_dict is a core dependency passed in constructor, and to_datasets comes from schema.relations which should be validated at schema creation time. This feels like it should be handled at a higher architectural level, not in this specific method. The comment is asking for defensive programming in what appears to be an internal API.
I might be wrong about schema validation - perhaps invalid relations could slip through. Also, defensive programming can be good practice even for internal APIs.
While defensive programming is good, this particular case feels like it would be masking a more fundamental data integrity issue that should be caught earlier. Adding error handling here would be treating a symptom rather than the cause.
Delete the comment as it suggests error handling at the wrong level of abstraction. Data integrity should be enforced at schema validation time.
2. pandasai/query_builders/view_query_builder.py:131
- Draft comment:
Using ' AND '.join(conditions) is fine; ensure that sanitize_view_column_name fully prevents SQL injection in join conditions. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
3. tests/unit_tests/query_builders/test_view_query_builder.py:266
- Draft comment:
Hardcoding expected multi-line SQL strings in tests can be brittle due to formatting. Consider comparing parsed SQL expressions or using a normalization function in tests. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
The comment raises a valid point about test maintainability - string comparisons of SQL can be fragile if formatting changes. However, the current approach is consistent throughout the test file and seems to be working well. The suggestion would require significant test refactoring and adding new dependencies/utilities for SQL parsing. The current approach, while not perfect, is simple and readable.
The comment identifies a real testing best practice issue. SQL parsing would make tests more robust against formatting changes.
While technically correct, implementing the suggestion would add complexity for minimal benefit since the current string-based tests are working well and are consistent throughout the codebase.
Delete the comment since it suggests a major testing approach change that would add complexity without providing enough benefit to justify the refactoring effort.
4. pandasai/query_builders/view_query_builder.py:118
- Draft comment:
Consider renaming the variable 'to_datasets' to 'target_dataset' for clarity, since it represents a single target dataset extracted from the join relation. - Reason this comment was not posted:
Marked as duplicate.
5. pandasai/query_builders/view_query_builder.py:121
- Draft comment:
In the join condition strings, sanitize_view_column_name is used directly. Consider if using self.normalize_view_column_name would be more consistent with column aliasing. If the intent is to preserve table.column format for join predicates, document this choice. - Reason this comment was not posted:
Marked as duplicate.
6. pandasai/query_builders/view_query_builder.py:126
- Draft comment:
The logic grouping join conditions by target dataset and combining them with 'AND' is clear and DRY. Ensure that tests sufficiently cover scenarios where multiple join conditions exist for the same target table. - Reason this comment was not posted:
Marked as duplicate.
7. tests/unit_tests/query_builders/test_view_query_builder.py:243
- Draft comment:
New tests (test_multiple_joins_same_table and test_three_table_join) provide comprehensive coverage for multi-join scenarios. Consider adding brief inline comments to clarify their intent, and monitor that the pretty-printed SQL output remains stable with sqlglot updates. - Reason this comment was not posted:
Marked as duplicate.
8. tests/unit_tests/query_builders/test_view_query_builder.py:16
- Draft comment:
The helper method _create_mock_loader is effective for creating mock loaders. If it becomes widely used across tests, consider refactoring it as a pytest fixture to promote reuse. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
The suggestion is about code organization and test maintainability. The method is currently used in two new tests, but it's not clear if it will be widely used elsewhere. The comment is speculative ("if it becomes widely used") rather than addressing a current issue. Converting to a fixture would require additional work without clear immediate benefit.
The comment makes a reasonable suggestion for potential future refactoring, but it's speculative and not based on current widespread usage. The current implementation as a helper method is perfectly valid for its current usage.
While fixtures are good for test reuse, making premature abstractions based on potential future use cases can lead to unnecessary complexity. The current helper method approach is appropriate for its current limited usage.
The comment should be deleted as it's speculative and suggests a refactoring that isn't clearly needed given the current usage pattern.
Workflow ID: wflow_KGNpdW0aG8wfJk34
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Important
Fixes handling of multiple join conditions in
ViewQueryBuilder
by grouping and combining them, with new tests added for verification.ViewQueryBuilder
by grouping and combining conditions with 'AND'.test_multiple_joins_same_table
to verify multiple join conditions are combined correctly.test_three_table_join
to verify joining three different tables with correct conditions.This description was created by
for 72b0584. It will automatically update as commits are pushed.