-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: jsonwriter and recordbatchwriter to respect stats skipping #2989
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
8e15603
to
1d3952e
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2989 +/- ##
==========================================
+ Coverage 72.44% 72.63% +0.18%
==========================================
Files 128 128
Lines 40974 41199 +225
Branches 40974 41199 +225
==========================================
+ Hits 29685 29924 +239
+ Misses 9374 9350 -24
- Partials 1915 1925 +10 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
I am making some modifications to this pull request to make the |
Sounds good thanks! Also I'm using going to be using JsonWriter for a pretty intensive workload, so I'd be happy to chip-in for any improvements/maintenance it might need. If you have anything in mind, feel free to tag me here or on slack. |
This is an update to JsonWriter and RecordBatchWriter to allow them to write commit log stats information in accordance with delta.dataSkippingNumIndexedCols and delta.dataSkippingStatsColumns if present on the table. If these fields are unset, then the default behavior of collecting stats for the first 32 columns is preserved Signed-off-by: Justin Jossick <[email protected]>
The JsonWriter was created before a lot of other code and was in a need of a little refactor. The writer does not commit to the Delta table on its own, which can be a benefit for some performance specific use-cases. This change does however enforce that it must be initialized with a valid Delta table path which will ensure it can use table configuration properly Signed-off-by: R. Tyler Croy <[email protected]>
bfd3220
to
f5ad8ed
Compare
Description
This is an update to JsonWriter and RecordBatchWriter to allow them to write commit log stats information in accordance with delta.dataSkippingNumIndexedCols and
delta.dataSkippingStatsColumns if present on the table. If these fields are unset, then the default behavior of collecting stats for the first 32 columns is preserved
Related Issue(s)
Tests
Tested by running all unit tests with
cargo test
as well as followed the instructions in CONTRIBUTING.md