-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transaction log JSON formatting issue when writing data via Python bindings #1017
Labels
Comments
rtyler
added
bug
Something isn't working
binding/python
Issues for the Python package
labels
Dec 14, 2022
As I mentioned in Slack, I think it's not that big of a deal for us to add some quotes around the |
Code to reproduce:
|
This was referenced Jan 8, 2023
3 tasks
wjones127
pushed a commit
that referenced
this issue
Jan 17, 2023
# Description Currently writing "operationParameters" in commit info is misaligned with delta io connector. [Here](https://github.com/delta-io/delta/blob/36a7edb8cf507e713700ba827c5fb5ad32b9163e/core/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala#L695) the sample of structure which is used in delta io. So the goal of this PR is to align with delta io approach and the PR do two thins: convert all values to string and delete keys with null values. # Related Issue(s) Closes [issue #1017](#1017) Co-authored-by: Ilya Moshkov <[email protected]>
chitralverma
pushed a commit
to chitralverma/delta-rs
that referenced
this issue
Mar 17, 2023
# Description Currently writing "operationParameters" in commit info is misaligned with delta io connector. [Here](https://github.com/delta-io/delta/blob/36a7edb8cf507e713700ba827c5fb5ad32b9163e/core/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala#L695) the sample of structure which is used in delta io. So the goal of this PR is to align with delta io approach and the PR do two thins: convert all values to string and delete keys with null values. # Related Issue(s) Closes [issue delta-io#1017](delta-io#1017) Co-authored-by: Ilya Moshkov <[email protected]>
5 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
hi there, i am using deltalake 0.6.4 python bindings to try writing delta tables to a S3-compatible storage.
i noticed an issue that the transaction log generated is not entirely the same as the example in https://github.com/delta-io/delta/blob/master/PROTOCOL.md (refer to the image attached, left one is the log generated by deltalake package, right one is the example on GitHub)
this caused the data query engine Trino (version: 397)
not able to read the delta log properly.
the relevant part of the error message is:
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.io.IOException: com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot deserialize value of type
java.lang.String
from Array value (tokenJsonToken.START_ARRAY
)at [Source: (String)"{"commitInfo":{"clientVersion":"delta-rs.0.5.0","operation":"delta-rs.Write","operationParameters":{"mode":"Append","partitionBy":[],"predicate":null},"timestamp":1670980236192}}"; line: 1, column: 131] (through reference chain: io.trino.plugin.deltalake.transactionlog.DeltaLakeTransactionLogEntry["commitInfo"]->io.trino.plugin.deltalake.transactionlog.CommitInfoEntry["operationParameters"]->java.util.LinkedHashMap["partitionBy"])
may i know if there is any method to work around the issue?
Slack Message
The text was updated successfully, but these errors were encountered: