-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spark] Resolves #1679 issue glue catalog #2310
base: branch-2.3
Are you sure you want to change the base?
[Spark] Resolves #1679 issue glue catalog #2310
Conversation
Signed-off-by: Felipe Calixto Filho <felipe.calixto>
Signed-off-by: Felipe Calixto Filho <felipe.calixto>
@calixtofelipe Thanks! It seems that your PR looks very similar to my original PR (#1579) but at that time it introduced another issue. Let me share something we observed in the past soon. |
Hey @moomindani, in the original PR you changed the provider='delta' to 'parquet' and it generate the another issue because a lot of other places will check the provider (E.g: impact time travel capability). |
Thanks for clarifying it. I confirmed that your PR won't cause the same issue that I experienced. |
Yes, only issue 1 has been resolved. |
I agree, Issue 1 is more critical than Issue 2. Thanks for clarifying it. |
I totally agree. I added the comment in the PR description and edited my commend in the issue. Thanks again for helping. |
Any timeline as to when this will be merged? |
Hi @calixtofelipe, which conf are you using to run it on AWS Glue? I mean not only spark.conf |
Yes, @lucabem, after you build a delta package from this branch, you should set it as a delta-package via |
+1 |
Which Delta project/connector is this regarding?
Description
spark.databricks.delta.fixSchema.GlueCatalog
in DeltaSqlConf.1 - In cleanupTableDefinition function, the schema will be updated with the table schema
2 - In updateCatalog function, after create a table in the catalog it will update the table schema using a session catalog function (alterTableDataSchema)
How was this patch tested?
I created 2 tests in DeltaTableBuilderSuite:
"Test schema external table delta glue catalog conf activated"
"Test schema delta glue catalog conf activated"
These tests just check if managed and external table will create the schema correctly when the parameter activated.
But the solution was tested in AWS glue catalog, creating. the tables and check in glue catalog if the table has the right schema and check if Athena can read the table.
Follow the 2 ways we can create tables after this solution:
Managed table:
The database location needs to be informed in the database catalog configuration.
External table
Does this PR introduce any user-facing changes?
No.