Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-50755][SQL] Pretty plan display for InsertIntoHiveTable
### What changes were proposed in this pull request? Add `toString` for `HiveFileFormat` and `HiveTempPath` to make the display of `InsertIntoHiveTable` plan pretty. ### Why are the changes needed? I found the current plan replacing rules does not handle tailing object hash properly https://github.com/apache/spark/blob/36d23eff4b4c3a2b8fd301672e532132c96fdd68/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestHelper.scala#L62 instead of fixing the replacing rule(see #49396, and please let me know if any reviewer think we should fix that too), seems we can override the `toString` of those classes to make it display pretty. Minor improvements of plan display for `InsertIntoHiveTable`, and make it consistent with `DataSource` plan like `InsertIntoHadoopFsRelationCommand` `InsertIntoHadoopFsRelationCommand`: ``` -- !query insert into t6 values (97) -- !query analysis InsertIntoHadoopFsRelationCommand file:[not included in comparison]/{warehouse_dir}/t6, false, Parquet, [path=file:[not included in comparison]/{warehouse_dir}/t6], Append, `spark_catalog`.`default`.`t6`, org.apache.spark.sql.execution.datasources.InMemoryFileIndex(file:[not included in comparison]/{warehouse_dir}/t6), [ascii] +- Project [cast(col1#x as bigint) AS ascii#xL] +- LocalRelation [col1#x] ``` `InsertIntoHiveTable`: ```patch -- !query insert into table spark_test_json_2021_07_16_01 values(1, 'a') -- !query analysis -InsertIntoHiveTable `spark_catalog`.`default`.`spark_test_json_2021_07_16_01`, false, false, [c1, c2], org.apache.spark.sql.hive.execution.HiveFileFormatxxxxxxxx, org.apache.spark.sql.hive.execution.HiveTempPath69beda67 +InsertIntoHiveTable `spark_catalog`.`default`.`spark_test_json_2021_07_16_01`, false, false, [c1, c2], Hive, HiveTempPath(file:[not included in comparison]/{warehouse_dir}/spark_test_json_2021_07_16_01) +- Project [cast(col1#x as int) AS c1#x, cast(col2#x as string) AS c2#x] +- LocalRelation [col1#x, col2#x] ``` ### Does this PR introduce _any_ user-facing change? It affects the `EXPLAIN` outputs and Spark UI `SQL/DataFrame` tab plan display ### How was this patch tested? See the above examples. Spark does not have SQL tests related to the `hive` module, I identified this issue when porting internal test cases to the 4.0. Since all existing SQL tests live on the `sql` module, adding hive-related tests is impossible. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49400 from pan3793/SPARK-50755. Authored-by: Cheng Pan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
- Loading branch information