Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Optimize When Querying the tables_history Table with the Condition table_id = n #17166

Merged
merged 3 commits into from
Jan 5, 2025

Conversation

TCeason
Copy link
Collaborator

@TCeason TCeason commented Jan 3, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@TCeason TCeason marked this pull request as draft January 3, 2025 12:22
Copy link
Contributor

github-actions bot commented Jan 3, 2025

This pull request's title is not fulfill the requirements. @TCeason please update it 🙏.

Valid format:

fix(query): fix group by string bug
  ^         ^---------------------^
  |         |
  |         +-> Summary in present tense.
  |
  +-------> Type: rfc, feat, fix, refactor, ci, docs, chore

Valid types:

  • rfc: this PR proposes a new RFC
  • feat: this PR introduces a new feature to the codebase
  • fix: this PR patches a bug in codebase
  • refactor: this PR changes the code base without new features or bugfix
  • ci: this PR changes build/testing/ci steps
  • docs: this PR changes the documents or websites
  • chore: this PR only has small changes that no need to record

1 similar comment
Copy link
Contributor

github-actions bot commented Jan 3, 2025

This pull request's title is not fulfill the requirements. @TCeason please update it 🙏.

Valid format:

fix(query): fix group by string bug
  ^         ^---------------------^
  |         |
  |         +-> Summary in present tense.
  |
  +-------> Type: rfc, feat, fix, refactor, ci, docs, chore

Valid types:

  • rfc: this PR proposes a new RFC
  • feat: this PR introduces a new feature to the codebase
  • fix: this PR patches a bug in codebase
  • refactor: this PR changes the code base without new features or bugfix
  • ci: this PR changes build/testing/ci steps
  • docs: this PR changes the documents or websites
  • chore: this PR only has small changes that no need to record

@TCeason TCeason changed the title optimize: Optimize When Querying the tables_history Table with the Condition table_id = n refactor: Optimize When Querying the tables_history Table with the Condition table_id = n Jan 3, 2025
@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Jan 3, 2025
@TCeason TCeason requested review from drmingdrmer and b41sh January 3, 2025 14:12
@TCeason TCeason marked this pull request as ready for review January 3, 2025 14:12
@TCeason TCeason added the ci-benchmark-local Benchmark: run only local test label Jan 3, 2025
Copy link
Contributor

github-actions bot commented Jan 3, 2025

Docker Image for PR

  • tag: pr-17166-385ba4e-1735918577

note: this image tag is only available for internal use,
please check the internal doc for more details.

Copy link
Contributor

github-actions bot commented Jan 3, 2025

Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 13 of 14 files at r1, all commit messages.
Reviewable status: 13 of 14 files reviewed, 3 unresolved discussions (waiting on @b41sh and @TCeason)


src/query/storages/system/src/tables_table.rs line 359 at r1 (raw file):

                            .into_iter()
                            .flatten()
                            .filter(|table| !tables_names.contains(table))

Make tables_names a BTreeSet of HashSet. Otherwise for large dataset the O(n^2) time complexity for linear lookup might hang the query.

And it does not need filter() anymore. just call extend() to add new table names.


src/query/storages/system/src/tables_table.rs line 366 at r1 (raw file):

                    Err(err) => {
                        warn!("Failed to get tables: {}, {}", ctl.name(), err);
                    }

Why just warning it but not returning an error? It could be a severe error.

Code quote:

                    Err(err) => {
                        warn!("Failed to get tables: {}, {}", ctl.name(), err);
                    }

src/query/storages/system/src/tables_table.rs line 451 at r1 (raw file):

                            Err(err) => {
                                let msg = format!("Failed to get tables: {}, {}", ctl.name(), err);
                                warn!("{}", msg);

Return the error instead of just warning about it.

Code quote:

                                let msg = format!("Failed to get tables: {}, {}", ctl.name(), err);
                                warn!("{}", msg);

Copy link
Collaborator Author

@TCeason TCeason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 13 of 14 files reviewed, 3 unresolved discussions (waiting on @b41sh and @drmingdrmer)


src/query/storages/system/src/tables_table.rs line 359 at r1 (raw file):

Previously, drmingdrmer (张炎泼) wrote…

Make tables_names a BTreeSet of HashSet. Otherwise for large dataset the O(n^2) time complexity for linear lookup might hang the query.

And it does not need filter() anymore. just call extend() to add new table names.

Done.


src/query/storages/system/src/tables_table.rs line 366 at r1 (raw file):

Previously, drmingdrmer (张炎泼) wrote…

Why just warning it but not returning an error? It could be a severe error.

system tables needs as many output results as possible, and even though the result set may be empty, the business does not want to report errors. If an unexpected result set appears, it can be screened through the log audit function

Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @b41sh)


src/query/storages/system/src/tables_table.rs line 366 at r1 (raw file):

Previously, TCeason wrote…

system tables needs as many output results as possible, and even though the result set may be empty, the business does not want to report errors. If an unexpected result set appears, it can be screened through the log audit function

Get it


src/query/storages/system/src/tables_table.rs line 451 at r1 (raw file):

Previously, drmingdrmer (张炎泼) wrote…

Return the error instead of just warning about it.

Same as the above conclusion.

Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @b41sh)

@TCeason TCeason added ci-benchmark-local Benchmark: run only local test and removed ci-benchmark-local Benchmark: run only local test labels Jan 4, 2025
Copy link
Contributor

github-actions bot commented Jan 4, 2025

Docker Image for PR

  • tag: pr-17166-7b08800-1736005066

note: this image tag is only available for internal use,
please check the internal doc for more details.

Copy link
Contributor

github-actions bot commented Jan 4, 2025

@TCeason TCeason added ci-benchmark-local Benchmark: run only local test and removed ci-benchmark-local Benchmark: run only local test labels Jan 4, 2025
Copy link
Contributor

github-actions bot commented Jan 4, 2025

Docker Image for PR

  • tag: pr-17166-7b08800-1736034033

note: this image tag is only available for internal use,
please check the internal doc for more details.

Copy link
Contributor

github-actions bot commented Jan 4, 2025

@TCeason
Copy link
Collaborator Author

TCeason commented Jan 5, 2025

cc @BohuTANG Please merge this pr. All ci pass and was not impact on performance.

Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @TCeason)

@drmingdrmer drmingdrmer added this pull request to the merge queue Jan 5, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 5, 2025
@drmingdrmer drmingdrmer added this pull request to the merge queue Jan 5, 2025
@BohuTANG BohuTANG removed this pull request from the merge queue due to a manual request Jan 5, 2025
@BohuTANG BohuTANG merged commit 8f6713d into databendlabs:main Jan 5, 2025
73 checks passed
@atifiu
Copy link

atifiu commented Jan 8, 2025

I can confirm that issue is fixed in version 1.2.685. Total time to fetch is approx. 0.06 secs as compared to version 1.2.674 where it took 160+ secs

image

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-benchmark-local Benchmark: run only local test pr-refactor this PR changes the code base without new features or bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants