Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add error stack to ErrorCode #16343

Merged
merged 3 commits into from
Sep 2, 2024
Merged

Conversation

andylokandy
Copy link
Collaborator

@andylokandy andylokandy commented Aug 29, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Aug 29, 2024
@andylokandy andylokandy marked this pull request as ready for review August 29, 2024 05:48
@andylokandy andylokandy requested a review from sundy-li August 29, 2024 05:49
@BohuTANG
Copy link
Member

How do I decide whether to use std::result::Result or databend_common_exception::exception::Result?

This PR replaces more of databend_common_exception::exception::Result with std::result::Result, but it makes it harder to determine when to use each.

@BohuTANG BohuTANG requested a review from zhang2014 August 29, 2024 07:45
@andylokandy
Copy link
Collaborator Author

How do I decide whether to use std::result::Result or databend_common_exception::exception::Result?

It's easy. When you want Result<T, ErrorCode>, use databend_common_exception::exception::Result, otherwise, use std::result::Result.

@BohuTANG
Copy link
Member

How do I decide whether to use std::result::Result or databend_common_exception::exception::Result?

It's easy. When you want Result<T, ErrorCode>, use databend_common_exception::exception::Result, otherwise, use std::result::Result.

Use any of them is not affect the stack of ErrorCode?

@andylokandy
Copy link
Collaborator Author

andylokandy commented Aug 29, 2024

Only Result<T, ErrorCode> records the error stack.

In practice, std::result::Result is used where

  1. interoperating with other libraries.
  2. intended to be recovered.
  3. be the root cause of an Result<T, ErrorCode>

In either case, the error stack is not needed.

@sundy-li
Copy link
Member

Are there any new tests to show the new error stack format ?

Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 59 files at r1, all commit messages.
Reviewable status: 5 of 59 files reviewed, 1 unresolved discussion (waiting on @andylokandy, @sundy-li, and @zhang2014)


src/common/exception/src/exception.rs line 104 at r1 (raw file):

    pub(crate) stacks: Vec<ErrorFrame>,
    pub(crate) _phantom: PhantomData<C>,
}

Why does ErrorCode require a type parameter C? Where is C actually used?

If this type parameter serves any implicit purpose, please add a doc comment explaining it.

Code quote:

pub struct ErrorCode<C = ()> {
    pub(crate) code: u16,
    pub(crate) name: String,
    pub(crate) display_text: String,
    pub(crate) detail: String,
    pub(crate) span: Span,
    // cause is only used to contain an `anyhow::Error`.
    // TODO: remove `cause` when we completely get rid of `anyhow::Error`.
    pub(crate) cause: Option<Box<dyn std::error::Error + Sync + Send>>,
    pub(crate) backtrace: Option<ErrorCodeBacktrace>,
    pub(crate) stacks: Vec<ErrorFrame>,
    pub(crate) _phantom: PhantomData<C>,
}

@zhang2014
Copy link
Member

Why need to change databend_common_exception::exception::Result to std::result::Result.

Perhaps it is compatible?

pub type databend_common_exception::exception::Result<T, E = ErrorCode> = std::result::Result<T, E>;

Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 10 of 59 files at r1.
Reviewable status: 15 of 59 files reviewed, 2 unresolved discussions (waiting on @andylokandy, @sundy-li, and @zhang2014)

a discussion (no related file):
I recommend using Result for std::result::Result throughout our codebase. For the specialized Result<T, ErrorCode>, consider using a different term, such as my_lib::Result.


@andylokandy
Copy link
Collaborator Author

andylokandy commented Aug 31, 2024

This is an code style change described in #15741.

Since when the issue is written, I've tried multiple ways to bring error-stack into databend, and this PR is the most viable one. Instead of replacing all Result<T, ErrorCode> with Result<T, error_stack::Report<C>>, this PR brings the core functionality of error-stack into ErrorCode, which is adding stacks (aka. context) info to the error when popping up the result thro the call stack.

Example:

async fn send(addr: &str, payload: &[u8]) -> Result<(), ConnError> {
    let make_error = || format!("failed to connect to {addr}");
    let endpoint = conn.connect_to(addr).await.with_context(make_error)?;
    endpoint.send(payload).await.with_context(make_error)?;
    Ok(())
}

when error occurs:

Error:
1. failed to execute query: select * from numbers(100), at /home/databend/src/query/connection.rs:123:34
2. failed to connect to 127.0.0.1:1234, at /home/databend/src/query/connection.rs:145:1
3. Socket or port is occupied (os error 2), at /home/databend/src/query/connection.rs:44:12

There is a problem in this code style: the developer usually just not add context where apropriate, and then just ? to throw away the error. error-stack resolves this problem by giving the Report<C> a phantom type C. It enforces the dev to call .change_context(context) to convert a Report<C> to Report<C2>. In this way, we can give every module a C, and so forth enforces a .change_context(context) at the boundary.

This PR has the exact same design with a slightly different name. Report<C> -> ErrorCode<C> and .change_context(context) to .with_context(context).

Another question is Why need to change databend_common_exception::exception::Result to std::result::Result. It's related to the phantom type C. Because ErrorCode<C> has an extra type C, Result<T, E> should became Result<T, ErrorCode<C>> and Result<T> should be an default context Result<T, ErrorCode<()>>. This brings a sematic change to Result<T, E> which had a meaning of std::result::Result<T, E>. So I explicitly qualified these usage of Result<T, E> in this PR.

Copy link
Member

@sundy-li sundy-li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this approach as it is an appropriate method to introduce error-stack when necessary.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 2, 2024
@BohuTANG
Copy link
Member

BohuTANG commented Sep 2, 2024

@andylokandy

Another question:
Is it possible to use with_context at the top level (similar to how it's used in an HTTP API handler) in the following code:
https://github.com/datafuselabs/databend/blob/main/src/query/service/src/servers/http/v1/query/execute_state.rs#L326-L403

to track all internal error stacks?

If so, how would the code look?

@andylokandy
Copy link
Collaborator Author

Is it possible to use with_context at the top level (similar to how it's used in an HTTP API handler) in the following code to track all internal error stacks?

If you mean rendering the stacks produced by with_context at the top level, yes, it can.

@andylokandy andylokandy added this pull request to the merge queue Sep 2, 2024
@BohuTANG BohuTANG removed this pull request from the merge queue due to a manual request Sep 2, 2024
@BohuTANG BohuTANG merged commit a58d8b4 into databendlabs:main Sep 2, 2024
78 of 79 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants