Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose per-query structured evaluator logs #1186

Merged
merged 3 commits into from
Mar 29, 2022

Conversation

edoardopirovano
Copy link
Contributor

Do not merge until the corresponding internal PR to the query server has been merged and the estimated version in this PR has been updated to match when we actually ship that.

This PR adds two new options to the Query History view:

  1. Show Evaluator Log (Raw) displays the raw JSON from the structured logs of a given query run.
  2. Show Evaluator Log (Summary) creates a human-readable summary of the structured logs and displays that.

Checklist

  • CHANGELOG.md has been updated to incorporate all user visible changes made by this pull request.
  • Issues have been created for any UI or other user-facing changes made by this pull request.
  • [Maintainers only] If this pull request makes user-facing changes that require documentation changes, open a corresponding docs pull request in the github/codeql repo and add the ready-for-doc-review label there.

@edoardopirovano edoardopirovano requested a review from a team as a code owner March 4, 2022 16:08
@edoardopirovano edoardopirovano force-pushed the edoardo/per-query-struct-logs branch from 40e3944 to b31b26f Compare March 4, 2022 16:10
/**
* The dataset for which we want to start a new structured log
*/
db: Dataset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, these logs are specific to a dataset, not a query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a sense, yes. We rotate the logs when new queries are started, but if there are multiple concurrent queries running then events from the evaluator will end up in the latest query to be started.

multiSelect: QueryHistoryInfo[]
) {
// Local queries only
if (!this.assertSingleQuery(multiSelect) || singleItem?.t !== 'local') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best if you call const { finalSingleItem, finalMultiSelect } = this.determineSelection(singleItem, multiSelect); before this so that you handle some edge cases where you right-click on an item that is not selected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

multiSelect: QueryHistoryInfo[]
) {
// Local queries only
if (!this.assertSingleQuery(multiSelect) || singleItem?.t !== 'local') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also done.

}

if (singleItem.evalLogLocation) {
const summaryLocation = singleItem.evalLogLocation + '.summary';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have these file names encapsulated in the LocalQueryInfo object. Can you create a getter called evalLogSummaryLocation? and return the proper path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done.

@@ -258,3 +258,7 @@ export class QueryServerClient extends DisposableObject {
export function findQueryLogFile(resultPath: string): string {
return path.join(resultPath, 'query.log');
}

export function findQueryStructLogFile(resultPath: string): string {
return path.join(resultPath, 'evaluator-log.jsonl');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just making sure this isn't a typo and it should be evaluator-log.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a typo - .jsonl is the format for line-delimited JSON (https://jsonlines.org/) which is what this file is.

@@ -657,6 +675,7 @@ export async function compileAndRunQueryAgainstDatabase(
progress: ProgressCallback,
token: CancellationToken,
templates?: messages.TemplateDefinitions,
queryInfo?: LocalQueryInfo,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this value will only be available for queries directly initiated by the user? Contextual queries won't have the queryInfo? That's fine, but I just want to be sure I understand. Could you add a comment here to explain that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's exactly right. There's a few places where queries aren't initiated by the user (as you say, contextual queries, and also test code). In these cases we don't have a sensible place to put the structured logs, nor any reason to store them, so we don't create a log for these cases.

@edoardopirovano
Copy link
Contributor Author

Thanks for the review @aeisenberg! I've believe I have addressed all your comments either with responses above or in the new commit 🙂

Copy link
Contributor

@aeisenberg aeisenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Copy link
Contributor

@angelapwen angelapwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️ just a few comments since we are waiting for the per-query structured logging PR to merge and be released anyway.

Comment on lines +765 to +774
if (!this.assertSingleQuery(finalMultiSelect) || !finalSingleItem || finalSingleItem.t !== 'local') {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this situation also result in some kind of error message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good question. I followed what @aeisenberg did above for other similar drop downs. Any particular reason you went with just a return rather than a more explicit error?

Comment on lines +783 to +792
if (!this.assertSingleQuery(finalMultiSelect) || !finalSingleItem || finalSingleItem.t !== 'local') {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

@edoardopirovano edoardopirovano force-pushed the edoardo/per-query-struct-logs branch from 76cd2fa to 9db40c2 Compare March 29, 2022 08:39
@edoardopirovano
Copy link
Contributor Author

Rebased onto the latest main to check everything is still working since this PR has been open for a while and we're now at a point where we can merge it.

@edoardopirovano edoardopirovano merged commit 3bee290 into main Mar 29, 2022
@edoardopirovano edoardopirovano deleted the edoardo/per-query-struct-logs branch March 29, 2022 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants