Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esbuild plugin's debug ID generation is not deterministic #500

Open
Cldfire opened this issue Mar 8, 2024 · 6 comments
Open

esbuild plugin's debug ID generation is not deterministic #500

Cldfire opened this issue Mar 8, 2024 · 6 comments

Comments

@Cldfire
Copy link

Cldfire commented Mar 8, 2024

I am porting an internal web extension build system from Webpack to esbuild. Since we publish our web extension on addons.mozilla.org, we are required to provide the full source code for our extension in a manner that a Mozilla contractor can use to perform a from-scratch build that will byte-for-byte match the release artifact built from our CI environment.

The Sentry Webpack plugin uses the chunk hash as the input for the debug ID snippet generation function, resulting in the same debug ID for the same build inputs:

const debugId = arg?.chunk?.hash ? stringToUUID(arg.chunk.hash) : uuidv4();
return getDebugIdSnippet(debugId);

The Sentry esbuild plugin, however, uses a unique UUID as the input, resulting in a different debug ID for the same build inputs:

contents: getDebugIdSnippet(uuidv4()),

This breaks the determinism of our web extension build as the resulting build outputs have different debug IDs embedded within them.

I'd like to request that the esbuild plugin produce deterministic debug IDs based on the build inputs.

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Mar 8, 2024
@lforst
Copy link
Member

lforst commented Mar 8, 2024

Unfortunately, due to limitations of the esbuilds plugin system we currently cannot make it deterministic - unless we modify files after the fact. This is possible but a major PITA because it also would require us to rewrite any related sourcemaps.

(I just thought of a hack where we maybe only insert a placeholder and then after the build is done replace the placeholder with an actual id, based on the files content. That should also keep the source maps intact. 🤔)

We won't tackle this right now and probably also not in the medium term future. If you got other ideas, let us know! Also, feel free to contribute to the package yourself! We don't see too much usage of the esbuild plugin, especially relative to vite and webpack, so I cannot justify working on this right now (even though I would like to see it succeed!).

@lforst
Copy link
Member

lforst commented Mar 8, 2024

As a workaround you can also use the legacy process (which doesn't inject anything) or Sentry CLI, which should be deterministic when injecting.

@Cldfire
Copy link
Author

Cldfire commented Mar 8, 2024

Unfortunately, due to limitations of the esbuilds plugin system we currently cannot make it deterministic - unless we modify files after the fact. This is possible but a major PITA because it also would require us to rewrite any related sourcemaps.

Oof, gotcha.

Also, feel free to contribute to the package yourself! We don't see too much usage of the esbuild plugin, especially relative to vite and webpack, so I cannot justify working on this right now (even though I would like to see it succeed!).

Understandable 🙂 if I find the time I'll see if I can contribute something 👍.

As a workaround you can also use the legacy process (which doesn't inject anything) or Sentry CLI, which should be deterministic when injecting.

The workaround I've implemented right now is pulling the sentry debug IDs out of the build artifacts from our CI system and writing them to a file:

# Get the Sentry debug IDs from our CI build output and store them for later
# use
rg "sentry-dbid-[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}" \
	$(dist) -o -N --no-heading > sentry_debug_ids.txt

and then in our build system when being built for AMO review, I take those debug IDs and inject them back into the build output after bundling is finished:

import { $ } from "execa";
import { readFile } from "fs/promises";
import { fileExists } from "./build-support";

// TODO: remove this when https://github.com/getsentry/sentry-javascript-bundler-plugins/issues/500
// is fixed
export async function replaceSentryDebugIds() {
	// This file contains multiple lines. Each line is of the format:
	// dist/firefox/...:sentry-dbid-{uuid}
	const sentryDebugIdTxt = await readFile("sentry_debug_ids.txt", "utf-8");
	console.log("sentry_debug_ids.txt contents: \n\n" + sentryDebugIdTxt);

	return Promise.all(
		sentryDebugIdTxt
			.trim()
			.split("\n")
			.map((filepathAndDebugId) => filepathAndDebugId.split(":", 2))
			.map(([filepath, debugId]) => [
				filepath,
				debugId.replace("sentry-dbid-", ""),
			])
			.map(async ([filepath, debugId]) => {
				const exists = await fileExists(filepath);
				if (exists) {
					// The safest way to find the Sentry debug ID is where it's
					// prefixed by "sentry-dbid-", but it's also present in the
					// file without that prefix.
					//
					// We find its exact value and then replace that.
					const newIdInFile =
						await $`rg sentry-dbid-[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12} ${filepath} -o -N --no-heading --no-filename`;
					const newIdPrefixStripped = newIdInFile.stdout.replace(
						"sentry-dbid-",
						"",
					);

					console.log(
						`Replacing ${newIdPrefixStripped} in ${filepath} with ${debugId}`,
					);

					// Replace Sentry debug IDs in the build output with the
					// value from our build in CI.
					return $`sed -i s/${newIdPrefixStripped}/${debugId}/g ${filepath}`;
				} else {
					console.warn(
						`File at ${filepath} did not exist, unable to replace Sentry debug ID`,
					);
					return;
				}
			}),
	);
}

This has been working well so far for our needs.

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Mar 8, 2024
@Cldfire
Copy link
Author

Cldfire commented Mar 8, 2024

Actually, the above workaround isn't quite enough, because the randomly generated Sentry debug IDs end up in the source code before esbuild does minification, affecting esbuild's character frequency analysis: evanw/esbuild#1846 (comment)

For larger bundled outputs this isn't an issue and doesn't affect character frequency enough to cause esbuild's name generation to change (since there are already so many characters in the file). For smaller bundled outputs, however (we have an entrypoint that's < 20 LoC), the random UUID Sentry adds to the source code is very significant and has a large impact on character frequency. This causes esbuild to generate different names in the minified output, and even though we fix the debug ID in the output after it's generated, we can't fix the generated names 😞.

I'll have to give the CLI a shot.

@Cldfire
Copy link
Author

Cldfire commented Mar 9, 2024

In case it helps anyone else facing this issue, I was able to successfully turn this Sentry esbuild plugin config:

sentryEsbuildPlugin({
	disable: !sentryEnabled,
	org: "org",
	project: "project",
	release: {
		name: `project@${await getVersion(commitTag)}-${channel}`,
		inject: false,
		setCommits: undefined,
	},
	telemetry: false,
})

Into this code running after esbuild finishes using the Sentry CLI:

import { $ } from "execa";
import { removeSourcemaps } from "./build-support";

// Handle creating Sentry releases and uploading sourcemaps.
//
// Documentation referenced while putting this together:
//
// * https://docs.sentry.io/product/cli/releases/
// * https://docs.sentry.io/platforms/javascript/sourcemaps/uploading/cli/
if (sentryEnabled && metafile) {
	// Remove CSS sourcemaps, we don't want Sentry doing anything with them
	await removeSourcemaps(metafile, ".css.map");

	const sentryEnv = { SENTRY_ORG: "org", SENTRY_PROJECT: "project" };
	const shellConfig: CommonOptions = {
		env: sentryEnv,
		stdio: "inherit",
		verbose: true,
	};
	const releaseName = `project@${await getVersion(commitTag)}-${channel}`;

	if (hasSentryAuthToken) {
		const pipelineUrl = process.env.CI_PIPELINE_URL;
		await $(shellConfig)`sentry-cli releases new ${releaseName} ${
			pipelineUrl ? ["--url", pipelineUrl] : []
		}`;
	}

	await $(shellConfig)`sentry-cli sourcemaps inject ${dist}`;

	if (hasSentryAuthToken) {
		const jobUrl = process.env.CI_JOB_URL ?? "";
		await $(
			shellConfig,
		)`sentry-cli sourcemaps upload --strict --note ${jobUrl} --release ${releaseName} ${dist}`;

		await $(shellConfig)`sentry-cli releases finalize ${releaseName}`;
	}
}

Build determinism is working well with this setup.

@lforst
Copy link
Member

lforst commented Mar 11, 2024

The workaround I've implemented right now is pulling the sentry debug IDs out of the build artifacts from our CI system and writing them to a file

That's pretty novel. Cool!

Thanks for providing the workaround. Hopefully, people will find it useful. I'll keep this issue in the backlog.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants