Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(apps): runtime orchestration fixes #34205

Merged
merged 9 commits into from
Dec 19, 2024
22 changes: 22 additions & 0 deletions .changeset/blue-items-raise.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
'@rocket.chat/fuselage-ui-kit': patch
'@rocket.chat/instance-status': patch
'@rocket.chat/ui-theming': patch
'@rocket.chat/model-typings': patch
'@rocket.chat/ui-video-conf': patch
'@rocket.chat/uikit-playground': patch
'@rocket.chat/core-typings': patch
'@rocket.chat/rest-typings': patch
'@rocket.chat/apps-engine': patch
'@rocket.chat/ui-composer': patch
'@rocket.chat/ui-contexts': patch
'@rocket.chat/gazzodown': patch
'@rocket.chat/ui-avatar': patch
'@rocket.chat/ui-client': patch
'@rocket.chat/livechat': patch
'@rocket.chat/ui-voip': patch
'@rocket.chat/i18n': patch
'@rocket.chat/meteor': patch
---

Fixes an error where the engine would not retry a subprocess restart if the last attempt failed
22 changes: 22 additions & 0 deletions .changeset/gold-comics-taste.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
'@rocket.chat/fuselage-ui-kit': patch
'@rocket.chat/instance-status': patch
'@rocket.chat/ui-theming': patch
'@rocket.chat/model-typings': patch
'@rocket.chat/ui-video-conf': patch
'@rocket.chat/uikit-playground': patch
'@rocket.chat/core-typings': patch
'@rocket.chat/rest-typings': patch
'@rocket.chat/apps-engine': patch
'@rocket.chat/ui-composer': patch
'@rocket.chat/ui-contexts': patch
'@rocket.chat/gazzodown': patch
'@rocket.chat/ui-avatar': patch
'@rocket.chat/ui-client': patch
'@rocket.chat/livechat': patch
'@rocket.chat/ui-voip': patch
'@rocket.chat/i18n': patch
'@rocket.chat/meteor': patch
---

Fixes error propagation when trying to get the status of apps in some cases
22 changes: 22 additions & 0 deletions .changeset/proud-planets-applaud.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
'@rocket.chat/fuselage-ui-kit': patch
'@rocket.chat/instance-status': patch
'@rocket.chat/ui-theming': patch
'@rocket.chat/model-typings': patch
'@rocket.chat/ui-video-conf': patch
'@rocket.chat/uikit-playground': patch
'@rocket.chat/core-typings': patch
'@rocket.chat/rest-typings': patch
'@rocket.chat/apps-engine': patch
'@rocket.chat/ui-composer': patch
'@rocket.chat/ui-contexts': patch
'@rocket.chat/gazzodown': patch
'@rocket.chat/ui-avatar': patch
'@rocket.chat/ui-client': patch
'@rocket.chat/livechat': patch
'@rocket.chat/ui-voip': patch
'@rocket.chat/i18n': patch
'@rocket.chat/meteor': patch
---

Fixes wrong data being reported to total failed apps metrics and statistics
8 changes: 3 additions & 5 deletions apps/meteor/app/statistics/server/lib/getAppsStatistics.ts
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ async function _getAppsStatistics(): Promise<AppsStatistics> {
totalInstalled++;

const status = await app.getStatus();
const storageItem = await app.getStorageItem();
const storageItem = app.getStorageItem();

if (storageItem.installationSource === AppInstallationSource.PRIVATE) {
totalPrivateApps++;
Expand All @@ -51,12 +51,10 @@ async function _getAppsStatistics(): Promise<AppsStatistics> {
}
}

if (status === AppStatus.MANUALLY_DISABLED) {
totalFailed++;
}

if (AppStatusUtils.isEnabled(status)) {
totalActive++;
} else if (status !== AppStatus.MANUALLY_DISABLED) {
totalFailed++;
}
}),
);
Expand Down
15 changes: 2 additions & 13 deletions packages/apps-engine/deno-runtime/deno.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 13 additions & 2 deletions packages/apps-engine/deno-runtime/lib/metricsCollector.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import { Queue } from "./messenger.ts";

export function collectMetrics() {
return {
pid: Deno.pid,
queueSize: Queue.getCurrentSize(),
}
};
Expand All @@ -15,6 +16,16 @@ export async function sendMetrics() {
await writeAll(Deno.stderr, encoder.encode(JSON.stringify(metrics)));
}

export function startMetricsReport() {
setInterval(sendMetrics, 5000);
let intervalId: number;

export function startMetricsReport(frequencyInMs = 5000) {
if (intervalId) {
throw new Error('There is already an active metrics report');
}

intervalId = setInterval(sendMetrics, frequencyInMs);
}

export function abortMetricsReport() {
clearInterval(intervalId);
}
11 changes: 11 additions & 0 deletions packages/apps-engine/deno-runtime/lib/parseArgs.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
import { parseArgs as $parseArgs } from "https://jsr.io/@std/cli/1.0.9/parse_args.ts";

export type ParsedArgs = {
subprocess: string;
spawnId: number;
metricsReportFrequencyInMs?: number;
}

export function parseArgs(args: string[]): ParsedArgs {
return $parseArgs(args);
}
5 changes: 4 additions & 1 deletion packages/apps-engine/deno-runtime/main.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import handleApp from './handlers/app/handler.ts';
import handleScheduler from './handlers/scheduler-handler.ts';
import registerErrorListeners from './error-handlers.ts';
import { startMetricsReport } from "./lib/metricsCollector.ts";
import { parseArgs } from "./lib/parseArgs.ts";

type Handlers = {
app: typeof handleApp;
Expand Down Expand Up @@ -128,8 +129,10 @@ async function main() {
}
}

const mainArgs = parseArgs(Deno.args);

registerErrorListeners();

main();

startMetricsReport();
startMetricsReport(mainArgs.metricsReportFrequencyInMs);
2 changes: 1 addition & 1 deletion packages/apps-engine/src/server/AppManager.ts
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ export class AppManager {
const prl = new ProxiedApp(this, item, {
// Maybe we should have an "EmptyRuntime" class for this?
getStatus() {
return AppStatus.COMPILER_ERROR_DISABLED;
return Promise.resolve(AppStatus.COMPILER_ERROR_DISABLED);
},
} as unknown as DenoRuntimeSubprocessController);

Expand Down
4 changes: 2 additions & 2 deletions packages/apps-engine/src/server/ProxiedApp.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import type { AppManager } from './AppManager';
import type { AppStatus } from '../definition/AppStatus';
import { AppStatus } from '../definition/AppStatus';
import { AppsEngineException } from '../definition/exceptions';
import type { IAppAuthorInfo, IAppInfo } from '../definition/metadata';
import { AppMethod } from '../definition/metadata';
Expand Down Expand Up @@ -79,7 +79,7 @@ export class ProxiedApp {
}

public async getStatus(): Promise<AppStatus> {
return this.appRuntime.getStatus();
return this.appRuntime.getStatus().catch(() => AppStatus.UNKNOWN);
}

public async setStatus(status: AppStatus, silent?: boolean): Promise<void> {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,11 @@ export class DenoRuntimeSubprocessController extends EventEmitter {

private state: 'uninitialized' | 'ready' | 'invalid' | 'restarting' | 'unknown' | 'stopped';

/**
* Incremental id that keeps track of how many times we've spawned a process for this app
*/
private spawnId = 0;

private readonly debug: debug.Debugger;

private readonly options = {
Expand Down Expand Up @@ -149,6 +154,8 @@ export class DenoRuntimeSubprocessController extends EventEmitter {
denoWrapperPath,
'--subprocess',
this.appPackage.info.id,
'--spawnId',
String(this.spawnId++),
];

// If the app doesn't request any permissions, it gets the default set of permissions, which includes "networking"
Expand Down Expand Up @@ -296,6 +303,7 @@ export class DenoRuntimeSubprocessController extends EventEmitter {
logger.info('Successfully restarted app subprocess');
} catch (e) {
logger.error("Failed to restart app's subprocess", { error: e.message || e });
throw e;
} finally {
await this.logStorage.storeEntries(AppConsole.toStorageEntry(this.getAppId(), logger));
}
Expand Down
13 changes: 11 additions & 2 deletions packages/apps-engine/src/server/runtime/deno/LivenessManager.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,11 @@ import type { ProcessMessenger } from './ProcessMessenger';
const COMMAND_PING = '_zPING';

const defaultOptions: LivenessManager['options'] = {
pingRequestTimeout: 10000,
pingRequestTimeout: 1000,
pingFrequencyInMS: 10000,
consecutiveTimeoutLimit: 4,
maxRestarts: Infinity,
restartAttemptDelayInMS: 1000,
};

/**
Expand All @@ -36,6 +37,9 @@ export class LivenessManager {

// Limit of times we can try to restart a process
maxRestarts: number;

// Time to delay the next restart attempt after a failed one
restartAttemptDelayInMS: number;
};

private subprocess: ChildProcess;
Expand Down Expand Up @@ -198,7 +202,12 @@ export class LivenessManager {
pid: this.subprocess.pid,
});

await this.controller.restartApp();
try {
await this.controller.restartApp();
} catch (e) {
this.debug('Restart attempt failed. Retrying in %dms', this.options.restartAttemptDelayInMS);
setTimeout(() => this.restartProcess('Failed restart attempt'), this.options.restartAttemptDelayInMS);
}

this.pingTimeoutConsecutiveCount = 0;
this.restartCount++;
Expand Down
Loading