RDS AWS Secrets Manager async db config fails randomly on AWS ECS with error: password authentication failed for user "postgres"

System Information
  • Strapi Version: v4.10.5
  • Operating System: node:18.7-alpine docker image
  • Database: postgres
  • Node Version: 18.7.1
  • NPM Version:
  • Yarn Version:

Hi everyone!

I’m currently facing a weird issue when deploying a strapi instance to AWS ECS and configuring the database async.

This is how I’m configuring it in the index.ts

Async configuration
"use strict";
// get environment variables to aws secrets manager
const {
  SecretsManagerClient,
  GetSecretValueCommand,
} = require("@aws-sdk/client-secrets-manager");
module.exports = {
  /**
   * An asynchronous register function that runs before
   * your application is initialized.
   *
   * This gives you an opportunity to extend code.
   */
  async register({ strapi }) {
    const secret_name = process.env["AWS_SECRET_NAME"];
    const client = new SecretsManagerClient({
      region: process.env["AWS_REGION"],
    });

    const response = await client.send(
      new GetSecretValueCommand({
        SecretId: secret_name,
      })
    );

    const { username, password } = JSON.parse(response.SecretString);

    strapi.server.app.keys = process.env["APP_KEYS"].split(",");
    strapi.config.set("admin.apiToken.salt", process.env["API_TOKEN_SALT"]);
    strapi.config.set("admin.auth.secret", process.env["ADMIN_JWT_SECRET"]);
    strapi.config.set(
      "admin.transfer.token.salt",
      process.env["TRANSFER_TOKEN_SALT"]
    );

    strapi.config.set("database.connection.client", "postgres");
    strapi.config.set(
      "database.connection.connection.host",
      process.env["DB_HOST"]
    );

    console.log("USERNAME", username);
    console.log("PASSWORD", password);
    strapi.config.set("database.connection.connection.port", 5432);
    strapi.config.set("database.connection.connection.database", "strapi");
    strapi.config.set("database.connection.connection.user", username);
    strapi.config.set("database.connection.connection.password", password);
    strapi.config.set("database.connection.connection.ssl", false);

    return strapi;
  },

  /**
   * An asynchronous bootstrap function that runs before
   * your application gets started.
   *
   * This gives you an opportunity to set up your data model,
   * run jobs, or perform some special logic.
   */
  bootstrap(/*{ strapi }*/) {},
};

Everything works fine in the beginning, but it suddenly throws

password authentication failed for user "postgres"

Error logs
July 08, 2024 at 13:09 (UTC-7:00) [2024-07-08 20:09:41.482] e[31merrore[39m: password authentication failed for user "postgres" - -
July 08, 2024 at 13:09 (UTC-7:00) error: password authentication failed for user "postgres" - -
July 08, 2024 at 13:09 (UTC-7:00) at Parser.parseErrorMessage (/opt/node_modules/pg-protocol/dist/parser.js:287:98) - -
July 08, 2024 at 13:09 (UTC-7:00) at Parser.handlePacket (/opt/node_modules/pg-protocol/dist/parser.js:126:29) - -
July 08, 2024 at 13:09 (UTC-7:00) at Parser.parse (/opt/node_modules/pg-protocol/dist/parser.js:39:38) - -
July 08, 2024 at 13:09 (UTC-7:00) at TLSSocket.<anonymous> (/opt/node_modules/pg-protocol/dist/index.js:11:42) - -
July 08, 2024 at 13:09 (UTC-7:00) at TLSSocket.emit (node:events:514:28) - -
July 08, 2024 at 13:09 (UTC-7:00) at TLSSocket.emit (node:domain:489:12) - -
July 08, 2024 at 13:09 (UTC-7:00) at addChunk (node:internal/streams/readable:324:12) - -
July 08, 2024 at 13:09 (UTC-7:00) at readableAddChunk (node:internal/streams/readable:297:9) - -
July 08, 2024 at 13:09 (UTC-7:00) at Readable.push (node:internal/streams/readable:234:10) - -
July 08, 2024 at 13:09 (UTC-7:00) at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23) - -
July 08, 2024 at 13:09 (UTC-7:00) [2024-07-08 20:09:41.483] e[31merrore[39m: password authentication failed for user "postgres" - -
July 08, 2024 at 13:09 (UTC-7:00) error: password authentication failed for user "postgres" - -
July 08, 2024 at 13:09 (UTC-7:00) at Parser.parseErrorMessage (/opt/node_modules/pg-protocol/dist/parser.js:287:98) - -
July 08, 2024 at 13:09 (UTC-7:00) at Parser.handlePacket (/opt/node_modules/pg-protocol/dist/parser.js:126:29) - -
July 08, 2024 at 13:09 (UTC-7:00) at Parser.parse (/opt/node_modules/pg-protocol/dist/parser.js:39:38) - -
July 08, 2024 at 13:09 (UTC-7:00) at TLSSocket.<anonymous> (/opt/node_modules/pg-protocol/dist/index.js:11:42) - -
July 08, 2024 at 13:09 (UTC-7:00) at TLSSocket.emit (node:events:514:28) - -
July 08, 2024 at 13:09 (UTC-7:00) at TLSSocket.emit (node:domain:489:12) - -
July 08, 2024 at 13:09 (UTC-7:00) at addChunk (node:internal/streams/readable:324:12) - -
July 08, 2024 at 13:09 (UTC-7:00) at readableAddChunk (node:internal/streams/readable:297:9) - -
July 08, 2024 at 13:09 (UTC-7:00) at Readable.push (node:internal/streams/readable:234:10) - -
July 08, 2024 at 13:09 (UTC-7:00) at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23) - -
July 08, 2024 at 13:09 (UTC-7:00) [2024-07-08 20:09:17.453] e[32mhttpe[39m: POST /graphql (92 ms) 200

I don’t have a config/database.js config file.

This is how the logs look like on ECS:

Logs
2024-07-08T20:20:52.556Z	yarn run v1.22.19
	2024-07-08T20:20:52.643Z	$ strapi start
	2024-07-08T20:21:06.774Z	Project information
	2024-07-08T20:21:06.851Z	┌────────────────────┬──────────────────────────────────────────────────┐
	2024-07-08T20:21:06.851Z	│ Time │ Mon Jul 08 2024 20:21:06 GMT+0000 (Coordinated … │
	2024-07-08T20:21:06.851Z	│ Launched in │ 9690 ms │
	2024-07-08T20:21:06.851Z	│ Environment │ production │
	2024-07-08T20:21:06.851Z	│ Process PID │ 28 │
	2024-07-08T20:21:06.851Z	│ Version │ 4.10.5 (node v18.17.1) │
	2024-07-08T20:21:06.851Z	│ Edition │ Community │
	2024-07-08T20:21:06.851Z	│ Database │ postgres │
	2024-07-08T20:21:06.851Z	└────────────────────┴──────────────────────────────────────────────────┘
	2024-07-08T20:21:06.854Z	Actions available
	2024-07-08T20:21:06.855Z	Welcome back!
	2024-07-08T20:21:06.855Z	To manage your project 🚀, go to the administration panel at:
	2024-07-08T20:21:06.859Z	http://0.0.0.0:8080/admin
	2024-07-08T20:21:06.861Z	To access the server ⚡️, go to:
	2024-07-08T20:21:06.862Z	http://0.0.0.0:8080
	2024-07-08T20:22:58.465Z	[2024-07-08 20:22:58.462] e[32mhttpe[39m: GET /api/healthcheck (28 ms) 200
	2024-07-08T20:22:58.510Z	[2024-07-08 20:22:58.509] e[32mhttpe[39m: GET /api/healthcheck (10 ms) 200
	2024-07-08T20:23:01.155Z	[2024-07-08 20:23:01.154] e[32mhttpe[39m: GET /admin/telemetry-properties (225 ms) 200
	2024-07-08T20:23:02.566Z	[2024-07-08 20:23:02.565] e[32mhttpe[39m: POST /graphql (1324 ms) 200
	2024-07-08T20:23:02.665Z	[2024-07-08 20:23:02.663] e[32mhttpe[39m: GET /admin/users/me/permissions (1263 ms) 200
	2024-07-08T20:23:02.777Z	[2024-07-08 20:23:02.776] e[32mhttpe[39m: GET /i18n/locales (996 ms) 200
	2024-07-08T20:23:03.083Z	[2024-07-08 20:23:03.082] e[32mhttpe[39m: GET /content-manager/content-types (43 ms) 200
	2024-07-08T20:24:13.148Z	[2024-07-08 20:24:13.148] e[32mhttpe[39m: POST /graphql (647 ms) 200
	2024-07-08T20:24:17.578Z	[2024-07-08 20:24:17.577] e[32mhttpe[39m: POST /graphql (247 ms) 200
	2024-07-08T20:24:29.871Z	[2024-07-08 20:24:29.871] e[32mhttpe[39m: POST /graphql (31 ms) 200
	2024-07-08T20:24:58.444Z	[2024-07-08 20:24:58.443] e[32mhttpe[39m: GET /api/healthcheck (0 ms) 200
	2024-07-08T20:24:58.542Z	[2024-07-08 20:24:58.542] e[32mhttpe[39m: GET /api/healthcheck (2 ms) 200

I’m not sure what the issue could be, it seems like it’s not retrieving the password again? but after it fails and retries it does work and returns the data.