Cron tasks, database connections and KnexTimeoutError

terchris · March 22, 2023, 11:56am

System Information

Strapi Version: 4.7.1
Operating System: macos
Database: postgres
Node Version: 18.13.0
NPM Version:
Yarn Version:

I have a system that processes data about companies.

It starts by importing the public data about the company and then enriching the data about the company. Different jobs add information by scraping their web page and looking up information about the company in various registers.
The number of companies that are to be processed is about a million (all organizations in Norway).

All these lookup jobs are done as cron tasks, and there are many of them. Each job is responsible for looking up a API and adding data about the company.

The company record has a field, jobName, that defines what job should process the company record next.

When a job starts, it picks the 100 companies that have the jobName that the job is processing and then processes them.

Then the jobName is set to the name of the next job, and this goes on until all the jobs have been processed and the company record has data from all kinds of registers.

To prevent cron from starting more than one instance of a job, I have a table, mergeJob, that contains a list all jobs that can be started.

For each job in mergeJob, there is the status of the job. The field mergeJob.jobStatus can be “Running”, “Ready”, “Finished,” and so on.

This means that if cron should start a job that is already running, it will not start a new instance of the job.

This worked fine during my testing and development. So I know that the code for looking up and storing all kinds of data works.

When testing with all the data, I get problems related to database connections. It runs for a while and then it runs out of database connections. I have set DATABASE_POOL_MAX=80 so that I have 80 connections to the database. I can monitor the postgres database and I se that when all connections are used I get the error:
“KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a “transacting” (trx) call?”

It seems that the jobs are not releasing the database connections. One of my jobs is accessing a PostgreSQL database directly using pg. Here I call the function pg.end() to release the database connection.

But how is this done in strapi?

So my question is: where and how do I release the database connections in Strapi?

Below is an extract of my code. I use the job maestro as the example. maestro is the job that moves companies from one job to the next.
The system is implemented as a strapi plugin.


// from the run_maestro() function I call strapi.entityService.findMany and strapi.entityService.update etc.
// do I need to release the database connection after each call to strapi.entityService.findMany and strapi.entityService.update etc.?

    try {
        strapiResponse = await strapi.entityService.findMany('api::merge-job.merge-job', {
            fields: ['id', 'jobName', 'jobStatus'],
            filters: { jobName: jobName },
        });
        if (strapiResponse.length == 1) { 
            mergeJobRecord = strapiResponse[0]; 
        }
    catch (e) {
        console.log("Error in findMany mergeJobRecord e=", e);
        debugger;
    }

  if (strapiResponse.jobStatus  != "Running") {
    // Update the mergeJobRecord to Running

    // get all companies that has jobName=="maestro" and process them
    // ...

    // Update the mergeJobRecord to Finished

  }

This is how I start the cron tasks:

// config/server.ts has loads my cron-tasks.ts file that has the following code:
export default {
   maestro: {
      task: async ({ strapi }) => {
         let returnString = "";
         returnString = await strapi
            .plugin('urbalurba')
            .service('urbalurbaMergeService')
            .maestro();
      },
      options: {
         rule: '*/2 * * * *', // Every 2 minutes
         tz: 'Europe/Oslo',
      },
   }
};

// src/plugins/urbalurba/services/urbalurbaMergeService.ts
export default ({ strapi }: { strapi: Strapi }) => ({
  async maestro() {
    let returnText = await run_maestro();
    return returnText;
  }
});

terchris · March 24, 2023, 8:41am

I have not found the reason for the problem, but I have reduced it.

First, I discovered that database connections are automatically released. Adding the following this line to .env file: DEBUG=knex:client

Now I see that a connection has been requested and released.

Using pgadmin I can see the server connections. Here the graph shows the total, active and idle connections. When one of my jobs start, I notice a spike in the total number of connections. Interestingly, the number of idle connections follows the spike in total connections.

After a while, the total and idle connections go down again.

One situation is that my job finishes just fine, but when the next one starts the timeout for releasing the connections has not been reached. I think that the time for releasing the connections can be adjusted.

I have also found that Strapi is not following the recommendations from Knex. Strapi is using Knex which, in turn, is using tarn.js for managing the pool of connections. Reading the documentation on Knex, I found something interesting related to how the connections are released. The default value for min pool in Strapi is 2. In Knex doc it says:

Note that the default value of min is 2 only for historical reasons. It can result in problems with stale connections, despite tarn’s default idle connection timeout of 30 seconds, which is only applied when there are more than min active connections. It is recommended to set min: 0 so all idle connections can be terminated.

So I have set min to 0 as recommended by Knex. And I think that Strapi probably should set min to 0 as default as well.

The other workaround I have done is to reduce the requests to the database. I’m using a table, mergeJob, to keep track of the status of cron jobs. This prevents several instances of the same job from running. This table is read into a global array variable (defined in ./config/globals.ts).
So now cron jobs just query the global array variable instead of the database.

To keep the global array variable up to date when changes are done to the mergeJob table, I have added lifecycle functions afterCreate and afterUpdate that update the global array variable.