Strapi model schema management

Sotiris_Tsiamouris · May 31, 2023, 8:57am

System Information

Strapi Version: 4.9.2
Operating System:
Database: postgres
Node Version: 18.x
NPM Version:
Yarn Version:

While going through the documentation around database migrations and the various options to move data between systems via the cli tool, I ended up in this discussion regarding changes that Strapi does to the db schema when syncing content schema changes via the content-type builder https://feedback.strapi.io/developer-experience/p/gracefully-handle-renaming-of-content-types-and-fields-in-the-ctb.

What I understand (and have verified through testing with a local setup of Strapi 4.9.2 running with Postgresql) is that when Strapi is running in development mode (strapi develop) any change to a model schema through content-type builder will by default delete any columns in the related model’s table in the db not mentioned in the json schema definition of the model. For example, renaming a field in a model definition in the content-type builder and applying the change will delete the original column in the model’s table in the db and create a new one with the new field’s name. If the table contains other columns not mentioned in the model’s json schema, they will be deleted too at that point.

Apart from the fact that the field rename functionality should be used with caution, as it results in data loss, I’m more concerned about the effect that this schema sync can have in a production system. Admittedly Strapi would not run in develop mode (with autoReload enabled) in production and the content-type builder will be disabled, however, is there any scenario that a schema sync would kick in and alter the schema of a production system when inconsistencies are found between the deployed json model schema files and the actual schema in the db? I would expect Strapi to fail if e.g. a required column is missing from a table, but not try to alter the existing db schema as this could lead to data loss.

Also, given the above concerns, what is the proposed way to promote changes to the model schema to a production system? I understand that the above mentioned https://docs.strapi.io/dev-docs/database-migrations could cover that, but that page also mentions

“These migrations are run automatically when the application starts and are executed before the automated schema migrations that Strapi also performs on boot.”

so I understand that even in a production setup (Strapi runs with “start” parameter instead of “develop”) there will be an automated schema migration that could alter/delete the existing db schema.

In order to deploy content model changes to a production system you would initially have to apply the related changes in the db and then deploy a new version of Strapi with the aligned json model definitions, however I want to be sure that any older version of Strapi that might operate in the new db schema until they are upgraded (e.g when doing a rolling deployment to avoid downtime) will still work if we’re talking about backward compatible schema additions obviously, and will not alter the db schema in any way to make it match their json model definitions.

So does anyone using Strapi in a Kubernetes cluster with multiple pods? Is there a strategy for database migrations?

Thanks in advance.

bstewart · August 22, 2023, 6:06pm

I have many of the same questions, about schema and data transfer by Strapi, and am having to do experimental changes, and diffs in code and SQL dumps to figure this out myself. It’s unfortunate that the Strapi documentation does not really cover any of these topics.

Kotork · October 5, 2023, 9:14pm

I do have very similar questions. Just commenting to check later if there is an answer

boundless · March 31, 2024, 8:07am

I have the exact same issues.

The only predictable and safe approach I’ve found so far is to basically follow this one rule: Never rename a column over the span of a single deployment.

If you need to rename a column, do it in two phases:

First create a new column, update the code to use the new column with a fallback to the old column (necessary for any architecture with multiple servers, whether for pods in kubernetes or just servers behind a load balancer), and then deploy. Write a script (to run after deployment) or migration (to run during deployment) to copy the data from the old column to the new one.
After the deployment has completed, the script/migration has been run, and all the pods/servers are guaranteed to access the new column, delete the old column and deploy.

It’s cumbersome and time-consuming, requires extra planning, and discourages schema updates, but it works.