Performance issue

Hi Forestadmin team,
I have some performances issues when managing related data due to the way the requests are generated with Sequelize.

For example; between my collections users and crm_user_exchanges (it is the same thing with other related data), I have defined this:

db.users.hasMany(db.crm_user_exchanges);
db.addresses.belongsTo(db.users, { as: 'user', foreignKey: 'user_id' });

When using that model, it is working.
When I access my crm_user_exchanges collection inside related data section of my users collection,
a SELECT request is well generated by joining users LEFT OUTER JOIN crm_user_exchanges ON …
BUT all the columns of my users collection are used in that specific SELECT.
And because my users collection is very large, I am getting very poor performances.

With all my belongsTo relations referring to users, i have a lot of performance issues, because sequelize is generating all the requests with all the columns of each related data PLUS all the columns of the users collection.

Would id be possible to generate that specific SELECT without all the columns of the parent table, but only a selection of them?
Is there a way to do this?

I had tried already the following workaround:
By defining another model t_users which is only an abstract of the native table model, I have changed the model to:

db.users.hasMany(db.crm_user_exchanges);
db.addresses.belongsTo(db.t_users, { as: 'user', foreignKey: 'user_id' });

Note how the belongsTo association is now using t_users.

This works very well when accessing the crm_user_exchanges collection.
But then, I have a new issue:
When trying to create a new record in crm_user_exchanges , the id of the parent collection ( users here) is not automatically populated.

It’s happening when accessing related data of a given user record

To complete that ticket:
When browsing from the crm_user_exchanges root collection (not from users collection), I have a similar problem because of the belongsTo association to users .
When Forest is displaying the table, sequelize is generating a SELECT with all the columns of both collections, while I would need only the reference field of users to be returned by the query, and it is generating performances problems too.

So it would be a big improvement in performances if that could be fixed.
Thanks for letting me know of any update on this.

Hi Louis-Marie,

Thanks for following up on this performance issue. We haven’t been able to tackle this subject yet but we will keep you informed of our progress in this discussion.

Best,
Louis

Hi @Louis-Marie :wave:, I have some good news.

Our team worked on improving performance of queries on HasMany relationships, which should definitely help solve your issue!

:warning: This improvement is available from the version 6.3.4 of our express-sequelize liana (you’re currently running on v6.0.0 in production).

You can run npm install forest-express-sequelize@latest to benefit from our latest version.

Let us know if that helps!

1 Like

Hi @louis
Many thanks for this update.

I will try ASAP and will let you know.
Best

1 Like

Hi,

Unfortunately, it seems that this new version is generating a regression issue.
Please check Regression issue

Thanks and regards

Hi @Louis-Marie,

Apart from the regression use case described in the new post, do you observe the benefits of the new implementation (if you bypass the regression)?

Thanks for your help.

Hi @arnaud

Thank your for your message.
In my test environment, it is still to difficult to say, as our database is very small.
I need to test it in our production environment.

Furthermore, to make it work, I would need to change many different references I used in my collections (I use many smart fields indeed).
So, I need to wait for the fix on the regression issue, and then I hope to be able to make the new test.

I will keep you informed.

Best.

@Louis-Marie as indicated in the other thread, we released a new version of forest-express-sequelize with a fix on the regression.

Hi @GuillaumeGautreau,

The regression issue is now fixed.

However, I can see that the generated SQL command which is generated when displaying my crm_user_exchanges collection, still does include all the users columns.

As a reminder, I have the following relationships defined:

db.users.hasMany(db.crm_user_exchanges);
db.crm_user_exchanges.belongsTo(db.users, { as: ‘user’, foreignKey: ‘user_id’, targetKey: ‘id’ });

When I display a record of my users collection, and then accessing the crm_user_exchanges related data (associated with that record), the generated SQL looks like the following one:

SELECT
“users”.“id”,
“crm_user_exchanges”.“id” AS “crm_user_exchanges.id”, “crm_user_exchanges”.“created_at” AS “crm_user_exchanges.created_at”,
“crm_user_exchanges”.“updated_at” AS “crm_user_exchanges.updated_at”, “crm_user_exchanges”.“deleted_at” AS “crm_user_exchanges.deleted_at”,
“crm_user_exchanges”.“subject” AS “crm_user_exchanges.subject”,
…
… all crm_user_exchanges columns …
…
“crm_user_exchanges.order_col”, “crm_user_exchanges”.“level” AS “crm_user_exchanges.level”,
“crm_user_exchanges->user”.“id” AS “crm_user_exchanges.user.id”,
“crm_user_exchanges->user”.“created_at” AS “crm_user_exchanges.user.created_at”,
“crm_user_exchanges->user”.“updated_at” AS “crm_user_exchanges.user.updated_at”, “crm_user_exchanges->user”.“deleted_at” AS “crm_user_exchanges.user.deleted_at”,
“crm_user_exchanges->user”.“email” AS “crm_user_exchanges.user.email”,
“crm_user_exchanges->user”.“is_email_verified” AS “crm_user_exchanges.user.is_email_verified”,
…
… all users columns … why? … only the columns required in the Reference should be used
…
FROM “vusers” AS “users”
LEFT OUTER JOIN “vcrm_user_exchanges” AS “crm_user_exchanges” ON “users”.“id” = “crm_user_exchanges”.“user_id”
LEFT OUTER JOIN “vusers” AS “crm_user_exchanges->user” ON “crm_user_exchanges”.“user_id” = “crm_user_exchanges->user”.“id”
WHERE
“users”.“id” = ‘1f04fe74-b47f-4d6e-aa83-91973c78997e’
ORDER BY
“crm_user_exchanges”.“order_col” ASC LIMIT 20 OFFSET 0;

All the users columns are still present, this is my issue.

My setup:

“database_type”: “postgres”,
“liana”: “forest-express-sequelize”,
“liana_version”: “6.3.11”,
“engine”: “nodejs”,
“engine_version”: “12.13.1”,
“framework”: “express”,
“framework_version”: “^4.17.1”,
“orm_version”: “4.44.0”

Thanks

Hello,

Are you displaying a smart field as the reference field for your user collection?

Hi @GuillaumeGautreau,

Yes, the reference field of my users collection is a smart field.
Thanks

That’s why all fields from the user table are retrieved. The code does not have any clue on which of the fields are actually used by your smart field.

That’s why for the moment, there is no other way than requesting all the fields.

If you choose to display a real field that is present in your table, then only this field will be requested for each user.

Hi @GuillaumeGautreau,

Yes, I understand.
I made some modifications to replace some reference field by native SQL fields whenever I can. However, it is not always possible.

Our forestadmin production is now running with these last changes and the last release of forest-express-sequelize.
The performances are a bit better but I still have performance problems.

After analyzing the requests that are done when viewing the users collection, I could confirm that all the smart fields of the collections associated with the users collection are computed.
And each one is computed 20 times (because I am displaying 20 lines per page).

Because many of them are requesting themselves information from the DB, I am getting a lot of requests: 20 times per smart field that need access to the DB.

Only for querying the smart fields of the associated collections, I have counted a total of 12 * 20 = 240 SQL requests
while it would need to query only 2 * 20 = 40 requests (because only two smart fields references are needed to query the DB)

So it is still a challenge to optimize Forest and I really do not know how I could do this.
All the native SQL fields + the smart fields of all the associated collections will always generate some performance issues in this context.

Thanks in advance for your help.

Here is my last setup:

  "meta": {
    "database_type": "postgres",
    "liana": "forest-express-sequelize",
    "liana_version": "6.3.11",
    "engine": "nodejs",
    "engine_version": "12.13.1",
    "framework": "express",
    "framework_version": "^4.17.1",
    "orm_version": "4.44.0"
  }

Hello @Louis-Marie. Thanks for your feedback.

Is it possible to start a new thread for this issue, as it seems quite different from the previous one? We will start investigating on this based on your new scenario.

Hi @GuillaumeGautreau

Ok, I started a new thread: Performance issue because of associated relationships.

Thanks in advance

1 Like