LEFT OUTER JOIN on count makes query very slow

Expected behavior

Count a record to be fast and don’t do unnessary joins which bybass indexes

Actual behavior

When browsing our user collection, forest slows down due to a slow count query.

SELECT count("user"."id") AS "count" FROM "public"."user" AS "user" LEFT OUTER JOIN "public"."organization" AS "organization" ON "user"."organization_id" = "organization"."id";

How can we make forest-admin to not join here since it is useless, maybe even useless for all use case and you can change the behaviour on your side?

  • Package Version: forest-express-sequelize 7.6.3
  • Express Version: 4.17.1
  • Sequelize Version: ~5.22.4
  • Database Dialect: pg
  • Database Version: 9.4
  • Forest-Environment-Id: 68385
  • Forest-Project-Id: 54928
  • Forest-Rendering-Id: 82745
  • Forest-Team-Id: 54359

Hi @JonasKman !
Can you share the model for your user table ?

What exactly do you need, since it is a big model ~ 450 LoC ?

Do you have any association, or inluded data in your model definition ?

If you have one, can you try to hide the related data fields as stated in the documentation

Yes it has associations

    User.belongsTo(models.organization, {
      foreignKey: {
        name: 'organizationIdKey',
        field: 'organization_id',
      },
      as: 'organization',
    });

We want this relation visible in the table view so hiding it is not an option.

My current fix is:

// Get a number of Users
router.get('/user/count', permissionMiddlewareCreator.list(), (request, response, next) => {
  const fields = Object.keys(request.query.fields).filter((it) => it != 'user');
  for (const field of fields) {
    delete request.query.fields[field];
  }
  request.query.fields.user = request.query.fields.user
    .split(',')
    .filter((it) => !fields.includes(it))
    .join(',');

  next();
});

But I think the forest-express-sequelize package should remove the JOIN by default as long as there is no filter active.

The method used to retreive the count is the same used to get the data, this might indeed need refactoring. I’ll push your request to our product team !

We have some findings we’d like to share:

  1. Since the id as primarykey is non nullable postgres suggest using count(*)
    image
    PostgreSQL: Documentation: 9.4: Aggregate Functions

  2. When using count(id) and LEFT OUTER JOIN the database (postgres) becomes slow because it uses a sequential scan:

EXPLAIN SELECT count("user"."id") AS "count" FROM "public"."user" AS "user" LEFT OUTER JOIN "public"."organization" AS "organization" ON "user"."organization_id" = "organization"."id";
+----------------------------------------------------------------------+
|QUERY PLAN                                                            |
+----------------------------------------------------------------------+
|Aggregate  (cost=4400144.78..4400144.79 rows=1 width=16)              |
|  ->  Seq Scan on "user"  (cost=0.00..4397730.82 rows=965582 width=16)|
+----------------------------------------------------------------------+

vs

EXPLAIN SELECT count(*) AS "count" FROM "public"."user" AS "user" LEFT OUTER JOIN "public"."organization" AS "organization" ON "user"."organization_id" = "organization"."id";
+-----------------------------------------------------------------------------------------------------------+
|QUERY PLAN                                                                                                 |
+-----------------------------------------------------------------------------------------------------------+
|Aggregate  (cost=1042091.14..1042091.15 rows=1 width=8)                                                    |
|  ->  Index Only Scan using user_organization_id_idx on "user"  (cost=0.55..1039677.19 rows=965582 width=8)|
+-----------------------------------------------------------------------------------------------------------+

1 Like