Project

General

Profile

Actions

Defect #43838

open

User mentions using the @login syntax cause performance issues when rendering content on instances with a large number of users.

Added by Marius BĂLTEANU about 5 hours ago. Updated about 4 hours ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Performance
Resolution:
Affected version:

Description

The performance issue was first reported by Jean-Philippe last year when he looked for solutions to optimize redmine.org website. One fix was to replace all the mentions made using @login syntax with user:#id syntax in the following news post.

In ApplicationHelper#parse_redmine_links, the lookup for user mentions is currently hardcoded to use a case-insensitive search: User.visible.find_by("LOWER(login) = :s AND type = 'User'", :s => name.downcase)

This causes two major issues on large instances (e.g., redmine.org with ~500,000 users):
1. Missing Index : The users table does not have an index on the login column.
2. Sequential Scans : The use of LOWER(login) prevents the database from using a standard index even if one were present, forcing a sequential table scan.

Benchmark Results (10000 Users):
I have used the attach file to test the performance.

1. The current trunk

root@f42152843160:/work# rails r bench_mentions.rb 
DEPRECATION WARNING: `config.active_support.to_time_preserves_timezone` is deprecated and will be removed in Rails 8.2 (called from <top (required)> at /work/config/environment.rb:16)
ruby=3.4.8 db=Mysql2 users=10000 mentions_per_text=100 iter=10
Inserting users...
.
Users inserted: 10000
Assigning sample users to project...

Benchmarking @login...
login_ms=3838.64 avg_login=383.86

Benchmarking user#id...
user_link_ms=3856.31 avg_user_link=385.63

Benchmarking @login (worst-case non-existent, forces full scan)...
invalid_login_ms=14433.17 avg_invalid_login=1443.32

Running explains:

mysql> EXPLAIN SELECT `users`.* FROM `users` WHERE `users`.`type` IN ('User', 'AnonymousUser') AND `users`.`status` = 1 AND `users`.`id` = 20008 AND `users`.`type` = 'User' LIMIT 1;
+----+-------------+-------+------------+-------+--------------------------------------------------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys                                          | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+-------+--------------------------------------------------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | users | NULL       | const | PRIMARY,index_users_on_id_and_type,index_users_on_type | PRIMARY | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+--------------------------------------------------------+---------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT `users`.* FROM `users` WHERE `users`.`type` IN ('User', 'AnonymousUser') AND `users`.`status` = 1 AND (LOWER(login) = 'bench_user_59' AND type = 'User') LIMIT 1;
+----+-------------+-------+------------+------+---------------------+---------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys       | key                 | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------------+---------------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | users | NULL       | ref  | index_users_on_type | index_users_on_type | 1023    | const | 5046 |    10.00 | Using where |
+----+-------------+-------+------------+------+---------------------+---------------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN  SELECT `users`.* FROM `users` WHERE `users`.`type` IN ('User', 'AnonymousUser') AND `users`.`status` = 1 AND (LOWER(login) = 'missing_user_100' AND type = 'User') LIMIT 1;
+----+-------------+-------+------------+------+---------------------+---------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys       | key                 | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------------+---------------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | users | NULL       | ref  | index_users_on_type | index_users_on_type | 1023    | const | 5046 |    10.00 | Using where |
+----+-------------+-------+------------+------+---------------------+---------------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

2. Add index to users.login, results the same:


root@f42152843160:/work# rails r bench_mentions.rb 
DEPRECATION WARNING: `config.active_support.to_time_preserves_timezone` is deprecated and will be removed in Rails 8.2 (called from <top (required)> at /work/config/environment.rb:16)
ruby=3.4.8 db=Mysql2 users=10000 mentions_per_text=100 iter=10
Inserting users...
.
Users inserted: 10000
Assigning sample users to project...

Benchmarking @login...
login_ms=4017.54 avg_login=401.75

Benchmarking user#id...
user_link_ms=3860.56 avg_user_link=386.06

Benchmarking @login (worst-case non-existent, forces full scan)...
invalid_login_ms=14659.24 avg_invalid_login=1465.92

3. Apply the attached patch


DEPRECATION WARNING: `config.active_support.to_time_preserves_timezone` is deprecated and will be removed in Rails 8.2 (called from <top (required)> at /work/config/environment.rb:16)
ruby=3.4.8 db=Mysql2 users=10000 mentions_per_text=100 iter=10
Inserting users...
.
Users inserted: 10000
Assigning sample users to project...

Benchmarking @login...
login_ms=3918.89 avg_login=391.89

Benchmarking user#id...
user_link_ms=3636.01 avg_user_link=363.6

Benchmarking @login (worst-case non-existent, forces full scan)...
invalid_login_ms=2576.47 avg_invalid_login=257.65
mysql> EXPLAIN SELECT `users`.* FROM `users` WHERE `users`.`type` IN ('User', 'AnonymousUser') AND `users`.`status` = 1 AND `users`.`login` = 'missing_user_100' LIMIT 1;
+----+-------------+-------+------------+------+------------------------------------------+----------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys                            | key                  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+------+------------------------------------------+----------------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | users | NULL       | ref  | index_users_on_type,index_users_on_login | index_users_on_login | 1022    | const |    1 |     5.00 | Using where |
+----+-------------+-------+------------+------+------------------------------------------+----------------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT `users`.* FROM `users` WHERE `users`.`type` IN ('User', 'AnonymousUser') AND `users`.`status` = 1 AND `users`.`login` = 'bench_user_92' LIMIT 1;
+----+-------------+-------+------------+------+------------------------------------------+----------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys                            | key                  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+------+------------------------------------------+----------------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | users | NULL       | ref  | index_users_on_type,index_users_on_login | index_users_on_login | 1022    | const |    1 |     5.00 | Using where |
+----+-------------+-------+------------+------+------------------------------------------+----------------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

Running the bench test with 500000 users will just increase the differences between looking by id or by login.

The proposed changes:
  • fix the issues only for the mention feature, method User.find_by_login(username) will still fail over to case-insensitive if none was found which will not use any index.
  • breaks mentioning feature for those instances where the username is not lowercase.

Another solution is to add an index for LOWER(login), but I didn't run a test yet.


Files

Actions

Also available in: Atom PDF