Project

General

Profile

Actions

Defect #44132

open

Project creation returns HTTP 500 (`Mysql2::Error: Record has changed since last read in table 'projects'`) under concurrent project create/destroy on MySQL/MariaDB

Added by Matthias Hörmann 2 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Resolution:
Affected version:

Description

Note: the bug report details were created with Claude Code after I ran into the issue when trying to make some updates to my redmine-api Rust crate and running the test suite.

Environment

  • Redmine: 6.1.2.stable
  • Rails: 7.2.3.1
  • Ruby: 3.4.9
  • Database: MariaDB 11.8.8
  • OS / server: Debian x86_64, Passenger + Apache
  • ActiveJob queue adapter: default (:async -- jobs run in-process, concurrently with web requests)

Summary

Creating a project via POST /projects(.json) intermittently fails with HTTP 500 when another project is being destroyed at the same time. The underlying error is:

ActiveRecord::StatementInvalid (Mysql2::Error: Record has changed since last read
in table 'projects'; try restarting transaction)

(MariaDB error 1020, ER_CHECKREAD.) The exception is not rescued or retried, so the request returns 500 instead of either serializing cleanly or retrying.

Because project deletion runs asynchronously via DestroyProjectJob (ActiveJob), the destroy of one project routinely overlaps the creation of the next -- so this is hit even by a single, sequential client (no parallel clients required).

Steps to reproduce

  1. Redmine on MySQL/MariaDB with the default async job queue.
  2. In quick succession (e.g. an API/integration test that sets up and tears down many projects):
    • DELETE /projects/:id -- enqueues DestroyProjectJob and returns immediately;
    • POST /projects -- create another project while that destroy job is still running.
  3. Observe intermittent HTTP 500 on the create. Which create fails varies run-to-run (it is a race); the immediately following create (after the destroy job finishes) succeeds.

A test suite that creates/deletes a project per test is an extremely reliable trigger: the async destroy of test N overlaps the create of test N+1.

Expected behaviour

Concurrent project create / move / destroy serialize correctly (or transiently conflicting transactions are retried), and the create succeeds.

Actual behaviour

The create transaction aborts with Mysql2::Error: Record has changed since last read in table 'projects'; try restarting transaction and Redmine returns HTTP 500.

Log excerpt (production.log)

Started POST "/projects.json" for ... at ...
Processing by ProjectsController#create as JSON
  Parameters: {"project" => {"name" => "...", "identifier" => "...", "enabled_module_names" => [...]}}
[ActiveJob] [DestroyProjectJob] [...] Performed DestroyProjectJob (...) from Async(default) in 239.19ms   # <-- concurrent destroy of a DIFFERENT project
Completed 500 Internal Server Error in 56ms (ActiveRecord: 37.9ms (21 queries, 2 cached) | GC: 2.0ms)
FATAL -- :
ActiveRecord::StatementInvalid (Mysql2::Error: Record has changed since last read in table 'projects'; try restarting transaction):

Causes:
Mysql2::Error (Record has changed since last read in table 'projects'; try restarting transaction)

The next POST /projects.json (≈0.2s later, after the destroy job completed) returns 201 Created.

Root-cause analysis

Project create and destroy both rewrite the lft/rgt nested-set columns across the whole projects table:

  • before_create :add_to_nested_setUPDATE projects SET lft = ..., rgt = ... WHERE lft >= ? OR rgt >= ? (lib/redmine/nested_set/project_nested_set.rb:58-69)
  • before_update :move_in_nested_set (project_nested_set.rb:71-114)
  • before_destroy :destroy_childrenUPDATE projects SET lft = ..., rgt = ... WHERE lft > ? OR rgt > ? (project_nested_set.rb:116-131)

All of these guard with lock_nested_set, which on MySQL/MariaDB does only a SELECT ... FOR UPDATE of all project rows:

# lib/redmine/nested_set/project_nested_set.rb:150-156
def lock_nested_set
  lock = true
  if /sqlserver/i.match?(self.class.connection.adapter_name)
    lock = "WITH (ROWLOCK HOLDLOCK UPDLOCK)" 
  end
  self.class.order(:id).lock(lock).ids   # => SELECT id FROM projects ORDER BY id FOR UPDATE
end

Relying on MySQL/MariaDB row locks to serialize whole-table nested-set rewrites is exactly the situation the Issue nested set already documents as broken and works around with a global advisory lock:

# lib/redmine/nested_set/issue_nested_set.rb:171-194 (mysql branch)
elsif Redmine::Database.mysql?
  # Use a global lock to prevent concurrent modifications - MySQL row locks are broken,
  # this will run into deadlock errors all the time otherwise.
  Issue.with_advisory_lock!("lock_issues", timeout_seconds: 30) do
    ...
    yield
  end

Redmine::NestedSet::ProjectNestedSet#lock_nested_set has no equivalent advisory-lock branch -- so concurrent project create/move/destroy collide at the storage-engine level on MariaDB, producing error 1020 (ER_CHECKREAD), which bubbles up as an uncaught ActiveRecord::StatementInvalid and a 500.

The window is wide open in normal operation because project deletion is asynchronous:

# app/jobs/destroy_project_job.rb
def self.schedule(project, user: User.current)
  project.self_and_descendants.update_all status: Project::STATUS_SCHEDULED_FOR_DELETION
  perform_later project.id, user.id, user.remote_ip   # runs later, concurrently
end

ProjectsController#destroy (app/controllers/projects_controller.rb:294, DestroyProjectJob.schedule at line 302) returns before the destroy actually runs, so the destroy's nested-set rewrite overlaps any project create that follows.

Suggested fix

Give the Project nested set the same advisory-lock guard the Issue nested set already uses (the with_advisory_lock gem is already a dependency). Convert ProjectNestedSet#lock_nested_set to take a block (as IssueNestedSet#lock_nested_set does) and, on MySQL/MariaDB, hold a global advisory lock across the read + the subsequent nested-set @UPDATE@s. Sketch:

def lock_nested_set
  if /sqlserver/i.match?(self.class.connection.adapter_name)
    self.class.order(:id).lock("WITH (ROWLOCK HOLDLOCK UPDLOCK)").ids
    yield
  elsif Redmine::Database.mysql?
    Project.with_advisory_lock!("lock_projects", timeout_seconds: 30) do
      self.class.order(:id).lock.ids   # for good measure
      yield
    end
  else
    self.class.order(:id).lock.ids     # FOR UPDATE works fine on PostgreSQL
    yield
  end
end

Call sites (add_to_nested_set, move_in_nested_set, destroy_children) would move their nested-set updates into the lock_nested_set { ... } block. Because the project hierarchy is a single global tree (no root_id partitioning like issues), one global advisory lock ("lock_projects") suffices.

Optionally/additionally, rescue the transient "try restarting transaction" class of error and retry the transaction a bounded number of times -- the DB is literally asking the caller to retry.

Workaround for affected installs

Set the ActiveJob queue adapter so DestroyProjectJob runs inline/synchronously (deletion completes within the DELETE request), which closes the create-vs-destroy window. This does not address other concurrent project operations (e.g. two creates, or admin actions racing the API), which the advisory-lock fix would.

Notes

  • PostgreSQL is unaffected: SELECT ... FOR UPDATE serializes correctly there. This is MySQL/MariaDB-specific, consistent with the existing comment in issue_nested_set.rb.
  • Reproduced against the public redmine-api Rust client integration test suite, which creates and deletes a project for many of its tests; 1 - 2 project creates per full run fail with this 500, varying by run.

No data to display

Actions

Also available in: Atom PDF