Project

General

Profile

Actions

Patch #10470

closed

Efficiently process new git revisions in a single batch

Added by Jeremy Bopp about 12 years ago. Updated about 12 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Toshi MARUYAMA
Category:
SCM
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

As noted in #8857, I am opening a new issue with patches that make processing new revisions with Git more efficient. I'm including the note that introduced the first pass at these patches below.

Here are 2 patches that apply cleanly to the trunk at revision r9240. The first modifies the usage of git-log to pass all revision arguments via stdin rather than on the command line. This patch can be applied without the second patch if desirable, as it should not change the functionality exposed by the revisions method that uses git-log. Doing this prepares the way for passing large numbers of revisions to git-log without overflowing the command buffer.

However, in order to support this new behavior, the shellout method had to be slightly modified so that the write end of the pipe it creates is left open upon request. That change could potentially affect other consumers of that method, but I doubt it will. Running the full scm test suite would be a good idea just in case though. I only had time to test git functionality myself.

The second patch builds upon the first. It processes all revisions in a single batch that are newly introduced since the last time the repository was processed. Each revision in the batch is processed exactly once. Disjoint branch histories and branch rewrites are supported.

All processing, including updating the last processed heads, occurs within a single transaction in order to ensure integrity of the data in case of concurrent attempts to update the repository. This transaction could potentially block updates for other repositories hosted in the same Redmine instance; however, normal operation of git repositories should rarely introduce so many new revisions as to hold this transaction open for very long. An initial import of a large repository on the order of thousands of commits would likely be the only realistic operation that could be a problem. Given the infrequency of that, it is safe to document that such an import should be scheduled for server downtime.

Importantly, due to the resistance toward introducing a migration in my first patch set for #8857, this patch does not include any migrations. A little extra processing is required to maintain the branch name to head revision relationship for every transaction, but this should be negligible. I'll happily introduce another patch on top of this one though in order to do this head processing in a cleaner way that would require a migration. Just let me know if you would take it.

These patches have been updated since they were submitted for #8857. They have been confirmed to work with the following Rubies:

  • ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
  • ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-linux]
  • ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-linux]
  • ruby 1.8.7 (2012-02-08 patchlevel 358) [i386-ming32]
  • ruby 1.9.2p290 (2011-07-09) [i386-mingw32]
  • ruby 1.9.3p0 (2011-10-30) [i386-mingw32]

Files

0001-Pass-revisions-to-git-log-via-stdin.patch (10.4 KB) 0001-Pass-revisions-to-git-log-via-stdin.patch Minimal changes to pass all revisions to git-log via stdin Jeremy Bopp, 2012-03-16 21:16
0002-Process-new-git-revisions-all-at-once-rather-than-pe.patch (14.3 KB) 0002-Process-new-git-revisions-all-at-once-rather-than-pe.patch Batch processes all new revisions for all branches in a single pass Jeremy Bopp, 2012-03-16 21:16

Related issues

Related to Redmine - Defect #8857: Git: Too long in fetching repositories after upgrade from 1.1 or new branch at first timeClosedToshi MARUYAMA2011-07-20

Actions
Actions

Also available in: Atom PDF