| 1 |
|
# redMine - project management software
|
| 2 |
|
# Copyright (C) 2006-2007 Jean-Philippe Lang
|
| 3 |
|
# Visual Source Safe modules: Aruo Miura
|
| 4 |
|
#
|
| 5 |
|
# This program is free software; you can redistribute it and/or
|
| 6 |
|
# modify it under the terms of the GNU General Public License
|
| 7 |
|
# as published by the Free Software Foundation; either version 2
|
| 8 |
|
# of the License, or (at your option) any later version.
|
| 9 |
|
#
|
| 10 |
|
# This program is distributed in the hope that it will be useful,
|
| 11 |
|
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
| 12 |
|
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
| 13 |
|
# GNU General Public License for more details.
|
| 14 |
|
#
|
| 15 |
|
# You should have received a copy of the GNU General Public License
|
| 16 |
|
# along with this program; if not, write to the Free Software
|
| 17 |
|
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
| 18 |
|
|
| 19 |
|
require 'redmine/scm/adapters/abstract_adapter'
|
| 20 |
|
require 'rexml/document'
|
| 21 |
|
require 'win32ole'
|
| 22 |
|
require 'tempfile'
|
| 23 |
|
#require 'nkf'
|
| 24 |
|
|
| 25 |
|
#============copy of diff-lcs, only change names. Diff -> DiffX
|
| 26 |
|
|
| 27 |
|
#lcs.rb
|
| 28 |
|
|
| 29 |
|
module DiffX
|
| 30 |
|
# = Diff::LCS 1.1.2
|
| 31 |
|
# Computes "intelligent" differences between two sequenced Enumerables.
|
| 32 |
|
# This is an implementation of the McIlroy-Hunt "diff" algorithm for
|
| 33 |
|
# Enumerable objects that include Diffable.
|
| 34 |
|
#
|
| 35 |
|
# Based on Mario I. Wolczko's <mario@wolczko.com> Smalltalk version
|
| 36 |
|
# (1.2, 1993) and Ned Konz's <perl@bike-nomad.com> Perl version
|
| 37 |
|
# (Algorithm::Diff).
|
| 38 |
|
#
|
| 39 |
|
# == Synopsis
|
| 40 |
|
# require 'diff/lcs'
|
| 41 |
|
#
|
| 42 |
|
# seq1 = %w(a b c e h j l m n p)
|
| 43 |
|
# seq2 = %w(b c d e f j k l m r s t)
|
| 44 |
|
#
|
| 45 |
|
# lcs = Diff::LCS.LCS(seq1, seq2)
|
| 46 |
|
# diffs = Diff::LCS.diff(seq1, seq2)
|
| 47 |
|
# sdiff = Diff::LCS.sdiff(seq1, seq2)
|
| 48 |
|
# seq = Diff::LCS.traverse_sequences(seq1, seq2, callback_obj)
|
| 49 |
|
# bal = Diff::LCS.traverse_balanced(seq1, seq2, callback_obj)
|
| 50 |
|
# seq2 == Diff::LCS.patch(seq1, diffs)
|
| 51 |
|
# seq2 == Diff::LCS.patch!(seq1, diffs)
|
| 52 |
|
# seq1 == Diff::LCS.unpatch(seq2, diffs)
|
| 53 |
|
# seq1 == Diff::LCS.unpatch!(seq2, diffs)
|
| 54 |
|
# seq2 == Diff::LCS.patch(seq1, sdiff)
|
| 55 |
|
# seq2 == Diff::LCS.patch!(seq1, sdiff)
|
| 56 |
|
# seq1 == Diff::LCS.unpatch(seq2, sdiff)
|
| 57 |
|
# seq1 == Diff::LCS.unpatch!(seq2, sdiff)
|
| 58 |
|
#
|
| 59 |
|
# Alternatively, objects can be extended with Diff::LCS:
|
| 60 |
|
#
|
| 61 |
|
# seq1.extend(Diff::LCS)
|
| 62 |
|
# lcs = seq1.lcs(seq2)
|
| 63 |
|
# diffs = seq1.diff(seq2)
|
| 64 |
|
# sdiff = seq1.sdiff(seq2)
|
| 65 |
|
# seq = seq1.traverse_sequences(seq2, callback_obj)
|
| 66 |
|
# bal = seq1.traverse_balanced(seq2, callback_obj)
|
| 67 |
|
# seq2 == seq1.patch(diffs)
|
| 68 |
|
# seq2 == seq1.patch!(diffs)
|
| 69 |
|
# seq1 == seq2.unpatch(diffs)
|
| 70 |
|
# seq1 == seq2.unpatch!(diffs)
|
| 71 |
|
# seq2 == seq1.patch(sdiff)
|
| 72 |
|
# seq2 == seq1.patch!(sdiff)
|
| 73 |
|
# seq1 == seq2.unpatch(sdiff)
|
| 74 |
|
# seq1 == seq2.unpatch!(sdiff)
|
| 75 |
|
#
|
| 76 |
|
# Default extensions are provided for Array and String objects through
|
| 77 |
|
# the use of 'diff/lcs/array' and 'diff/lcs/string'.
|
| 78 |
|
#
|
| 79 |
|
# == Introduction (by Mark-Jason Dominus)
|
| 80 |
|
#
|
| 81 |
|
# <em>The following text is from the Perl documentation. The only
|
| 82 |
|
# changes have been to make the text appear better in Rdoc</em>.
|
| 83 |
|
#
|
| 84 |
|
# I once read an article written by the authors of +diff+; they said
|
| 85 |
|
# that they hard worked very hard on the algorithm until they found the
|
| 86 |
|
# right one.
|
| 87 |
|
#
|
| 88 |
|
# I think what they ended up using (and I hope someone will correct me,
|
| 89 |
|
# because I am not very confident about this) was the `longest common
|
| 90 |
|
# subsequence' method. In the LCS problem, you have two sequences of
|
| 91 |
|
# items:
|
| 92 |
|
#
|
| 93 |
|
# a b c d f g h j q z
|
| 94 |
|
# a b c d e f g i j k r x y z
|
| 95 |
|
#
|
| 96 |
|
# and you want to find the longest sequence of items that is present in
|
| 97 |
|
# both original sequences in the same order. That is, you want to find a
|
| 98 |
|
# new sequence *S* which can be obtained from the first sequence by
|
| 99 |
|
# deleting some items, and from the second sequence by deleting other
|
| 100 |
|
# items. You also want *S* to be as long as possible. In this case *S*
|
| 101 |
|
# is:
|
| 102 |
|
#
|
| 103 |
|
# a b c d f g j z
|
| 104 |
|
#
|
| 105 |
|
# From there it's only a small step to get diff-like output:
|
| 106 |
|
#
|
| 107 |
|
# e h i k q r x y
|
| 108 |
|
# + - + + - + + +
|
| 109 |
|
#
|
| 110 |
|
# This module solves the LCS problem. It also includes a canned function
|
| 111 |
|
# to generate +diff+-like output.
|
| 112 |
|
#
|
| 113 |
|
# It might seem from the example above that the LCS of two sequences is
|
| 114 |
|
# always pretty obvious, but that's not always the case, especially when
|
| 115 |
|
# the two sequences have many repeated elements. For example, consider
|
| 116 |
|
#
|
| 117 |
|
# a x b y c z p d q
|
| 118 |
|
# a b c a x b y c z
|
| 119 |
|
#
|
| 120 |
|
# A naive approach might start by matching up the +a+ and +b+ that
|
| 121 |
|
# appear at the beginning of each sequence, like this:
|
| 122 |
|
#
|
| 123 |
|
# a x b y c z p d q
|
| 124 |
|
# a b c a b y c z
|
| 125 |
|
#
|
| 126 |
|
# This finds the common subsequence +a b c z+. But actually, the LCS is
|
| 127 |
|
# +a x b y c z+:
|
| 128 |
|
#
|
| 129 |
|
# a x b y c z p d q
|
| 130 |
|
# a b c a x b y c z
|
| 131 |
|
#
|
| 132 |
|
# == Author
|
| 133 |
|
# This version is by Austin Ziegler <diff-lcs@halostatue.ca>.
|
| 134 |
|
#
|
| 135 |
|
# It is based on the Perl Algorithm::Diff by Ned Konz
|
| 136 |
|
# <perl@bike-nomad.com>, copyright © 2000 - 2002 and the Smalltalk
|
| 137 |
|
# diff version by Mario I. Wolczko <mario@wolczko.com>, copyright ©
|
| 138 |
|
# 1993. Documentation includes work by Mark-Jason Dominus.
|
| 139 |
|
#
|
| 140 |
|
# == Licence
|
| 141 |
|
# Copyright © 2004 Austin Ziegler
|
| 142 |
|
# This program is free software; you can redistribute it and/or modify it
|
| 143 |
|
# under the same terms as Ruby, or alternatively under the Perl Artistic
|
| 144 |
|
# licence.
|
| 145 |
|
#
|
| 146 |
|
# == Credits
|
| 147 |
|
# Much of the documentation is taken directly from the Perl
|
| 148 |
|
# Algorithm::Diff implementation and was written originally by Mark-Jason
|
| 149 |
|
# Dominus <mjd-perl-diff@plover.com> and later by Ned Konz. The basic Ruby
|
| 150 |
|
# implementation was re-ported from the Smalltalk implementation, available
|
| 151 |
|
# at ftp://st.cs.uiuc.edu/pub/Smalltalk/MANCHESTER/manchester/4.0/diff.st
|
| 152 |
|
#
|
| 153 |
|
# #sdiff and #traverse_balanced were written for the Perl version by Mike
|
| 154 |
|
# Schilli <m@perlmeister.com>.
|
| 155 |
|
#
|
| 156 |
|
# "The algorithm is described in <em>A Fast Algorithm for Computing Longest
|
| 157 |
|
# Common Subsequences</em>, CACM, vol.20, no.5, pp.350-353, May 1977, with
|
| 158 |
|
# a few minor improvements to improve the speed."
|
| 159 |
|
module LCS
|
| 160 |
|
VERSION = '1.1.2'
|
| 161 |
|
end
|
| 162 |
|
end
|
| 163 |
|
|
| 164 |
|
module DiffX::LCS
|
| 165 |
|
# Returns an Array containing the longest common subsequence(s) between
|
| 166 |
|
# +self+ and +other+. See Diff::LCS#LCS.
|
| 167 |
|
#
|
| 168 |
|
# lcs = seq1.lcs(seq2)
|
| 169 |
|
def lcs(other, &block) #:yields self[ii] if there are matched subsequences:
|
| 170 |
|
DiffX::LCS.LCS(self, other, &block)
|
| 171 |
|
end
|
| 172 |
|
|
| 173 |
|
# Returns the difference set between +self+ and +other+. See
|
| 174 |
|
# Diff::LCS#diff.
|
| 175 |
|
def diff(other, callbacks = nil, &block)
|
| 176 |
|
DiffX::LCS::diff(self, other, callbacks, &block)
|
| 177 |
|
end
|
| 178 |
|
|
| 179 |
|
# Returns the balanced ("side-by-side") difference set between +self+ and
|
| 180 |
|
# +other+. See Diff::LCS#sdiff.
|
| 181 |
|
def sdiff(other, callbacks = nil, &block)
|
| 182 |
|
DiffX::LCS::sdiff(self, other, callbacks, &block)
|
| 183 |
|
end
|
| 184 |
|
|
| 185 |
|
# Traverses the discovered longest common subsequences between +self+ and
|
| 186 |
|
# +other+. See Diff::LCS#traverse_sequences.
|
| 187 |
|
def traverse_sequences(other, callbacks = nil, &block)
|
| 188 |
|
traverse_sequences(self, other, callbacks || DiffX::LCS::YieldingCallbacks,
|
| 189 |
|
&block)
|
| 190 |
|
end
|
| 191 |
|
|
| 192 |
|
# Traverses the discovered longest common subsequences between +self+ and
|
| 193 |
|
# +other+ using the alternate, balanced algorithm. See
|
| 194 |
|
# Diff::LCS#traverse_balanced.
|
| 195 |
|
def traverse_balanced(other, callbacks = nil, &block)
|
| 196 |
|
traverse_balanced(self, other, callbacks || DiffX::LCS::YieldingCallbacks,
|
| 197 |
|
&block)
|
| 198 |
|
end
|
| 199 |
|
|
| 200 |
|
# Attempts to patch a copy of +self+ with the provided +patchset+. See
|
| 201 |
|
# Diff::LCS#patch.
|
| 202 |
|
def patch(patchset)
|
| 203 |
|
DiffX::LCS::patch(self.dup, patchset)
|
| 204 |
|
end
|
| 205 |
|
|
| 206 |
|
# Attempts to unpatch a copy of +self+ with the provided +patchset+.
|
| 207 |
|
# See Diff::LCS#patch.
|
| 208 |
|
def unpatch(patchset)
|
| 209 |
|
DiffX::LCS::unpatch(self.dup, patchset)
|
| 210 |
|
end
|
| 211 |
|
|
| 212 |
|
# Attempts to patch +self+ with the provided +patchset+. See
|
| 213 |
|
# Diff::LCS#patch!. Does no autodiscovery.
|
| 214 |
|
def patch!(patchset)
|
| 215 |
|
DiffX::LCS::patch!(self, patchset)
|
| 216 |
|
end
|
| 217 |
|
|
| 218 |
|
# Attempts to unpatch +self+ with the provided +patchset+. See
|
| 219 |
|
# Diff::LCS#unpatch. Does no autodiscovery.
|
| 220 |
|
def unpatch!(patchset)
|
| 221 |
|
DiffX::LCS::unpatch!(self, patchset)
|
| 222 |
|
end
|
| 223 |
|
end
|
| 224 |
|
|
| 225 |
|
module DiffX::LCS
|
| 226 |
|
class << self
|
| 227 |
|
# Given two sequenced Enumerables, LCS returns an Array containing their
|
| 228 |
|
# longest common subsequences.
|
| 229 |
|
#
|
| 230 |
|
# lcs = Diff::LCS.LCS(seq1, seq2)
|
| 231 |
|
#
|
| 232 |
|
# This array whose contents is such that:
|
| 233 |
|
#
|
| 234 |
|
# lcs.each_with_index do |ee, ii|
|
| 235 |
|
# assert(ee.nil? || (seq1[ii] == seq2[ee]))
|
| 236 |
|
# end
|
| 237 |
|
#
|
| 238 |
|
# If a block is provided, the matching subsequences will be yielded from
|
| 239 |
|
# +seq1+ in turn and may be modified before they are placed into the
|
| 240 |
|
# returned Array of subsequences.
|
| 241 |
|
def LCS(seq1, seq2, &block) #:yields seq1[ii] for each matched:
|
| 242 |
|
matches = DiffX::LCS.__lcs(seq1, seq2)
|
| 243 |
|
ret = []
|
| 244 |
|
matches.each_with_index do |ee, ii|
|
| 245 |
|
unless matches[ii].nil?
|
| 246 |
|
if block_given?
|
| 247 |
|
ret << (yield seq1[ii])
|
| 248 |
|
else
|
| 249 |
|
ret << seq1[ii]
|
| 250 |
|
end
|
| 251 |
|
end
|
| 252 |
|
end
|
| 253 |
|
ret
|
| 254 |
|
end
|
| 255 |
|
|
| 256 |
|
# Diff::LCS.diff computes the smallest set of additions and deletions
|
| 257 |
|
# necessary to turn the first sequence into the second, and returns a
|
| 258 |
|
# description of these changes.
|
| 259 |
|
#
|
| 260 |
|
# See Diff::LCS::DiffCallbacks for the default behaviour. An alternate
|
| 261 |
|
# behaviour may be implemented with Diff::LCS::ContextDiffCallbacks.
|
| 262 |
|
# If a Class argument is provided for +callbacks+, #diff will attempt
|
| 263 |
|
# to initialise it. If the +callbacks+ object (possibly initialised)
|
| 264 |
|
# responds to #finish, it will be called.
|
| 265 |
|
def diff(seq1, seq2, callbacks = nil, &block) # :yields diff changes:
|
| 266 |
|
callbacks ||= DiffX::LCS::DiffCallbacks
|
| 267 |
|
if callbacks.kind_of?(Class)
|
| 268 |
|
cb = callbacks.new rescue callbacks
|
| 269 |
|
callbacks = cb
|
| 270 |
|
end
|
| 271 |
|
traverse_sequences(seq1, seq2, callbacks)
|
| 272 |
|
callbacks.finish if callbacks.respond_to?(:finish)
|
| 273 |
|
|
| 274 |
|
if block_given?
|
| 275 |
|
res = callbacks.diffs.map do |hunk|
|
| 276 |
|
if hunk.kind_of?(Array)
|
| 277 |
|
hunk = hunk.map { |block| yield block }
|
| 278 |
|
else
|
| 279 |
|
yield hunk
|
| 280 |
|
end
|
| 281 |
|
end
|
| 282 |
|
res
|
| 283 |
|
else
|
| 284 |
|
callbacks.diffs
|
| 285 |
|
end
|
| 286 |
|
end
|
| 287 |
|
|
| 288 |
|
# Diff::LCS.sdiff computes all necessary components to show two sequences
|
| 289 |
|
# and their minimized differences side by side, just like the Unix
|
| 290 |
|
# utility <em>sdiff</em> does:
|
| 291 |
|
#
|
| 292 |
|
# old < -
|
| 293 |
|
# same same
|
| 294 |
|
# before | after
|
| 295 |
|
# - > new
|
| 296 |
|
#
|
| 297 |
|
# See Diff::LCS::SDiffCallbacks for the default behaviour. An alternate
|
| 298 |
|
# behaviour may be implemented with Diff::LCS::ContextDiffCallbacks. If
|
| 299 |
|
# a Class argument is provided for +callbacks+, #diff will attempt to
|
| 300 |
|
# initialise it. If the +callbacks+ object (possibly initialised)
|
| 301 |
|
# responds to #finish, it will be called.
|
| 302 |
|
def sdiff(seq1, seq2, callbacks = nil, &block) #:yields diff changes:
|
| 303 |
|
callbacks ||= DiffX::LCS::SDiffCallbacks
|
| 304 |
|
if callbacks.kind_of?(Class)
|
| 305 |
|
cb = callbacks.new rescue callbacks
|
| 306 |
|
callbacks = cb
|
| 307 |
|
end
|
| 308 |
|
traverse_balanced(seq1, seq2, callbacks)
|
| 309 |
|
callbacks.finish if callbacks.respond_to?(:finish)
|
| 310 |
|
|
| 311 |
|
if block_given?
|
| 312 |
|
res = callbacks.diffs.map do |hunk|
|
| 313 |
|
if hunk.kind_of?(Array)
|
| 314 |
|
hunk = hunk.map { |block| yield block }
|
| 315 |
|
else
|
| 316 |
|
yield hunk
|
| 317 |
|
end
|
| 318 |
|
end
|
| 319 |
|
res
|
| 320 |
|
else
|
| 321 |
|
callbacks.diffs
|
| 322 |
|
end
|
| 323 |
|
end
|
| 324 |
|
|
| 325 |
|
# Diff::LCS.traverse_sequences is the most general facility provided by this
|
| 326 |
|
# module; +diff+ and +LCS+ are implemented as calls to it.
|
| 327 |
|
#
|
| 328 |
|
# The arguments to #traverse_sequences are the two sequences to
|
| 329 |
|
# traverse, and a callback object, like this:
|
| 330 |
|
#
|
| 331 |
|
# traverse_sequences(seq1, seq2, Diff::LCS::ContextDiffCallbacks.new)
|
| 332 |
|
#
|
| 333 |
|
# #diff is implemented with #traverse_sequences.
|
| 334 |
|
#
|
| 335 |
|
# == Callback Methods
|
| 336 |
|
# Optional callback methods are <em>emphasized</em>.
|
| 337 |
|
#
|
| 338 |
|
# callbacks#match:: Called when +a+ and +b+ are pointing
|
| 339 |
|
# to common elements in +A+ and +B+.
|
| 340 |
|
# callbacks#discard_a:: Called when +a+ is pointing to an
|
| 341 |
|
# element not in +B+.
|
| 342 |
|
# callbacks#discard_b:: Called when +b+ is pointing to an
|
| 343 |
|
# element not in +A+.
|
| 344 |
|
# <em>callbacks#finished_a</em>:: Called when +a+ has reached the end of
|
| 345 |
|
# sequence +A+.
|
| 346 |
|
# <em>callbacks#finished_b</em>:: Called when +b+ has reached the end of
|
| 347 |
|
# sequence +B+.
|
| 348 |
|
#
|
| 349 |
|
# == Algorithm
|
| 350 |
|
# a---+
|
| 351 |
|
# v
|
| 352 |
|
# A = a b c e h j l m n p
|
| 353 |
|
# B = b c d e f j k l m r s t
|
| 354 |
|
# ^
|
| 355 |
|
# b---+
|
| 356 |
|
#
|
| 357 |
|
# If there are two arrows (+a+ and +b+) pointing to elements of
|
| 358 |
|
# sequences +A+ and +B+, the arrows will initially point to the first
|
| 359 |
|
# elements of their respective sequences. #traverse_sequences will
|
| 360 |
|
# advance the arrows through the sequences one element at a time,
|
| 361 |
|
# calling a method on the user-specified callback object before each
|
| 362 |
|
# advance. It will advance the arrows in such a way that if there are
|
| 363 |
|
# elements <tt>A[ii]</tt> and <tt>B[jj]</tt> which are both equal and
|
| 364 |
|
# part of the longest common subsequence, there will be some moment
|
| 365 |
|
# during the execution of #traverse_sequences when arrow +a+ is pointing
|
| 366 |
|
# to <tt>A[ii]</tt> and arrow +b+ is pointing to <tt>B[jj]</tt>. When
|
| 367 |
|
# this happens, #traverse_sequences will call <tt>callbacks#match</tt>
|
| 368 |
|
# and then it will advance both arrows.
|
| 369 |
|
#
|
| 370 |
|
# Otherwise, one of the arrows is pointing to an element of its sequence
|
| 371 |
|
# that is not part of the longest common subsequence.
|
| 372 |
|
# #traverse_sequences will advance that arrow and will call
|
| 373 |
|
# <tt>callbacks#discard_a</tt> or <tt>callbacks#discard_b</tt>, depending
|
| 374 |
|
# on which arrow it advanced. If both arrows point to elements that are
|
| 375 |
|
# not part of the longest common subsequence, then #traverse_sequences
|
| 376 |
|
# will advance one of them and call the appropriate callback, but it is
|
| 377 |
|
# not specified which it will call.
|
| 378 |
|
#
|
| 379 |
|
# The methods for <tt>callbacks#match</tt>, <tt>callbacks#discard_a</tt>,
|
| 380 |
|
# and <tt>callbacks#discard_b</tt> are invoked with an event comprising
|
| 381 |
|
# the action ("=", "+", or "-", respectively), the indicies +ii+ and
|
| 382 |
|
# +jj+, and the elements <tt>A[ii]</tt> and <tt>B[jj]</tt>. Return
|
| 383 |
|
# values are discarded by #traverse_sequences.
|
| 384 |
|
#
|
| 385 |
|
# === End of Sequences
|
| 386 |
|
# If arrow +a+ reaches the end of its sequence before arrow +b+ does,
|
| 387 |
|
# #traverse_sequence try to call <tt>callbacks#finished_a</tt> with the
|
| 388 |
|
# last index and element of +A+ (<tt>A[-1]</tt>) and the current index
|
| 389 |
|
# and element of +B+ (<tt>B[jj]</tt>). If <tt>callbacks#finished_a</tt>
|
| 390 |
|
# does not exist, then <tt>callbacks#discard_b</tt> will be called on
|
| 391 |
|
# each element of +B+ until the end of the sequence is reached (the call
|
| 392 |
|
# will be done with <tt>A[-1]</tt> and <tt>B[jj]</tt> for each element).
|
| 393 |
|
#
|
| 394 |
|
# If +b+ reaches the end of +B+ before +a+ reaches the end of +A+,
|
| 395 |
|
# <tt>callbacks#finished_b</tt> will be called with the current index
|
| 396 |
|
# and element of +A+ (<tt>A[ii]</tt>) and the last index and element of
|
| 397 |
|
# +B+ (<tt>A[-1]</tt>). Again, if <tt>callbacks#finished_b</tt> does not
|
| 398 |
|
# exist on the callback object, then <tt>callbacks#discard_a</tt> will
|
| 399 |
|
# be called on each element of +A+ until the end of the sequence is
|
| 400 |
|
# reached (<tt>A[ii]</tt> and <tt>B[-1]</tt>).
|
| 401 |
|
#
|
| 402 |
|
# There is a chance that one additional <tt>callbacks#discard_a</tt> or
|
| 403 |
|
# <tt>callbacks#discard_b</tt> will be called after the end of the
|
| 404 |
|
# sequence is reached, if +a+ has not yet reached the end of +A+ or +b+
|
| 405 |
|
# has not yet reached the end of +B+.
|
| 406 |
|
def traverse_sequences(seq1, seq2, callbacks = DiffX::LCS::SequenceCallbacks, &block) #:yields change events:
|
| 407 |
|
matches = DiffX::LCS.__lcs(seq1, seq2)
|
| 408 |
|
|
| 409 |
|
run_finished_a = run_finished_b = false
|
| 410 |
|
string = seq1.kind_of?(String)
|
| 411 |
|
|
| 412 |
|
a_size = seq1.size
|
| 413 |
|
b_size = seq2.size
|
| 414 |
|
ai = bj = 0
|
| 415 |
|
|
| 416 |
|
(0 .. matches.size).each do |ii|
|
| 417 |
|
b_line = matches[ii]
|
| 418 |
|
|
| 419 |
|
ax = string ? seq1[ii, 1] : seq1[ii]
|
| 420 |
|
bx = string ? seq2[bj, 1] : seq2[bj]
|
| 421 |
|
|
| 422 |
|
if b_line.nil?
|
| 423 |
|
unless ax.nil?
|
| 424 |
|
event = DiffX::LCS::ContextChange.new('-', ii, ax, bj, bx)
|
| 425 |
|
event = yield event if block_given?
|
| 426 |
|
callbacks.discard_a(event)
|
| 427 |
|
end
|
| 428 |
|
else
|
| 429 |
|
loop do
|
| 430 |
|
break unless bj < b_line
|
| 431 |
|
bx = string ? seq2[bj, 1] : seq2[bj]
|
| 432 |
|
event = DiffX::LCS::ContextChange.new('+', ii, ax, bj, bx)
|
| 433 |
|
event = yield event if block_given?
|
| 434 |
|
callbacks.discard_b(event)
|
| 435 |
|
bj += 1
|
| 436 |
|
end
|
| 437 |
|
bx = string ? seq2[bj, 1] : seq2[bj]
|
| 438 |
|
event = DiffX::LCS::ContextChange.new('=', ii, ax, bj, bx)
|
| 439 |
|
event = yield event if block_given?
|
| 440 |
|
callbacks.match(event)
|
| 441 |
|
bj += 1
|
| 442 |
|
end
|
| 443 |
|
ai = ii
|
| 444 |
|
end
|
| 445 |
|
ai += 1
|
| 446 |
|
|
| 447 |
|
# The last entry (if any) processed was a match. +ai+ and +bj+ point
|
| 448 |
|
# just past the last matching lines in their sequences.
|
| 449 |
|
while (ai < a_size) or (bj < b_size)
|
| 450 |
|
# last A?
|
| 451 |
|
if ai == a_size and bj < b_size
|
| 452 |
|
if callbacks.respond_to?(:finished_a) and not run_finished_a
|
| 453 |
|
ax = string ? seq1[-1, 1] : seq1[-1]
|
| 454 |
|
bx = string ? seq2[bj, 1] : seq2[bj]
|
| 455 |
|
event = DiffX::LCS::ContextChange.new('>', (a_size - 1), ax, bj, bx)
|
| 456 |
|
event = yield event if block_given?
|
| 457 |
|
callbacks.finished_a(event)
|
| 458 |
|
run_finished_a = true
|
| 459 |
|
else
|
| 460 |
|
ax = string ? seq1[ai, 1] : seq1[ai]
|
| 461 |
|
loop do
|
| 462 |
|
bx = string ? seq2[bj, 1] : seq2[bj]
|
| 463 |
|
event = DiffX::LCS::ContextChange.new('+', ai, ax, bj, bx)
|
| 464 |
|
event = yield event if block_given?
|
| 465 |
|
callbacks.discard_b(event)
|
| 466 |
|
bj += 1
|
| 467 |
|
break unless bj < b_size
|
| 468 |
|
end
|
| 469 |
|
end
|
| 470 |
|
end
|
| 471 |
|
|
| 472 |
|
# last B?
|
| 473 |
|
if bj == b_size and ai < a_size
|
| 474 |
|
if callbacks.respond_to?(:finished_b) and not run_finished_b
|
| 475 |
|
ax = string ? seq1[ai, 1] : seq1[ai]
|
| 476 |
|
bx = string ? seq2[-1, 1] : seq2[-1]
|
| 477 |
|
event = DiffX::LCS::ContextChange.new('<', ai, ax, (b_size - 1), bx)
|
| 478 |
|
event = yield event if block_given?
|
| 479 |
|
callbacks.finished_b(event)
|
| 480 |
|
run_finished_b = true
|
| 481 |
|
else
|
| 482 |
|
bx = string ? seq2[bj, 1] : seq2[bj]
|
| 483 |
|
loop do
|
| 484 |
|
ax = string ? seq1[ai, 1] : seq1[ai]
|
| 485 |
|
event = DiffX::LCS::ContextChange.new('-', ai, ax, bj, bx)
|
| 486 |
|
event = yield event if block_given?
|
| 487 |
|
callbacks.discard_a(event)
|
| 488 |
|
ai += 1
|
| 489 |
|
break unless bj < b_size
|
| 490 |
|
end
|
| 491 |
|
end
|
| 492 |
|
end
|
| 493 |
|
|
| 494 |
|
if ai < a_size
|
| 495 |
|
ax = string ? seq1[ai, 1] : seq1[ai]
|
| 496 |
|
bx = string ? seq2[bj, 1] : seq2[bj]
|
| 497 |
|
event = DiffX::LCS::ContextChange.new('-', ai, ax, bj, bx)
|
| 498 |
|
event = yield event if block_given?
|
| 499 |
|
callbacks.discard_a(event)
|
| 500 |
|
ai += 1
|
| 501 |
|
end
|
| 502 |
|
|
| 503 |
|
if bj < b_size
|
| 504 |
|
ax = string ? seq1[ai, 1] : seq1[ai]
|
| 505 |
|
bx = string ? seq2[bj, 1] : seq2[bj]
|
| 506 |
|
event = DiffX::LCS::ContextChange.new('+', ai, ax, bj, bx)
|
| 507 |
|
event = yield event if block_given?
|
| 508 |
|
callbacks.discard_b(event)
|
| 509 |
|
bj += 1
|
| 510 |
|
end
|
| 511 |
|
end
|
| 512 |
|
end
|
| 513 |
|
|
| 514 |
|
# #traverse_balanced is an alternative to #traverse_sequences. It
|
| 515 |
|
# uses a different algorithm to iterate through the entries in the
|
| 516 |
|
# computed longest common subsequence. Instead of viewing the changes as
|
| 517 |
|
# insertions or deletions from one of the sequences, #traverse_balanced
|
| 518 |
|
# will report <em>changes</em> between the sequences. To represent a
|
| 519 |
|
#
|
| 520 |
|
# The arguments to #traverse_balanced are the two sequences to traverse
|
| 521 |
|
# and a callback object, like this:
|
| 522 |
|
#
|
| 523 |
|
# traverse_balanced(seq1, seq2, Diff::LCS::ContextDiffCallbacks.new)
|
| 524 |
|
#
|
| 525 |
|
# #sdiff is implemented with #traverse_balanced.
|
| 526 |
|
#
|
| 527 |
|
# == Callback Methods
|
| 528 |
|
# Optional callback methods are <em>emphasized</em>.
|
| 529 |
|
#
|
| 530 |
|
# callbacks#match:: Called when +a+ and +b+ are pointing
|
| 531 |
|
# to common elements in +A+ and +B+.
|
| 532 |
|
# callbacks#discard_a:: Called when +a+ is pointing to an
|
| 533 |
|
# element not in +B+.
|
| 534 |
|
# callbacks#discard_b:: Called when +b+ is pointing to an
|
| 535 |
|
# element not in +A+.
|
| 536 |
|
# <em>callbacks#change</em>:: Called when +a+ and +b+ are pointing
|
| 537 |
|
# to the same relative position, but
|
| 538 |
|
# <tt>A[a]</tt> and <tt>B[b]</tt> are
|
| 539 |
|
# not the same; a <em>change</em> has
|
| 540 |
|
# occurred.
|
| 541 |
|
#
|
| 542 |
|
# #traverse_balanced might be a bit slower than #traverse_sequences,
|
| 543 |
|
# noticable only while processing huge amounts of data.
|
| 544 |
|
#
|
| 545 |
|
# The +sdiff+ function of this module is implemented as call to
|
| 546 |
|
# #traverse_balanced.
|
| 547 |
|
#
|
| 548 |
|
# == Algorithm
|
| 549 |
|
# a---+
|
| 550 |
|
# v
|
| 551 |
|
# A = a b c e h j l m n p
|
| 552 |
|
# B = b c d e f j k l m r s t
|