-
-
-
-
-
-
-
-
-
Space after the : Also, on the comments, there should be a space after #, and the comment should end with a period.
-
-
-
-
-
-
-
-
-
Since we reference table.sidebyside instances many times, we should store the result of $("table.sidebyside") in a variable and reference that variable for all other uses.
-
-
-
-
-
-
-
This is probably part of the other change, but rather than having onclick, I'd prefer that diffviewer.js register these actions for these links. You can give them IDs to reference them.
Detect and display moved parts in a file.
Review Request #926 — Created July 21, 2009 and submitted
This diff builds up on my previous whitespace diff, so it also contains those changes, as mercurial could not keep them apart. Further improving the diff viewer is the detection of whole blocks move inside a file. A move is characterized by a delete in one place and the insert of the same text on another location. The main problem for this effect is the complexity of the analysis required for this detection. So for now, I opted to do whole blocks only, but the interface and internal representation supports an arbitrary technique, so it is possible to build upon them. Once a move is detected it is shown according to the screenshots below. Each moved line gets a handle containing information as from where that line came from, or were it went. When the handle is clicked the page jumps to the surrounding chunk, that is highlighted, and the corresponding line is highlighted for 2 seconds, with a bright yellow.
Tested all sorts of moves. File with only moves, mixed moves and edit and files where there is several instances of the destination (i.e. when the user cuts a chunk and pastes several times). In the case of multiple destinations, the last one found is the one considered target, while the others are ignored. The effects it self were tested on Safari 4.0, Chrome 3.0 preview, Firefox 3.5 and Firefox 3.0
-
Hey Eduardo, Can you update this diff now that the whitespace change is in? I'm trying to think of ways to optimize the algorithm so we don't loop so much at the end. I think we can do this a few different ways. While doing the initial iteration and deciding whether to dump the range into an inserts or deletes bucket, we could generate some sort of a hash of the data and store it in a dictionary. The dictionary would map the hash to a list of groups. This may only be needed for the removes bucket. Then, after putting them into buckets, we can go through the inserts list and see if its generated hash exists in the removes bucket. If it does, loop through the list associated with that hash and do the comparison. This has the benefit of only introducing one mandatory loop, the one for the inserts list. Instead of looping through all the remaining removes on each insert, we only need to do one cheap lookup, and then iterate through a small list (probably 1 or 2) of items for comparison. The hash tag could be something as simple as "%d-%d-%s-%s" % (i2 - i1, j2 - j1, differ.a[:10], differ.a[-10:]), which would potentially have more matches to search through, but would be simple and quick to generate. Probably better than going through all the data and generating a checksum, which would cost more than a couple extra small loops. So, something like: for ... in differ.get_opcodes(): ... if tag == 'delete': key = "%d-%d-%s-%s" % (j2 - j1, i2 - i1, differ.b[10:], differ.b[-10:]) if not key in removes: removes[key] = [] removes[key].append(groups[-1]) else: key = "%d-%d-%s-%s" % (i2 - i1, j2 - j1, differ.a[:10], differ.a[-10:]) if not key in inserts: inserts[key] = [] inserts[key].append(groups[-1]) for insert_key, groups in inserts.iteritems(): if insert_key in removes: # Do comparisons here This is just a rough idea at 1 in the morning, so it's probably buggy at best, but try out a version of this and see if you can make it work. I want to be sure this algorithm is as optimal as possible. Thanks for your hard work on this so far!
-
-
As an optimization, I think we can check to see if the length of each range is the same before doing any data comparison.
-