git-p4: Fix file renames, deletes and files with whiltespace in name

Review Request #12258 — Created April 25, 2022 and submitted

Information

RBTools
master

Reviewers

Previously, the filename was gathered from the diff --git ... line,
but this is problematic in the case where the filename has spaces. The
parser was simply splitting on space.

The solution is to get the filename from the --- ... line, resetting
when we see the diff ... line. This is simpler as we can assume
anything following the --- is the filename. We also ensure that we
correctly identify file renames by reading the old
filename and new filename from these lines.

Further we include the headers for renames as generated by the
perforce rbtools client (Moved from: and Moved to:). There's one special
case here where previously a 100% rename (with no diff) would
silently fail to send anything to the server. In this case, we now send
something the server recognises as a file rename operation, special
casing like we do in the perforce.py client.

In a git-p4 repo with files containing spaces:
1. edit one file
2. rename one file
3. create a new file with space in it
4. delete a file with space in it

I compared the generated diffs from a pure perforce repo and a git-p4 repo of the same locatoin and ensured that we generate the same for git-p4 as we do for perforce on rbt post.

Here are the outputs from adding, renaming (without changes), renaming (with changes) and deleting files:

Image
Image
Image

The one remaining area that's differnt is that git-p4 diff generation does not handle binary files being added/modified/deleted/etc. but this is enough for one patch.

unit tests

I used a docker conatiner to run them. Here's the dockerfile:

FROM ubi8

ENV YUMARGSBASE="--disablerepo=* --enablerepo=ubi-8-*"
ENV YUMARGS="${YUMARGSBASE} --enablerepo=epel*"

RUN  dnf ${YUMARGSBASE} -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm &&\
  yum ${YUMARGS} update -y && \
  yum ${YUMARGS} install -y git \
                            gcc-c++ \
                            make \
                            python39 \
                            diffutils \
                            apr-devel \
                            apr-util-devel \
                            unzip \
                            zlib-devel \
                            lz4-devel \
                            python39-devel && \
  yum ${YUMARGS} clean all

ARG SUBVERSION=1.14.2
ARG SQLITE=3081101

RUN cd $HOME &&  \
    curl -LO https://dlcdn.apache.org/subversion/subversion-${SUBVERSION}.tar.gz && \
    curl -LO https://www.sqlite.org/2015/sqlite-amalgamation-${SQLITE}.zip && \
    tar zxvf subversion-${SUBVERSION}.tar.gz && \
    rm subversion-${SUBVERSION}.tar.gz && \
    unzip sqlite-amalgamation-${SQLITE}.zip && \
    rm sqlite-amalgamation-${SQLITE}.zip && \
    mv sqlite-amalgamation-${SQLITE} subversion-${SUBVERSION}/sqlite-amalgamation && \
    cd subversion-${SUBVERSION} && \
    ./configure --with-utf8proc=internal && \
    make -j 8 && \
    make install && \
    cd $HOME && \
    rm -rf subversion-${SUBVERSION}

The result was some failures - all for svn failing, though i don't know why this would be:

============================================================================ short test summary info =============================================================================
FAILED rbtools/clients/tests/test_cvs.py::CVSClientTests::test_get_repository_info_with_found - AssertionError: None is not an instance of <class 'rbtools.clients.RepositoryIn...
FAILED rbtools/clients/tests/test_scanning.py::ScanningTests::test_scanning_nested_repos_1 - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtools/...
FAILED rbtools/clients/tests/test_scanning.py::ScanningTests::test_scanning_nested_repos_2 - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtools/...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_diff_exclude - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtools/clients/tests/te...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_diff_exclude_in_subdir - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtools/client...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_diff_exclude_root_pattern_in_subdir - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/r...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_diff_non_unicode_characters - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtools/c...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_diff_non_unicode_filename_repository_url - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbto...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_get_commit_message_committed_revision - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_get_commit_message_committed_revisions - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtool...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_get_commit_message_working_copy - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtoo...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_history_scheduled_with_commit_nominal - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_history_scheduled_with_commit_special_case_exclude - Exception: Failed to execute command: ['svn', 'co', 'file://...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_history_scheduled_with_commit_special_case_non_local_mods - Exception: Failed to execute command: ['svn', 'co', '...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_parse_revision_spec_invalid_spec - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbto...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_parse_revision_spec_no_args - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtools/c...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_parse_revision_spec_non_unicode_log - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/r...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_parse_revision_spec_one_arg_two_revisions - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbt...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_parse_revision_spec_one_revision - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbto...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_parse_revision_spec_one_revision_changelist - Exception: Failed to execute command: ['svn', 'co', 'file:///root/r...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_parse_revision_spec_one_revision_nonexistant_changelist - Exception: Failed to execute command: ['svn', 'co', 'fi...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_parse_revision_spec_one_revision_url - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_parse_revision_spec_two_arguments - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbt...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_parse_revision_spec_two_revisions_url - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_rename_diff_mangling_bug_4546 - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtools...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_same_diff_multiple_methods - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtools/cl...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_show_copies_as_adds_disabled - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtools/...
ERROR rbtools/clients/tests/test_svn.py::SVNClientTests::test_show_copies_as_adds_enabled - Exception: Failed to execute command: ['svn', 'co', 'file:///root/rbtools/rbtools/c...
================================================== 3 failed, 221 passed, 42 skipped, 2 warnings, 25 errors in 78.68s (0:01:18) ===================================================

The skiipped tess are all bzr and clearcase (and some mercurial for some reason):

[root@1bda7694ea8b rbtools]# python3 tests/runtests.py 
============================================================================== test session starts ===============================================================================
platform linux -- Python 3.9.6, pytest-7.1.2, pluggy-1.0.0
rootdir: /root/rbtools, configfile: setup.cfg, testpaths: rbtools
plugins: kgb-7.0, env-0.6.2
collected 291 items                                                                                                                                                              

rbtools/api/tests/test_capabilities.py ...                                                                                                                                 [  1%]
rbtools/api/tests/test_factory.py ......                                                                                                                                   [  3%]
rbtools/api/tests/test_http_request.py ........                                                                                                                            [  5%]
rbtools/api/tests/test_resource.py ....................                                                                                                                    [ 12%]
rbtools/clients/tests/test_bzr.py ssssssssssssssssssss                                                                                                                     [ 19%]
rbtools/clients/tests/test_clearcase.py sssssssssssssss                                                                                                                    [ 24%]
rbtools/clients/tests/test_cvs.py F.                                                                                                                                       [ 25%]
rbtools/clients/tests/test_git.py ...............................................                                                                                          [ 41%]
rbtools/clients/tests/test_mercurial.py .....................................sssssss                                                                                       [ 56%]
rbtools/clients/tests/test_p4.py .....................                                                                                                                     [ 63%]
rbtools/clients/tests/test_scanning.py FF                                                                                                                                  [ 64%]
rbtools/clients/tests/test_svn.py ....EEEEEEEEEEEEEEEEEEEEEEEEE                                                                                                            [ 74%]
rbtools/commands/tests/test_alias.py .......                                                                                                                               [ 76%]
rbtools/commands/tests/test_main.py .............                                                                                                                          [ 81%]
rbtools/commands/tests/test_post.py ..............                                                                                                                         [ 86%]
rbtools/commands/tests/test_setup_repo.py .......                                                                                                                          [ 88%]
rbtools/utils/tests/test_aliases.py .............                                                                                                                          [ 93%]
rbtools/utils/tests/test_checks.py ...                                                                                                                                     [ 94%]
rbtools/utils/tests/test_console.py .......                                                                                                                                [ 96%]
rbtools/utils/tests/test_filesystem.py .....                                                                                                                               [ 98%]
rbtools/utils/tests/test_process.py .                                                                                                                                      [ 98%]
rbtools/utils/tests/test_repository.py ....                                                                                                                                [100%]
Summary ID
git-p4: Fix file renames, deletes and files with spaces in name
Previously, the filename was gathered from the `diff --git ... line`, but this is problematic in the case where the filename has spaces. The parser was simply splitting on space. The solution is to get the filename from the `--- ...` line, resetting when we see the `diff ...` line. This is simpler as we can assume anything following the `---` is the filename. We also ensure that we correctly identify file renames by reading the old filename and new filename from these lines. Further we include the headers for renames as generated by the perforce rbtools client (Moved from: and Moved to:). There's one special case here where previously a 100% rename (with no diff) would silently fail to send anything to the server. In this case, we now send something the server recognises as a file rename operation, special casing like we do in the perforce.py client..
3db412b2e4f9b8fa1c88c8f3441544a0ae10cc62
Description From Last Updated

Can you verify if unit tests run? It'd be great to add new tests for these cases, but I also …

chipx86chipx86

Typos in the description: anyting -> anything correcly -> correctly

chipx86chipx86

E202 whitespace before ']'

reviewbotreviewbot

E203 whitespace before ':'

reviewbotreviewbot

E226 missing whitespace around arithmetic operator

reviewbotreviewbot

E203 whitespace before ':'

reviewbotreviewbot

E202 whitespace before ']'

reviewbotreviewbot

E501 line too long (80 > 79 characters)

reviewbotreviewbot

E501 line too long (80 > 79 characters)

reviewbotreviewbot

We aim to keep comments as proper sentences, with trailing periods. Can you add a period to this sentence? (Some …

chipx86chipx86

Just to help document this code and pair with the new comment made above, can you add comments here on …

chipx86chipx86

We can do split(b'\t', 1) to avoid any further splitting if, for some reason, there were multiple tabs.

chipx86chipx86

Unless I'm misreading, changing old_filename and new_filename here is going to impact the filename building for the resulting ---/+++ lines, …

chipx86chipx86

Can you change these comments to be complete sentences, with sentence casing/period?

chipx86chipx86
Checks run (1 failed, 1 succeeded)
flake8 failed.
JSHint passed.

flake8

puremourning
chipx86
  1. 
      
  2. Show all issues

    Can you verify if unit tests run?

    It'd be great to add new tests for these cases, but I also recognize we don't have P4-specific Git diff generation tests right now.

    1. I have run all of the unit tests which all pass (modulo some which were skipped, presumably because i didn't have some dependency installed). Certainly all the git and perforce tests pass.

      I did look at adding unit tests for these, but it's kinda tricky because all the perforce interraction is mocked out and i didn't see an obvious way to create a "git-p4 like" repo within the test suite. The best i thought of was to use the existing git repo and temporarily "git config git-p4.port localhost:1666", and ... somehow magically create refs/remotes/p4/master and edit a commit so that it has the additional metadata that git-p4 adds. At this point i figured that it was unlikely to be robust without an actual p4d. It might be possible to do that, given helix licensing now, but it would certainly require p4d to be installed externally to the test suite and stuf like that.

    2. Yeah it's a bit of work. It's fine. Thanks for verifying the test runs :)

  3. Show all issues

    Typos in the description:

    anyting -> anything

    correcly -> correctly

  4. rbtools/clients/git.py (Diff revision 2)
     
     
    Show all issues

    We aim to keep comments as proper sentences, with trailing periods. Can you add a period to this sentence?

    (Some older code doesn't do this, but we're trying to maintain that going forward.)

  5. rbtools/clients/git.py (Diff revision 2)
     
     
     
     
    Show all issues

    Just to help document this code and pair with the new comment made above, can you add comments here on the format at this point that we're parsing an why we're grabbing it here?

  6. rbtools/clients/git.py (Diff revision 2)
     
     
     
    Show all issues

    We can do split(b'\t', 1) to avoid any further splitting if, for some reason, there were multiple tabs.

  7. rbtools/clients/git.py (Diff revision 2)
     
     
     
     
     
     
     
    Show all issues

    Unless I'm misreading, changing old_filename and new_filename here is going to impact the filename building for the resulting ---/+++ lines, I believe. We want to retain the /dev/null where appropriate.

    1. I should apologise, i realised i have actually conflated 2 fixes in this patch, both of which appear the same to users:

      1. Files with spaces in the name didn't work (they produced partial file names in the diff output) - leading to "Cannot display diff" in reviewboard
      2. File renames, adds and deletes didn't work - they produced file names of /dev/null which is incompatible with the way p4 diff produces its output - leading to "Cannot dispolay diff" in reviewboard.

      So this is intentional, in order for the diff output to match the equivalent behaviour of p4 diff, which is:

      • For an add, old file = new file
      • for a delete (p4 delete), new file = old file
      • for a rename (p4 integ or p4 move), new file = move-add file, old file = move-delete file (as expected).

      I'll get some sample outputs from p4 diff and post them on the test notes section.

    2. In testing this, I foudn that my fix was incomplete, and there is special casing required for a rename that doesn't modify any file lines. I've added a fix for this and included testing.
      Sorry that makes this patch rather large, but hopefully it's ok.

    3. Okay, yeah, /dev/null is standard for most diff implementations but for Perforce it's not used. The revision is what matters here.

      Eventually we're going to start trying to move away from things like vendor-specific diffs and instead move to DiffX, which is our extensible, portable diff format. It'll solve a lot of problems (particularly in the case of Perforce, which has historically had a lot of broken, contradictory logic).

  8. rbtools/clients/git.py (Diff revision 2)
     
     
     
     
     
    Show all issues

    Can you change these comments to be complete sentences, with sentence casing/period?

  9. 
      
puremourning
puremourning
puremourning
puremourning
  1. Is anything further required from me on this?

  2. 
      
chipx86
  1. Ship It!
    1. Thanks once again for your patience on this. I know this is long overdue. RBTools has been getting a substantial rewrite of all the diff code, and an increase in unit testing, for RBTools 4. I had to meet some deadlines there, and then add new testing to verify these patches. That's all looking good. We'll probably put out a RBTools 3.x release soon, but will be releasing 4.0 in the coming month.

    2. Just as a follow-up, and something we'll fix for 4.0 specifically as part of the transition to the new diffing code, is that adds and deletes aren't represented with the ====A==== and ====D==== markers. Deletes still contain the Git deleted file mode 100644 line.

    3. Sorry, scratch that, those are specifically for binary files. I'll take care of the extra deleted file mode.

      Can't wait to move everything onto DiffX.

  2. 
      
puremourning
Review request changed
Status:
Completed
Change Summary:
Pushed to release-3.x (66e9d1d)