Adding in a smarter parent-base finding algorithm
Review Request #6797 — Created Jan. 17, 2015 and discarded
Main idea
Given a tree like this one:
A - * - * - * - * origin/master \ * - B - * - * origin/feature \- C - D topic
when we
rbt post
, it would like to have three things,
'base': A revision to use as the base of the resulting diff.
'tip': A revision to use as the tip of the resulting diff.
'parent_base': (optional) The revision to use as the base of a
parent diff.
Now,parent_base
has to be in the remote repository, in other words, a ancestor of a commit of any branchorigin/*
. We would also likeparent_base
to be as young an ancestor oftip
as possible, given the first constraint.I've worked out a
git-rev-list
query to find this.git rev-list $TIP --not --remotes=origin | tail -1
should give the oldest ancestor that is not reachable from a public branch, so it's parent will be the best parent base.
Now with tests!
test_diff_finding_parent
is a new test, and I had to modifytest_parse_revision_spec_no_args_parent
to pass, since the parent base is no longer the same after my modifications.
Description | From | Last Updated |
---|---|---|
Col: 80 E501 line too long (98 > 79 characters) |
reviewbot | |
Col: 17 E265 block comment should start with '# ' |
reviewbot | |
Col: 17 E113 unexpected indentation |
reviewbot | |
I need to fix this to listen to the upstream_branch if it's explicitly set. |
TB tbelaire | |
Col: 50 E261 at least two spaces before inline comment |
reviewbot | |
Col: 80 E501 line too long (91 > 79 characters) |
reviewbot | |
Going to remove this hardcoded thing later. |
TB tbelaire | |
local variable 'ri' is assigned to but never used |
reviewbot | |
Col: 80 E501 line too long (84 > 79 characters) |
reviewbot | |
Col: 17 E265 block comment should start with '# ' |
reviewbot | |
Col: 17 E113 unexpected indentation |
reviewbot | |
Col: 50 E261 at least two spaces before inline comment |
reviewbot | |
Col: 80 E501 line too long (91 > 79 characters) |
reviewbot | |
local variable 'ri' is assigned to but never used |
reviewbot | |
Col: 80 E501 line too long (84 > 79 characters) |
reviewbot | |
local variable 'ri' is assigned to but never used |
reviewbot | |
local variable '_ri' is assigned to but never used |
reviewbot | |
Can we change the terminology here a little bit? I'd like to use: online -> remote offline -> local feature … |
david | |
We should put blank lines around this block, since that's the way Christian likes it. |
david | |
local variable '_' is assigned to but never used |
reviewbot | |
Col: 36 E127 continuation line over-indented for visual indent |
reviewbot | |
_ is often used for gettext. Best to call this unused, or just fetch get_origin()[0]. |
chipx86 | |
Can you add a docstring covering exactly what this does and how it works? |
chipx86 | |
This can pretty easily spit out a large number of commits, particularly for large repositories. I'd give this a try … |
chipx86 | |
Should just be: if not local_commits: |
chipx86 | |
"a commit" "that's also" "available" Should end with a period. |
chipx86 | |
Can use self.assertIsInstance |
VT VTL-Developer | |
Can use self.assertIn |
VT VTL-Developer |
- Groups:
- Description:
-
It's still not very smart, and does no error checking.
But it looks like it's working.
+ + Main idea
+ + Given a tree like this one:
+ + A - * - * - * - * origin/master
+ \ * - B - * - * origin/feature
+ \- C - D topic
+ + + when we
rbt post
, it would like to have three things,+ 'base': A revision to use as the base of the resulting diff. + 'tip': A revision to use as the tip of the resulting diff. + 'parent_base': (optional) The revision to use as the base of a + parent diff. + Now, parent_base
has to be in the remote repository, in other words, a ancestor of a commit of any branchorigin/*
. We would also likeparent_base
to be as young an ancestor oftip
as possible, given the first constraint.+ + I've worked out a
git-rev-list
query to find this.+ + git rev-list $TIP --not --remotes=origin | tail -1
+ + + should give the oldest ancestor that is not reachable from a public branch, so it's parent will be the best parent base.
- Change Summary:
-
Now with tests!
- Testing Done:
-
~ I need to learn how to test properly.
~ Now with tests!
- It kinda works on github.com/tbelaire/sandbox2.git, but I will make some tests once I learn how. - Commit:
-
f5a63b01645728ed69b303a75a3f5016f6f92d52cf99a05cadbeb55d56d87f25155102a8e46c5e26
- Diff:
-
Revision 2 (+41 -4)
-
Tool: Pyflakes Processed Files: rbtools/clients/tests.py rbtools/clients/git.py Tool: PEP8 Style Checker Processed Files: rbtools/clients/tests.py rbtools/clients/git.py
-
-
-
-
-
-
-
What's happened to this patch? We haven't seen a code drop in a while. Theo, can you please update this today?
- Commit:
-
cf99a05cadbeb55d56d87f25155102a8e46c5e269983dd039d857797a268d6a03ed5efb1fe8569d5
- Diff:
-
Revision 3 (+54 -6)
-
Tool: Pyflakes Processed Files: rbtools/clients/tests.py rbtools/clients/git.py Tool: PEP8 Style Checker Processed Files: rbtools/clients/tests.py rbtools/clients/git.py
-
-
-
-
-
-
- Change Summary:
-
Fixed ReviewBot Issues.
- Commit:
-
9983dd039d857797a268d6a03ed5efb1fe8569d574e01b5d67ef539bf607ab591cec52526242924c
- Diff:
-
Revision 4 (+55 -6)
- Commit:
-
74e01b5d67ef539bf607ab591cec52526242924c42667578d32de1e6ff9b1fb272c6f7f5d5f8f86d
- Diff:
-
Revision 5 (+55 -6)
- Summary:
-
WIP: Adding in a smarter parent-base finding algorithmAdding in a smarter parent-base finding algorithm
- Description:
-
- It's still not very smart, and does no error checking.
- - But it looks like it's working.
- Main idea
Given a tree like this one:
A - * - * - * - * origin/master
\ * - B - * - * origin/feature
\- C - D topic
when we
rbt post
, it would like to have three things,'base': A revision to use as the base of the resulting diff. 'tip': A revision to use as the tip of the resulting diff. 'parent_base': (optional) The revision to use as the base of a parent diff. Now, parent_base
has to be in the remote repository, in other words, a ancestor of a commit of any branchorigin/*
. We would also likeparent_base
to be as young an ancestor oftip
as possible, given the first constraint.I've worked out a
git-rev-list
query to find this.git rev-list $TIP --not --remotes=origin | tail -1
should give the oldest ancestor that is not reachable from a public branch, so it's parent will be the best parent base.
- Testing Done:
-
Now with tests!
+ + + + test_diff_finding_parent
is a new test, and I had to modifytest_parse_revision_spec_no_args_parent
to pass, since the parent base is no longer the same after my modifications. - Commit:
-
42667578d32de1e6ff9b1fb272c6f7f5d5f8f86df3e2320da11bcdb032cb9a9671eb17a613f66983
- Diff:
-
Revision 6 (+57 -6)
- Commit:
-
f3e2320da11bcdb032cb9a9671eb17a613f669838b95ef2362922fd4aeb37854149b9c570c25ca30
- Diff:
-
Revision 7 (+57 -6)
-
Tool: Pyflakes Processed Files: rbtools/clients/tests.py rbtools/clients/git.py Tool: PEP8 Style Checker Processed Files: rbtools/clients/tests.py rbtools/clients/git.py
- Commit:
-
8b95ef2362922fd4aeb37854149b9c570c25ca30a59c4e8e53a903cb4ce576b19dca9d73c0eb2c6b
- Diff:
-
Revision 8 (+55 -6)
-
Tool: Pyflakes Processed Files: rbtools/clients/tests.py rbtools/clients/git.py Tool: PEP8 Style Checker Processed Files: rbtools/clients/tests.py rbtools/clients/git.py
-
Good work so far!
Since this is a pretty fundamental, hard-to-get-right change, I want to see a lot more than one test. I can think of several more that need to be added:
1. Simple case: Posting a feature branch off a tracking branch aligned with the remote.
This is the basic case of:
o [feature-branch] | o [origin/remote-branch] [remote-branch] | | o [origin/master] [master] | |
It should detect
origin/remote-branch
.2. Posting a feature branch off a tracking branch with changes since the remote.
o [feature-branch] | o [remote-branch] | o [origin/remote-branch] | | o [origin/master] [master] | |
It should still detect
origin/remote-branch
.3. Posting a feature branch off a branch not properly tracking the remote.
Same as #1 above, but with
remote-branch
not being set to trackorigin/remote-branch
(meaning it's a regular branch and not set to track in the git config).It should still detect
origin/remote-branch
.4. Posting a feature branch with a merged tracking branch off another tracking branch.
o [feature-branch] |\ | \ | o [origin/merged-branch] | | | ... | o [origin/remote-branch] [remote-branch] |
This is harder, because it could go either way. However, I think the user would expect that it would show the merged changes, meaning that
origin/remote-branch
would be the right branch to detect.5. Posting a feature branch off a tracking branch that merged another tracking branch.
o [feature-branch] | o [origin/remote-branch] [remote-branch] |\ | \ | o [origin/merged-branch] | |
Should be fairly straight-forward, but good to have for testing. Should be the same as #1 above, really.
6. Posting a feature branch off a remote branch without any tracking branches.
o [feature-branch] | o [origin/remote-branch] |
The tracking branch shouldn't matter so much. It should still return
origin/remote-branch
.7. Posting a feature branch off a remote branch aligned to the same commit as another remote branch.
o [feature-branch] | o [origin/remote-branch1] [origin/remote-branch2] |
Obviously either one of these is fine, but it should reliably choose one of these in a predictable way.
8. Posting a feature branch not up-to-date with the remote branch.
o [feature-branch] | | o [origin/remote-branch] [remote-branch] | / | / o | o [origin/master] |
I wish there was a good way to have this detect
origin/remote-branch
, but that's unlikely to be possible. It should returnorigin/master
if auto-detecting.However, if the user has hard-coded
TRACKING_BRANCH = 'origin/remote-branch'
, then it should be smart and use the common commit they share, to avoid a lot of the problems people hit today where the diff gets all screwed up with things reversed fromorigin/remote-branch
.9. Posting a feature branch with branches from different remotes in the path.
o [feature-branch] | o [other-remote/remote-branch] | o [origin/remote-branch] |
The other remotes should be ignored. It should only fetch
origin/remote-branch
.Others?
There may be other situations that come up, but this is a good set of cases that all need to be handled properly in order to not break things.
We also need to ensure that performance is fine with large (multi-gigabyte) repositories.
-
-
-
This can pretty easily spit out a large number of commits, particularly for large repositories. I'd give this a try against some known large repositories, like the KDE or Linux kernel repositories, and see how it handles.
When run against the Review Board repository with a commit on master, I get over 6,300 commits back.
-
-