This change has been marked as completed.

Describe the completed change (optional):

Summary

CVS scmtool

Review Request #50 — Created June 4, 2007 and submitted

Information

Owner

lobbin*

Repository

Review Board SVN (deprecated)

Branch

trunk

Bugs

Depends On

Reviewers

Groups

reviewboard

People

Description*

Update:
Added 100% unittests
Removed extrapath dependency

====
Update:
Back from vacation, fixed the remaining code format issues.
Interface now lists new files with correct name
Open problem: sort out and remove dependency of extrapath and add tests.

====
Update:
Uses the new diffparser, code gets a lot better now.
Open problem: interface lists new files as "/dev/null" (file.origName), not the real name found in file.newName, is this changes recently?
Also, removes dependency on external lib for CVS, needs to think a smart way to use other methods than pserver.

====
A first draft of the CVS scmtool. Whats needs to be done is to figure out how to store and display the complete filename when viewing the diff and change to a better cvs lib.

Testing Done

100% unit-tested (included)

Description	From	Last Updated
There are no open issues

```
Got some comments for you.
```

trunk/reviewboard/scmtools/cvs.py (Diff revision 1)

I'm not sure about hardcoding the pserver method here.  Perhaps it'll do for now.

trunk/reviewboard/scmtools/cvs.py (Diff revision 1)

This seems really fragile, given that there's a datestring in there.  Can you convert this to a regular expression, or at least split on whitespace?

lobbin 18 years, 1 month ago

Ok, new diff uploaded which fixes all current issues. Added compiled regex for getting full rcs file and also added a (tested) regex for getting the revision from revision_str.

trunk/reviewboard/scmtools/cvs.py (Diff revision 1)

Instead of the (truly ugly) isfloat method below, can you just do this?

if not re.match('\d\\.\d', revision):
    raise SCMException('Unable to parse diff revision header "%s"' % \
                       revision_str)

trunk/reviewboard/scmtools/cvs.py (Diff revision 1)

This is a lot more complicated than it needs to be.  Here's what I recommend:

for line in file.splitlines():
    m = re.match('^RCS file: %s/(.*),v$' % self.repopath, line) or \
        re.match('^RCS file: (.+)$', line)
    if m:
        self.extrapath.append(m.group(1))
        return
    if line[:10] == 'RCS file: ':
        raise SCMException('Unable to parse RCS line "%s"' % line)

I'm not sure those last two lines are even necessary.  Will
cvs diff -u ever do that?

chipx86 18 years, 1 month ago

or even better:

if line.startswith('RCS file: '):

Might be nice to compile those two regexs first, but probably doesn't matter as we should hit those lines quickly in all correct diffs.

chipx86 18 years, 1 month ago

David: We don't want to return there, do we? We're parsing an entire diff file, not just one filediff. We'd need to store all the RCS File paths in extrapath, not just the first one.

Making sure I didn't misunderstand this.

david 18 years, 1 month ago

Nope it shouldn't return, that was dumb of me.

Compiling the regexes once in the tool constructor would be great.

As far as .cvspass is concerned, maybe this could read the file in CVSTool.__init__ and add the necessary line yourself?

cuviper 18 years, 1 month ago

.cvspass doesn't contain the raw password, but rather a hashed/encrypted variant.

Also, you can include the password in CVSROOT, so the .cvspass isn't strictly necessary.

  :pserver:username:password@server:/path

```
Some more comments.
```

trunk/reviewboard/scmtools/cvs.py (Diff revision 1)

Might be time for us to come up with some sort of authmethod field in the database.

david 18 years, 1 month ago

Maybe.  I wonder if we couldn't invent some kind of magical format
string for the repository path (so you could do something like
':pserver:%u@cvs.foo.com').

Of course, any of this can wait until later.  All of the CVS servers
I care about have at least an anonymous read-only version using
pserver.

trunk/reviewboard/scmtools/cvs.py (Diff revision 1)

Excess Whitespace. I'm only going to mark this one, but it'd be nice to go through the diff and remove the others.

trunk/reviewboard/scmtools/cvs.py (Diff revision 1)

I can't remember.. Is there a case where we actually pass in a path that's None?

lobbin 18 years, 1 month ago

I just used the same signature as the core scmtool uses. As it seems from CVS it's probably a valid revision or PRE-CREATION I guess.

trunk/reviewboard/scmtools/cvs.py (Diff revision 1)

Is there any exception we can catch that would give us a FileNotFound?

lobbin 18 years, 1 month ago

Probably a lot, but as I stated I want to change cvs module. When that is done I add the appropriate exceptions.

trunk/reviewboard/scmtools/cvs.py (Diff revision 1)

You can condense this down to:

return file_str, PRE_CREATION

trunk/reviewboard/scmtools/cvs.py (Diff revision 1)

This can be condensed to:

return self.extrapath.pop(0), revision

lobbin 18 years, 1 month ago

Yes, I was going to add extra safety to check that the filenames actually matched.

trunk/reviewboard/diffviewer/forms.py (Diff revision 1)

I've been thinking about this parse_extra_info_headers thing a bit more. Passing the entire diff and parsing it a second time is kind of excessive.

Perhaps we should go a slightly different route. Parse the file diff, store everything we don't recognize, and pass that to the SCMTool? Of course, this possibly opens up additional problems where we have to suddenly understand every kind of diff. This is food for thought, not something that would have to be done for this change :)

This is looking pretty good!  Just a couple comments on the code:

trunk/reviewboard/scmtools/cvs.py (Diff revision 3)

OK, very minor style comment.  Can you write these lines like this:

import re

import pycvs

from reviewboard.scmtools.core import SCMException, FileNotFoundException, \
                                      SCMTool, HEAD, PRE_CREATION


This is how PEP-8 recommends formatting imports.

trunk/reviewboard/scmtools/cvs.py (Diff revision 3)

Don't you need \d+ here?  This regex won't match "1.15" or "10.2"

What more needs to be done with this before it can go into SVN?

lobbin 18 years, 1 month ago

How does it look? All I want now is to change the cvs module, otherwise it seems good.

david 18 years ago

Sounds good to me.  Some unit tests would be great, but
not necessary to start with.

trunk/reviewboard/scmtools/cvs.py (Diff revision 3)

As David mentioned, this won't match multiple digits.  It also needs to match branch revisions (1.2.3.4), and you should make sure that the .* doesn't get carried away.

You can use r'' to avoid the extra escapes.

My suggestion: r'^.*?(\d+(\.\d+)+)$'

david 18 years, 1 month ago
```
This looks great.
```
LO

lobbin 18 years, 1 month ago
```
Didn't think of branches, thanks!
```

trunk/reviewboard/scmtools/cvs.py (Diff revision 3)

I don't like this general FIFO mechanism you're using, because it assumes that parse_diff_revision will always be called in the exact same order of the diff file.  This may be true now, but that's a dangerous assumption.

Can you at least throw in an assertion that the popped value .endswith(file_str)?

lobbin 18 years, 1 month ago

I don't like it either, but at the moment the cvstool isn't able to get this extra information from the diff parser. I've added an extra check to be sure.

trunk/reviewboard/scmtools/cvs.py (Diff revision 4)

This should be:

  raise SCMException("File '%s' doesn't match the parsed file '%s'" %\
  (file_str, fullfile))

trunk/reviewboard/scmtools/cvs.py (Diff revision 4)

The CVS server may be accessed using something other than pserver (ext, for example). There should be a way to configure that, I suppose.

trunk/reviewboard/scmtools/cvs.py (Diff revision 4)

This regex doesn't work with diffs created on a Windows machine. I managed to fix that with the following:

  self.regex_rev   = re.compile(r'^.*?(\d+(\.\d+)+)\r?$')

```
Got some more comments for you on this.
```
trunk/reviewboard/scmtools/cvs.py (Diff revision 5)
```
Add a blank line after this as per PEP-8
```

trunk/reviewboard/scmtools/cvs.py (Diff revision 5)

1 space between each token of variable = X.  Aligning them up like this is useful in some cases but for disparate data like this, just makes it hard to maintain.  Notice that you've already got one that ended up misaligned during your revisions ;)

trunk/reviewboard/scmtools/cvs.py (Diff revision 5)

We don't want to compile this regex each time an instance is initialized, since SCMTools are short-lived.  Make this class data like this:

class CVSTool(SCMTool):
    regex_rev = re.compile(...)

    def __init__(self, repository):
        ...

That way the regexs are compiled once for the entire program.

trunk/reviewboard/scmtools/cvs.py (Diff revision 5)
```
Two blank lines between classes.
```

trunk/reviewboard/scmtools/cvs.py (Diff revision 5)

Same deal as above with class data.

cuviper 18 years ago

regex_full depends on a bit of instance data (repo), so it can't be moved to class data as-is.

david 18 years ago

Oh, right.  OK.  Does repo need to be escaped at all for this?

cuviper 18 years ago

Good point -- yes, repo should go through re.escape().

trunk/reviewboard/scmtools/cvs.py (Diff revision 5)

One way you could fix the problem with /dev/null is to have this rewrite origFile, origInfo as something like (file.newFile, 'PRE-CREATION') if origFile == '/dev/null'.  Then above in parse_diff_revision, check that the info string is 'PRE-CREATION'

trunk/reviewboard/scmtools/cvs.py (Diff revision 5)
```
Add another blank line here.
```
trunk/reviewboard/scmtools/cvs.py (Diff revision 5)
```
Please wrap these to 80 columns.
```

This is looking very good -- I'm already using it for our internal site.  I have a few more comments, but I hope to see it committed soon.

trunk/reviewboard/scmtools/cvs.py (Diff revision 5)

This member variable is never used.

LO

lobbin 17 years, 12 months ago
```
It's used internally by DiffParser
```

cuviper 17 years, 12 months ago

Ok, I see -- but I think you should just call DiffParser.__init__(self, data) to let it setup self.data and self.lines.

trunk/reviewboard/scmtools/cvs.py (Diff revision 5)

It's a bit painful to scan every line for RCS info.  I think extrapath is not needed anymore -- see below.

trunk/reviewboard/scmtools/cvs.py (Diff revision 5)

Rather than relying on extrapath, which assumes things happen in an exact order, I think you can now just parse for the RCS line directly.  Just scan for a few lines before linenum for the matching RCS.  My patches always seem to have RCS at (linenum-3), but you can add a little slop to that.

lobbin 17 years, 12 months ago

Yes, it should be at a fixed linenum, 2 or 3 depending on whether it's a new or an existing file, I'll see if I've got time tomorrow

trunk/reviewboard/scmtools/cvs.py (Diff revision 5)

Rather than format-printing '-d%s', just pass it as two separate arguments: '-d', self.repository.  Same for '-r%s'.

Ship it!

This looks great.  Is it ready to be committed?

lobbin 17 years, 11 months ago

Well, it works for me(tm) :-). I'm not yet running this in production so there might be cornercases, etc. Also, support for other methods than pserver, but that's up to someone else as I'm only using pserver at the moment.

cuviper 17 years, 11 months ago

I agree, this is good.  Let's get it in -- unseen cornercases can be worked out as more people try it.

david 17 years, 11 months ago

I just committed the patch to SVN, and will close out this review.  Thanks for your hard work!