Starting point for the Python API

Review Request #1847 — Created Oct. 19, 2010 and discarded




Starting point for the Python API, created by myself and Brendan Curran-Johnson.  The project is divided into three parts, API, Clients, and Commands.
The API deals with resources and RESTful communications to a reviewboard server.  Also, it deals with OS calls and generic configuration data.

The Clients deal specifically with the different SCMs available to be used with reviewboard.  In a nutshell - generating diffs and grabbing repository information.

The Commands are a set of scripts which use the API.  Most of the functionality found in post-review can be found in the current commands.
Able to run the scripts:
rb-close     (close a review request as submitted or discarded)
rb-config    (configure the server_info for rb_scripts)
rb-create    (create a new review request)
rb-get-diff  (generate a diff)
rb-info      (get the information for any base resource list or base resource list's child)
rb-open      (open a review request)
rb-publish   (publish a review request)
rb-upload    (upload a diff or screenshot to a review request)
  1. Lindsey / Brendan:
    Thanks for posting this up!  I've broken my review down into two sections - Style, and Architecture:
    So, style-wise, here are the big ones:
    1)  Extraneous whitespace has to go.
    2)  Lines must be less than 80 characters
    3)  Formatting for docstrings needs some fixing:  see for examples
    4)  Code blocks outside of classes are spaced with two blank lines.
    A lot of this style stuff can be caught using the pep8 script, which you can install with easy_install.  Very useful.
    So I didn't highlight everything style-wise, otherwise there'd be a ton of comments for you to go through.  Suffice it to say, pep8 will help you nail a huge chunk of them.
    It's a good start.  A suggestion (and Christian/David might override me on this, but I thought I'd toss in):
    We might want to organize these parts into different modules, like:
    rbtools.api.client (for the client...maybe the REST interface, too)
    rbtools.api.resources (all of the RESTful resources)
    rbtools.commands (this could maybe have a bunch of modules in it - one for each command.  This is how the Django framework does it.)
    rbtools.utils (maybe have a separate module for the SCM utils...)
    I don't know.  Hugely open to discussion on that.  Christian/David/others might have a better idea about how to organize something like this.
    What I'm pretty sure we'll want to do, though, is do imports somewhat like the following:
    from rbtools.api.resources import ReviewRequest, ReviewRequestDraft,...
    This means that we can instantiate ReviewRequests without the Resources prefix.
    I'm curious:  why does get_repository in RBUtilities have so much flexibility in terms of what SCMClients it looks for?  Just wanted to know.  :)
    Anyhow, that's what I've got from a first glance.  Might follow up on this with more once I get my midterm done.
    1. "I'm curious:  why does get_repository in RBUtilities have so much flexibility in terms of what SCMClients it looks for?  Just wanted to know.  :)"
      postreview checked all those scmclients, and when I was first writing that code, I got it in my head that all should be checked (lest I remove functionality). The resulting function ends up looking a bit weird and unwieldy. I'll have to put a bit more thought into it.
  2. rbtools/ (Diff revision 1)
    Should explicitly inherit from object, like
    class RBUtilities(object)
  3. rbtools/ (Diff revision 1)
    No space needed here, or on line 27.
  4. rbtools/ (Diff revision 1)
    It might be better to have a list of SCMCLIENTS (or a dict, if you want to keep the keys) instead of writing this out statically.
    The original postreview code did this, and it seemed to work pretty well.
  5. rbtools/ (Diff revision 1)
    import RBUtilities should be put after import os.
    We do imports alphabetically.  If there's an unordered list (like imports, or css attributes), we *tend* to put things in alphabetical order.
  6. rbtools/ (Diff revision 1)
  7. rbtools/ (Diff revision 1)
    If you're outside of a class, there are two lines between code blocks.
    So we'd insert 2 lines between the imports and the try/catch.
  8. rbtools/ (Diff revision 1)
    extra line here.
  9. rbtools/ (Diff revision 1)
    add another extra line here.
  10. rbtools/ (Diff revision 1)
    and add another extra line here.
  11. rbtools/ (Diff revision 1)
    No need for this line.
  12. rbtools/ (Diff revision 1)
    alphabetical order, and two lines before try/catch
  13. rbtools/ (Diff revision 1)
    add another line here
  14. rbtools/ (Diff revision 1)
    And add another extra line here.
  15. rbtools/ (Diff revision 1)
  1. Hi guys,
    This is going to be a long review. You knew that was coming, but don't worry :)
    There are a lot of stylistic things that still need a lot of love. I listed some examples of this but I didn't go through every line to comment on each occurrence. Please go through your code and make sure that everything follows the styles as expected. This will make it *much* easier to review logic and implementation rather than style.
    As for the structure of the code, I really want to see different types of classes segmented into namespaces. There should be the following:
    * rbtools.api -- All API-related things should go somewhere in here. Resources, for instance, should go in rbtools.api.resources (or rbtools/api/
    * rbtools.clients -- All support for clients (git, svn, etc.) should go in here. This is what you currently have as the repository classes. I want to see one file per type of repository. So, rbtools.clients.git (or rbtools/clients/ would have GitRepository and GitClient.
    * rbtools.commands -- All scripts would move into here, and you'd use the Python setuptools scripting stuff to wrap these. See RBTool's for how this is done with post-review.
    On that note, filenames should always be lowercase in Python. The mixed-case style of Java, C++, etc. isn't used here.
    I want to see some changes to the way Resources are done. Basically, I want to be able to treat a Resource as a sort of structure I can get/create, populate, and save, and have it work sort of like the Django database API. Something like:
    review_request_list = ... # Pretend we have this at this point
    review_request = review_request_list.create()
    review_request.submit_as = some_user
    review_request.repository = repository_id
    In a way, it's like the Django database API. I should also be able to fetch a resource, change some stuff on it, and save(). If the review request is newly created, do an HTTP POST. If it's one fetched from a server, do a PUT. Should also provide a delete() method. Etc.
    So, basically, there would be two main types of resources: Lists, and objects.
    List resources would have create() and get() methods.
    create() would create a new instance of the type of child resource, which could be populated and saved.
    I'd like a get() that would get the correct resource from the list. The only way this can be reliably done (since an object doesn't necessarily live in that resource, as it may be linked to another place) is to query the root resource (which should only be done once in the lifecycle of the app), find the uri_template for the resource name (say, 'review_requests'), and plug values into the URI template (review_request_id, etc.). Those values should come from values passed to get() and from the list resource's values (and its parents, recursively up the chain).
    So, I could do 'review_request_list.get(42)' and it would plug 42 into review_request_id. It would know this from, say, an 'uri_id_field' key in the resource subclass. It would get any info from the list, its parent, etc., and then once it has all this, parse the URI template and get the URL. Then fetch that and return the resource type.
    Resource lists would also need to be able to return their results in a nice, Pythonic way involving iterators. I want to be able to do:
        for resource in resource_list:
        resources = resource_list[4:10]
    Resource objects would have save() and delete() methods. Fairly self-explanatory I think.
    For all this to work, you need two classes (they should have a base parent class that has most of the logic for this, though), and then a subclass for each thing that we care about.
    ResourceList subclasses would probably need:
    * child_resource_type (reference to a class, such as ReviewRequest).
    * create()
    * to be iterable (there should be lots of examples of this on the web).
    Resource subclasses would need:
    * uri_id_field
    * resource_name
    * save()
    * delete()
    That's probably a lot to process. Feel free to discuss it further over e-mail on reviewboard-dev@.
  2. rbtools/ (Diff revision 2)
    Imports are always in three groups, each separated by a blank line:
    1) System imports (such as sys, mimetools, socket, etc.)
    2) Third-party imports (would be things such as djblets, though that's not needed here).
    3) Local imports (in this case, Repository).
    For local imports, use the full import path. In this case, 'from rbtools import Repository'.
  3. rbtools/ (Diff revision 2)
    Classes should always inherit from object, like so:
    class RBUtilities(object):
  4. rbtools/ (Diff revision 2)
    No blank lines on the insides of the """
    Make sure everything fits under 80 columns.
  5. rbtools/ (Diff revision 2)
    No blank line.
  6. rbtools/ (Diff revision 2)
    Functions should always be more like:
    def __init__(self, log_file='rbproblems.log'):
    Note the lack of spaces within the parens, and the lack of spaces around the '='. This is true only for keyword arguments, not general assignment in code.
  7. rbtools/ (Diff revision 2)
    Docs in Python shouldn't contain the function header (the __init__(...)) or list of parameters.
    The very first line should always be the summary, like:
    """Briefly describes the function.
    More detail that can span paragraphs here.
    Though, for init functions, don't bother with docs.
  8. rbtools/ (Diff revision 2)
    No blank line immediately after the docs.
  9. rbtools/ (Diff revision 2)
    Only one blank line between functions within a class.
  10. rbtools/ (Diff revision 2)
    Don't define a long list of things in the header. This should go outside of the function somewhere.
    Always wrap to < 80 chars.
  11. rbtools/ (Diff revision 2)
    It would be better to have a lookup map somewhere outside the function, like:
    repository_types = {
        'svn': Repository.SVNRepository,
        'cvs': Repository.CVSRepository,
    Then you can do:
    if type in repository_types:
        possible_repos.append(repository_types[tpe](url, self))
  12. rbtools/ (Diff revision 2)
    No blank line here.
  13. rbtools/ (Diff revision 2)
    Parameters must always be aligned.
  14. rbtools/ (Diff revision 2)
    No blank line right after the function declaration.
  15. rbtools/ (Diff revision 2)
    You should never need this. It's deprecated. All string operations can be performed on strings themselves.
  16. rbtools/ (Diff revision 2)
    We can dictate the return type, so there should never be a need to handle anything but json.
  17. rbtools/ (Diff revision 2)
    Why the 1? This should be documented.
  18. rbtools/ (Diff revision 2)
    Rather than doing this, why not just set self.url so we can access resource.url?
  19. rbtools/ (Diff revision 2)
    When comparing against None, use "is":
    if field is None:
  20. rbtools/ (Diff revision 2)
    Actually, do this instead:
    .. note:: This method MUST blah blah
              blah blah
    It'll appear nicer in generated docs.
  21. rbtools/ (Diff revision 2)
    No need for \ when inside parens.
  22. rbtools/ (Diff revision 2)
    Why the couple random spaces? What happens if e.msg doesn't end with a period?
  23. rbtools/ (Diff revision 2)
    You can make a lot of this much easier.
    Require that subclasses define the name of the resource field within, like so:
    class ReviewRequest(Resource):
        resource_name = 'review_request'
    Then have Resource be smart enough to do the get_field.
    Rather than things like draft_url, just provide an easy way to get a named URL. I think having get_link jut take the name and return the href instead of requiring the ['href'] would do that nicely.
  24. rbtools/ (Diff revision 2)
    Align these, and no need for \.
    Also. there's an easier way here. Define this as:
    def __init__(self, method, *args, **kwargs):
    Then, for the parent constructor call, do:
    super(RequestWithMethod, self).__init__(self, *args, **kwargs)
    That way, you never have to keep the params in sync with urllib2.Request.
  25. rbtools/ (Diff revision 2)
    Align these, and no need for \
  26. rbtools/ (Diff revision 2)
    Should use:
    return super(RequestWithMethod, self).get_method()
  27. rbtools/ (Diff revision 2)
    No need for \
    The final ] should appear on its own line.
  28. rbtools/ (Diff revision 2)
    No need for \, and align params.
  29. rbtools/ (Diff revision 2)
    I don't think there's much point in this. We should always just use json.
  30. rbtools/ (Diff revision 2)
    Align params.
  31. rbtools/ (Diff revision 2)
    No need for \, and align params.
  32. rbtools/ (Diff revision 2)
    Login in the new API shouldn't use this. It should use HTTP Basic Auth.
  33. rbtools/ (Diff revision 2)
    Paths should never be hard-coded with the REST API. They should be "discovered" or "browsed" from the root resource.
  34. rbtools/ (Diff revision 2)
    No need for \, and align params.
  35. rbtools/ (Diff revision 2)
    Don't use \ in the strings. Instead, terminate a string and resume it on the next line.
  36. rbtools/ (Diff revision 2)
    These methods (the ones operating on review requests and such) don't really belong in here. The act of creating or updating an object should happen solely on the Resource for it.
  37. rbtools/ (Diff revision 2)
    Multi-line if statements should have the conditional wrapped in parens instead of using \.
    In this case, you can do:
    if not isinstance(self.current, (Resource.ReviewRequest, Resource.DraftReviewRequest):
  1. More stylistic stuff, and some code simplification/reorganization.
    Remember, the filenames should be all lowercase, not mixed case.
    There are still things from previous reviews that I think are unfixed. When I'm modifying my change based on review feedback, I'll usually respond to each item with "Fixed" or something, so that the reviewers and I all know what was done and what wasn't.
  2. rbtools/api/ (Diff revision 5)
    Going to want David's take on this, but I generally would say that you should have a base ResourceError exception, and then simple subclasses for each error type, instead of error codes. So:
    class ResourceError(Exception):
    class UnloadedResourceError(ResourceError):
    Also, any exceptions should go in
  3. rbtools/api/ (Diff revision 5)
    This should be:
    Exception.__init__(self, *args, **kwargs)
  4. rbtools/api/ (Diff revision 5)
    Instructions probably don't belong in here. I'd do:
    return self.resource_string or "Unloaded resource"
  5. rbtools/api/ (Diff revision 5)
    First line of any doc string should be a one-sentence summary, ideally fitting on one line. It should be on the same line as the starting """. Then, a blank line, then the remaining docs.
  6. rbtools/api/ (Diff revision 5)
    Should be consistent with where the space is on multi-line strings. I always put it before the end quote mark.
    Also, any time you're breaking apart a string, all parts should be on the same indentation level. That includes "The resource"
  7. rbtools/api/ (Diff revision 5)
    Blank line.
  8. rbtools/api/ (Diff revision 5)
    I'm not convinced this is safe. You're using field both as the result and the thing you're fetching from. So what happens after the first iteration?
    1. More clearly documented what this is doing.
  9. rbtools/api/ (Diff revision 5)
    if field is None
  10. rbtools/api/ (Diff revision 5)
    "if field is None"
  11. rbtools/api/ (Diff revision 5)
    Same here about the indentation of the string.
  12. rbtools/api/ (Diff revision 5)
    Same here with the error string.
    Also, any others.
  13. rbtools/api/ (Diff revision 5)
    Shouldn't print. This may be used from non-console apps. Callers should be able to get something meaningful so they can handle it their own way.
  14. rbtools/api/ (Diff revision 5)
    Should use:
    super(ResourceBase, self).__init__(server_interface, ...)
    You'll notice that's different from what I said to do about the Exception above. The reason is that new-style objects (those inheriting from 'object' somewhere in the chain) can use super(). Exception, however, isn't a new-style object on older versions of Python that we support.
  15. rbtools/api/ (Diff revision 5)
    Space after the '#'
    Same with the other comments.
  16. rbtools/api/ (Diff revision 5)
    Kind of an odd way to do it. You can just do: if elem != 'stat':
  17. rbtools/api/ (Diff revision 5)
    Same here with the prints.
  18. rbtools/api/ (Diff revision 5)
    The ".. note:" is only for doc strings. Just use "NOTE: " here
    Also, this seems very inconsistent with the other functions. We should be consistent, one way or another. The fewer types of exceptions the caller has to care about, though, the better.
  19. rbtools/api/ (Diff revision 5)
    You should make url_field_ids set to [] by default, and then just append the get_field('id') to the list always.
    1. Fixed - url_field_ids is no longer required.
  20. rbtools/api/ (Diff revision 5)
  21. rbtools/api/ (Diff revision 5)
    Space after the #
  22. rbtools/api/ (Diff revision 5)
    We should normalize the value passed to get_field() and then call it only once.
  23. rbtools/api/ (Diff revision 5)
    Space after #.
    Same with all other comments.
  24. rbtools/api/ (Diff revision 5)
    Shouldn't print.
  25. rbtools/api/ (Diff revision 5)
    Where does is_resource_list come from?
    1. Fixed - is_resource_list changed to _is_resource_list.
  26. rbtools/api/ (Diff revision 5)
    Shouldn't print.
  27. rbtools/api/ (Diff revision 5)
    Should have some docs. Otherwise, just remove the """
  28. rbtools/api/ (Diff revision 5)
    use super()
  29. rbtools/api/ (Diff revision 5)
    Blank line between these.
  30. rbtools/api/ (Diff revision 5)
    Two blank lines between classes.
  31. rbtools/api/ (Diff revision 5)
    Same comment about the docs.
  32. rbtools/api/ (Diff revision 5)
    Same about super().
    Should go through and make sure it's used right everywhere.
  33. rbtools/api/ (Diff revision 5)
    Comments like these aren't very useful. When code is self-explanatory, comments aren't needed.
  34. rbtools/api/ (Diff revision 5)
    Should be 'reopen'
  35. rbtools/api/ (Diff revision 5)
    There are two ways to close a review request -- to mark it submitted, and to discard it. You should define constants for PENDING, SUBMITTED, CLOSED, and use those. They can be defined to "pending", "submitted", "closed", but that way strings aren't assumed to be used.
  36. rbtools/api/ (Diff revision 5)
    ResourceList is a pretty fundamental base class. I'd put it before any specific resource classes.
  37. rbtools/api/ (Diff revision 5)
    Same about the comment.
  38. rbtools/api/ (Diff revision 5)
  39. rbtools/api/ (Diff revision 5)
    Blank line.
  40. rbtools/api/ (Diff revision 5)
    These comments aren't that useful either.
  41. rbtools/api/ (Diff revision 5)
    Parameters should always be aligned.
    Maybe do:
        self.child_resource_url = \
    1. Fixed - Child Resource Url is now determined from the 'self' url.
  42. rbtools/api/ (Diff revision 5)
    Shouldn't print.
  43. rbtools/api/ (Diff revision 5)
    Should be one statement.
  44. rbtools/api/ (Diff revision 5)
    Can do:
    reverse_url_field_ids = reverse(self.url_field_ids)
  45. rbtools/api/ (Diff revision 5)
    Shouldn't print.
  46. rbtools/api/ (Diff revision 5)
    Should do '%s{some_id_field}%s' % (temp[0], temp[2])
    I don't really know what all this is for, though.
    1. As we discussed, with the next release of the ReviewBoard server you plan on including a "uri_templates" field in each resource.  Since currently this isn't there I'm reverse engineering this so that the part of the code which fills in the id fields up the chain can be tested.
      This isn't 100% percent, of course.  It will only work when the child resource template url is on the same path as the parent.  However, I didn't know what else to do for now so that we could get things at least partially operational.
    2. Okay, in this case, I'd rather see the effort made to introduce this to Djblets. The resource code actually lives there, and so it can be updated and tested, and then RB 1.5.1 would use that. I'll look at taking a stab at it, though, but ideally this workaround wouldn't be committed upstream.
    3. Fixed - Child Resource Url is now determined from the 'self' url and this is completely unnecessary.
  47. rbtools/api/ (Diff revision 5)
    Shouldn't print.
    1. Child Resource Url is now determined from the 'self' url.
    2. Fixed.
  48. rbtools/api/ (Diff revision 5)
    At this point I'm going to ignore any further prints.
  49. rbtools/api/ (Diff revision 5)
    I'd go through and remove all comments that don't explain otherwise confusing, complicated code.
  50. rbtools/api/ (Diff revision 5)
    Build the values to pass first, then call build_resource_url and Resource()
  51. rbtools/api/ (Diff revision 5)
    Should use a comment here, since it's not a function or class's documentation.
  52. rbtools/api/ (Diff revision 5)
    What's the reason for this? Shouldn't __next__ suffice?
    1. Fixed.  For some reason I thought both were required.
  53. rbtools/api/ (Diff revision 5)
    == len(self)
  54. rbtools/api/ (Diff revision 5)
    Can collapse this:
    elif self.is_root:
  55. rbtools/api/ (Diff revision 5)
    Should also be a comment.
  56. rbtools/api/ (Diff revision 5)
    You can do without the 'contains' variable.
    for n in self:
        if n == key:
            return True
    return False
  57. rbtools/api/ (Diff revision 5)
    Since the root resource is special, it should probably just provide special versions of these iterator functions, instead of having every resource checking if it's the root.
    1. Work in progress.
      So do you mean a RootResourceList object with the necessary methods overwritten?  If so, what happens when a 3rd party client uses a ResourceList for the root instead of the RootResourceList?
    2. Fixed.
  58. rbtools/api/ (Diff revision 5)
    No need for this blank line.
  59. rbtools/api/ (Diff revision 5)
    Same here about using a comment.
  60. rbtools/api/ (Diff revision 5)
    Are these meant to be used outside of this file? If not, then the names should have a _ prefix.
    1. Fixed - Removed because its no longer necessary.
  61. rbtools/api/ (Diff revision 5)
    out isn't used until later, so no need to declare it yet.
  62. rbtools/api/ (Diff revision 5)
    I'm really not sure what any of this is supposed to do. This would be where a nice explanatory comment would be helpful. My gut is that there's an easier way to write it, but a comment would help me figure that out :)
  63. rbtools/api/ (Diff revision 5)
    Should use format strings for this. You can get it down to one statement.
  64. rbtools/api/ (Diff revision 5)
    field_id_index -= 1
  65. rbtools/api/ (Diff revision 5)
    Sorry to make you kill off this function, but Python has what you probably need here.
    Python's 're' module (regular expressions) lets you do pattern-based splitting easily.
    You can do:
    import re
    left, center, right = re.split('{([^}]+)}', 'foo{abc}def')
    The pattern there matches something starting with { and anything up to (but not including) the }.
    Result in that case would be: ['foo', 'abc', 'def']
    1. Excellent, that sounds a lot better :D
    2. Fixed.
  66. rbtools/api/ (Diff revision 5)
    Ah, this is what I was wondering before.
    Isn't data a dictionary? If so, you can just do:
    return 'total_results' in data
  67. rbtools/api/ (Diff revision 5)
    When importing, you want to first have a group of built-in Python modules, then a blank line, then 3rd party modules + blank line (if any -- none in this case), then imports from this module (in this case, the rbtools ones).
  68. rbtools/api/ (Diff revision 5)
    This can be simplified to:
    return self._method or super(RequestWithMethod, self).get_method()
  69. rbtools/api/ (Diff revision 5)
    This is only used for the old-style API.
    1. Work in progress.
    2. Fixed - Now using HTTPBasicAuthHandler for authenticating.
  70. rbtools/api/ (Diff revision 5)
    This should probably be split off into some little utility thing in the rbtools.commands API. A graphical third-party app would never want the library prompting for input.
  71. rbtools/clients/ (Diff revision 5)
    Two blank lines.
  72. rbtools/clients/ (Diff revision 5)
    We shouldn't use a log file here. We should use python's logging module.
  73. rbtools/commands/ (Diff revision 5)
    This doesn't have to be done in the first change, but it really shouldn't hard-code the names. What it should be doing is looking for a program with the name of sys.argv[1] with 'rb-' prepended. That way, third parties can make their own and it'll Just Work.
    But again, that's a future task, which isn't a priority over getting this change in.
  74. rbtools/commands/ (Diff revision 5)
    Two blank lines.
  75. rbtools/commands/ (Diff revision 5)
    Two blank lines.
  76. rbtools/commands/ (Diff revision 5)
    Two blank lines.
  77. rbtools/commands/ (Diff revision 5)
    Two blank lines.
Review request changed