Fix ReviewRequest.get_last_activity() to return the last update activity for a review request

Review Request #8800 - Created March 7, 2017 and updated

Anni Cao
Review Board
release-2.5.x
3445
2b7aac3...
reviewboard, students

ReviewRequest.get_last_activity() always returns the ReviewRequest itself. The bug is caused by timestamp discrepancy between ReviewRequest.last_updated and review.timestamp. There is a minor difference in millisecond between these two timestamps although they are logically supposed to be same. What's really happening is that, when we publish/save an update with a timestamp of "now," and then when we save the review request, we allow it to re-calculate what "now" is. The fix should be forcing the two to be in sync.
After change, ReviewRequest.get_last_activity() returns the last update activity for a review request.

Unit tests for following update type have been added and passed in ReviewRequestTests:
- Reviews
- Diffsets
- ChangeDescription

  • 0
  • 0
  • 8
  • 2
  • 10
Description From Last Updated
Review Bot
  1. Tool: Pyflakes
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
    
    Tool: PEP8 Style Checker
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
  2. 
      
Mike Conley
  1. 
      
  2. Why is the scmtools fixture needed, out of curiosity?

    1. It is for creating repository for new diffset, otherwise the test case cannot run

    2. Ah, I see. I see other tests in that file creating diffsets, and they do it by decorating the test with

      @add_fixtures(['test_scmtools'])

      I imagine we probably want to do the same thing here?

    3. I see the annotation now. Do you know what's the difference between these two approaches?

    4. In webapi tests, it is included in fixtures variable instead of annotation. Does it depend on how often the tools will be used?

    5. Yes, I think so. If the fixture is added at the top level, I think that means each test gets them, which slows down running the whole suite. If we can be more selective, we can only load the necessary fixtures and not get bogged down unnecessarily. I believe that's the rationale, anyhow.

  3. 
      
Anni Cao
Anni Cao
Review Bot
  1. Tool: Pyflakes
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
    
    Tool: PEP8 Style Checker
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
  2. 
      
Simon Zhang
  1. For a bit of background, ReviewRequest.get_last_activity() is actually used in review_details() in reviewboard/reviews/views.py. So, I am trying to understand the reasoning behind the changes.

    Some more detail about the changes would be appreciated in the description. Why are the changes required?

    Correct me if I am wrong--are this changes required because the comparison of timestamps (eg. on line 505 and 517 of the original file) is not working? Thus, you're using string comparisons instead?

  2. In the changes above, you are setting

    timestamp = diffset_timestamp
    

    and

    timestamp = review_timestamp
    

    but you never end up setting server_timestamp, which should hold the timestamp of the latest activity of the review request.

    Right now, server_timestamp will always be set to self.last_updated regardless of whether there is a new diffset or review newer than self.last_updated.

    1. Forgive me, I didn't read the bug description at https://hellosplat.com/s/beanbag/tickets/3445/. Perhaps add some of the information there into your review request?

      What I don't understand now is how converting last_updated to a string, and comparing them then, will fix anything.

    2. Sorry for the confustion. I have added some explanation in the above Description section for what causes the bug, and why I convert the timestamps, which is just for comparison purpose, not as return value of the function. The return timestamp should be the original server_timestamp.

  3. 
      
Anni Cao
David Trowbridge
  1. 
      
  2. Instead of converting all of these to a string and then comparing strings, how about just squashing out the microseconds part of the datetimes?

    timestamp = server_timestamp.replace(microsecond=0)
    
    ...
    
    diffset_timestamp = diffset.timestamp.replace(microsecond=0)
    if diffset_timestamp >= timestamp:
       ...
    

    This should be faster and more conceptually correct than comparing string representations.

  3. 
      
Anni Cao
Review Bot
  1. Tool: Pyflakes
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
    
    Tool: PEP8 Style Checker
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
  2. 
      
Anni Cao
Anni Cao
Christian Hammond
  1. 
      
  2. This change is going to appear to fix the bug most of the time, but it will only appear that way. In practice, it's still broken.

    What this change is doing is compensating for the case where review.timestamp's microsecond is, say, 100000, and review_request.last_updated is, say, 100001, and is a result of the review being saved. We're knocking the microseconds down to 0 so the review wins (and it wins because the review request's timestamp is the fallback, in the event that any other is newer or equal).

    I say it only appears to fix it because there are still two cases where things will go wrong (one that is just a manifestation of the current bug, and another that is introduced by this change).

    1) If review.timestamp is 999999 (or thereabouts) and review_request.last_updated is 1, the review request will still win. It's the same problem as before, but the difference isn't contained within a microsecond range of 0-999999. It overlapped into the next second. We'd have to then chase this into the seconds, knocking those to 0, but I think you'll agree that opens the door easily to false positives.
    2) It's possible that the review request is newer than the review. If the review was posted at microsecond=1, and the review request was updated by another user at microsecond=999999, the review request should win! This is the same problem as the seconds issue I describe above, but in a smaller window of time (but on a busy server, this isn't unlikely).

    This change is what we call a "bandaid." It's trying to address the symptom of the problem without solving the core issue. If you get cut by sharp splinter of wood every time you open a drawer, a bandaid might keep blood from a cut on your finger from dripping, but it doesn't do anything to prevent that splinter from cutting you or someone else next time. Ideally, it's best to solve that core problem so the bandaid isn't ever necessary.

    What's really happening with this bug is that, at some point, we publish a review with a timestamp of "now," and then when we save the review request, we allow it to re-calculate what "now" is. What we probably should be doing is forcing the two to be in sync.

    Here's where this is happening, in Review.publish:

    # Update the last_updated timestamp and the last review activity
    # timestamp on the review request.
    self.review_request.last_review_activity_timestamp = self.timestamp
    self.review_request.save(
        update_fields=['last_review_activity_timestamp', 'last_updated'])
    

    last_updated is a special field that, left to its own devices, will update itself to "now" on save. You can see this code in djblets/db/fields.py (ModificationTimestampField). What needs to happen is we need to actually set last_updated the same way we're setting last_review_activity_timestamp, so the two are in sync, and bypass ModificationTimestampField's special settings of this.

    The good news is, it's really easy to bypass the special "pre-save" logic that ModificationTimestampField is doing that's causing it to set the latest timestamp. We can use save_base(raw=True, ...) instead of save(...).

    Model.save() does some special stuff to prepare a model for saving, and ReviewRequest.save() does some other special stuff. In this case, though, none of that special stuff matters, at all. We're only changing two timestamps, and this is safe and reliable and simple. So you can do:

    self.review_request.last_updated = self.timestamp
    self.review_request.save(
        raw=True,
        update_fields=[...])
    

    With that, you shouldn't need any of the bandaids to adjust microseconds, and it should be reliable every time (unless of course some signal handler intercepts the save and makes other modifications, but we can't do anything about that really).

    Give that a try and adjust the unit tests, and see if you can make the problem go away.

    1. Sorry, that should have been:

      self.review_request.last_updated = self.timestamp
      self.review_request.save_base(
          raw=True,
          update_fields=[...])
      
    2. Totally agree that we need to solve the core issue instead of only address the sympton of the problem. Thank you so much for your detailed explanation. I've made changes as you suggested.

  3. You should also add a unit test to ensure that saving a review results in the proper timestamps on the review request.

    1. An unit test has been added and passed.

  4. There should be one of these per update type/result, rather than one big test. That way, if one part of all this regresses later, it'll help in diagnosing things.

    You're also going to want to manually insert timestamps for the review and review request in these tests. (The create_* methods you use here, which live in reviewboard/testing/testcase.py, will need to be updated to optionally take last_updated values). Unit tests that test timestamps without explicitly defining them often lead to their own time precision bugs.

    1. I have created unit test cases for Reviews, Diffsets and ChangeDescription separately. They all pass now.

  5. 
      
Anni Cao
Review Bot
  1. Tool: Pyflakes
    Processed Files:
        reviewboard/reviews/tests/test_review.py
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review.py
        reviewboard/reviews/models/review_request_draft.py
        reviewboard/reviews/models/review_request.py
    
    
    
    Tool: PEP8 Style Checker
    Processed Files:
        reviewboard/reviews/tests/test_review.py
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review.py
        reviewboard/reviews/models/review_request_draft.py
        reviewboard/reviews/models/review_request.py
    
    
  2. 
      
Anni Cao
Anni Cao
Review request changed

Description:

~  

ReviewRequest.get_last_activity() always returns the ReviewRequest itself. The bug is caused by timestamp discrepancy between ReviewRequest.last_updated and review.timestamp. There is a minor difference in millisecond between these two timestamps although they are logically supposed to be same. I use replace(microsecond=0) to trim the microsecond part of the timestamps to remove the discrepancy.

  ~

ReviewRequest.get_last_activity() always returns the ReviewRequest itself. The bug is caused by timestamp discrepancy between ReviewRequest.last_updated and review.timestamp. There is a minor difference in millisecond between these two timestamps although they are logically supposed to be same. What's really happening is that, when we publish/save an update with a timestamp of "now," and then when we save the review request, we allow it to re-calculate what "now" is. The fix should be forcing the two to be in sync.

    After change, ReviewRequest.get_last_activity() returns the last update activity for a review request.

Simon Zhang
  1. 
      
  2. I'm a little confused. I wrote the following but later noticed that you have unit tests written and you said that they passed. Is this because last_updated is updated somewhere else?

    "In the docstring, you should be returning 'the last object updated, along with the timestamp of that object'.

    On this line, you are returning server_timestamp and it is still assigned to self.last_updated."

    1. That's the whole purpose of this function, review_request.last_updated should store the same timestamp of the last update object. Yes, the last_updated is updated whenever a Review is publish or some other update change happens.

  3. reviewboard/reviews/tests/test_review.py (Diff revision 4)
     
     
     
     

    I think your test could use a better name here. Perhaps something more specific like test_sync_of_review_timestamp_and_review_request_last_updated?

    1. Thanks for good suggestion

    2. But it might be a bit long, also the following comment gives the detail. Maybe something like "test_sync_of_last_updated_timestamp"

  4. reviewboard/reviews/tests/test_review.py (Diff revision 4)
     
     
     

    Why are you checking this twice?

    1. Just want to make sure that when the second review is published, the last_updated field will be updated to overwrite the first one.

  5. reviewboard/reviews/tests/test_review.py (Diff revision 4)
     
     
     
     
     
     

    You might want to put this into its own unit test.

    1. I actually put them together on purpose to test if the last_updated will be updated when a new type of review is created and published

  6. 
      
Loading...