Keep ReviewRequest timestamps in sync with Reviews and DiffSets

Review Request #8800 — Created March 7, 2017 and submitted

Information

Review Board
release-3.0.x
fa23ae9...

Reviewers

When publishing a Review, we let the ReviewRequest compute its own
last updated time, which ends up being mere milliseconds after
Review.timestamp, which makes it always appear that the review request
is always the last updated object returned by
ReviewRequest.get_last_activity(). The same is true for DiffSets
during publishing of review requests -- their timestamp is always
behind.

We now force the timestamp of ReviewRequest to be in sync with
Review.timestamp when publishing a review and DiffSet.timestamp when
publishing a review request that contains a new diff. With in-sync
timestamps, ReviewRequest.get_last_activity() now returns the correct
models.

Co-authored-by: Anni Cao anni@sfu.ca.

Ran unit tests.

Built the docs and went through them.

Manually verified the following before applying the patch
(via ./reviewboard/manage.py shell):

  • ReviewRequest.get_last_activity() returns the review request
    after a review is published.
  • ReviewRequest.get_last_activity() returns the review request
    after a revision with a diffset is published.

After applying the patch this is no longer the case.

Description From Last Updated

You should also add a unit test to ensure that saving a review results in the proper timestamps on the …

chipx86chipx86

This change is going to appear to fix the bug most of the time, but it will only appear that …

chipx86chipx86

Prior to this change, it was possible to call draft.publish() and have it kinda sorta do the right thing. It …

chipx86chipx86

Can you detail what manual testing you did to verify bug 3445?

daviddavid

Why is the scmtools fixture needed, out of curiosity?

mike_conleymike_conley

Instead of converting all of these to a string and then comparing strings, how about just squashing out the microseconds …

daviddavid

In the changes above, you are setting timestamp = diffset_timestamp and timestamp = review_timestamp but you never end up setting …

szhangszhang

There should be one of these per update type/result, rather than one big test. That way, if one part of …

chipx86chipx86

I'm a little confused. I wrote the following but later noticed that you have unit tests written and you said …

szhangszhang

I think your test could use a better name here. Perhaps something more specific like test_sync_of_review_timestamp_and_review_request_last_updated?

szhangszhang

Why are you checking this twice?

szhangszhang

You might want to put this into its own unit test.

szhangszhang

F401 'django.utils.six' imported but unused

reviewbotreviewbot

F401 'reviewboard.diffviewer.differ.DiffCompatVersion' imported but unused

reviewbotreviewbot

F401 'django.utils.six' imported but unused

reviewbotreviewbot

F821 undefined name 'timezone'

reviewbotreviewbot

Blank line before comment blocks.

chipx86chipx86

So I think what I'd like to see actually is a change to ModificationTimestampField that prevents overriding custom values. It …

chipx86chipx86

Needs to be the full module path.

chipx86chipx86

This sentence seems off.

chipx86chipx86

This may not link correctly now.

chipx86chipx86

Needs to be the full module path.

chipx86chipx86

This can be grouped together.

chipx86chipx86

Group together.

chipx86chipx86

Group together.

chipx86chipx86

This seeems... unrelated

daviddavid

Needs a period.

daviddavid

Do we still want to do this? Does the modification timestamp work fix this? If not, the comment above needs …

chipx86chipx86

This should be undone. The order was right before.

chipx86chipx86

, optional

chipx86chipx86
reviewbot
  1. Tool: Pyflakes
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
    
    Tool: PEP8 Style Checker
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
  2. 
      
mike_conley
  1. 
      
  2. Show all issues

    Why is the scmtools fixture needed, out of curiosity?

    1. It is for creating repository for new diffset, otherwise the test case cannot run

    2. Ah, I see. I see other tests in that file creating diffsets, and they do it by decorating the test with

      @add_fixtures(['test_scmtools'])

      I imagine we probably want to do the same thing here?

    3. I see the annotation now. Do you know what's the difference between these two approaches?

    4. In webapi tests, it is included in fixtures variable instead of annotation. Does it depend on how often the tools will be used?

    5. Yes, I think so. If the fixture is added at the top level, I think that means each test gets them, which slows down running the whole suite. If we can be more selective, we can only load the necessary fixtures and not get bogged down unnecessarily. I believe that's the rationale, anyhow.

  3. 
      
AN
AN
reviewbot
  1. Tool: Pyflakes
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
    
    Tool: PEP8 Style Checker
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
  2. 
      
szhang
  1. For a bit of background, ReviewRequest.get_last_activity() is actually used in review_details() in reviewboard/reviews/views.py. So, I am trying to understand the reasoning behind the changes.

    Some more detail about the changes would be appreciated in the description. Why are the changes required?

    Correct me if I am wrong--are this changes required because the comparison of timestamps (eg. on line 505 and 517 of the original file) is not working? Thus, you're using string comparisons instead?

  2. Show all issues

    In the changes above, you are setting

    timestamp = diffset_timestamp
    

    and

    timestamp = review_timestamp
    

    but you never end up setting server_timestamp, which should hold the timestamp of the latest activity of the review request.

    Right now, server_timestamp will always be set to self.last_updated regardless of whether there is a new diffset or review newer than self.last_updated.

    1. Forgive me, I didn't read the bug description at https://hellosplat.com/s/beanbag/tickets/3445/. Perhaps add some of the information there into your review request?

      What I don't understand now is how converting last_updated to a string, and comparing them then, will fix anything.

    2. Sorry for the confustion. I have added some explanation in the above Description section for what causes the bug, and why I convert the timestamps, which is just for comparison purpose, not as return value of the function. The return timestamp should be the original server_timestamp.

  3. 
      
AN
david
  1. 
      
  2. Show all issues

    Instead of converting all of these to a string and then comparing strings, how about just squashing out the microseconds part of the datetimes?

    timestamp = server_timestamp.replace(microsecond=0)
    
    ...
    
    diffset_timestamp = diffset.timestamp.replace(microsecond=0)
    if diffset_timestamp >= timestamp:
       ...
    

    This should be faster and more conceptually correct than comparing string representations.

  3. 
      
AN
reviewbot
  1. Tool: Pyflakes
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
    
    Tool: PEP8 Style Checker
    Processed Files:
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review_request.py
    
    
  2. 
      
AN
AN
chipx86
  1. 
      
  2. Show all issues

    You should also add a unit test to ensure that saving a review results in the proper timestamps on the review request.

    1. An unit test has been added and passed.

  3. Show all issues

    This change is going to appear to fix the bug most of the time, but it will only appear that way. In practice, it's still broken.

    What this change is doing is compensating for the case where review.timestamp's microsecond is, say, 100000, and review_request.last_updated is, say, 100001, and is a result of the review being saved. We're knocking the microseconds down to 0 so the review wins (and it wins because the review request's timestamp is the fallback, in the event that any other is newer or equal).

    I say it only appears to fix it because there are still two cases where things will go wrong (one that is just a manifestation of the current bug, and another that is introduced by this change).

    1) If review.timestamp is 999999 (or thereabouts) and review_request.last_updated is 1, the review request will still win. It's the same problem as before, but the difference isn't contained within a microsecond range of 0-999999. It overlapped into the next second. We'd have to then chase this into the seconds, knocking those to 0, but I think you'll agree that opens the door easily to false positives.
    2) It's possible that the review request is newer than the review. If the review was posted at microsecond=1, and the review request was updated by another user at microsecond=999999, the review request should win! This is the same problem as the seconds issue I describe above, but in a smaller window of time (but on a busy server, this isn't unlikely).

    This change is what we call a "bandaid." It's trying to address the symptom of the problem without solving the core issue. If you get cut by sharp splinter of wood every time you open a drawer, a bandaid might keep blood from a cut on your finger from dripping, but it doesn't do anything to prevent that splinter from cutting you or someone else next time. Ideally, it's best to solve that core problem so the bandaid isn't ever necessary.

    What's really happening with this bug is that, at some point, we publish a review with a timestamp of "now," and then when we save the review request, we allow it to re-calculate what "now" is. What we probably should be doing is forcing the two to be in sync.

    Here's where this is happening, in Review.publish:

    # Update the last_updated timestamp and the last review activity
    # timestamp on the review request.
    self.review_request.last_review_activity_timestamp = self.timestamp
    self.review_request.save(
        update_fields=['last_review_activity_timestamp', 'last_updated'])
    

    last_updated is a special field that, left to its own devices, will update itself to "now" on save. You can see this code in djblets/db/fields.py (ModificationTimestampField). What needs to happen is we need to actually set last_updated the same way we're setting last_review_activity_timestamp, so the two are in sync, and bypass ModificationTimestampField's special settings of this.

    The good news is, it's really easy to bypass the special "pre-save" logic that ModificationTimestampField is doing that's causing it to set the latest timestamp. We can use save_base(raw=True, ...) instead of save(...).

    Model.save() does some special stuff to prepare a model for saving, and ReviewRequest.save() does some other special stuff. In this case, though, none of that special stuff matters, at all. We're only changing two timestamps, and this is safe and reliable and simple. So you can do:

    self.review_request.last_updated = self.timestamp
    self.review_request.save(
        raw=True,
        update_fields=[...])
    

    With that, you shouldn't need any of the bandaids to adjust microseconds, and it should be reliable every time (unless of course some signal handler intercepts the save and makes other modifications, but we can't do anything about that really).

    Give that a try and adjust the unit tests, and see if you can make the problem go away.

    1. Sorry, that should have been:

      self.review_request.last_updated = self.timestamp
      self.review_request.save_base(
          raw=True,
          update_fields=[...])
      
    2. Totally agree that we need to solve the core issue instead of only address the sympton of the problem. Thank you so much for your detailed explanation. I've made changes as you suggested.

  4. Show all issues

    There should be one of these per update type/result, rather than one big test. That way, if one part of all this regresses later, it'll help in diagnosing things.

    You're also going to want to manually insert timestamps for the review and review request in these tests. (The create_* methods you use here, which live in reviewboard/testing/testcase.py, will need to be updated to optionally take last_updated values). Unit tests that test timestamps without explicitly defining them often lead to their own time precision bugs.

    1. I have created unit test cases for Reviews, Diffsets and ChangeDescription separately. They all pass now.

  5. 
      
AN
reviewbot
  1. Tool: Pyflakes
    Processed Files:
        reviewboard/reviews/tests/test_review.py
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review.py
        reviewboard/reviews/models/review_request_draft.py
        reviewboard/reviews/models/review_request.py
    
    
    
    Tool: PEP8 Style Checker
    Processed Files:
        reviewboard/reviews/tests/test_review.py
        reviewboard/reviews/tests/test_review_request.py
        reviewboard/reviews/models/review.py
        reviewboard/reviews/models/review_request_draft.py
        reviewboard/reviews/models/review_request.py
    
    
  2. 
      
AN
AN
szhang
  1. 
      
  2. Show all issues

    I'm a little confused. I wrote the following but later noticed that you have unit tests written and you said that they passed. Is this because last_updated is updated somewhere else?

    "In the docstring, you should be returning 'the last object updated, along with the timestamp of that object'.

    On this line, you are returning server_timestamp and it is still assigned to self.last_updated."

    1. That's the whole purpose of this function, review_request.last_updated should store the same timestamp of the last update object. Yes, the last_updated is updated whenever a Review is publish or some other update change happens.

  3. reviewboard/reviews/tests/test_review.py (Diff revision 4)
     
     
     
     
    Show all issues

    I think your test could use a better name here. Perhaps something more specific like test_sync_of_review_timestamp_and_review_request_last_updated?

    1. Thanks for good suggestion

    2. But it might be a bit long, also the following comment gives the detail. Maybe something like "test_sync_of_last_updated_timestamp"

  4. reviewboard/reviews/tests/test_review.py (Diff revision 4)
     
     
     
    Show all issues

    Why are you checking this twice?

    1. Just want to make sure that when the second review is published, the last_updated field will be updated to overwrite the first one.

  5. reviewboard/reviews/tests/test_review.py (Diff revision 4)
     
     
     
     
     
     
    Show all issues

    You might want to put this into its own unit test.

    1. I actually put them together on purpose to test if the last_updated will be updated when a new type of review is created and published

  6. 
      
brennie
brennie
Review request changed
Change Summary:

Cleanups to Anni's patch.

Summary:
Fix ReviewRequest.get_last_activity() to return the last update activity for a review request
Keep ReviewRequest timestamps in sync with Reviews and DiffSets
Description:
~  

ReviewRequest.get_last_activity() always returns the ReviewRequest itself. The bug is caused by timestamp discrepancy between ReviewRequest.last_updated and review.timestamp. There is a minor difference in millisecond between these two timestamps although they are logically supposed to be same. What's really happening is that, when we publish/save an update with a timestamp of "now," and then when we save the review request, we allow it to re-calculate what "now" is. The fix should be forcing the two to be in sync.

~   After change, ReviewRequest.get_last_activity() returns the last update activity for a review request.

  ~

When publishing a Review, we let the ReviewRequest compute its own

  ~ last updated time, which ends up being mere milliseconds after
  + Review.timestamp, which makes it always appear that the review request
  + is always the last updated object returned by
  + ReviewRequest.get_last_activity(). The same is true for DiffSets
  + during publishing of review requests -- their timestamp is always
  + behind.

  +
  +

We now force the timestamp of ReviewRequest to be in sync with

  + Review.timestamp when publishing a review and DiffSet.timestamp when
  + publishing a review request that contains a new diff. With in-sync
  + timestamps, ReviewRequest.get_last_activity() now returns the correct
  + models.

  +
  +

Additionally, while I was updating ReviewRequestDraft.publish, I

  + removed the send_notification parameter that was unused except by
  + ReviewRequest.publish. Now only ReviewRequest will emit the
  + review_request_published singal.

  +
  +

The majority of this change was contributed by Anni Cao anni@sfu.ca.

Testing Done:
~  

Unit tests for following update type have been added and passed in ReviewRequestTests:

  ~

Ran unit tests.

-   - Reviews
-   - Diffsets
-   - ChangeDescription

Commit:
2b7aac36d92e86980a0f71b0b1e26178bddfb4c1
9ea37fb4e466bb924178744bbfcc756c1194427d

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

brennie
Review request changed
Change Summary:

flake8

Commit:
9ea37fb4e466bb924178744bbfcc756c1194427d
ca44a255a218795f6ba5917406fd2d267c71404a

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

brennie
chipx86
  1. 
      
  2. Show all issues

    Prior to this change, it was possible to call draft.publish() and have it kinda sorta do the right thing. It wasn't consistent, though, and posed a problem. Looking through the test suite, we have places where we're publishing the draft, and not the review request. If we're going to change any semantics and function arguments in this change, I'd like to address this once and for all.

    I think what we should do is rename the existing publish() on drafts to something like apply_changes(). We may then want a draft.publish() that in turn calls review_request.publish() and returns the change description (matching the existing return type for the method).

    This would help clarify what the publishing logic on the draft does, and help avoid problems if code (including existing extension code out in the wild) calls the wrong publish method.

    1. I REALLY don't want to roll that into this change. I'll undo the modifications that I did to this.

  3. Show all issues

    Blank line before comment blocks.

  4. reviewboard/reviews/models/review_request.py (Diff revision 7)
     
     
     
     
     
     
     
    Show all issues

    So I think what I'd like to see actually is a change to ModificationTimestampField that prevents overriding custom values. It would entail tracking the loaded value and seeing if a new, explicit value was set. If so, it wouldn't override.

    This would also let us avoid calling save() and then save_base(), reducing the number of database writes and avoiding incorrect state if something looks up the row in-between these writes.

  5. Show all issues

    Needs to be the full module path.

    1. No, it doesn't. It links just fine.

    2. It won't going forward. We use the full paths in all new code.

  6. Show all issues

    This sentence seems off.

  7. Show all issues

    This may not link correctly now.

  8. Show all issues

    Needs to be the full module path.

    1. Do you mean remove the ~? reviewboard.changedescs.models has no sub-modules.

  9. reviewboard/reviews/tests/test_review_request.py (Diff revision 7)
     
     
     
     
    Show all issues

    This can be grouped together.

  10. reviewboard/reviews/tests/test_review_request.py (Diff revision 7)
     
     
     
     
     
    Show all issues

    Group together.

  11. reviewboard/reviews/tests/test_review_request.py (Diff revision 7)
     
     
     
     
    Show all issues

    Group together.

  12. 
      
brennie
david
  1. 
      
  2. Show all issues

    Can you detail what manual testing you did to verify bug 3445?

  3. docs/manual/_ext/webapidocs.py (Diff revision 8)
     
     
     
     
     
     
     
     
    Show all issues

    This seeems... unrelated

    1. Oops, that was me just getting docs to build. Didn't mean to commit that.

  4. Show all issues

    Needs a period.

  5. 
      
brennie
brennie
chipx86
  1. 
      
  2. reviewboard/reviews/models/review.py (Diff revision 10)
     
     
     
    Show all issues

    Do we still want to do this? Does the modification timestamp work fix this?

    If not, the comment above needs to say exactly why we're doing this.

  3. Show all issues

    This should be undone. The order was right before.

  4. Show all issues

    , optional

  5. 
      
brennie
david
  1. Ship It!
  2. 
      
brennie
brennie
Review request changed
Status:
Completed
Change Summary:
Pushed to release-3.0.x (43c6fad)