Fix Unicode parsing issues with diff fragment and entry update payloads.

Review Request #9450 — Created Dec. 18, 2017 and submitted

Information

Review Board
release-3.0.x
995ff33...

Reviewers

Review Board 3.0 introduced a more efficient way of transferring diff
fragments for comments, and a mechanism for providing updates to entries
on the review request page. These used a streamable format that encoded
byte lengths followed by content.

The problem is, JavaScript doesn't deal with strings as binary data,
they deal with them as Unicode data. This meant that our byte counts
would be wildly off whenever any multi-byte characters were introduced
in any of the content sections, causing various breakages.

A long-term (and in-progress) solution is to switch to interpreting
these payloads as binary content by using JavaScript's ArrayBuffer and
DataView objects, which would allow any form of content to be made
available in the payload. For now, though, since these are all UTF-8
string content, we can resolve this issue by storing character counts
and not byte counts.

Python and JavaScript unit tests pass on Chrome, Firefox, and IE.

Manually tested Unicode content in updates and diff fragments. They
loaded correctly.

Description From Last Updated

Seems a little silly to define variables for these. Why not just payload.write(metadata.encode('utf-8'))?

daviddavid

Same here re: defining a single-use variable.

daviddavid
chipx86
david
  1. 
      
  2. reviewboard/reviews/views.py (Diff revision 1)
     
     
     
    Show all issues

    Seems a little silly to define variables for these. Why not just payload.write(metadata.encode('utf-8'))?

    1. Had it during some debugging.

  3. reviewboard/reviews/views.py (Diff revision 1)
     
     
    Show all issues

    Same here re: defining a single-use variable.

  4. 
      
chipx86
david
  1. Ship It!
  2. 
      
chipx86
Review request changed
Status:
Completed
Change Summary:
Pushed to release-3.0.x (4f73586)