Fix Unicode parsing issues with diff fragment and entry update payloads.

Review Request #9450 — Created Dec. 18, 2017 and submitted

Christian Hammond
Review Board

Review Board 3.0 introduced a more efficient way of transferring diff
fragments for comments, and a mechanism for providing updates to entries
on the review request page. These used a streamable format that encoded
byte lengths followed by content.

The problem is, JavaScript doesn't deal with strings as binary data,
they deal with them as Unicode data. This meant that our byte counts
would be wildly off whenever any multi-byte characters were introduced
in any of the content sections, causing various breakages.

A long-term (and in-progress) solution is to switch to interpreting
these payloads as binary content by using JavaScript's ArrayBuffer and
DataView objects, which would allow any form of content to be made
available in the payload. For now, though, since these are all UTF-8
string content, we can resolve this issue by storing character counts
and not byte counts.

Python and JavaScript unit tests pass on Chrome, Firefox, and IE.

Manually tested Unicode content in updates and diff fragments. They
loaded correctly.

  • 0
  • 0
  • 2
  • 0
  • 2
Description From Last Updated
Christian Hammond
David Trowbridge
  2. reviewboard/reviews/ (Diff revision 1)

    Seems a little silly to define variables for these. Why not just payload.write(metadata.encode('utf-8'))?

    1. Had it during some debugging.

  3. reviewboard/reviews/ (Diff revision 1)

    Same here re: defining a single-use variable.

Christian Hammond
David Trowbridge
  1. Ship It!
Christian Hammond
Review request changed

Status: Closed (submitted)

Change Summary:

Pushed to release-3.0.x (4f73586)