Fix Unicode parsing issues with diff fragment and entry update payloads.

Review Request #9450 — Created Dec. 18, 2017 and submitted — Latest diff uploaded

Information

Review Board
release-3.0.x
995ff33...

Reviewers

Review Board 3.0 introduced a more efficient way of transferring diff
fragments for comments, and a mechanism for providing updates to entries
on the review request page. These used a streamable format that encoded
byte lengths followed by content.

The problem is, JavaScript doesn't deal with strings as binary data,
they deal with them as Unicode data. This meant that our byte counts
would be wildly off whenever any multi-byte characters were introduced
in any of the content sections, causing various breakages.

A long-term (and in-progress) solution is to switch to interpreting
these payloads as binary content by using JavaScript's ArrayBuffer and
DataView objects, which would allow any form of content to be made
available in the payload. For now, though, since these are all UTF-8
string content, we can resolve this issue by storing character counts
and not byte counts.

Python and JavaScript unit tests pass on Chrome, Firefox, and IE.

Manually tested Unicode content in updates and diff fragments. They
loaded correctly.

    Loading...