Fix diffing of rendered text file attachments on Python 3.

Review Request #10709 — Created Sept. 10, 2019 and submitted — Latest diff uploaded

Information

Review Board
release-4.0.x
9947e50...

Reviewers

Rendered text file attachments (such as Markdown) are rendered as
Unicode, for output to the client. These were getting passed directly to
the differ, which itself takes byte strings. On Python 2, this all
worked out fine. On Python 3, this blew up pretty early, but in a way
that was hard to diagnose due any errors coming from subclasses just
being logged (with a massively-huge log message) and then the exception
ignored.

The Markdown review UI specifically had another issue where we were
passing in a file handle containing byte strings, which was then getting
written back as byte strings but to a StringIO expecting Unicode
strings. While the core issue lives in Djblets (and there's another
patch up that aims to fix this), it's better if we just read the file
content and convert to Unicode first, since the Markdown library doesn't
stream the file pointer anyway (it just grabs all the data at once).

This change most of these issues by doing the following:

1) Ensures we're always passing the right string types around (both to
Markdown and the diff chunk generators).

2) Switches the Markdown rendering call to render_markdown(), passing
it a Unicode string instead of a file handle.

3) Adds new unit tests covering all this.

It does not address the overzealous exception sandboxing. That will be
taken care of separately.

Unit tests passed.

Verified I could view Markdown files in both rendered and source mode,
with diffs.