Improve HTML safety for both mimetype text thumbnails and Markdown rendering.
Review Request #9303 — Created Oct. 21, 2017 and submitted — Latest diff uploaded
The old mimetype text thumbnail generation code returned plain Unicode
strings for any rendered HTML. This was fine, since we were escaping
user content appropriately (mostly -- see the next point), but manually
working with HTML-safe Unicode strings isn't ideal. This change switches
to usingformat_html_join
andmark_safe
in more places to be
explicit about what kinds of strings we're working with, helping prevent
subclasses and new code down the road from accidentally injecting
HTML-unsafe code into the thumbnails.The Markdown thumbnails now use our standard Markdown rendering code and
settings, creating consistency between the Markdown Review UI and the
thumbnail and giving us only one path to manage for handling HTML safety
for user-provided Markdown content.While working on this, I discovered that the new Markdown support
shipping in 3.0 was allowing HTML attributes to be embedded, which
wasn't ideal and could potentially lead to vulnerabilities. This was due
to switching fromsafe_mode='escape'
in the rendering process to using
a new extension that ensures safety (sincesafe_mode
) is going away in
the next major Python-Markdown release. Previously, settingsafe_mode
would setenable_attributes=False
(meaning all pre-3.0 releases are
fine), but the new extension doesn't do this. We now manually set that
setting ourselves for all rendering.
Unit tests pass.