Summary

Fix get_markdown_element_tree() with bad escaping in e-mail addresses.

Review Request #11618 — Created May 25, 2021 and submitted June 2, 2021, 6:47 p.m.

Information

Owner

chipx86

Repository

Djblets

Branch

release-1.0.x

Bugs

Depends On

Reviewers

Groups

djblets

People

Description

We already had some code in place to ensure that some entity names were
converted to entity codes, and some other code to convert invalid raw
character codes to a ?. This change combines the two, updating the
entity name conversion to also perform numeric entity value conversion
for any illegal characters.

Testing Done

Unit tests passed.

Verified that this fixed a reported issue on Review Board with these
bad e-mail addresses when diffing fields.

Tested this fix against Python 3 as well on release-2.0.x.

Commits

Summary	ID
Fix get_markdown_element_tree() with bad escaping in e-mail addresses. When Python Markdown renders an e-mail address, it escapes each character in order to make it harder for addresses to be scraped. This can generate some bad entities when adding backslash escaping at the end of the e-mail address (like `<test@example.com\\>`), which browsers will deal with fine, but Python's XML parser won't. We already had some code in place to ensure that some entity names were converted to entity codes, and some other code to convert invalid raw character codes to a `?`. This change combines the two, updating the entity name conversion to also perform numeric entity value conversion for any illegal characters.	aa219443d57c31831c287c4d771d9b47519a30b5

Summary

Fix get_markdown_element_tree() with bad escaping in e-mail addresses.

When Python Markdown renders an e-mail address, it escapes each character in order to make it harder for addresses to be scraped. This can generate some bad entities when adding backslash escaping at the end of the e-mail address (like `<test@example.com\\>`), which browsers will deal with fine, but Python's XML parser won't. We already had some code in place to ensure that some entity names were converted to entity codes, and some other code to convert invalid raw character codes to a `?`. This change combines the two, updating the entity name conversion to also perform numeric entity value conversion for any illegal characters.

aa219443d57c31831c287c4d771d9b47519a30b5

flake8 passed.

JSHint passed.

Ship it!

```
Ship It!
```

Status:: Completed
Change Summary:: Pushed to release-1.0.x (e93e526)