Fix output of binary diff data in python 3
Review Request #11279 — Created Nov. 13, 2020 and submitted
Fix output of non-UTF8 diff data in python 3
In Python 3, "binary" and "unicode" (i.e. string) data are
distinct types, and only UTF-8 unicode data can be printed
to standard out. In diff.py currently, we attempt to decode
diff data as UTF-8, and we fail if this fails.Diff data from many SCMs (Git as an example) is not guaranteed
to be UTF-8 data; though this is often a safe assumption, it
is not always the case, and Git will happily output binary
data to the terminal if it does not detect that binary data
as being binary data.As an example, Chinese language data encoded as Big5 or GB2312
will not be detected as binary data, even though it is also not
UTF-8 data. This means that changing a file encoded in Big5
(in our case, Chinese localization data) makes it impossible
to runrbt diff
to visually validate your changes.In our use case, we also have a workflow of
rbt diff
to output
to a file, process and validate that diff, and then runrbt post
to submit the diff, so this breaks our workflow in Python 3.The documentation for Python 3 sys.stdin/out/err notes that
if writing binary data to these streams is required, the stream's
buffer object should be used, so we apply this change when printing
the diff object.This patch was made against master, but the same change applies
for both the 1.x and 2.x branches.How SCMs detect binary data:
SImple testing:
- Add a Bit5-encoded file to a Git repository
git --no-pager diff
the repository; this should show that the diff contains binary datarbt diff
the repository; this previously failed, but now succeeds
Summary | ID |
---|---|
8542ddcfe8e96bad4637baf685fac6238d6b80b0 | |
6312d90b7a4ffd7edae1c8c760da0d2c538b5de4 | |
9cdec9006e6e1b5e54f7286b5aa0a5d0df93a836 | |
3a6bcf149cf1e44b1fe2b9b970744261290d8e58 | |
3a58e1651c414d7d806e8e888f4361530dc95c64 | |
1dcdb0aca9ba40299575bdab457dfbf264bb2723 |
Description | From | Last Updated |
---|---|---|
Worth noting that this only allows to push binary data to ReviewBoard; there are significant bugs preventing rbtools from actually … |
danudey | |
F821 undefined name 'sys' |
reviewbot | |
One difference being introduced here is that print(...) will always add a trailing newline (even if the content ends with … |
chipx86 | |
Can you add a blank line before the if and between else's block and the print(), like: print() if six.PY2: … |
chipx86 | |
E501 line too long (86 > 79 characters) |
reviewbot | |
E261 at least two spaces before inline comment |
reviewbot | |
E501 line too long (86 > 79 characters) |
reviewbot | |
E261 at least two spaces before inline comment |
reviewbot | |
E501 line too long (90 > 79 characters) |
reviewbot | |
E261 at least two spaces before inline comment |
reviewbot | |
E501 line too long (83 > 79 characters) |
reviewbot |
- Change Summary:
-
Add binary output handling from git during rbt patch
- Commits:
-
Summary ID 19bec3472954aa41fb17dd2a9571a6e9b6ceca02 c7b32bf6f570a9f3edfc4110e4092f313590b43f c88ed549640b1270376ae5e701b3a1a763af2c6d
- Change Summary:
-
Fix regression
- Commits:
-
Summary ID c7b32bf6f570a9f3edfc4110e4092f313590b43f c88ed549640b1270376ae5e701b3a1a763af2c6d c7b32bf6f570a9f3edfc4110e4092f313590b43f c88ed549640b1270376ae5e701b3a1a763af2c6d 7d03ce7319fae982d22c4bd4944194278ef3eecf
Checks run (2 succeeded)
- Change Summary:
-
Add support for non-UTF-8 content in patches when using
rbt patch --print
- Commits:
-
Summary ID c7b32bf6f570a9f3edfc4110e4092f313590b43f c88ed549640b1270376ae5e701b3a1a763af2c6d 7d03ce7319fae982d22c4bd4944194278ef3eecf 8542ddcfe8e96bad4637baf685fac6238d6b80b0 6312d90b7a4ffd7edae1c8c760da0d2c538b5de4 9cdec9006e6e1b5e54f7286b5aa0a5d0df93a836 3a6bcf149cf1e44b1fe2b9b970744261290d8e58
Checks run (2 succeeded)
-
This looks great! And good timing: We're releasing RBTools 2.0 basically this weekend. I have a couple changes that need to be made to ensure we don't regress on some output, but otherwise this looks very solid.
-
One difference being introduced here is that
print(...)
will always add a trailing newline (even if the content ends with one), butsys.stdout.buffer.write(...)
will only add one if the content ends with one. We probably want to follow this statement up with aprint()
.This applies to the other lines that move to
sys.stdout.buffer
as well. -
Can you add a blank line before the
if
and betweenelse
's block and theprint()
, like:print() if six.PY2: ... else: ... print()
We separate out statements from new blocks, to help visually distinguish them.
- Change Summary:
-
Re-add newlines to
sys.stdout.buffer.write
instances. - Commits:
-
Summary ID 8542ddcfe8e96bad4637baf685fac6238d6b80b0 6312d90b7a4ffd7edae1c8c760da0d2c538b5de4 9cdec9006e6e1b5e54f7286b5aa0a5d0df93a836 3a6bcf149cf1e44b1fe2b9b970744261290d8e58 8542ddcfe8e96bad4637baf685fac6238d6b80b0 6312d90b7a4ffd7edae1c8c760da0d2c538b5de4 9cdec9006e6e1b5e54f7286b5aa0a5d0df93a836 3a6bcf149cf1e44b1fe2b9b970744261290d8e58 72abe839d0653fbf2ba037bd78472d435cee1418
Checks run (1 failed, 1 succeeded)
flake8
- Change Summary:
-
Fix Flake8 regressions.
- Commits:
-
Summary ID 8542ddcfe8e96bad4637baf685fac6238d6b80b0 6312d90b7a4ffd7edae1c8c760da0d2c538b5de4 9cdec9006e6e1b5e54f7286b5aa0a5d0df93a836 3a6bcf149cf1e44b1fe2b9b970744261290d8e58 72abe839d0653fbf2ba037bd78472d435cee1418 8542ddcfe8e96bad4637baf685fac6238d6b80b0 6312d90b7a4ffd7edae1c8c760da0d2c538b5de4 9cdec9006e6e1b5e54f7286b5aa0a5d0df93a836 3a6bcf149cf1e44b1fe2b9b970744261290d8e58 3a58e1651c414d7d806e8e888f4361530dc95c64
Checks run (2 succeeded)
- Change Summary:
-
Fix up
svn diff
parsing/output for Python 3 - Commits:
-
Summary ID 8542ddcfe8e96bad4637baf685fac6238d6b80b0 6312d90b7a4ffd7edae1c8c760da0d2c538b5de4 9cdec9006e6e1b5e54f7286b5aa0a5d0df93a836 3a6bcf149cf1e44b1fe2b9b970744261290d8e58 3a58e1651c414d7d806e8e888f4361530dc95c64 8542ddcfe8e96bad4637baf685fac6238d6b80b0 6312d90b7a4ffd7edae1c8c760da0d2c538b5de4 9cdec9006e6e1b5e54f7286b5aa0a5d0df93a836 3a6bcf149cf1e44b1fe2b9b970744261290d8e58 3a58e1651c414d7d806e8e888f4361530dc95c64 f15e987d6031351fb2482dab891a72b5996f1252
- Change Summary:
-
Flake8 regression.
- Commits:
-
Summary ID 8542ddcfe8e96bad4637baf685fac6238d6b80b0 6312d90b7a4ffd7edae1c8c760da0d2c538b5de4 9cdec9006e6e1b5e54f7286b5aa0a5d0df93a836 3a6bcf149cf1e44b1fe2b9b970744261290d8e58 3a58e1651c414d7d806e8e888f4361530dc95c64 f15e987d6031351fb2482dab891a72b5996f1252 8542ddcfe8e96bad4637baf685fac6238d6b80b0 6312d90b7a4ffd7edae1c8c760da0d2c538b5de4 9cdec9006e6e1b5e54f7286b5aa0a5d0df93a836 3a6bcf149cf1e44b1fe2b9b970744261290d8e58 3a58e1651c414d7d806e8e888f4361530dc95c64 1dcdb0aca9ba40299575bdab457dfbf264bb2723