This change has been marked as completed.

Describe the completed change (optional):

Pushed to release-4.0.x (7321fbd)

Summary

Fix data truncation, blocking I/O, and performance issues in rbssh.

Review Request #12574 — Created Sept. 2, 2022 and submitted 2 years, 9 months ago — Latest diff uploaded 2 years, 10 months ago

Information

Owner

chipx86*

Repository

Review Board

Branch

release-4.0.x

Bugs

Depends On

Reviewers

Groups

reviewboard

People

Description*

rbssh has been in use for many, many years, and has worked in the cases
that people most often use it for (Git, Subversion, etc.). However, a
new integration that remote-controls a server over SSH has unveiled a
problem where we can stop processing the channel before we finish
handling all stdout/stderr content.

This will happen if the other end sends a large payload of data
(extending a buffer beyond our read limit of 4096) without blocking, and
then closes the stream with an EOF.

Paramiko will see the EOF and then mark the channel as "closed". This is
sort of a lie. "closed" does not mean "stop reading", as we've
interpreted it, but rather "no more data will be come in." There's a
difference there, as the buffers may still contain content.

This was one issue. Another is that we'd try to output data and,
sometimes this would result in a Resource Not Available error. This
depends on the version of Python and how the data is being consumed
(e.g., outputting to iTerm on macOS using Python 2.7 can trigger this,
but redirecting to a file does not).

These have been solved by doing two things:

We no longer assume we're done reading if the channel is marked as
closed. Instead, we only close if we've received an exit code, we did
not receive new data, and we don't have any data left in the buffers.
stdout/stderr are now non-blocking, like stdin. This ensures data
gets sent immediately to the caller, and forces us to handle writes
accordingly (which could be an issue if a caller were to set our
blocking flags for those streams).

With these fixed, we're also able to increase the buffer sizes to
improve performance.

For stdout/stderr data, we now read up to 16KB at a time (it will not
wait for a full 16KB, it'll just read up to that).

For stdin, we also read up to 16KB, instead of 1 byte. This is far more
efficient. We use the same blocking-aware logic we now have for working
with stdout/stderr streams.

Testing Done

Unit tests pass.

Tested the new integration that occasionally triggered the buffer
truncation issue. With this fix, the issue no longer occurred across
many dozens of repeated attempts.

Ran many tests using Python 2.7 and various 3.x versions where I ran
commands on multiple remote machines, checking for blocking I/O issues
and truncated data. These included using cat to transfer data to a
remote server and to transfer from a remote server, verifying hashes.

Tested performance and verified a pretty substantial (several seconds)
decrease in time when transferring 100MB+ files.

Diff Revision 3 (Latest)

orig

Commits

First	Last	Summary	ID	Author
		Fix data truncation, blocking I/O, and performance issues in rbssh. rbssh has been in use for many, many years, and has worked in the cases that people most often use it for (Git, Subversion, etc.). However, a new integration that remote-controls a server over SSH has unveiled a problem where we can stop processing the channel before we finish handling all stdout/stderr content. This will happen if the other end sends a large payload of data (extending a buffer beyond our read limit of 4096) without blocking, and then closes the stream with an EOF. Paramiko will see the EOF and then mark the channel as "closed". This is sort of a lie. "closed" does not mean "stop reading", as we've interpreted it, but rather "no more data will be come in." There's a difference there, as the buffers may still contain content. This was one issue. Another is that we'd try to output data and, sometimes this would result in a Resource Not Available error. This depends on the version of Python and how the data is being consumed (e.g., outputting to iTerm on macOS using Python 2.7 can trigger this, but redirecting to a file does not). These have been solved by doing two things: 1. We no longer assume we're done reading if the channel is marked as closed. Instead, we only close if we've received an exit code, we did not receive new data, and we don't have any data left in the buffers. 2. stdout/stderr are now non-blocking, like stdin. This ensures data gets sent immediately to the caller, and forces us to handle writes accordingly (which could be an issue if a caller were to set our blocking flags for those streams). With these fixed, we're also able to increase the buffer sizes to improve performance. For stdout/stderr data, we now read up to 16KB at a time (it will not wait for a full 16KB, it'll just read up to that). For stdin, we also read up to 16KB, instead of 1 byte. This is far more efficient. We use the same blocking-aware logic we now have for working with stdout/stderr streams.	73df92c5b545a66c2440dcb9dcb4c1c93100dad8	Christian Hammond

Files

reviewboard/cmdline/rbssh.py

reviewboard/cmdline/rbssh.py