Fix issues in rbssh that could lead to race conditions or bad logging.

Review Request #12834 — Created Feb. 20, 2023 and submitted — Latest diff uploaded


Review Board


A few problems were encountered during some testing with a company that
resulted in missing results from rbssh and problems logging.

The logging issue was a combination of Django overwriting our logging
setup, and logging not always flushing. We now set up Django as the
first thing in main(), and we explicitly shut down before exit.

The data loss issue could occur if the stdin stream was opened,
selected for new data, yet empty. This was tricky to reproduce, but we
hit this on CentOS 7.9.2009 with a custom-built Python 3.7.9 running
within Apache 2.4.6 and mod_wsgi in embedded mode.

When stdin was closed, we finished selecting for data. This was
definitely a bug. We hadn't seen this come up in production before, but
had a reliable test case to verify this now.

Now, when closed, we simply exclude it from future selects, and only
close out the stream when the remote process itself ends.

The loop for reading data has been made a bit less expensive. We now
wait 0.01 seconds between reads, using Paramiko's own constant and
following its logic.

There are a few other small fixes/improvements:

  1. Errors are now outputted to stderr, and not just the log, helping to
    catch issues.

  2. Setting streams to non-blocking now uses some handy Python helpers,
    rather than going through fcntl.

  3. More conditions and data are now logged, to help with debugging,
    and log messages have been cleaned up.

Locally tested rbssh with shell mode and remote execution with
short-running processes, long-running processes, and streaming

Locally tested within a web server, interfacing with Cliosoft SOS
over SSH mode.

Tested remotely on the CentOS system that exhibited the data loss
issue. Verified that all interaction worked correctly under reliable
failure conditions.