Add cross-platform file path conversion/normalization.

Review Request #12071 — Created Feb. 23, 2022 and submitted

Information

ReviewBot
release-3.0.x

Reviewers

File paths in diffs can come in any format. Windows paths, POSIX paths,
or UNC-like paths (or similar, such as for Perforce).

Review Bot has been assuming the file paths match the local environment,
which is generally safe, as POSIX is most common. However, it's an
assumption that could result in problems.

There are also problems with assuming the paths can be trusted. They may
be absolute, or may include a number of .. elements that could
result in issues modifying other files.

This change introduces some normalization and safeguards for working
with paths. A new get_path_platform() method inspects a path, trying
to determine if it's a Windows or POSIX path, and returns an enum value
representing it. Callers can then get the appropriate
os.path-equivalent module (ntpath or posixpath) from that value.

normalize_platform_path() makes use of this to convert a path to
something safe for use in patching files. It will split out any drive
letters or UNC hosts from a path, normalize any relative references
within the path, and then split/re-join it for the local system's
path separator.

It can optionally append this to a target directory as well, for
building an absolute path.

The resulting path is checked to ensure it's not referencing something
outside of the target directory. If it is, an exception is raised. This
mirrors similar behavior from GNU patch, which denies absolute or
relative references below the patch target directory.

This will be used for an upcoming replacement for the patching logic.

Unit tests pass.

Summary ID
Add cross-platform file path conversion/normalization.
File paths in diffs can come in any format. Windows paths, POSIX paths, or UNC-like paths (or similar, such as for Perforce). Review Bot has been assuming the file paths match the local environment, which is generally safe, as POSIX is most common. However, it's an assumption that could result in problems. There are also problems with assuming the paths can be trusted. They may be absolute, or may include a number of ``..`` elements that could result in issues modifying other files. This change introduces some normalization and safeguards for working with paths. A new `get_path_platform()` method inspects a path, trying to determine if it's a Windows or POSIX path, and returns an enum value representing it. Callers can then get the appropriate `os.path`-equivalent module (`ntpath` or `posixpath`) from that value. `normalize_platform_path()` makes use of this to convert a path to something safe for use in patching files. It will split out any drive letters or UNC hosts from a path, normalize any relative references within the path, and then split/re-join it for the local system's path separator. It can optionally append this to a target directory as well, for building an absolute path. The resulting path is checked to ensure it's not referencing something outside of the target directory. If it is, an exception is raised. This mirrors similar behavior from GNU patch, which denies absolute or relative references below the patch target directory. This will be used for an upcoming replacement for the patching logic.
a719e4a16e0e07edffca27b800e78bbb7544280e
david
  1. Ship It!
  2. 
      
chipx86
Review request changed
Status:
Completed
Change Summary:
Pushed to release-3.0.x (f945d49)