Fix several GitHub code linking problems and add unit tests.

Review Request #12532 — Created Aug. 14, 2022 and submitted

Information

beanbag-docutils
master

Reviewers

We had a Python 3 incompatibility issue where a byte string was ending
up in the format string for the final URL when linking to code on
GitHub, resulting in bad links. The source of the problem turned out to
be in _get_git_doc_ref(), which accepted the resulting byte string
from _run_git() and passed it along.

This issue was first found by Simon Willison, who helpfully provided a
pull request (https://github.com/beanbaginc/beanbag-docutils/pull/2) for
this problem.

The lack of test coverage caused us to miss this issue. In the
development of new tests, additional problems were found. Due to the way
we attempted to generate links to code, we failed to link properly to
variables, and could end up linking to the wrong page if a variable was
assigned to an object, class, or function reference.

The reason is that we were walking an object path (a.b.c.d), and
assuming that the last item (d) was something that resided owned by
the module. Instead, for variables, fetching d really just fetched the
value of the variable. That caused the resulting link to point to the
wrong place entirely, and that place could be a 404, due to how links
are generated.

The better solution is to avoid the inspect machinery we were using,
which can't deal with this situation, and instead parse the AST of the
file. Through this, we can reliably look up the identifier belonging to
the module and get the resulting line number.

This change incorporates that along with the fix for byte strings,
bringing Python 3.6-3.11 compatibility for this module (we will be
dropping 2.7 from beanbag-docutils 2.0).

All unit tests pass.

Built the Review Board docs, and verified that all the broken links I
could reproduce (including ones pointing to the wrong places) were now
correctly pointing to the correct files and line numbers on GitHub.

Summary ID
Fix several GitHub code linking problems and add unit tests.
We had a Python 3 incompatibility issue where a byte string was ending up in the format string for the final URL when linking to code on GitHub, resulting in bad links. The source of the problem turned out to be in `_get_git_doc_ref()`, which accepted the resulting byte string from `_run_git()` and passed it along. This issue was first found by Simon Willison, who helpfully provided a pull request (https://github.com/beanbaginc/beanbag-docutils/pull/2) for this problem. The lack of test coverage caused us to miss this issue. In the development of new tests, additional problems were found. Due to the way we attempted to generate links to code, we failed to link properly to variables, and could end up linking to the wrong page if a variable was assigned to an object, class, or function reference. The reason is that we were walking an object path (`a.b.c.d`), and assuming that the last item (`d`) was something that resided owned by the module. Instead, for variables, fetching `d` really just fetched the value of the variable. That caused the resulting link to point to the wrong place entirely, and that place could be a 404, due to how links are generated. The better solution is to avoid the `inspect` machinery we were using, which can't deal with this situation, and instead parse the AST of the file. Through this, we can reliably look up the identifier belonging to the module and get the resulting line number. This change incorporates that along with the fix for byte strings, bringing Python 3.6-3.11 compatibility for this module (we will be dropping 2.7 from beanbag-docutils 2.0).
8d42a37256e2d9f6ad28517cc1e9f6f915fbc201
Description From Last Updated

Neat - learned about https://pypi.org/project/kgb/ from this! Thanks.

simonwsimonw

undefined name 'e' Column: 38 Error code: F821

reviewbotreviewbot

undefined name 'e' Column: 42 Error code: F821

reviewbotreviewbot

expected 2 blank lines after class or function definition, found 1 Column: 1 Error code: E305

reviewbotreviewbot
Checks run (1 failed, 1 succeeded)
flake8 failed.
JSHint passed.

flake8

chipx86
chipx86
Review request changed
Change Summary:

Added missing unit test coverage.

Commits:
Summary ID
Fix several GitHub code linking problems and add unit tests.
We had a Python 3 incompatibility issue where a byte string was ending up in the format string for the final URL when linking to code on GitHub, resulting in bad links. The source of the problem turned out to be in `_get_git_doc_ref()`, which accepted the resulting byte string from `_run_git()` and passed it along. This issue was first found by Simon Willison, who helpfully provided a pull request (https://github.com/beanbaginc/beanbag-docutils/pull/2) for this problem. The lack of test coverage caused us to miss this issue. In the development of new tests, additional problems were found. Due to the way we attempted to generate links to code, we failed to link properly to variables, and could end up linking to the wrong page if a variable was assigned to an object, class, or function reference. The reason is that we were walking an object path (`a.b.c.d`), and assuming that the last item (`d`) was something that resided owned by the module. Instead, for variables, fetching `d` really just fetched the value of the variable. That caused the resulting link to point to the wrong place entirely, and that place could be a 404, due to how links are generated. The better solution is to avoid the `inspect` machinery we were using, which can't deal with this situation, and instead parse the AST of the file. Through this, we can reliably look up the identifier belonging to the module and get the resulting line number. This change incorporates that along with the fix for byte strings, bringing Python 3.6-3.11 compatibility for this module (we will be dropping 2.7 from beanbag-docutils 2.0).
1e8070fd24eeb2c07a25b4a0f821fa348206311a
Fix several GitHub code linking problems and add unit tests.
We had a Python 3 incompatibility issue where a byte string was ending up in the format string for the final URL when linking to code on GitHub, resulting in bad links. The source of the problem turned out to be in `_get_git_doc_ref()`, which accepted the resulting byte string from `_run_git()` and passed it along. This issue was first found by Simon Willison, who helpfully provided a pull request (https://github.com/beanbaginc/beanbag-docutils/pull/2) for this problem. The lack of test coverage caused us to miss this issue. In the development of new tests, additional problems were found. Due to the way we attempted to generate links to code, we failed to link properly to variables, and could end up linking to the wrong page if a variable was assigned to an object, class, or function reference. The reason is that we were walking an object path (`a.b.c.d`), and assuming that the last item (`d`) was something that resided owned by the module. Instead, for variables, fetching `d` really just fetched the value of the variable. That caused the resulting link to point to the wrong place entirely, and that place could be a 404, due to how links are generated. The better solution is to avoid the `inspect` machinery we were using, which can't deal with this situation, and instead parse the AST of the file. Through this, we can reliably look up the identifier belonging to the module and get the resulting line number. This change incorporates that along with the fix for byte strings, bringing Python 3.6-3.11 compatibility for this module (we will be dropping 2.7 from beanbag-docutils 2.0).
8d42a37256e2d9f6ad28517cc1e9f6f915fbc201

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

david
  1. Ship It!
  2. 
      
simonw
  1. 
      
  2. Show all issues

    Neat - learned about https://pypi.org/project/kgb/ from this! Thanks.

    1. It's one of my favorite things I've written :) Lots of advantages over mocks, a key thing being that it leaves the objects and function instances intact, so fewer chances of side effects. We use it heavily.

  3. 
      
chipx86
Review request changed
Status:
Completed
Change Summary:
Pushed to master (3e80042)