Add the initial documentation and spec for DiffX.

Review Request #8914 - Created April 22, 2017 and updated

Christian Hammond
DiffX
master
40c1b90...
diffx
This goes over the rationale for why we need a new diff file format,
answers questions about the whys and hows, and includes a specification
for the format itself.

The spec is not complete. It's an iteration over what we had on
Hackpad/Notion, but additional changes will be coming to address things
like the metadata format and to include more examples.

Built the docs and read through them. This is still early stages, so a
lot needs to be verified during review and after.

  • 6
  • 0
  • 3
  • 0
  • 9
Description From Last Updated
zh_CN is a language code, not an encoding. Perhaps say "Shift-JIS" or "UTF-16LE" (which are both relatively common)? David Trowbridge David Trowbridge
I'm not sure what "the file is treated as 8-bit binary data" means here, since this seems to be explaining ... David Trowbridge David Trowbridge
How do we parse this if the encoding is something like UTF-16? It seems like there's kind of a chicken ... David Trowbridge David Trowbridge
What happens if a commit message has a line that looks like a diffx header? Perhaps far-fetched, but consider commits ... David Trowbridge David Trowbridge
You are missing the author_date field! Barret Rennie Barret Rennie
Can we make this a comma-separated list? It's weird to have copy-modify and move-modify be their own things, and we ... David Trowbridge David Trowbridge
Checks run (1 failed, 1 succeeded, 1 failed with error)
JSHint passed.
PEP8 Style Checker internal error.
Pyflakes failed.

Pyflakes

David Trowbridge
  1. I'll probably have a lot more comments as I digest this further, but here's a start.

  2. docs/problems-with-diffs.rst (Diff revision 1)
     
     

    zh_CN is a language code, not an encoding. Perhaps say "Shift-JIS" or "UTF-16LE" (which are both relatively common)?

  3. docs/spec/index.rst (Diff revision 1)
     
     
     
     

    I'm not sure what "the file is treated as 8-bit binary data" means here, since this seems to be explaining the encoding of the section itself. I'm also not sure what it means for parsing if a section defines its own encoding somewhere in the middle.

    1. I believe all section headings are interpreted as ASCII and this just refers to the body (i.e., the content of the sections) and will be used as a default encoding if any child section does not declare an encoding.

    2. Correct. It's not a valid diffx file if the headers and such are in something like UTF-16. These are all ASCII, and the encoding has to do with the section.

      Having a content length solves parsing problems here for sure.

    3. In that case, the "Section Format" part should specify that all diffx lines are ASCII-only.

  4. docs/spec/index.rst (Diff revision 1)
     
     

    How do we parse this if the encoding is something like UTF-16? It seems like there's kind of a chicken and egg problem. I'd be way more comfortable if we specified the encoding for the main header explicitly, and maybe even for all sections.

  5. docs/spec/index.rst (Diff revision 1)
     
     
     
     

    What happens if a commit message has a line that looks like a diffx header?

    Perhaps far-fetched, but consider commits to a diffx tool.

    1. I think it would be great to have content-length=512 for e.g. 512 byte content, so we know exactly how much to parse. That would avoid the issue altogether.

    2. To clarify, I mean:

      #..somesection: content-length=512
      /* 512 bytes */
      
  6. docs/spec/index.rst (Diff revision 1)
     
     
     
     

    Can we make this a comma-separated list? It's weird to have copy-modify and move-modify be their own things, and we may want to add additional types of operations (permissions change, import from some external repository, etc), some of which might be able to be used in conjunction with others.

  7. 
      
Barret Rennie
  1. 
      
  2. docs/spec/index.rst (Diff revision 1)
     
     

    You are missing the author_date field!

  3. 
      
Loading...