Add the initial documentation and spec for DiffX.
Review Request #8914 — Created April 22, 2017 and updated
This goes over the rationale for why we need a new diff file format, answers questions about the whys and hows, and includes a specification for the format itself. The spec is not complete. It's an iteration over what we had on Hackpad/Notion, but additional changes will be coming to address things like the metadata format and to include more examples.
Built the docs and read through them. This is still early stages, so a
lot needs to be verified during review and after.
Description | From | Last Updated |
---|---|---|
'sys' imported but unused |
reviewbot | |
'os' imported but unused |
reviewbot | |
'shlex' imported but unused |
reviewbot | |
zh_CN is a language code, not an encoding. Perhaps say "Shift-JIS" or "UTF-16LE" (which are both relatively common)? |
david | |
I'm not sure what "the file is treated as 8-bit binary data" means here, since this seems to be explaining … |
david | |
How do we parse this if the encoding is something like UTF-16? It seems like there's kind of a chicken … |
david | |
What happens if a commit message has a line that looks like a diffx header? Perhaps far-fetched, but consider commits … |
david | |
You are missing the author_date field! |
brennie | |
Can we make this a comma-separated list? It's weird to have copy-modify and move-modify be their own things, and we … |
david |
-
I'll probably have a lot more comments as I digest this further, but here's a start.
-
zh_CN is a language code, not an encoding. Perhaps say "Shift-JIS" or "UTF-16LE" (which are both relatively common)?
-
I'm not sure what "the file is treated as 8-bit binary data" means here, since this seems to be explaining the encoding of the section itself. I'm also not sure what it means for parsing if a section defines its own encoding somewhere in the middle.
-
How do we parse this if the encoding is something like UTF-16? It seems like there's kind of a chicken and egg problem. I'd be way more comfortable if we specified the encoding for the main header explicitly, and maybe even for all sections.
-
What happens if a commit message has a line that looks like a diffx header?
Perhaps far-fetched, but consider commits to a diffx tool.
-
Can we make this a comma-separated list? It's weird to have
copy-modify
andmove-modify
be their own things, and we may want to add additional types of operations (permissions change, import from some external repository, etc), some of which might be able to be used in conjunction with others.