Wait until you realize that the difference between path and query string is entirely arbitrary and decided by the server. Query strings should never have existed. They are an implementation detail of CGI webservers that leaked all over everything and now smells really bad.
I dunno, it seems like the fact that we arrived at a fairly standard structure for URL paths that works pretty well is not a bad outcome.
Seems a lot better than the other potential world we could lived in, where paths were a black box and every web server/framework invented their own structure for them.
In my current project I use URIs to refer to absolutely any entity in a git(-ish) repo. Files, branches, revisions, diffs, anything. URI turns out to be a really good addressing scheme for everything. Surprise. But the most used and abused element is always the path. Query takes a lot of that mess away. Might have been unmanageable otherwise.
Grouping data by user is common and normal in computing: /home laid precedent decades ago.
Project directories are an extremely common grouping within a user’s work sets. Yeah, some of us just dump random files in $HOME, but this is still a sensible tier two path component.
The choice to make ‘view metadata-wrapped content in browser HTML output’ the default rather than ‘view raw file contents’ the default is legitimate for their usage. One could argue that using custom http headers would be preferable to a path element (to the exclusion of JavaScript being able to access them, iirc?) or that the path element blob should be moved into the domain component or should prefix rather than suffix the operands; all valid choices, but none implicitly better or worse here.
Object hash is obviously mandatory for git permalinks, and is perhaps the only mandatory component here. (But notably, that’s not the same as a commit hash.) However, such paths could arguably be interpreted as maximally user-hostile.
File path, interestingly enough, is completely disposable if one refers to a specific result object hash within a commit, but if the prior object hash was required to be a commit, then this is a valid unique identifier for the filesystem-tree contents of that commit. You could use the object hash instead of the full path within the commit hash, but that’s a pretty user-hostile way to go about this.
So, then, which part of the ordering and path selections do you consider indiscriminate, and why?
actually, instead of the object hash, you could also use the commit-hash. then the filename would be mandatory, but the url would be more readable and usable: give me the file VERBS.md as it is at commit <hash>
Which target audience of github needs extra verbosity in the commit hash, though? Once you know it you know it; if you don’t know git you aren’t the target audience; etc. Saying /user=foo is no better than ?user=foo if your audience can work it out without confusion from your unadorned paths. We have a great deal of history with filesystems showing that people are capable of keeping up with paths that lack key names if exposed to and familiar with them, and if the filesystem isn’t being constantly randomized.
Of course there's nothing to stop you using URIs like this (I think Angular does, or did at one point?) but I don't think the rules for relative matrix URIs were ever figured out and standardised, so browsers don't do anything useful with them.
what would be a better way of doing that? i am not disagreeing, but i just can't think of any way to improve on this. put everything into the query part? i prefer to use the query only for optional arguments. in this example the blob argument is the only thing that doesn't fit in my opinion.
Every object in git (commit, tree, revision of a single file) has a hash that is guaranteed unique within a repository (otherwise many more things than a web UI would break) and likely also globally. I can understand wanting to isolate repositories to prevent hash collisions from causing problems, but within a repo everything has a universally unique ID.
edit: for instance, that specific VERBS.md is represented by the blob 3b9a46854589abb305ea33360f6f6d8634649108.
> this should be sufficient to represent the file.
Except it's not, because the oid can be a short hash (https://github.com/gritzko/beagle/blob/a7e172/VERBS.md) and that means you're at risk of colliding with every other top-level entry in the repository, so you're restricting the naming of those toplevel entries, for no reason.
So namespacing git object lookups is perfectly sensible, and doing so with the type you're looking for (rather than e.g. `git` to indicate traversal of the git db) probably simplifies routing, and to the extent that it is any use makes the destination clearer for people reading the link.
turns out that "blob", "raw" and "commit" have nothing to do with the hash itself, but are functions to describe how the object in question is to be presented. so what i said above about blob being redundant is false, the problem is rather that it is in a weird place. it should be at the end, like a kind of extension because it signifies the format of the output. except i think putting it at the end makes handling relative paths more difficult as it would have to be appended to every link to other files.
the roxen webserver has an interesting solution for that. they call it prestates and it's placed at the beginning of a url: https://github.com/(commit)/gritzko/beagle/a7e172/VERBS.md . it sets the format value visually apart, and you could have multiple prestate values separated by a comma. i have used that feature extensively on my own sites. i even expanded on the concept in custom modules.
They are following the /key/value/key/value pattern, but the first two pairs in a GitHub URL are fixed to user and project, which lets them omit the key names. I could see them not being willing to hardcode the third pair to blob.
Back when GitHub URLs were kind of cool, github.com/user/gritzko/project/beagle would have been much less cool than just github.com/gritzko/beagle.
Not entirely arbitrary - forms that use the GET method instead of POST will append form values as query params.
For sites without Javascript, it's great for things like search boxes, tables with sorting/filtering, etc. instead of POST, since it preserves your query in the URL.
It has always amazed me how much trouble the SPA folks are willing to go to in order to slowly rebuild just normal boring URLs with querystrings because users demand deep linking and back buttons and the like.
Or you could accept that you're probably going to need a round trip to the server and use a normal URL and it's fine.
For all but the absolute biggest websites in the world, anyhow. At Facebook or Google scale yeah it's needed.
Query strings existed before CGI did, and the way they're defined to be filled in from web forms is quite useful; I wouldn't want to need Javascript to fit that into path format. There's nothing wrong about having things decided by the server; I don't get that part of your argument at all.
Maybe dumb question: how does the server “decide” anything other than what file to serve? Today we have many choices but back in the day CGI was the first standard way to do it.
So yes query parameters existed before CGI but to use them you had to hack your server to do something with them (iirc NCSA web servers had some magic hacks for queries). CGI drove standardization.
I'm not sure what point you're trying to make. Here it is in C, so you can run it on you computer in 1995? Because servers could make decisions in 1995.
int main() {
int s = socket(AF_INET, SOCK_STREAM, 0);
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &(int){1}, sizeof(int));
A post claimed CGI led to bad standards around query parameter formatting and parsing. I was merely pointing out that, prior to the advent of CGI, if you wanted to actually do anything with those parameters on the server, you had to extend whatever primitive HTTP server you were running, write some custom code and invent your own “standard”. There were no server side frameworks or standards.
TCP has been around a long time. Listen, read, send, you're good to go. It's just software so you can make it do anything.
But you're asking about the relationship between popular primarily file serving servers like Apache and their relationship to high level code to create custom responses? Yeah, CGI was the first big standard there that I remember, though it was a bit before my time. But that's only one possible architecture.
These days, most web apps have the web server built in, and so the custom code you're writing works with the full request directly. There may be a lightweight web server in front (or multiple), like nginx, to manage connections, but they will largely just proxy the whole thing through.
> Query strings existed before CGI did… There's nothing wrong about having things decided by the server
Sure, but there is also no standard for how to format/parse the query string. And also no server plugin frameworks. So you are inventing your own standard and extending some HTTP server for which you have source. Until CGI forces a standard, bad as it might be; it’s a common ground.
It's arbitrary to a degree like the difference between using an attribute or child element in XML, but it's not entirely arbitrary. If you want to include data in the URL that's not part of the hierarchy of the path, query strings are good for that.