Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the reason everyone jumped ship from XML to JSON was that JSON is comparatively very dumb - and dumb is quick to grok.

Then some of us realised JSON is actually too dumb for many things, and instead of going back to XML we made JSON Schema, OpenAPI, etc.

Others of us thought that the main problem with JSON is that it's not human readable and writeable enough. So we came up with new formats like YAML. [EDIT: My timing is wrong here, sorry.] Unfortunately being human we could not resist making it much more complicated again, thus increasing cognitive load.

There have been many times when in order to really understand a YAML file (full of anchors and aliases, etc) I've had to turn it into JSON. This is ridiculous.



The timing doesn't quite work out for this explanation, unfortunately. YAML is about the same age as JSON and started finding a niche as a configuration language in parallel with JSON finding use as a serialization format. Ruby on Rails 1.0 was using YAML for configuration in 2005, and it didn't even have JSON support out of the box at that point.


Serves me right for not checking! I certainly became aware of YAML long after I started using JSON. But I do think people are choosing it over JSON for it's alleged improved read/write friendliness.


Indeed, back then it as "Yet another Markup Language" (https://yaml.org/spec/history/2001-12-10.html). I remember using it to write blog posts with static generators, like webgen around 2004.


Interesting, I'm surprised the opposite way as the others replying -- I thought YAML was much older than JSON. We all encounter things at different times I guess.


This is lovely, I didn't know. I guess this is what Kuhn was talking about, we write history in retrospective, sorting it out preferring narrative over fact.


> Then some of us realised JSON is actually too dumb for many things, and instead of going back to XML we made JSON Schema, OpenAPI, etc.

This take doesn't make any sense at all.

JSON Schema is a validation tool for languages based on JSON. You build your subset language based on JSON, and you want to validate input data to check whether it complies with your format. Instead of writing your own parser, you specify your language with a high-level language and run the validation as a post-parsing step. Nothing in this usecase involves anything resembling "too dumb".

OpenAPI is a language to specify your APIs. At most, it is orthogonal to JSON. Again, nothing in this usecase involves anything resembling "too dumb".

JSON is just one of those tools that is a major success, and thus people come out of the woodwork projecting their frustrations and misconceptions on a scapegoat. "JSON is too dumb because I developed a JSON-based format that I need to validate." "JSON is too dumb because I need to document how my RESTful API actually works". What does JSON has to do with anything?


You're making an interesting distinction between "JSON" and "languages based on JSON" there, which I don't. JSON and XML in isolation are just a bunch of punctuation and not useful. They're only useful when we know the structure of the data within them. XML already had schemas, and we were able to easily (YMMV!) validate the correctness of a document.

JSON was simpler because we would just say to each other "I'll send you a {"foo":{"bar":1234,"ok":true}}", "Cool thx" - and there wasn't a way to formalise that even if we wanted to. That doesn't scale well though. We needed a (machine-readable) way to define what the data should actually look like, thus OpenAPI etc.


> You're making an interesting distinction between "JSON" and "languages based on JSON" there, which I don't.

That's basically the root cause of your misconceptions. A document format encoded in JSON is a separate format, which happens to be a subset of JSON. A byte stream can be a valid JSON document but it's broken with regards to the document format you specified. Tools like JSONPath bridge the gap between your custom document format and JSON. This is not a JSON shortcoming or design flaw.

> They're only useful when we know the structure of the data within them.

They are only useful to you because you're complaining that you still need to parse your custom format. This is not a problem with JSON or any other document format. This is only a misconception you have with regards to what you're doing.


Too-dumb is that it was too machine tied, imo jsonc(and now later json5) strikes a perfect compromise.

Json5 adds back, quote-less identifiers, trailing commas in objects/arrays and most importantly, comments (That jsonc already added).

With those additions, there is little extra pain of writing JSON as configuration without losing anything in terms of being stringent.


>There have been many times when in order to really understand a YAML file (full of anchors and aliases, etc) I've had to turn it into JSON. This is ridiculous.

There's no reason why editor couldn't inline those things in YAML to help see what's going on locally. I can't code without code navigation and stuff like type hints for inferred types.

As YAML gets used for more complex stuff I think the tooling needs to catch up.


> There's no reason why editor couldn't inline those things in YAML to help see what's going on locally.

There's no reasons why an editor couldn't present either YAML style (block style or the JSON-like flow style) that the user prefers, and save in whichever one is preferred for storage.


The multitude of IDE plugins and editor modes cannot (and my God should not) solve for the fundamental weaknesses of a data specification format.


Referencing common data is not a weakness, but it does introduce tradeoffs - I'd take that over having to do a "have I updated every instance" whack a mole.


I recently wrote about the JSON/YAML limitations in the context of OpenAPI and JSON Schema:

https://ebastien.name/posts/api-design-language/

Also sharing an humble attempt at an alternative language:

https://www.oxlip-lang.org/


> There have been many times when in order to really understand a YAML file (full of anchors and aliases, etc) I've had to turn it into JSON. This is ridiculous.

Spitballing here. If underlying data is identical in JSON or YAML or whatever, why not introduce a view layer that is structure agnostic provided that the syntax can be translated without modifying the data?

I'm imagining a VSCode plugin or some view that parses the data into whatever format you'd like when you open it, then when you write it serializes it into the file format specified in the filename. You could do the same with your code review system.

Ultimately the specific syntax is for humans only, so as tooling improves why not add that next layer of abstraction? Is it because there are so many format-specific idiosyncrasies that can't translate well, due to the complex nature of a lot of these config files (gitlab-yaml, etc.)?

Just wondering, without having the time to think through the language specs properly, why we haven't seen this yet when it seems like such a huge quality of life improvement.


> Then some of us realised JSON is actually too dumb for many things, and instead of going back to XML we made JSON Schema

XML has numerous different schema languages made for it, outside of the XML standard, because XML itself is just as “dumb” as JSON in this regard, and apparently no one got schemas exactly right for it. The holy wars over XML schema languages only faded when XML’s dominance did.

> OpenAPI

OpenAPI uses JSON and JSON Schema much as SOAP uses XML, but it doesn't prove JSON is “too dumb” any more than SOAP proves that XML is.


I don't know if SOAP caused it, but since it popularized, every single tool just assumes your XML is specified by a DTD, even though that's the one schema language that nobody ever liked.

I never saw a war. AFAIK nobody ever wanted to use DTDs, but that's what everybody used because it was what everybody used.


One thing about yaml that I think conflicts with 'human readable' is the idents mattering. On long files, even with a small number idents, it can be tough to tell whether something is in one section or another. For whatever reason, lining up braces/brackets/parentheses makes it easier for me to tell


Encoding meaning in whitespace not only makes it difficult to verify correctness by eye (for nontrivial cases), but also is very fragile.

You're lucky if it survives a cut and paste across tools. Almost every tool ever written treats whitespace as formatting, not meaningful content, and many tools therefore assume that "fixing" whitespace will improve readability.


I'm still baffled at how semantic whitespace 'won' with YAML and Python.


something that baffles me is that I find Python's semantic whitespace more comprehensible than YAML's. Haven't figure out why though


Python's use of whitespace is, dare i say, perfect.

It makes everything more readable. I've never seen cases where indentation is ambiguous.

I think this is because in a programming language indentation happens in a small set of specific situations like `for` loops and such. Either you indent right or you get an error. On the rare occasion the wrong indentation level can assign a variable to the wrong value, but that is a rookie mistake.

Whereas in YAML, everything starts out ambiguous. Everything is context sensitive. Indents change the meanings of things in slight, unclear ways. Its a constant source of confusion


The legibility argument.

What if I told you that C-style syntax can be formatted to use indentation exactly as you prefer?


Because Python only has one semantic meaning for whitespace. YAML’s whitespace is also mixed with other symbols that change what the whitespace means, and it differs further depending on how much whitespace follows the previous indent and precedes the following symbol, if any. Or if the previous line ends with a pipe, then the only semantic meaning is “none of this whitespace matters, so much so that it’s trimmed… but the indentation is preserved after the first line”. I’m probably wrong about some or all of this! It’s been a whole day since I had to write YAML, so the YAML gnomes have rightly reclaimed the part of my brain that was sure I knew what a given bit of whitespace actually means.


In python you split your code into functions when it grows too many indentation levels to be comprehensible.

In yaml, you have the whole application logic in one ”function”. Fine for small things, problematic when it grows.


Amen. My current theory: data science. Data science has become much more prevalent in the last decade, and I suspect data scientists prefer English-like syntax because they're not engineers.

Engineers use language like mechanical components. It should be precise, neat, and functional. We're designing machines. I have a feeling that data scientists don't want any of that. They want to use language as a way to describe data transformations in a format that resembles a log rather than functional instructions. Much like SQL.


The problem with braces is that as a human, you'll still rely on whitespace, and when the whitespace isn't enforced, it can be misleading.

    if (launch_button_pressed)
      prepare_missile();
      launch_missile();


That should fail a lint check in your CI and your editor/IDE should autoformat that indentation away.

You also claim that is a "problem with braces" but you're not using braces and languages like rust no longer allow brace-less single statements like that.


Yeah, if I had my druthers, brace-less shorthand syntax like that would never be allowed. I never use it in my own code.

Beautify takes care of all questions about what an indent means. C-syntax devs never assume an indent has syntactic meaning. We use it has a helpful hint. We innately understand that it's the braces that matter, and it's really easy to Beautify a file if the original author was sloppy.

C-syntax devs read differently. We're not reading a book. We're reading code.

And we generally strongly prefer correctness. Braces avoid all unintended bugs related to where the instructions are located on the screen. Pretty is nice. Structure and correctness are better.


gcc will give a warning for this.

  <source>:6:17: note: ...this statement, but the latter is 
  misleadingly indented as if it were guarded by the 'if'
      6 |                 launch_missile();
        |                 ^~~~~~~~~~~~~~
https://godbolt.org/z/cYqhhqz47


Which is why you configure your editor to auto format, and then those errors can't exist once you press save.


It works much better for Python than for YAML. The details matter, and YAML has all of the worst details.

(And if you want proof that the details matter, just look at Haskell, where it works perfectly well.)



> Others of us thought that the main problem with JSON is that it's not human readable and writeable enough. So we came up with new formats like YAML. [EDIT: My timing is wrong here, sorry.] Unfortunately being human we could not resist making it much more complicated again, thus increasing cognitive load.

One of the odd things about the progression is how user-hostile it is. JSON lacks support for comments, YAML has systematically-meaningful indentation indentation and frequently deep nesting, etc.


There was a combination of reasons.

XML has plenty of problems of its own which legitimately generated a lot of hate for the format. JSON, at least superficially, didn't have many of those because it lacked (and still lacks) a lot of features. So, for a reasonable person it wouldn't be a proper comparison, but... there's the reason number 2.

JSON rise to prominence coincided with Flash dying and JavaScript hype train gathering momentum. Flash made a bet on XML (think E4X in the latest AS3 spec, MXML, XML namespaces in the code etc.) Those who hated Flash for reasons unrelated to its technological merits hated everything associated with it. In particular, that hate would come from people doing JavaScript. HTML5 that was supposed to replace Flash, but never did was fueling this hype train even more.

At the time, JavaScript programmers fell inferior to every other kind of programmer. Well, nobody considered JavaScript programmers to be a thing. If you were doing something on Web, you'd try hard to work with some other technology that compiled to JavaScript, but god forbid you actually write JavaScript. But people like Steve Yegge and Douglas Crockford worked on popularizing the technology, backed by big companies who wanted to take Adobe out of the standardization game. And, gradually, the Web migrated from Flash as a technology for Web applications to JavaScript. JSON was a side-effect of this change. JavaScript desperately needed tools to match different abilities of Flash, and XHR seemed like it won't be part of JavaScript and was in general a (browser-dependent) mess, especially when it comes to parsing, but also security. JSON had a potential to exploit a security hole in Web security model by being immediately interpreted as JavaScript data, and this was yet another selling point.

To expand on the last claim: one of the common ways to work with JSON was through dynamically appending a `script` element to HTML document, then extracting the data from that element, which side-stepped XHR. There was also a variant of pJSON (I think this is what it was called, but don't quote me, it was a long time ago), where thus loaded JSON would be sent as this:

    $callback({ ... some JSON ... })
Where the `$callback` was supplied by the caller. I'm not entirely sure what this was actually trying to accomplish beside dealing with the asynchronous nature of JavaScript networking, but I vaguely remember hearing about some security benefits of doing this.

Anyways, larger point being: JSON came to life in a race to dislodge one of the dominant forces on the Web. Speed of designing the language and the speed of onboarding of new language users was of paramount importance in this race, where quality, robustness and other desirable engineering qualities played a secondary role, if at all.


As someone else said, JSONP, the callback thing you showed, is/was a same-origin workaround: script tags are legacy and can be loaded cross-origin and executed, but that doesn't expose the content of the script to you.

So you pass along the callback that you want it to execute in a query string param, the script comes back as a call to your callback, and you can then get at the data even though it's coming from a different origin. The remote side has to opt in to looking at the callback query param and giving you back JSONP, so it's kind of a poor man's CORS where the remote side is declaring that this data is safe to expose like this. Of course on the flipside, you're just getting and executing whatever Javascript the remote chooses to send, so you're trusting them more than you have to with a modern fetch/xhr using CORS.


Oh, yeah, thanks for the refresher. It's been a while!


Before CORS, JSONP was one of the few ways to work around the same-origin policy.


> HTML5 that was supposed to replace Flash, but never did was fueling this hype train even more.

My impression of what happened was that the iPhone replaced Flash, and therefore HTML5 couldn't replace Flash because Flash was already gone.

There were some nasty things about Flash, but in retrospect mobile applications are so much worse. We used to have things better than we do now.


> There were some nasty things about Flash, but in retrospect mobile applications are so much worse.

Not really. Flash was terrible on phones that supported it. There were SDKs that turned Flash applications into native iOS ones, IIRC, but otherwise Flash was a dead end once mobile started to grow.


Not really... people made these claims without any testing, as per usual.

I was on Adobe's community advisory board at the time of the iPhone fight against Flash and I'd work with all sorts of things that were supposed to be for the phones.

Flex had problems with phones. Macromedia and later Adobe built this GUI framework on top of Flash with multiple problems... performance being one of them, but the other, and perhaps more important problem was that Flex was created by people who wanted to copy Swing. In no way was it any good for making typical smartphone UI.

So, Adobe tried, but with very little commitment to produce some sort of Flex-based "extension" for smartphones... and that thing never went beyond prototype. Also, at that exact time, Flex was transitioning to using the new text rendering engine, which while offered more typographically pleasing options was really, really slow to render.

People behind Flash player had some good ideas. Eg. Flash Alchemy: a GCC-based backend for ActionScript that made Flash very competitive performance-wise (but never really went beyond prototype). Around that same time a new UI framework appeared in Flash aiming to utilize GPU for rendering, which was a big step in the right direction, especially considering how "native 3D" in Flash failed (it was all on CPU, operating very heavy display objects).

None of these ideas saw much traction in particular because Adobe's management responsible for the product lived under illusion of invincibility. They did a little bit of something, just enough to keep the lights on, but they didn't realize they were side-tracked until it was late. And even at the time it was late, they made some really bad choices. Instead of open-sourcing the player, they started a feud with people who wanted to maintain Apache Flex (the Adobe-abandoned Flex) because of some irrelevant IP right on the Flash player core API. They never officially recognized Haxe. And, generally, undermined a bunch of other projects that targeted their platform (Digrafa comes to mind).

They didn't come clear with major users of their technology, repeating "Adobe is eternally committed to supporting Flex" until they left it in the ditch and forgot all about it. They made it very, very hard to support them in whatever they were doing.

----

Bottom line, Flash could've been made to perform well on smartphones. It ran OK on what would today be called "feature phones" before smartphones existed (eg. it was available on Symbian), if you knew what you were doing.

It died because of piss-poor management on one side and monopolistic desires of mega-corporations on the other side.


> Not really... people made these claims without any testing, as per usual.

Come on… I was around at the time and played with it on phones that supported it. From a user’s perspective it was very bad. Some people desperately wanted it to happen and they had their reasons, but saying that I am negative without having tested it is pointless speculation, and wrong.


I worked for a shop that made Flash games for Symbian phones (i.e. old Nokias). That's a lot more resource-constrained environment than any of iPhone or Android ever were. And it ran fine, if you knew what you were doing.

When Android just appeared on the market, I worked for a company that was making a video chat Facebook app. It was written in AS3 and one of the main features was to apply various effects to video. We tested it on Android, and it worked fine, even though that's a very memory and CPU intensive app.

Really, Flash player was not the problem. It couldn't go toe-to-toe with native code, but optimized AS3 code would beat unoptimized native code.

It was some form of code-golf to write Base64 encoding in AS3 and benchmark it. Usually comparing to the implementation in Flex. When Flash Alchemy came out, I wrote a version of Base64 encoding that beat it something like 100:1. A friend of mine who was known by his forum / Github user name "bloodhound" (here's some of his stuff: https://github.com/blooddy/blooddy_crypto/tree/master/src/by... ) wrote a bunch of encoding / decoding libraries for various formats (he also improved upon my Base64 code). And these were used all over the place for things like image uploads / image generation online. This stuff would beat similar Java libraries for example.

Not sure if you remember this, but at one point in the past Facebook had a Java applet that they used to manage image uploads to your "albums". Later they replaced it by Flash applet. It didn't work any worse that's for sure.

----

The performance problems were in Adobe AS3 code, not the player. Flex was a very inefficiently written framework. And so were AS components. But if you take AS3 3D engines, even those that were fully on CPU... you had plenty of proper 3D games. Eg. Tanki Online (a Russian game made with Alternativa3D Flash engine) was a huge hit. Even if the phone could handle a fraction of that, you'd still have plenty of room for less complex UI.


iPhone didn't replace Flash: the intent was to be be smartphone, not a data distribution format...

iPhone browser, like MacOS browser, drop support for Flash (and most plugins in fact but Flash was the most noticeable.) In the other hand, HTML5 was adopted quickly by Apple. So we can say that HTML5 replaced Flash (not iPhone per se, as it didn't come with a specific replacement first and an alternative was already there second.)

However, I wouldn't say that HTML5 is the drop-in replacement of Flash. It did help to avoid this later on some common use cases with video and audio tags and standardisation of formats (that also kill use of QuickTime/WindowsMedia/RealMedia plugins)


What do you think happened to flashgamelicense.com?


I dunno what it is. After a little search, I think you wanted to point out that games developers lost interest in it and migrated to the smartphone stores?


The reason is easy of use. XML means SAX (difficult to work with for most people), or DOM (insane API).

JSON was just: var settings = eval(json-string); Done, nothing else, just simple access.

It's all about API design. This is also why Stripe even exists




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: