Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Short answer: yes.

Crawlers are based on consuming text.

HTML is text. Sites that optimize for SEO also use JavaScript to provide SEO context. The specific standard is called JSON+LD; pretty much any site that you use where SEO matters has JSON+LD, RDF-a, or Microdata embedded in the HTML.

You can see these structures if you use the Schema.org validator: https://validator.schema.org/

Try plugging in a URL like Reddit.com and see for yourself. On e-commerce websites, it's a *must have*. For example, try this Amazon page: https://www.amazon.com/dp/B09V3GZD32.

TL;DR: crawlers are parsing RDF-a and Microdata in the HTML or JSON+LD embedded in `<script/>` tags.

You can learn more about it here: https://developers.google.com/search/docs/appearance/structu...



Here's an excerpt of some Javascript found on the Amazon link:

    window.ue_ihb = (window.ue_ihb || window.ueinit || 0) + 1;
        if (window.ue_ihb === 1) {

            var ue_csm = window,
                ue_hob = +new Date();
            (function(d) {
                var e = d.ue = d.ue || {},
                    f = Date.now || function() {
                        return +new Date
                    };
                e.d = function(b) {
                    return f() - (b ? 0 : d.ue_t0)
                };
                e.stub = function(b, a) {
Feel free to visit it to find the entire script. It is much too large to post here. What is a crawler learning from that program that would be lost if the equivalent code was bundled as WASM instead? Why couldn't its WASM parser pull out the same information? The JS/WASM runtime in the browser has to produce the same result regardless of which encoding is chosen, so everything will be encoded in there somehow.


> Why couldn't its WASM parser pull out the same information

There's currently no standard. If there's a will, there's a way.

JSON+LD is the standard for JavaScript based metadata.


I don't get it. JSON+LD is not Javascript. It's not even spelled the same? If you are meaning that your Javascript is able to read JSON+LD, so too could you WASM in this hypothetical world we're talking about.


JSON is literally JavaScript Object Notation, my friend.


Which, humorously, isn't compatible with Javascript object notation. { foo: "bar" } is a valid Javascript object, but not valid JSON.

Regardless, I don't get what you are trying to say. Pretty much every language still in existence is able to work with JSON (even SQL!). JSON is not Javascript. It's not clear why moving code from the Javascript runtime to the WASM runtime would magically make JSON+LD inoperable or whatever it is you are trying to say.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: