HTML is text. Sites that optimize for SEO also use JavaScript to provide SEO context. The specific standard is called JSON+LD; pretty much any site that you use where SEO matters has JSON+LD, RDF-a, or Microdata embedded in the HTML.
Try plugging in a URL like Reddit.com and see for yourself. On e-commerce websites, it's a *must have*. For example, try this Amazon page: https://www.amazon.com/dp/B09V3GZD32.
TL;DR: crawlers are parsing RDF-a and Microdata in the HTML or JSON+LD embedded in `<script/>` tags.
Here's an excerpt of some Javascript found on the Amazon link:
window.ue_ihb = (window.ue_ihb || window.ueinit || 0) + 1;
if (window.ue_ihb === 1) {
var ue_csm = window,
ue_hob = +new Date();
(function(d) {
var e = d.ue = d.ue || {},
f = Date.now || function() {
return +new Date
};
e.d = function(b) {
return f() - (b ? 0 : d.ue_t0)
};
e.stub = function(b, a) {
Feel free to visit it to find the entire script. It is much too large to post here. What is a crawler learning from that program that would be lost if the equivalent code was bundled as WASM instead? Why couldn't its WASM parser pull out the same information? The JS/WASM runtime in the browser has to produce the same result regardless of which encoding is chosen, so everything will be encoded in there somehow.
I don't get it. JSON+LD is not Javascript. It's not even spelled the same? If you are meaning that your Javascript is able to read JSON+LD, so too could you WASM in this hypothetical world we're talking about.
Which, humorously, isn't compatible with Javascript object notation. { foo: "bar" } is a valid Javascript object, but not valid JSON.
Regardless, I don't get what you are trying to say. Pretty much every language still in existence is able to work with JSON (even SQL!). JSON is not Javascript. It's not clear why moving code from the Javascript runtime to the WASM runtime would magically make JSON+LD inoperable or whatever it is you are trying to say.
Crawlers are based on consuming text.
HTML is text. Sites that optimize for SEO also use JavaScript to provide SEO context. The specific standard is called JSON+LD; pretty much any site that you use where SEO matters has JSON+LD, RDF-a, or Microdata embedded in the HTML.
You can see these structures if you use the Schema.org validator: https://validator.schema.org/
Try plugging in a URL like Reddit.com and see for yourself. On e-commerce websites, it's a *must have*. For example, try this Amazon page: https://www.amazon.com/dp/B09V3GZD32.
TL;DR: crawlers are parsing RDF-a and Microdata in the HTML or JSON+LD embedded in `<script/>` tags.
You can learn more about it here: https://developers.google.com/search/docs/appearance/structu...