My 10,000 ft layperson's view, to which I invite corrections, is broadly: - The ...

fauigerzigerk · on Jan 26, 2021

>The entire field was trumped by Deep Learning, which takes as its premise that you can infer relationships from the exabytes of human rambling on the internet, rather than having to laboriously encode them explicitly

I don't think machine learning can ever replace data modeling, because data modeling is often creative and/or normative. If we want to express what data must look like and which relationships there should be, then machine learning doesn't help and we have no other choice than to laboriously encode or designs. And as long as we model data we will have a need for data exchange formats.

You could categorise data exchange formats as follows:

a) Ad-hoc formats with ill defined syntax and ill defined semantics. That would be something like the CSV family of formats or the many ad-hoc mini formats you find in database text fields.

b) Well defined syntax with externally defined often informal semantics. XML and JSON are examples of that.

c) Well defined syntax with some well defined formal semantics. That's where I see Semantic Web standards such as RDF (in its various notations), RDFS and OWL.

So if the task is to reliably merge, cleanse and interpret data from different sources then we can achieve that with less code on the basis of (c) type data exchange formats.

But it seems we're stuck with (b). I understand some of the reasons. The Semantic Web standards are rather complex and at the same time not powerful enough to express all the things we need. But that is a different issue than what you are talking about.

cheph · on Jan 26, 2021

Deep Learning does not even operate in the same space as where most of Semantic Web is being used today, some examples:

- https://schema.org/

- https://www.wikidata.org/

- https://lod-cloud.net/

- http://www.ontobee.org/

- https://catalog.data.gov/dataset?res_format=RDF&_res_format_...

- https://ukparliament.github.io/ontologies/

- https://ckan.publishing.service.gov.uk/dataset?res_format=SP...

- https://ckan.publishing.service.gov.uk/dataset?res_format=RD...

- https://data.nasa.gov/ontologies/atmonto/index.html

- https://data.europa.eu/euodp/linked-data

breck · on Jan 26, 2021

I think you are spot on.

I think what we'll see is Deep Learning/Human Editor "Teams".

DL will do the bulk of the relationship encoding, but human domain experts will do "code reviews" on the commits made by DL agents.

Over time fewer and fewer commits will need to be reviewed, because each one trains the agent a bit more.