Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You know I just can't shake the 90's from my architectural perspective. Back then, we tried making domain models for everything; using a relational database is great but you have it as a backing store for the domain model, and that model is your business case that's exposed to the world. The domain model is what you publish; that's what your web services see, that's what your GUIs see, and if you're publishing out replication events to go to totally other kinds of systems that aren't relational databases, that's the model you publish there as well. Back then, one of the dragons we were slaying was the "stored procedure wall on top of SQL Server" API model, where relational databases had half-broken APIs exposed as crappy SPs and views that the whole world would consume directly from an ODBC connection over many network hops. It was a great way to kill any database, and schema changes were enormous beasts since the whole world hardcoded to your schema.

The article here as well as the yelp one seem to be interested in streaming out the MySQL database schema itself, not just data changes but schema changes as well. If you're doing this as part of bigger MySQL database replication scheme, e.g. that the stream is just consumed by other MySQL databases that agree to share the identical schema, that's great. But the context here seems to be more along the lines of a swarm of microservices, which we would expect are each dealing with their own private set of MySQL data, are not sending INSERT/UPDATE/DELETE statements to each other and instead are speaking more in terms of the domain model. My 90's-esque sense is that if you've built up on lots of services, the messages you want flying around the network should be in terms of the domain models of those services, not your raw SQL / DDL.

Disclaimer: this is an "I'm an idiot" comment because I think I'm missing something

Disclaimer: I'm the author of SQLAlchemy so yes, I get the SQL/DDL/ORM thing



I was reading your comment and getting ready to tell you that I agreed, and that SQLAlchemy helps do pretty much what you're suggesting. Then I got to the bottom. Thanks for what you do!

But yeah, I might be missing something too, but by and large I've lived through a lot of code that depends on a specific database schema, rather than a domain model or some other abstraction. Even when database changes don't break things, there is so. much. regression. testing. before anyone is confident.


I think asserting the primacy of a domain model over a relational model - in particular, an OO domain model - is a recipe for pain, especially in a polyglot world. OO is for modelling communicating agents where coordinated behaviour is the desired outcome. Databases are for storing facts, not serialized objects; the ideal relational model would be pure enough that rows are almost Horn clauses.

I think OO is particularly poor at modeling the kinds of problems I've been involved with - all our perspectives are probably path dependent. My domains have involved lots of data transfer, manipulation, ETL, and transmitting tuples from one language to another; the interface between different modules is at the data layer, not the API layer, because the business of our business is working with the customer's data. In so far as there are core objects, they're things like schemas and transformation rules - configuration, metadata, information about the data and its transformations.

I particularly like the way it's possible to express global constraints over sets of data with relational algebra in a declarative style. Databases aren't (usually) built to evaluate these kinds of constraints, and I don't want to build stored procedure APIs or triggers that emulate them, but the conceptual model is powerful, much more powerful and expressive to me than anything I see in OO.

Wirth once wrote: Algorithms + Data = Programs. I'd put Data first; and in fact when I want to understand a big program, it's almost always the data structures and data flow which reveal more, than the control flow.


> I think asserting the primacy of a domain model over a relational model - in particular, an OO domain model - is a recipe for pain, especially in a polyglot world.

It's unfortunate that, in common usage, "domain model" basically means "OO domain model". Relational model, one which properly reflects your domain, is most certainly as worthy of that name as an OO one.

We can, of course, debate if simply sharing relational database among multiple applications is good architecture. It very well might not be, due to missing intermediate layer that can absorb schema changes, provide caching, etc. Or due to crappiness of languages used to write stored procedures and databases generally not being great application servers (although PostgreSQL seems to be getting there...)

But it is not because relational model is not a "domain model".


Totally agree!

Once we were very close to implement a very similar scenario (stream of events, some lightweight processing, kafka for distribution, hbase/solr for storage). If you go for such thing, better be sure that your business analysts are on the same page as you. Otherwise, instead of a clean event sourcing model, you may end up with some sort of message protocol that spans across the schema/data format/business logic/weird requirements. It is just that this thing will become your new "model".


Exactly this. This wepay post is well written and interesting, and DBZ sounds pretty neat. However, I think that the use case they present is an antipattern that incents/greases the wheels of leaky abstraction. Hell is having ETLs break because one refactored OLTP...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: