For a book, I would keep the text itself as something very simple, such as a big old list of lines. A chapter is a range of lines. A section is also a range of lines. Adding subsections is a matter of adding a field to Book and it is (surprise) a range of lines, too.
You could implement this as a byte offset if you wanted, but you get the idea. If the text of the book is data, the rest is metadata. IMO it should live with the data, preferably in as lightweight a fashion as possible. Adding more metadata is a matter of adding fields, but as long as you stick to a basic form of representation, you should be fine.
Your helper methods help connect interrelated pieces of data. If you want all the chapters in a set of pages, you check for which chapters either overlap or are a subset of the page range you got.
If you want to add footnotes to pages, for example, you might define a footnotes field. That field has the text of the footnote and the line that it points to, perhaps with an offset to denote specifically which word it wants. (Maybe this is an argument for using byte offsets in the first place.)
Is this roughly what you had in mind, or have I completely missed the mark?
> Is this roughly what you had in mind, or have I completely missed the mark?
Maybe, but I'm not sure. If you don't encapsulate the data behind an API, then I suspect that you will eventually be sorry. And if you do encapsulate the data behind an API, then you're doing OO.
You may be right that your design for the representation of the book may be better than the typical naive OO representation, but there's nothing that forces an internal OO representation to be naive, or for an FP one to be sophisticated.
I guess what I'm saying is that I still don't understand how the FP approach is supposed to be different from the OO approach using the Law of Demeter. With the law of Demeter, methods only return data, not other objects ("two dots: bad"), so it's not as object-filled as a more naive OO approach. And in a case such as your FP representation for a book, where the representation is bound to be rather complicated, it seems that you will surely want to encapsulate it, in which case, the FP approach has adopted some OO. I.e., the FP approach and the OO approach have met in the middle and have become the same.
Incidentally I'm about done watching one of Rich Hickey's talks about complexity, the one called Simple Made Easy. Thanks for the recommendation; it has given a me a lot to think about. Are there more, or was that the one you were referring to originally?
"You may be right that your design for the representation of the book may be better than the typical naive OO representation, but there's nothing that forces an internal OO representation to be naive, or for an FP one to be sophisticated."
Well, there is, in that OO is inherently more complicated, in the most trivial sense; compare an idiomatic implementation of storing, incrementing, and displaying an integer in Java versus C. At a minimum, for Java, you're going to want an object. For C, you neither need nor want any such thing; a function or two and an integer are sufficient.
If the only thing that exists is a bag of data and code which pulls out that data, you have a much simpler system. I don't think this is controversial; an object is the same thing, in essence, except that due to language implementations and semantics, there are additional layers of complexity that are inherent and inextricable. Hickey goes on about this better than I could, although one of my favorite recent discoveries from C++ is the explicit keyword. The point is that with purity and referential transparency, it is far far easier to reason about you program. Pure functions are orthogonal, no?
I have heard of the Law of Demeter, but in practice, I'm not sure I see it obeyed that often if ever. APIs tend to return objects, or assume you'll chain method calls, or what have you. It's a fact of life, at least in my experience with Android development as well as vanilla Java. Maybe there's some subtlety that I'm missing out on? Otherwise it seems like a dud, if only because nobody wants to maintain those niggling chains of accessor calls.
I think that's really the crux of the issue: culturally, OO tends to value taxonomy in the form of object hierarchies; polymorphism; inheritance; and the like. All of that adds complexity. No, there's no reason why OO has to rely on mutable state and hierarchies. But the conventional wisdom about best practices is a powerful influence. You could compare theoretical implementations, but honestly, taking a look around would probably give you a better idea of how this theoretical scenario is likely to play out. That's the basis for comparison, here: Hickey is contrasting a simple approach with software development as it happens now, typically, not as it might or ought to happen.
Also, for what it's worth, I don't think encapsulation is irretrievably the province of OO. APIs are not a concept over which OO has exclusive ownership, right? The idea that you might want to sequester state goes hand in hand with concepts of purity and referential transparency. Or the idea that you should use a function which implements an algorithm as an entry-point to a data structure rather than muck about in it yourself (think of a heap).
> Are there more, or was that the one you were referring to originally?
That's the one. Though there are two with that name, I believe. One for Strange Loop and one for a Ruby conference. The two talks are mostly the same, but it's worth watching both. Then watch all other talks by Rich Hickey too, as they are all excellent.
> At a minimum, for Java, you're going to want an object
Java, if you ask me is a degenerate case. It's a language that pushes consistency at all costs, in the face of any common sense. Most other OO languages don't pressure you into programming this way. I.e., they give you normal functions in addition to methods.
Though when programming in Java, I am personally not averse at all to just putting functions into an empty utility class that exists solely for holding nothing but static methods. Scala actively encourages doing this, as it provides for a type of object called a "package object" that exists for this particular purpose.
> I have heard of the Law of Demeter, but in practice, I'm not sure I see it obeyed that often if ever.
Maybe people in the trenches don't typically obey "best practices", but the OO best practices gurus all seem to push the Law of Demeter these days. I'm sure it's difficult to figure out how to do well. But then again, so is figuring out how to apply Rich Hickey's advice on simplicity!
> nobody wants to maintain those niggling chains of accessor calls.
Nobody wants to be locked into an inflexible API either, but I agree that people are lured by what is easy!
Hickey is contrasting a simple approach with software development as it happens now, typically, not as it might or ought to happen.
It would be interesting to know how what Hickey would do is different from what OO best practices gurus recommend. Indeed it's clear that in the real world, people don't often follow either advice.
> Also, for what it's worth, I don't think encapsulation is irretrievably the province of OO.
For the most part, it's usually considered to be. API's surely existed before OO, but they were typically for one program to talk to another, not for communication within a program. Before OO, different parts of programs typically revealed all their data structures, and you were allowed to go to town on them. Though, it's true that for certain things you probably weren't supposed to do that. E.g., as you said, on the malloc heap in C.
I believe that encapsulation as we know it today was invented by the CLU programming language. It was certainly the first language to enforce encapsulation. CLU isn't necessarily what we would consider an to be OO language today, as it didn't have inheritance, but it considered itself an OO language at the time.
You could implement this as a byte offset if you wanted, but you get the idea. If the text of the book is data, the rest is metadata. IMO it should live with the data, preferably in as lightweight a fashion as possible. Adding more metadata is a matter of adding fields, but as long as you stick to a basic form of representation, you should be fine.
Your helper methods help connect interrelated pieces of data. If you want all the chapters in a set of pages, you check for which chapters either overlap or are a subset of the page range you got.
If you want to add footnotes to pages, for example, you might define a footnotes field. That field has the text of the footnote and the line that it points to, perhaps with an offset to denote specifically which word it wants. (Maybe this is an argument for using byte offsets in the first place.)
Is this roughly what you had in mind, or have I completely missed the mark?