A thermometer doesn't measure temperature any better than a meterstick measures length. And we all know what Einstein had to say about the relativity of metersticks.
To paraphrase from the paper I linked in another reply to you, a thermometer is just a heat bath equipped with a pointer which reads its average energy, whose scale is calibrated to give the temperature T, defined by 1/T = dS/d<E>.
You can read the thermometer if you like, but if you know the exact microstate of the water to begin with, the thermometer reading will tell you much less than you already knew about the water. And precise knowledge of the water's microstate will (theoretically) allow you to extract much more work from that water than you would be able to with only the thermometer reading.
You seem pretty convinced. Let me see if you're talking about the same pedantic distinction that oh_my_goodness was.
A: "an urn containing either a white ball or a black ball".
B: "I notice that the ball in the urn A is white".
I would say that initially the entropy of our ball-urn system is 1 bit, and that with observation B, we have reduced the entropy of our ball-urn system to 0 bits.
But if you are going to take the view that even knowing the ball in this particular urn is actually white doesn't change the fact that the entropy of <<"an urn containing either a white ball or a black ball">> is 1 bit and not 0, and that that's the entropy that we're discussing, then I won't argue about it any further.
The entropy of the system was always 0 bits. Knowledge is irrelevant.
If the urn actually contained nothing and would materialize a black or a white ball randomly then this can occur with or without your knowledge. When the ball materializes and nothing more can be done THEN the entropy has changed. Because there's no more possible microstates.
You not having knowledge about microstates DOES NOT change available microstates. You seem to think that if you don't know about something, anything goes.
You're really arguing abstract philosophy. Did a tree in the forest fall if no one was around to see it? Yes it did dude. Your knowledge of it has nothing to do with whether it fell. Same with entropy. And if you deny the fact that a tree in the forest never fell, then you're the one going off onto a pedantic tangent.
I'm surprised to hear that "the entropy of the system was always 0 bits."
Let's say that the urn contains a ball that changes its color from black to white and viceversa every thousand years (relative to January 1, 1970 midnight UTC).
Given that "macrostate" there are two possible and equally probable "microstates". The entropy is positive. If I look into the urn and find that the ball is white was the entropy of the system always zero? Or is it always positive in this example?
What I'm seeing in your other reply is actually you not even reading my reply. Your making statements on things I already touched upon.
It makes you seem not intelligent. But that would be a rude thing to say would it? There's no point. If you want to have a discussion probably smart to say things that will keep the other person engaged rather then pissed off.
This is actually bad enough where I demand an apology. Say your sorry, genuinely, or this thread won't continue and you're just admitting your mean.
I really mean what I wrote: that answer is consistent with other things that you wrote indicating that you view the macrostate as a theoretical collection of microstates which is defined under some assumptions which may be dettached from what is known about the state of the system.
So you think that it's still somehow meaningful to refer to the macrostate "the ball may be black or white" and the associated entropy even if the color of the ball is known.
The coherence is in discarding the knowledge of the microstate / the future outcomes of the die / the color of the ball and claiming the original models are still valid (which they may be for some purposes - when that additional knowledge is irrelevant - but not for others).
If you had said "the macrostate I originally chose becomes meaningless because I know the microstate and the entropy is zero now [or was zero all along, as in your previous comment]" it would have been less coherent with the rest of your discourse - and it would have merited a longer reply.
>So you think that it's still somehow meaningful to refer to the macrostate "the ball may be black or white" and the associated entropy even if the color of the ball is known.
Yes. A probability distribution can still be derived from a series of events EVEN when the outcome of those exact events are known prior to the actual occurrence.
I believe this is the frequentist view point as opposed to the bayesian viewpoint.
The argument comes down to this as entropy is really just a derivation from probability and the law of large numbers.
I would say that - in this particular example - knowing that the ball is white and will remain white for the next 950 years and will then be black for the next 1000 years, etc. makes the macrostate "the ball may be black or white" irrelevant.
However, I agree that one can still define the macrostate as if the color was unkown - and make calculations from it. (It's just that I don't see the point - it doesn't seem a good or useful description of the system.)
If I can look at a probability distribution from data collected from the past and generate a probability. Why can't I look to the future and do the same thing? Even with knowledge of the future you can still count each event and build up a probability from future data.
Think of it this way. Nobody looks at a past event and says that the probability of that past event was 100%. Same with a future event that you already know is going to occur. The probability number is communicating frequency of occurrence along along a large sample size. Probability is ALSO independent of knowledge from the frequentist viewpoint. Entropy is based off of probability so it is based off this concept. Knowledge of a system doesn't suddenly reduce entropy because of this.
That's not the only defition of probability - and it's a very limiting one.
Statistical mechanics is based on the probability of the physical state now and not on the frequency of physical states over time. Sometimes we can assume an hypothetical infinite time evolution and use averages over that but a) it's just a way of arriving at the averages over ensembles that are really the object of interest, b) the theoretical justification is controversial and in some cases invalid, and c) doesn't make sense at all in non-equilibrium statistical mechanics.
Don't be offended, but "Nobody looks at a future event that he already knows is going to occur and says that the probability of that future event is 100%" is a very strange thing to say. Everybody does that! Ask astronomers what's the probability that there is a total solar eclipse in 2024 and they answer 100%, for example.
>Don't be offended, but "Nobody looks at a future event that he already knows is going to occur and says that the probability of that future event is 100%" is a very strange thing to say.
Not offended by your words but I am offended by the way you think. Maybe rather then assuming I'm wrong and your superior, why don't you just ask questions and dig deeper into what I'm talking about.
Probability is a mathematical concept separate from reality. Like any other mathematical field it has a series of axioms and the theorems built off the axioms are independent of our typical usage of it in applied applications.
Just because it's rare for probability to be used on future events that are already known doesn't mean the math doesn't work. We tend to use applied math, specifically probability, to predict events that haven't occurred yet but the actual math behind probability is time invariant. It can be applied to events that have ZERO concept of time. See Ulam spirals. In Ulam spirals prime numbers have higher probability in appearing at certain coordinates. This probability is determined independent of time. WE have deterministic algorithms for calculating ALL primes. Yet we can still write out a probability distribution. Probability still has meaning EVEN when the output is already known.
That means I can look at a series of known past events and calculate a probability distribution from there. I can also look at a series of known future events and do the same thing. I can also look at events not involving time like where prime numbers appear on a spiral and calculate a distribution. Just look at the math. All you need are events.
English and traditional intuitions around probability are distractions from the actual logic. You used an english sentence to help solidify your point but obviously our arguments are way past surface intuitions and typical applied applications of probability.
Look up frequentist and bayesian interpretations of probability. That is the root of our argument. You are arguing for the bayesian side, I am arguing for frequentist.
I'm aware that there are different interpretations of probability. I said as much in the first line of my previous message!
You may be content with an interpretation restricted to talking about frequencies. I prefer a more general interpretation which can also - but not exclusively - refer to frequencies.
Even from a frequentist point of view I find perplexing your suggestion that nobody says that the probability of something is 100% when they are able to predict the outcome with certainty.
Probability may be a mathematical concept separate from reality but when it's applied to say things about the real world not all probability statements are equally good - just like the "moon made of cheese" model is not as good as any other even if it's mathematically flawless. This has nothing to do with Bayesian vs frequentist, by the way, empirical frequencies are not mathematical concepts separated from reality.
It's a perfectly frequentist thing to do to compare a sequence of probabilistic predictions to the realised outcomes to see how well-calibrated they are.
The astronomer that predicts P(total solar eclipse in 2023)=0%, P(t.s.e. 2024)=100%, P(t.s.e. 2025)=0%, P(t.s.e. 2026)=100%, etc. will score better than one who predicts P(t.s.e. 2023)=P(t.s.e. 2024)=P(t.s.e. 2025)=P(t.s.e. 2026)=2/3 or whatever is the long run frequency.
A weather forecaster that looks at satellite images will score better than one that predicts every day the average global rainfall. Being a frequentist doesn't prevent you from trying to do as well as you can.
I did already agree that you _can_ keep your model where the millenary-change ball may be either white or black and make calculations from it. (It just doesn't seem to me a good or useful description of that system once the precise state is known. You _can_ also change the model when you have more information and the updated model is objectively better. I think we will agree that from frequentist point of view predicting a white ball with 100% probability and getting it right every time is more accurate than a series of 50%/50% predictions. And the refined model can calculate the loooooong-term frequency of colours just as well.)
>It's a perfectly frequentist thing to do to compare a sequence of probabilistic predictions to the realised outcomes to see how well-calibrated they are.
Yeah but it's a philosophical point of view. The bayesian sees this calibration process as the probability changing with more knowledge. The frequentist sees it as exactly you described a calibration... a correction on what was previously a more wrong probability.
>A weather forecaster that looks at satellite images will score better than one that predicts every day the average global rainfall. Being a frequentist doesn't prevent you from trying to do as well as you can.
Look at this way. Let's say I know that the next 100 days there will be sunny weather every day except for the 40th day, the 23rd day, the 12th day, the 16th day, the 67th day, the 21st day, the 19th day, the 98th day, and the 20th day. On those days there will be rain.
Is there any practicality if I say in the next 100 days there's a 9% chance of rain? I'd rather summarize that fact then regurgitate that mouthful. The statement and usage of the worse model is still relevant and has semantic meaning.
This is a example of deliberately choosing a model that is objectively worse then the previous one but it is chosen because it is more practical. In the same way we use approximate models, entropy is the same thing.
Personally I think this is irrelevant to the argument. I bring it up because associating the mathematical definitions of these concepts with practical daily intuition seems to be help your understanding.
> The bayesian sees this […] as the probability changing with more knowledge. The frequentist sees it as […] a correction on what was previously a more wrong probability.
Ok, so the Bayesian starts with one probability (the best available) and ends with another probability (the best available). While the frequentist ends with two probabilities - one more right and one more wrong. Good enough for this discussion, it makes clear that frequentists are also able to use the “more right” probability instead of the “more wrong” probability when they know more. (Of course they can also keep using the “more wrong” probability - we agree that both options exist.)
> The statement and usage of the worse model is still relevant and has semantic meaning.
Sure. But the meaning no longer includes “as far as we know” as it would be in the absence of other knowledge. It’s still relevant but not as much as before. And I still wonder if you really claimed that _nobody_ would say “there is 0% probability of rain in the next ten days” if they knew it with certainty - or maybe I misread your comment.
> it makes clear that frequentists are also able to use the “more right” probability instead of the “more wrong” probability when they know more.
I shouldn't have used the word "more wrong." A more accurate model is the better term. Similar to how general relativity is more accurate then classical mechanics.
>Sure. But the meaning no longer includes “as far as we know”
The definition of any model including entropy doesn't include as far as we know. And usage of any model less accurate or more accurate isn't exclusively based on that phrase.
Less accurate models are used all the time and they summarize details that the more accurate models fail to show. In fact all of analytics is based off aggregations which essentially lose detail and accuracy as data is processed. But the output of this data, however less fine grained summarizes an "average" that is otherwise invisible in the more detailed model.
Entropy is the same thing, it is a summary of the system. Greater knowledge of the system doesn't mean you discard the summary as useless.
To paraphrase from the paper I linked in another reply to you, a thermometer is just a heat bath equipped with a pointer which reads its average energy, whose scale is calibrated to give the temperature T, defined by 1/T = dS/d<E>.
You can read the thermometer if you like, but if you know the exact microstate of the water to begin with, the thermometer reading will tell you much less than you already knew about the water. And precise knowledge of the water's microstate will (theoretically) allow you to extract much more work from that water than you would be able to with only the thermometer reading.