> I thought entropy (in the Shannon sense) was a property of discrete and finite...

ogogmad · on April 28, 2022

One of the properties of entropy H(X) of a random variable X is that if f is a bijective function then H(f(X)) = H(X).

For relative entropy (or "KL divergence" as some people call it), we have that H(X||Y) = H(f(X)||f(Y)). But if you fix Y to have a continuous uniform distribution, then you lose this critical property because f(Y) may no longer have a continuous uniform distribution.

kgwgk · on April 28, 2022

Apparently this "critical property" is not so important to all the people who use relative entropy as a generalization to a continuous distribution defined on a space with an underlying measure.

Why would they care about arbitrary transformations mapping points in the space to other points in the space?

im3w1l · on April 28, 2022

What I think it means, is that if you take two different parametrizations of the same physical phenomenon, then you get two different entropy values.

E.g. if you have a bunch of particles with fixed mass. You could look at the distribution of speeds and get one entropy. Then the distribution of kinetic energy (basically speed squared). Uniform speed means non-uniform speed squared so the entropies would disagree.

This sounds like it could pose issues.

kgwgk · on April 28, 2022

Physical entropy is defined from the probability distribution over states. Velocities or squared-velocities are not states, they are derived quantities. Points in a phase space would describe states. Physical states are discrete anyway when you consider quantum physics :-)

As for the entropy of probability distributions in general, I think relative entropy is invariant under reparametrizations because both the probability of interest and the reference probability transform in the same way [1]. But I don't remember what does it mean exactly. [And I am not sure if that makes ogogmad wrong, I may not have understood well his comment.]

([Edit: forget this aside. You probably were talking about speeds as positive magnitudes.] By the way using an example analogue to yours discrete entropy wouldn't be invariant either: if you have a distribution {-1,1} and square it it collapses to a zero-entropy singleton {1}.)

[1] https://en.wikipedia.org/wiki/Kullback–Leibler_divergence#Pr...

omegalulw · on April 28, 2022

+1. The commenter above also wanted cared about bijective mappings, and squaring a random variable in [-1, 1] is not bijective. Squaring a random variable defined over positive real numbers would lead to a bijective mapping and the distribution would still remain uniform.

Actually, I find it hard to come up with a bijective mapping that leads to a non uniform distribution that's useful for anything practical.

im3w1l · on April 29, 2022

Ok so first to have a uniform distribution we have to have a bounded set. Maybe you can do something clever with limits but lets not overcomplicate things. Lets say we have 0 <= v < 10. Define E = v^2. Then 0 <= e < 100

Uniformity of v would mean that p(0 <= v < 1) = 1 / 10

Uniformity of E would mean that p(0 <= E < 1) = 1 / 100

But by construction p(0 <= v < 1) = p(0 <= E < 1). So it's not possible for both to be uniform.

omegalulw · on April 29, 2022

It's not necessary to have p(0 <= v < 1) = p(0 <= E < 1). Only that P(f(X)) is uniform.

But this does bring up a good point. H(X||Y) = H(f(X)||f(Y)) for any bijective f if the distributions are discrete. When they are continuous this is not true, even with a bijective f. For example f = x^2 doesn't work even though it yields a binary distribution. Interestingly however, affine transformations work.

im3w1l · on April 29, 2022

(For non-negative v) v^2 is less than one if and only if v is less than one. That's why the probabilities have to agree in our specific case.

lambdatronics · on April 29, 2022

Yeah, you also have to transform the "reference" function, and then the entropy stays the same. I prefer to think of it as the "density of states" -- it's necessary to make the argument of the logarithm dimensionless, after all.