Huffman encoding should be a mandatory topic you can explain before you are allo...

alexjeffrey · on Feb 25, 2013

technically your example is incorrect unless the same attribute name is used quite a few times. Regardless, premature optimisation is the root of all evil, especially in places that are highly unlikely to be a bottleneck and even more so when it's used to remove valuable documentation.

gte910h · on Feb 26, 2013

The particular example I'm referring to, it is used over 100 times.

BlackAura · on Feb 26, 2013

Huffman coding alone wouldn't do a lot in that example. Huffman coding lets you build a prefix code to pack symbols into variable length sequences of bits, depending on the frequency of the symbol. A dictionary encoder, like LZ77, would replace repeated sequences by references, and that would make the two variable names equivalent. Deflate uses Huffman coding to pack the LZ77 compressed data using as few bits as possible, making it more efficient, but LZ77 is the part removing the redundancy.

I agree with the general sentiment though. Compression isn't magic, and understanding how it works helps you to work with compressed formats.

Writing your own implementation of some compression algorithms is a lot of fun to. I learned a lot from implementing Huffman coding and an LZ77 variant a few years ago.

gte910h · on Feb 26, 2013

>Huffman coding alone wouldn't do a lot in that example. Huffman coding lets you build a prefix code to pack symbols into variable length sequences of bits, depending on the frequency of the symbol. A dictionary encoder, like LZ77, would replace repeated sequences by references, and that would make the two variable names equivalent. Deflate uses Huffman coding to pack the LZ77 compressed data using as few bits as possible, making it more efficient, but LZ77 is the part removing the redundancy.

That's how gzip works, but all you have to know that gzip WILL be packed via huffman coding in a compression algorithm, and you will be able to conclude you can have a far longer variable name length with no cost.

The actual algorithm itself is a far less common topic than huffman coding.

JoshTriplett · on Feb 25, 2013

While I do agree that web APIs should keep compression in mind, as the user of the API I don't necessarily want to type aReallyNiceLongNameThatDescribesTheAttribute either. I'd rather see a word or two than a letter, but I'd rather see a letter than a sentence.

gte910h · on Feb 26, 2013

I was more being hyperbolic. I don't want to see sentences either, but I'm okay with fullDesignSpec. No reason to call that "s"

nathanlil13 · on Feb 25, 2013

This seems fitting.

There are only two hard things in Computer Science: cache invalidation and naming things.

-- Phil Karlton

GuiA · on Feb 26, 2013

"...and off by one errors"

:)

nathanlil13 · on Feb 26, 2013

I had a very heated argument with a friend on saterday about 0 indexed arrays. God damn I hate them.

defrost · on Feb 26, 2013

Just stop using the word index and start using the word offset. Problem solved.

The notion that the first brick in a run has an offset of 0 as does the first upright stud in a frame is something that's been basic to bricklayers and carpenters for a very long time.

Sometimes it's natural to refer to elements by their position, or offset, other times it's natural to talk about sequence numbers or indexes.

gte910h · on Feb 26, 2013

They're not indexed. That's an offset.

Naming things. Again.

1 indexed arrays are truly indexed :D