This makes me super happy that my C array types are only 16 bytes.

flohofwoe · on Jan 5, 2024

This will still spill on the stack when compiling with MSVC (the cutoff size is 8 bytes there).

gavinhoward · on Jan 5, 2024

Oh, thank you for letting me know.

saagarjha · on Jan 5, 2024

Where do you put the capacity?

alcover · on Jan 5, 2024

  struct Array{void *ptr; uint64_t cap;}
  sizeof(struct Array) //16

norir · on Jan 5, 2024

I suspect the OP expected something like:

  struct Array{void *ptr; uint64_t len; uint64_t cap;}
  sizeof(struct Array) //24

At any rate, a simple way to get to 16 bytes is the following:

  struct Array{void *ptr; uint32_t len; uint32_t cap;}

How often does one really need a general array with more than 4B elements?

alcover · on Jan 5, 2024

You are right. My reply was a bit silly.

Now thanks to alignment you could use spare bits in the pointer to afford bigger lengths.

  struct Array{int64_t ptr:60; uint64_t len:34, cap:34;}

sesuximo · on Jan 6, 2024

1) Maybe you could store the bitwise-or of size and capacity on top of each other. If you restrict capacity to a power of two

2) you could store the capacity before the array payload

gavinhoward · on Jan 5, 2024

My array types are not meant to be changed, so no capacity is needed. I have separate resizable ones.

tialaramex · on Jan 5, 2024

When you say "array types" it seems like people expected you meant a growable array (C++ std::vector, Rust's Vec, or say Java's ArrayList)

But if you meant an actual array as in Rust's [T; N] then it's weird to talk about them as if they've got some specific size, their size is a parameter (N) of the type.

The size of Rust's [u8; 16] or C's unsigned char [16] is 16 bytes but like, duh. And there's no magic here, [u8; 24] or unsigned char [24] is 24 bytes.

gavinhoward · on Jan 5, 2024

In my code, my array types are a pointer and a length, regardless of the length of the array.

This is so I can use regular C pointers to pass them to system functions that expect pointers. Because C still uses just pointers. But I have the length for bounds checking in my own code.

But my resizable types are much bigger. Probably 32 bytes because they store a destructor too. I pass them by pointer because plain pointers are almost always just one item, meaning no bounds checking is necessary.

Yes, that does mean the actual array is two indirections away, but that style gives me a lot of safety because araay indexing is a code smell.

tialaramex · on Jan 5, 2024

This seems like a lot of lost optimization opportunities compared to a language which gets this right out of the box.

Also in most cases people's destructors don't have associated local state, so, they needn't take up space in each object. All C objects have non-zero size, so if you have an object representing the destructor even if it has no state that takes up space. In C++ there's a hack to avoid paying this price, but in C there is not.

gavinhoward · on Jan 5, 2024

> This seems like a lot of lost optimization opportunities compared to a language which gets this right out of the box.

Well, yeah, but I hate all other languages [1], and I'm willing to pay the price in C.

That same sort of thing allowed me to implement RAII in C, though.

[1]: https://gavinhoward.com/2023/02/why-i-use-c-when-i-believe-i...

avgcorrection · on Jan 5, 2024

Yeah, it looks like this would be problem for Rust’s `String` (owned) although not `&str` (borrowed).

tialaramex · on Jan 5, 2024

Inside Rust it doesn't matter, because Rust gets to pick its own ABI rules.

So this only matters for the FFI case and it's probably usually a bad idea to give a mutable String (as opposed to a read-only &str) to foreign language code.

Likewise for Vec<T> and &[T] indeed underneath Rust's String is literally Vec<u8> and &str is literally &[u8] but in both cases with the explicit requirement that the bytes are valid UTF-8 text.

chlorion · on Jan 5, 2024

I'm a little bit confused on what this means.

By array I guess you mean a dynamically sized, but not dynamically-resizable pointer and size/length pair? So you aren't storing the "capacity" like Rust's Vec would be doing.

You can do the same thing in Rust with Box<[T]> if I am understanding you correctly. Box<[T]> in this specific case is a fat pointer, which is a pointer to the underlying allocation and a length.

One of the issues with Box though, at least as of the last time I looked at it, is that the only way you could create a Box<[T]> without unsafe or nightly was to create a Vec, and then call into_boxed_slice on it. The conversion from Vec to Box actually causes a new allocation to be created if the size and capacity fields in the Vec are not equal. In C it would be possible to reuse the Vec's buffer, but dealloc (and all other alloc related functions it seems) in Rust requires passing in information about the layout the of the underlying allocation, and size is part of the layout!

In C++ I guess you would want to use std::unique_ptr<T[]>, and manually store the length, which is not great but still works. I'm not sure if unique_ptr is guaranteed to be the size of a pointer or not, so this may or may not work.

Regular statically-sized arrays are the same size in each language ofc. In Rust and C++ statically-sized arrays do have one benefit though, you get bounds checking on them "for free". Since the length is baked into the type, the accessor methods know the bounds at compile time, so no runtime length information is required to do runtime bounds checking! You can do this in C by hand for each array you define but that is impractical, you could probably use a macro to do this though.

If I did understand what you meant, I think this brings up an interesting topic. Having a first-class dynamically-sized non-resizable array type can be pretty useful! Rust already can do this awkwardly with Box, but C++ currently doesn't have a way to do this without manually storing the length and doing manual bounds checking that I know of. It would not be terribly difficult to implement this and there probably are libraries that exist for it, but I still thing it's interesting.

# comment for Vec::into_boxed_slice https://doc.rust-lang.org/src/alloc/vec/mod.rs.html#1075

# rust std::alloc::dealloc requires std::alloc::Layout https://doc.rust-lang.org/std/alloc/fn.dealloc.html

gavinhoward · on Jan 5, 2024

I meant static arrays.

I have to use a pointer+length pair because many system functions in C require a pointer, but I want bounds for bounds checking.

Yes, I do my own arrays in C.

I also have dynamic arrays, but those are not 16 bytes, and I treat them differently.

chlorion · on Jan 5, 2024

Do you have your own array library that you use in your projects? If so that's pretty cool and it would be interesting to see.

Edit: i looked at your profiles information and have discovered lots of fun stuff

gavinhoward · on Jan 5, 2024

Thank you!

As you have probably discovered, my array stuff is just part of a monorepo. But yes. :)

Edit: The code at the Yzena one is not up to date because I've had to keep commits local. But I have 1200+ commits since, and I plan to make them public in April.