r0bin.xyz

Home About Blog Projects

C is Still Fun: Part I

Oct 18, 2024 6 min tech c coding library +2 liblcu utils

Here goes my first attempt at a two part post. No pressure, right? Not so recently I wrote a C library called liblcu, and it hit me, C is still pretty fun. Don’t get me wrong, C is old. When you’re writing stuff in C, you can practically feel it asking you to turn down the music and get off its lawn. But it’s not the syntax that’s the issue. I’m one of those weirdos who gets a little too excited about C syntax (don’t judge). It’s more that C comes with its own special brand of programmer anxiety. You know, those moments when you’re writing code, thinking, “Wait, am I supposed to be freeing this buffer? Better triple check cuz I don’t remember.” That kind of stuff gives C a certain… let’s call it “vintage charm.” But despite all that, it’s still fun to use.

C let’s you do wacky stuff like this:

#include <stdio.h>

union {
    long x;
    void (*func)();
    char str[8];
} a;

void wacky() { printf("wacky stuff\n"); }

int main() {
    // Section 1
    a.x = (long)&wacky;
    (*a.func)();

    // Section 2
    for (long i = 0, p = (long)&printf; i < sizeof(void *); i++)
        a.str[i] = p >> (i * 8);

    (*a.func)("%p says: what...\n", a.x);

    // Section 3
    a.func = &wacky;
    //a.func = &printf;

    return 0;
}

And don’t worry, you can compile it with warnings turned on and still sleep soundly at night:

gcc -Wall -Wextra -Wpedantic test.c

Here’s the output from running the program:

wacky stuff
0x7f72eceb81c0 says: what...

Because of ASLR, the address for printf will look different, but that’s the output you can expect to see on most systems.

That code snippet is a great example of what makes C both fun and terrifying. Fun in the sense that the language and its ecosystem do not hold your hand. It lets you wander off into the forest with nothing but your red cape and wicker basket. The same qualities that make C fun can lead to some serious headaches in production. Let’s break down why that snippet is cursed.

We’ve got a union with three fields. One of those fields is a pointer to a function that returns void and takes no arguments, which we’ve named func.

If you’re familiar with unions in C, you’d know that the size of the union is equal to the size of its largest member. In this case, our union is 8 bytes. You’d also know that all the members of the union share the same region in memory.

In main(), we do this:

// Section 1
a.x = (long)&wacky;
(*a.func)();

Now, this is strange, but not too strange. As we just discussed, all members of a union share the same region in memory. We’re grabbing the address of wacky(), casting it to long and assigning it to a.x. Notice that a.func was never assigned, but because it shares the same memory region as a.x it’s Ok to do (*a.func)();. Since we know that the number stored in a.x is the address of wacky(), it’s virtually the same as doing this:

//...
a.func = &wacky;
(*a.func)();
// ...

If you’re familiar with function pointers in C, you’d know that they must match the signature of the functions they point to. This means the return type and the parameter types must be the same. We want a.func to point to functions with the signature void name(), so naturally, the above snippet is fine since wacky() fits the bill.

Yeah, maybe the compiler isn’t smart enough to warn us about calling a.func without ever assigning it, but maybe it’s not as clueless as we think. It might realize that we grabbed the address of wacky() and that’s now sitting in the same memory space as a.func, so there’s technically no harm done. Still, I think it should probably give us a heads up, especially when I’m compiling with -Wextra -Wpedantic. Something like, “Hey, you see this function pointer you’re about to call? Sure, it’s in a union, but you never assigned it. Be careful!”

And now the cursed part:

// Section 2
for (long i = 0, p = (long)&printf; i < sizeof(void *); i++)
    a.str[i] = p >> (i * 8);

(*a.func)("%p says: what...\n", a.x);

So, what’s happening here? Since a.str is a char array with a size of 8, we can access each byte individually. Here, we’re just copying each byte of the address of printf() into it, which, of course, is in the same memory space that a.x and a.func share. Simple. Not too different from when we assigned the address of wacky() to a.x, and called a.func.

Hold up, what’s the function signature of printf() again? Isn’t it int printf(const char *__restrict__, ...)? Didn’t we say that function pointers must match the signature of the functions they point to? Yup, we’re cheating, and the compiler is just letting it slide. Nothing is throwing a fit, or slapping us hard on the wrists. First, we called a function through a.func, which was void wacky(), and then we changed to int printf(const char *__restrict__, ...), without issues! Isn’t that fun? And isn’t it horrifying that the only way to catch this is by looking through the code?

Earlier I mentioned that Section 1 was fine because it’s basically the same as doing a.func = &wacky;, but things have taken a turn. If you try uncommenting the second line in Section 3:

// Section 3
a.func = &wacky;
//a.func = &printf;

You’ll run into this error:

test.c: In function ‘main’:
test.c:21:12: error: assignment to ‘void (*)()’ from incompatible pointer type ‘int (*)(const char * restrict, ...)’ [-Wincompatible-pointer-types]
   21 |     a.func = &printf;
      |            ^

Yet, we were able to call printf() from a.func in Section 2. Weird… or what the cool kids call “undefined behavior”.

Yeah, C is a lot of fun. I’m working on a Part II for this post that should be just as entertaining, where we’ll dive into liblcu, and some of the wacky issues I ran into while writing it.

Stay tuned by subscribing via RSS!