19 October 2015

Parameterize by data member in C++

(For the C++ programmers in the audience.)

This is the story of me finding an unused tool at the bottom of my C++ toolbox and figuring out what it did.

I started with some structures that looked like this. (And, of course, changing the structures was not an option.)

struct Foo {
mint a;
mint b;
mint c;
mint d;
m/*etc*/
};

const int foo_count = 10;

struct Bar {
mFoo foo[foo_count];
m/*etc*/
};

I needed to write some code that looked something like this.

int foo_calc_a(const Bar& bar1, const Bar& bar2)
{
mint result = 0;
mfor (int i = 0; i < foo_count; ++i)
mmresult += bar1.foo[i].a - bar2.foo[i].a;
mreturn result;
}

But I needed a function like that for multiple members of Foo, and I didn’t want to copy & paste that function for each data member.

I could do it the old school way using offsetof. (I’ve fancied it up with C++-style casts and a C++11 lambda.) And then we have an example call for member a.

int foo_calc(const Bar& bar1, const Bar& bar2, std::size_t offset)
{
mauto f = [&offset](const Foo& foo) {
mmreturn *(reinterpret_cast(
mmmreinterpret_cast(&foo) + offset));
m};
mint result = 0;
mfor (int i = 0; i < foo_count; ++i)
mmresult += f(bar1.foo[i]) - f(bar2.foo[i]);
mreturn result;
}

auto result = foo_calc(bar1, bar2, offsetof(Foo, a));

Of course, we get no help from the compiler here since we’ve told it, “Trust me!”. (While writing this version, I did write two bugs that the compiler missed. The improved version of the function—which I’ll show you later—gave the right answer as soon as I got it to compile.)

That works, but surely we can do better! I tried a few other solutions, but we’ll skip to the one I settled on.

While I was reading something else, I noticed a mention to “pointer to data member” that suggested it was more than what I thought it was at face value. This lead me to...

The operators −>∗ and .∗ are arguably the most specialized and least used C++ operators.

—Bjarne Stroustrup, The C++ Programming Language (4th edition), §20.6

The funny part was that I immediately recognized that I’d read this section before, though clearly I never fully grokked it.

You get a PMD (pointer to member data) with the ampersand operator.

&Foo::a

What type is it? It is a “int Foo::*”. A pointer to an int member of Foo.

int Foo::* pmd = &Foo::a;

The thing is, this isn’t really a pointer. It is an offset. Although it is a typed offset. Like any offset, we need a pointer to an instance to “add it to” to make it a pointer. How do we do that? With those specialized operators mentioned above.

Foo foo;
Foo* p_foo = &foo;
foo.*pmd;
p_foo->*pmd;

...and thus, foo_calc becomes...

int foo_calc(const Bar& bar1, const Bar& bar2,
mint Foo::* pmd)
{
mint result = 0;
mfor (int i = 0; i < foo_count; ++i)
mmresult += bar1.foo[i].*pmd - bar2.foo[i].*pmd;
mreturn result;
}

auto result = foo_calc(bar1, bar2, &Foo::d);

Viola!

But at what cost? You should, of course, measure for your own environment. For my code, this was no slower than using offsetof. There’s really no reason for it to be, since it is essentially the same thing. Just with some different syntax and the compiler checking your work more.

So, if you’re tempted to use offsetof, use a “offsetpointer to member data” instead.

No comments: