Switch on strings with C++11

Many programmers new to C++, especially those coming from languages that focus on programmer productivity instead of performance per Watt, are surprised by the fact that one cannot use the switch statement with anything other than constant integers, enums or classes that have a single non-explicit integer or enum conversion operator. It’s a fairly reasonable concern – after all, there are lots of use cases for such a feature.

With the introduction of constexpr in C++11 last year it could be expected that such expressions of any type would become legit expression for the case label, but alas, they aren’t. That being said, it’s not impossible to do – at least in some form – or I wouldn’t be writing this post. With use of two very nifty C++11 features a very similar result can be achieved.

Before C++11’s constexpr there was no way to reliably enforce compile-time processing of strings or floating point numbers (it was possible for integers with fairly complicated template magic, but that’s a topic for an another post). Now it’s possible, but with some restrictions, the most important being that it can only have one return statement that contains only literal values or constexpr variables or functions. It cannot have any logic besides that statement either. That being said, the ternary operator can be used, which, along with recursion, is all that is needed to implement any advanced logic.

The second new feature that will be used – user-defined literals – is not strictly necessary, because it’s basically syntax sugar over functions, but it makes everything much more convenient. They allow the programmer to express their intent in an easier and, in my opinion, more clear way. Instead of code that looks like this measurements::kilometers(21.5) or even km(21.5) it’s possible to write 21.5_km.

The case expressions must be convertible to integers or enums at compile time, it’s set in stone by the standard and (excluding compiler-specific extensions) unavoidable. Fortunately, converting a string to a reasonably unique integer is fairly simple using one of modern hashing algorithms. For purpose of this post I used FNV-1a, which is simple to implement, seems to offer good enough hash distribution – I tested it on all combinations of printable characters in strings of length up to 5 and found no collisions – and, most importantly, I already knew it.

The algorithm itself is really simple. Its official home page gives the following pseudo code:

hash = offset_basis
for each octet_of_data to be hashed
	hash = hash xor octet_of_data
	hash = hash * FNV_prime
return hash

My implementation is almost identical, although instead of taking the string length as a parameter, it’ll process byte after byte until the null-terminator is found:

hash_t hash(char const* str)
{
	hash_t ret{basis};
 
	while(*str){
		ret ^= *str;
		ret *= prime;
		str++;
	}
 
	return ret;
}

I think it’s fairly straightforward. Unfortunately, the above function doesn’t pass the requirements to be executed at compile time. To achieve that, all the logic needs to stay in the single return statement. The conversion should be fairly simple to understand as well:

constexpr hash_t hash_compile_time(char const* str, hash_t last_value = basis)
{
	return *str ? hash_compile_time(str+1, (*str ^ last_value) * prime) : last_value;
}

The only restriction here is the minimum recursion depth required by the standard: 512 levels. Any code that expects to be fully standard-compliant cannot use more. In this case that means that strings can’t be longer than 511 characters (plus the null terminator).

With the above it’s already possible to write a working switch statement, but it will look like this:

switch(hash(string)){
case hash_compile_time("first option"): // code
case hash_compile_time("second option"): // code
case hash_compile_time("third option"): // code
}

It doesn’t look pretty, or even readable, does it? Luckily, with help of the previously mentioned user-defined literals it can look like this:

switch(hash(string)){
case "first option"_hash: // code
case "second option"_hash: // code
case "third option"_hash: // code
}

It’s readable, short and easy to write. There are several ways to declare a user-defined literal (for more more information look in ISO/IEC 14482:2011 §13.5.8), but since this one will deal with strings, the following will be the best:

constexpr unsigned long long operator "" _hash(char const* p, size_t)
{
	return hash_compile_time(p);
}

I don’t use the size parameter (although the implementation could be changed to use it), but it’s required in the function signature by the standard.

Here’s the full working code (tested under gcc 4.7.1):

#include <iostream>
#include <iomanip>
#include <type_traits>
 
#include <cstdint>
 
namespace fnv1a_64
{
 
typedef std::uint64_t hash_t;
 
constexpr hash_t prime = 0x100000001B3ull;
constexpr hash_t basis = 0xCBF29CE484222325ull;
 
constexpr hash_t hash_compile_time(char const* str, hash_t last_value = basis)
{
	return *str ? hash_compile_time(str+1, (*str ^ last_value) * prime) : last_value;
}
 
hash_t hash(char const* str)
{
	hash_t ret{basis};
 
	while(*str){
		ret ^= *str;
		ret *= prime;
		str++;
	}
 
	return ret;
}
 
}
 
constexpr unsigned long long operator "" _hash(char const* p, size_t)
{
	return fnv1a_64::hash_compile_time(p);
}
 
void simple_switch(char const* str)
{
	using namespace std;
	switch(fnv1a_64::hash(str)){
	case "first"_hash:
		cout << "1st one" << endl;
		break;
	case "second"_hash:
		cout << "2nd one" << endl;
		break;
	case "third"_hash:
		cout << "3rd one" << endl;
		break;
	default:
		cout << "Default..." << endl;
	}
}
 
int main()
{
	using namespace std;
	constexpr auto const_test{ integral_constant<fnv1a_64::hash_t, "anything"_hash>::value };
 
	//auto const_test2{ integral_constant<fnv1a_64::hash_t, fnv1a_64::hash("anything")>::value };
 
	simple_switch("first");
	simple_switch("uh uh");
	simple_switch("second");
	simple_switch("third");
	simple_switch("another wrong one");
}

If compiled with C++11 compliant compiler, the above code will produce the following output:

1st one
Default...
2nd one
3rd one
Default...

I don’t think it will be very useful, and it’s not the best idea to include that in production code, but I have to say, exploring the new standard has been real fun so far.

5 thoughts on “Switch on strings with C++11

  1. New Clang doesn’t like this:

    constexpr auto const_test{ integral_constant::value };

    integral_constant is not constexpr for some reason. Dunno if that can be done otherwise,or this is simply a bug in clang (clang 4.2)

    1. I can reproduce it with my clang 3.2.

      I’ll look it up. I’m rather hesitant to call “compiler bug!”

    2. you can change it to:
      constexpr auto const_test = integral_constant::value;

      Both work with g++ 4.7.2, but clang 3.2 only seems to accept the second version.

  2. // not sure if your character buffer allows, but here is my version, thought you might enjoy it

    #include
    #include
    #include
    #include
    #include
    #include
    #include

    namespace fnv1a_64
    {

    typedef std::uint64_t hash_t;

    constexpr hash_t prime = 0x100000001B3ull;
    constexpr hash_t basis = 0xCBF29CE484222325ull;

    constexpr hash_t hash_compile_time(char const* str, hash_t last_value = basis)
    {
    return *str ? hash_compile_time(str+1, (*str ^ last_value) * prime) : last_value;
    }

    hash_t hash(char const* str)
    {
    hash_t ret{basis};

    while(*str){
    ret ^= *str;
    ret *= prime;
    str++;
    }

    return ret;
    }

    }

    constexpr unsigned long long operator “” _hash(char const* p, size_t)
    {
    return fnv1a_64::hash_compile_time(p);
    }

    char *
    swap_random_chars ( char * result_buffer, const char * word ) {

    using namespace std::chrono;

    time_point now = system_clock::now();
    int word_length = strlen( word )
    , swap_char;
    long epoch = duration_cast( now.time_since_epoch() ).count()
    , offset_1 = epoch % word_length
    , offset_2 = ( epoch / word_length ) % ( word_length – 1 );

    if ( offset_1 <= offset_2 ) offset_2++;

    strcpy( result_buffer, word );
    swap_char = *( result_buffer + offset_1 );
    *( result_buffer + offset_1 ) = *( result_buffer + offset_2 );
    *( result_buffer + offset_2 ) = swap_char;

    return result_buffer;
    }

    int
    usage ( int return_value )
    {
    const char program_name[] = "switch-on-strings-with-c11";
    char swap_buffer[sizeof(program_name) + 1];

    printf( "Usage: %s FORMAT\n", swap_random_chars( swap_buffer, program_name ) );

    return return_value;
    }

    void simple_switch(char const* str)
    {
    using namespace std;
    switch(fnv1a_64::hash(str)){
    case "first"_hash:
    cout << "1st one" << endl;
    break;
    case "second"_hash:
    cout << "2nd one" << endl;
    break;
    case "third"_hash:
    cout << "3rd one" << endl;
    break;
    case "-?"_hash:
    case "-h"_hash:
    case "-u"_hash:
    case "–help"_hash:
    case "–usage"_hash:
    usage( 0 );
    break;
    default:
    exit( usage( 1 ) );
    }
    }

    int main( int argc, char * argv[] )
    {
    using namespace std;
    constexpr auto const_test{ integral_constant::value };
    int argi = 0;

    //auto const_test2{ integral_constant::value };

    while ( ++argi < argc ) simple_switch(argv[argi]);
    simple_switch("first");
    simple_switch("uh uh");
    simple_switch("second");
    simple_switch("third");
    simple_switch("another wrong one");
    }

Leave a Reply

Your email address will not be published.