# Presenters manual Pressing **C** will open a cloned view of the current slideshow in a new browser window. Pressing **P** will toggle presenter mode. --- class: center, middle # Switch für Strings ##### _ Sven Johannsen
sven@sven-johannsen.de
www.sven-johannsen.de --- class: center, middle # Switch für Strings ##### with constexpr hash functions Sven Johannsen
sven@sven-johannsen.de
www.sven-johannsen.de --- # The Problem ```cpp std::string foo(); ... switch(foo()) { case "Hello": cout << "Hello\n"; break; case "World": cout << "World\n"; break; default: cout << "something else...\n"; } ``` --- # swich statement (from cppreference) ![switch](media/cppreference-switch.png) ***condition*** - any expression of integral or enumeration type, or of a class type contextually implicitly convertible to an integral or enumeration type, or a declaration of a single non-array variable of such type with a brace-or-equals initializer. ***constant_expression*** - a **constant expression** of the same type as the type of **condition** after conversions and **integral promotions** --- # Howto convert a string to: ### a **expression** of **integral** or enumeration type? --- # Howto convert a string to: ### a **expression** of **integral** or enumeration type? with a hash function! --- # Howto convert a string to: ### a **expression** of **integral** or enumeration type? with a hash function! ### a **constant expression** of the same type? --- # Howto convert a string to: ### a **expression** of **integral** or enumeration type? with a hash function! ### a **constant expression** of the same type? with a constexpr hash function! --- # The Idea very simple approach (need some tweaks ;-) ) ```cpp std::string foo(); std::size_t constexpr myHash(std::string_view); ... switch(myHash(foo())) // evaluate a string at runtime / define the type { case myHash("Hello"): // evaluated the string "Hello" at compile time cout << "Hello\n"; break; case myHash("World"): cout << "World\n"; break; default: cout << "something else...\n"; } ``` --- # Simple constexpr hash function ```cpp size_t constexpr myHash(std::string_view val) { size_t hash = 0; for (const auto c : val) { hash = 31 * hash + static_cast
(c); } return hash; } ``` Hash functions from Java and Qt in C++14. (not from the original sources) --- # Nicer syntax, (but need still some tweaks) ```cpp std::size_t constexpr myHash(std::string_view); size_t constexpr operator "" _myHash(const char* val, size_t len) { return myHash(std::string_view(val, len)); } ... switch(myHash(foo())) // evaluate a string at runtime / define the type { case "Hello"_myHash: // evaluated the string "Hello" at compile time cout << "Hello\n"; break; case "World"_myHash: cout << "World\n"; break; default: cout << "something else...\n"; } ``` --- # Hash Collision Hash Collision: 2 different string are representated by the same hash value. ##### Why? Hash function reduces arbitrarily long strings to integer values. More then 2^32 combinations possible for strings longer then 4 characters. (More then 2^64 combinations possible for strings longer then 8/10 characters.) Example: ``` 4 character long strings => 256^4 combinations. (256^4 == 2^32) ``` Imperfect hash function makes collisions more likely. --- # Hash Collision ```cpp size_t constexpr myHash(std::string_view val) { size_t hash = 0; for (const auto c : val) { hash = 31 * hash + static_cast
(c); } return hash; } ``` ```cpp "BB" 31 * 0 + 66 = 66 // B 31 * 66 + 66 = 2112 // BB "Aa" 31 * 0 + 65 = 65 // A 31 * 65 + 97 = 2112 // Aa ``` --- # Hash Collision (A) ```cpp switch(myHash("Aa")) { case myHash("BB"): cout << "BB\n"; break; default: cout << "something else...\n"; } ``` ``` > BB ``` --- # Fix Hash Collision (A) ```cpp switch(myHash(val)) { case myHash("BB"): if (val != "BB") goto hash_default; cout << "BB\n"; break; case myHash("World"): if (val != "World") goto hash_default; ... hash_default: default: cout << "something else...\n"; } ``` --- # Reduce boilerplate code (A) ```cpp #define STRCASE(str, val) case myHash(str): if (val != str) goto hash_default; #define STRDEFAULT hash_default: default switch(myHash(val)) { STRCASE("BB", val) cout << "BB\n"; break; STRCASE("Hello", val) cout << "Hello\n"; break; STRDEFAULT: cout << "something else...\n"; } ``` --- # Hash Collision (B) ```cpp switch(myHash(val)) { case myHash("Aa"): cout << "Aa\n"; break; case myHash("BB"): cout << "BB\n"; break; default: cout << "something else...\n"; } ``` ``` .../main03.cpp:23:10: error: duplicate case value '2112' case myHash("BB"): ``` --- # Fix Hash Collision (B) ```cpp switch(myHash(val)) { case myHash("Aa"): static_assert(myHash("Aa") == myHash("BB")); if (val == "Aa") cout << "Aa\n"; else if (val == "BB") cout << "BB\n"; else goto hash_default; break; case myHash("Hello"): ... hash_default: default: cout << "something else...\n"; } ``` Very unlikely --- # Find a better hash function? Long strings requires additional checks. It's more important to have a fast hash function. The additional checks fixes imperfect hash function. --- # Alternative constexpr hash algorithms * Java & Java (C++11) * djb2 & djb2a (Dan Bernstein) * sdbm * fvn1a (Fowler, Noll and Vo) with spezializations for 32 and 64 [hash.h](./hash.h) --- # Alternative use case for constexpr hash functions ```cpp enum struct UIEventType : uint32_t { unknown = 0, enter = id32("enter"), exit = id32("exit"), press = id32("press"), release = id32("release"), scroll = id32("scroll"), move = id32("move"), key = id32("key"), text = id32("text") }; ``` Idea from Sven Bergström. (https://notes.underscorediscovery.com/constexpr-fnv1a/) --- # A view words about std::hash * Spezialization for `std::string` and `string_view` generate the same results * No spezialization for `char*`! -> fallback to `void*` * No `noexcept` (see `unordered_???::erase()`) * Const but **not `constexpr`** --- # Links * [SO: Compile time string hashing / Java and Qt hash functions https://stackoverflow.com/questions/2111667/compile-time-string-hashing](https://stackoverflow.com/questions/2111667/compile-time-string-hashing) * [SE: Which hashing algorithm is best for uniqueness and speed? https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed](https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed) * [oz: djb2, sdbm, lose lose http://www.cse.yorku.ca/~oz/hash.html](http://www.cse.yorku.ca/~oz/hash.html) * [oz blog http://nextbit.blogspot.com/](http://nextbit.blogspot.com/) * [Sven Bergström - underscorediscovery - fnv1a https://notes.underscorediscovery.com/constexpr-fnv1a/](https://notes.underscorediscovery.com/constexpr-fnv1a/) * [Sven Bergström gist https://gist.github.com/underscorediscovery/81308642d0325fd386237cfa3b44785c](https://gist.github.com/underscorediscovery/81308642d0325fd386237cfa3b44785c) * [Sven Johannsen Switch für Strings http://sven-johannsen.de/slides/string-switch20181108/string-switch.html#2](http://sven-johannsen.de/slides/string-switch20181108/string-switch.html#23) --- # Questions? ```cpp hash_Java( "Questions")=72396190474541 hash_djb2( "Questions")=249859238103114832 hash_djb2a("Questions")=249846534941830808 hash_sdbm( "Questions")=14239328396920791565 hash_fnv1a("Questions")=3484669616083914402 std::hash( "Questions")=16616952683238146753 hash_Java( "?")=63 hash_djb2( "?")=177636 hash_djb2a("?")=177626 hash_sdbm( "?")=63 hash_fnv1a("?")=12638141021067257134 std::hash( "?")=5353594561807755293 ```