Posted Joe Chu cpp5 minutes read (About 703 words)0 visits
Performance Gains with Union
The power of Unions for efficient memory usage and performance improvement.
1. Union Basics
In C++, a union is a special data structure that allows multiple members to share the same memory location. All members of a union start at the same memory address, meaning that at any given time, only one member of the union can hold a value. This contrasts with a struct, where each member has its own memory location.
The memory allocated for a union is equal to the size of its largest member, and all members share that same memory space.
Only one member can hold a valid value at any time, though the union can store different types of values across its members.
2. Use Case
Here we have a typical senario where we can use union to enhance the cod efficiency. Our goal is to assign values to non-trivial types like int, float, and bool. In the less efficient implementation, we dynamically allocate memory for each data type. However, in the optimized version, we use a union to hold all the required variable types. This approach allows different types of variables to share the same memory address, eliminating redundant memory allocations.
voidperformanceTestRegular(){ std::vector<RegularData> data; auto start = std::chrono::high_resolution_clock::now(); // Perform operations using the union for (int i = 0; i < numOperations; ++i) { switch (i % 4) { case0: // Use int data data.emplace_back(1); break; case1: // Use float data data.emplace_back(1.f); break; case2: // use bool data data.emplace_back(true); break; case3: // Use string data data.emplace_back("hello"); break; default: break; } } auto end = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start); std::cout << "Performance Test (Regular):\n"; std::cout << "Time taken for " << numOperations << " operations: " << duration.count() << " milliseconds\n"; }
voidperformanceTestOptimized(){ std::vector<OptimizedData> data; auto start = std::chrono::high_resolution_clock::now(); // Perform operations using the union for (int i = 0; i < numOperations; i++) { switch (i % 4) { case0: // Use int data data.emplace_back(1); break; case1: // Use float data data.emplace_back(1.f); break; case2: // use bool data data.emplace_back(true); break; case3: // Use string data data.emplace_back("hello"); break; default: break; } } auto end = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start); std::cout << "Performance Test (Optimized):\n"; std::cout << "Time taken for " << numOperations << " operations: " << duration.count() << " milliseconds\n"; }
Performance Test (Optimized): Time taken for 1000000 operations: 74 milliseconds Performance Test (Regular): Time taken for 1000000 operations: 96 milliseconds
~23% performance improvement by using union.
3. Conclusion
The optimized version using a union is faster because:
Regular version uses new, requiring dynamic allocation on heap memory. The optimized version uses a union which pre-allocates fixed space on the stack. Heap allocations are much slower than stack operations.
Dynamically allocated memory is scattered across the heap, making memory access less predictable and potentially resulting in cache misses. The optimized version stores data within the object, making it contiguous in memory and improving cache locality.
In regular version, every access requires dereferencing the pointer, introducing additional overhead. Copying or moving a RegularData object involves duplicating or reassigning pointers, potentially leading to more heap operations. The union stores the data inline, so no pointer dereferencing is required. Copying or moving OptimizedData objects is more efficient because it only involves copying the fixed-size union.