In this section, we are going to measure and compare the delays that several garbage collections bring to the application program.
Traditional garbage collector will suspend your application threads, either all at once or one by one, during the garbage collection. The pause times varies from time to time based on different application environments, such as the depth of execuction stack, the number of application threads, the workload to be done, etc. In this testing, we choose Microsoft .NET framework C++ with CLR (Common Language Runtime) as a typical sample of state of art collector in the industry, and choose Boehm-Demers-Weiser garbage collector as another popular conservative collector in C/C++ to compare with our HnxGC.
By design, HnxGC should perform garbage collection either in a dedicated thread, or in the current thread when the thread explicitly invokes some HnxGC system services, e.g. using gcnew to create an object. So, if an application thread is *not* in a HnxGC system services, then it should not be disturbed by garbage collection. These un-disturbed tasks include all native operations, and pointer assignments that change the graph of reference relationship.
During pointer assignment operations, the target object may become zero-referenced and ready for reclamation. Destructing the object may cause more objects and descendants become ready for reclamation, and this nesting destructing scenario may last indefinitely long time. HnxGC provide API to postpone the destructions to be done at the next explicit HnxGC service calls of the same thread, or done at a global level. Therefore, a time-critical thread that has declared a such postpone request can run without interferences, and free to change reference relationship without worries about any delays due to the zero-referenced objects.
Notice - HnxGC does not guarantee a HnxGC service will return in a definite time, for example, every time when you use gcnew() to create an object, please keep in mind that it may block indefinite long if the operating system has not enough physical memory and need to swap some data to disk. Therefore, we don't recommend creating objects in a time critical thread, you should create objects in other less-critical threads and pass them to the time-critical thread.
To measure the delays from garbage collection, we create the following testing program: use timeSetEvent starts a multimedia timer that pulses a specific event object at every 5 milliseconds with 1 millisecond resolution (which is the highest resolution we can get from our off-the-shelf computer); a time-critical thread with the highest real-time priority is waiting on that event object, when the timer fires, the critical thread records down the time in a large internal data buffer that can be check later, and return.
The time-critical thread avoids any operations that may block, such as input/output operations, or file operations. It just write some data to a pre-allocated memory buffer (hopes there is no page-faults and disk swaps occur). The current process is also set to the highest priority REALTIME_PRIORITY_CLASS to eliminate impacts from other processes of the same system. All these codes are defined in the class CDelayTester (see the source code: benchmarks / Pauseless / DelayTester.cpp).
Now, begin the testing.
(1) IDEAL GC --- First, we builded and tested an idle program that just sleeps and waits for user inputs, do *nothing* with garbage collection. The resulting represents what an ideal garbage collector can achieve in the current testing environment. (source code: benchmarks / pauseless / ideal.cpp)
(A portion of Ideal testing result, click to see
full data,
full graph)
Explanation: the height of each green bar represents the length of each interval of the timer, which theoretically should be 5 milliseconds. The maximum value of Y-axis in this graph is 100 milliseconds. The peak (maximum) length of interval is 5.999007 milliseconds due to the timer resolution of 1 millisecond and the inaccuracy of measurement.
(2) .NET GC --- We let the .NET CLR program keep creating different level of trees and discarding them as GCBench does. The result shows that the impacts from .NET collector are so obvious and cannot be ignored. (source code: benchmarks / pauseless / msclr.cpp)
(A portion of Microsoft .NET CLR testing result, click to see
full data,
full graph)
(3) BDW GC --- We let the Boehm-Demers-Weiser testing program do the same work as .NET CLR. The result shows that the impacts from garbage collector are still observable. (source code: benchmarks / pauseless / bdwgc.cpp)
(A portion of Boehm-Demers-Weiser GC testing result, click to see
full data,
full graph)
(4) Finally, we tested our HnxGC.
We added more workloads to the testing program. A thread was doing GCBench; and other two threads keep creating objects for the time critical thread to consume; At every time events, the time critical thread consume these objects in the CriticalConsumer function by releasing the references to these objects. Thus, the time critical thread was changing the reference relationship graph, and makes some objects become garbage.
To simply the complexity of the communications between the realtime critical thread and the normal low-priority threads, we use lock-free techniques to share a global pointer among these threads by using the InterlockedExchangePointer windows service.
The result looks perfect while considering the current inaccuracy of measurement, almost the same as a ideal testing graph. Different type of managed HnxGC objects have been tested and got no much difference. Herein, we've just posted one of them to save the web space for the reason that they are almost the same.
(A portion of HnxGC testing result, click to see
full data,
full graph
full data of optimized,
full graph of optimized
)
You can download the testing source code, modify it as you like to run on your machine, to get a brief idea about the pauseless feature of HnxGC and enjoy it.
See Also:| About HnxGC |
|---|
| Overview |
| Download HnxGC |
| Installation |
| "Hello World" |
| Programming Guide |