Today’s microprocessors are incredibly dense, with billions of transistors packed into a design – a design which must be verified to ensure that it is correct. Surprising to most, the majority of the development time of a processor is not design, but rather verification. Bugs that are not caught before a product is released can cost companies billions of dollars. However, even if a bug is detected, the diagnosis of the problem is extremely time and resource consuming as the source of the bug is tracked down. Bugs tend to hide in extremely complex corner cases, often detected long after their occurrence. Engineers engage in a time-consuming, mostly ad-hoc, triaging process to identify the design units responsible for a bug and then carve a unit-level analysis to root-cause it.
To accurately pinpoint the root cause of bugs, a research team including Thurnau Professor Valeria Bertacco, Prof. Scott Mahlke, and CSE graduate students Biruk Mammo and Daya S. Khudia have proposed BugMD, an automatic bug triaging solution that collects multiple architectural-level mismatches and employs a classifier to pinpoint buggy design units. BugMD compares a design’s architected state with a golden state from an instruction set simulator to collect multiple symptoms for a single bug in a single test run. These multiple manifestations of bugs form bug signatures that are then passed through a machine learning backend to obtain a prediction of likely bug sites.
To train the machine-learning classifier, the researchers developed a synthetic bug injection framework for generating large training datasets when real, previously diagnosed bug signatures are either unavailable or insufficient. Despite leveraging only architectural-level mismatches without any microarchitectural knowledge, their experiments show that BugMD can identify the correct location over 70% of the time on first try. When considering multiple top candidates, the buggy design unit is among BugMD’s top three likely candidates in over 90% of cases.