PhD Thesis

< back to PhD thesis

« Defining and using virtual platforms traces captured for debugging MPSoCs ».

Author: M. Cunha
Advisor: F. Pétrot
President of jury: T. Risset
thesis reviewer(s): S. Pillement, Abdoulaye Gamatié,
thesis examinator(s): Luis-Miguel Santana-Ormeno,
These de Doctorat Université Grenoble Alpes
Speciality: Informatique
Defense: January 29 2016
ISBN: 978-2-11-129208-6


The increasing complexity of Multiprocessor System on Chip (MPSoC) makes the engineers’ life harder as bugs and inefficiencies can have a very broad range of sources. Hardware/software interactions can be one of these sources, their early identification and resolution being a priority for rapid system integration. Thus, due to the huge number of possible execution interleavings, reproducing the conditions of occurrence of a given error/performance issue is very difficult. One solution to this problem consists of tracing an execution for later analysis. Obtaining the traces from real platforms goes against the recent development processes, now broadly adopted by industry and academy, which rely on simulation to anticipate hardware/software integration. Multi/many core systems on chip tend to have specific memory hierarchies, to make the hardware simpler and predictable, at the cost of having the hardware percolate towards the high levels of the software stack. Despite the developers efforts, it is hard to make sure all preventive measures are taken to ensure a given property, such as lack of race conditions or data coherency. In this context, the debugging process is particularly tedious as it involves analyzing parallel execution flows. Executing a program many times is an integral part of the process in conventional debugging, but the non-determinism due to parallel execution often leads to different execution paths and different behaviors. This thesis details the challenges and issues behind the production and exploitation of "well formed" traces in a transaction accurate virtual prototyping environment that uses dynamic binary translation as processor simulation technology. These traces contain causality relations among events, which allow firstly to simplify the analysis, and secondly to avoid relying on timestamps. We propose a formalism to define the traces and detail an implementation to produce them in a non-intrusive manner. We use these traces to help identify and correct bugs on multi/many-core platforms. We firstly introduce a method to identify the potential cache coherence violations in non-cache-coherent platforms. Our method identifies potential violations which may occur during a given execution for write-through and write-back cache policies by analyzing the traces. We secondly focus on easing the debugging process of parallel software running on MPSoC using traces. To that aim, we propose a debugging process which replays a faulty execution using traces. We detail a strategy for providing forward and reverse execution features to avoid long simulation times during a debug session. We conducted experiments on MPSoC using parallel applications to quantify our proposal, and overall show that complex analysis and debug strategies can be implemented over traces, leading to deterministic results in shorter time than simulation alone.

pdf pdf