Theses and Dissertations

Permanent URI for this collection

https://hdl.handle.net/10217/100415

Browse

Now showing 1 - 3 of 3

Open Access
Extending and validating the stencil processing unit
(Colorado State University. Libraries, 2016) Rajasree, Revathy, author; Rajopadhye, Sanjay, advisor; Pasricha, Sudeep, committee member; Anderson, Charles W., committee member
Stencils are an important class of programs that appear in the core of many scientific and general-purpose applications. These compute-intensive kernels can benefit heavily from the massive compute power of accelerators like the GPGPU. However, due to the absence of any form of on-chip communication between the coarse-grain processors on a GPU, any data transfer/synchronization between the dependent tiles in stencil computations has to happen through the off-chip (global) memory, which is quite energy-expensive. In the road to exascale computing, energy is becoming an important cost metric. The need for hardware and software that can collaboratively work towards reducing energy consumption of a system is becoming more and more important. To make the execution of dense stencils more energy efficient, Rajopadhye et al. proposed the GPGPU-based accelerator called Stencil Processing Unit that introduces a simple neighbor-to-neighbor communication between the Streaming Multiprocessors (SM) on the GPU, thereby allowing some restricted data sharing between consecutive threadblocks. The SPU includes special storage units, called Communication Buffers, to orchestrate this data transfer and also provides an explicit mechanism for inter-threadblock synchronization by way of a special instruction. It claims to achieve energy-efficiency, compared to GPUs, by reducing the number of off-chip accesses in stencils which in turn reduces the dynamic energy overhead. Uguen developed a cycle-accurate performance simulator for the SPU, called SPU-Sim, and evaluated it using a matrix multiplication kernel which was not suitable for this accelerator. This work focuses on extending the SPU-Sim and evaluating the SPU architecture using a more insightful benchmark. We introduce a producer-consumer based inter-block synchronization approach on the SPU, which is more efficient than the previous global synchronization, and an overlapped multi-pass execution model in the SPU runtime system. These optimizations have been implemented into SPU-Sim. Furthermore, the existing GPUWattch power model in the simulator has been refined to provide better power estimates for the SPU architecture. The improved architecture has been evaluated using a simple 2-D stencil benchmark and we observe an average of 16% savings in dynamic energy on SPU compared to a fairly close GPU platform. Nonetheless, the total energy consumption on SPU is still comparatively high due to the static energy component. This high static energy on SPU is a direct impact of the increased leakage power of the platform resulting from the inclusion of special load/store units. Our conservative estimates indicate that replacing the current design of these L/S units with DMA engines can bring about a 15% decrease in the current leakage power of the SPU and this can help SPU outperform GPU in terms of energy.
Open Access
Perfect tracking for non-minimum phase systems with applications to biofuels from microalgae
(Colorado State University. Libraries, 2010) Buehner, Michael R., author; Young, Peter M., advisor; Chong, Edwin Kah Pin, committee member; Scharf, Louis L., committee member; Anderson, Charles W., committee member
In a causal setting, a closed-loop control system receives reference inputs (with no a priori knowledge) that it must track. For this setting, controllers are designed that provide both stability and performance (e.g., to meet tracking and disturbance rejection requirements). Often, feedback controllers are designed to satisfy weighted optimization criteria (e.g., weighted tracking error) that are later validated using test signals such as step responses and frequency sweeps. Feedforward controllers may be used to improve the response to measurable external disturbances (e.g., reference inputs). In this way, they can improve the closed-loop response; however, these approaches do not directly specify the closed-loop response. Two controller architectures are developed that allow for directly designing the nominal closed-loop response of non-minimum phase systems. These architectures classify both the signals that may be perfectly tracked by a non-minimum phase plant and the control signals that provide this perfect tracking. For these architectures, perfect tracking means that the feedback error is zero (for all time) in the nominal case (i.e., the plant model is exact) when there are no external disturbances. For the controllers presented here, parts of the feedforward controllers are based on the plant model, while a separate piece is designed to provide the desired level of performance. One of the potential limitations to these designs is that the actual performance will depend upon the quality of the model used. Robustness tools are developed that may be used to determine the expected performance for a given level of model uncertainty. These robustness tools may also be used to design the piece of the feedforward controller that provides performance. There is a tradeoff between model uncertainty and achievable performance. In general, more model uncertainty will result in less achievable performance. Another way to approach the issue of performance is to consider that a good model must either be known a priori or learned via adaptation. In the cases where a good model is difficult to determine a priori, adaptation may be used to improve the models in the feedforward controllers, which will, in turn, improve the performance of the overall control system. We show how adaptive feedforward architectures can improve performance for systems where the model is of limited accuracy. An example application of growing microalgae for biofuel production is presented. Microalgae have the potential to produce enough biofuels to meet the current US fuel demands; however, progress has been limited (in some part) due to a lack of appropriate models and controllers. In the work presented here, models are developed that may be used to monitor the productivity of microalgae inside a photobioreactor and to develop control algorithms. We use experimental data from a functional prototype photobioreactor to validate these models and to demonstrate the advantages of the advanced controller architectures developed here.
Open Access
Using locally observed swarm behaviors to infer global features of harsh environments
(Colorado State University. Libraries, 2021) Emmons, Megan R., author; Maciejewski, Anthony A., advisor; Chong, Edwin K. P., advisor; Anderson, Charles W., committee member; Young, Peter M., committee member
Robots in a swarm are programmed with individual behaviors but then interactions with the environment and other robots produce more complex, emergent swarm behaviors. A partial differential equation (PDE) can be used to accurately quantify the distribution of robots throughout the environment at any given time if the robots have simple individual behaviors and there are a finite number of potential environments. A least mean square algorithm can then be used to compare a given observation of the swarm distribution to the potential models to accurately identify the environment being explored. This technique affirms that there is a correlation between the individual robot behaviors, robot distribution, and the environment being explored. For more complex behaviors and environments, there is no closed-form model for the emergent behavior but there is still a correlation which can be used to infer one property if the other two are known. A simple, single-layer neural network can replace the PDE and be trained to correlate local observations of the robot distribution to the environment being explored. The neural network approach allows for more sophisticated robot behaviors, more varied environments, and is robust to variations in environment type and number of robots. By replacing the neural network with a simulated human rescuer who uses only locally observed velocity information to navigate a disaster scenario, the impact of fundamental swarm properties can be systematically explored. Further, the baseline swarm resilience can be quantified. Collectively, this development lays a foundation for using minimalist swarms, where robots have simple motions and no communication, to achieve collective sensing which can be leveraged in a variety of applications where no other robotic solutions currently exist.

Browse

Browsing Theses and Dissertations by Author "Anderson, Charles W., committee member"

Results Per Page

Sort Options