Normal görünüm MARC görünümü ISBD görünümü

Parallel programming on GPU using CUDA Christopher Umahaeyo; Supervisor: Öykü Akaydın

Yazar:

Umahaeyo, Christopher

Katkıda bulunan(lar):

Supervisor: Akaydın,Öykü

Dil: İngilizce Yayın ayrıntıları:Nicosia Cyprus International University 2015Tanım: XI, 42 p. table, color figure, figure 30.5 cm CDİçerik türü:

text

Ortam türü:

unmediated

Taşıyıcı türü:

volume

Konu(lar):

Eksik içerik

1 CHAPTER 1

1 INDRODUCTION

3 CHAPTER 2

3 Introductıon

3 LITERATURE REVIEW

3 Implementation of GPU

5 CHAPTER 3

5 GRAPHICAL PROCESSING UNIT

5 Introduction

5 Parallelism

7 Mainstream Computer with GPU

8 Architecture of Graphical Processing Unit

8 General Architecture

9 Today's GPU architecture

11 Geforce GT 440 and its ''Fermi Architecture''

11 Streaming Multiprocessor

12 Memory

12 Memory types

13 Memory interactions

14 Kernel Scheduler

14 Multithreaded Instruction Unit

14 Streaming Processor

14 Load/Store Units

14 Special Function Units

15 Warp Scheduler

15 General Purpose Programming with the GPU(GPGPU)

16 CHAPTER 4

16 IMPLEMENTATION USING CUDA

16 Introduction

16 Parallelism with CUDA

17 Single Instruction Multiple Threads(SIMT)

17 Single Instruction,Multiple Registers

17 Single Instruction, Multiple Addresses

18 Single Istruction, Mutiple Flow Path

18 CUBA Program Structure

18 CUBA Threads,Blocks, Grid and Kernel

20 Kernel Function Call and Dimension

21 CUDA Memory

22 Heterogenous Data Transfer

23 CUDA Software Stack

23 Matrix Multiple Algorithm

24 Sequential Implementation

24 Sequential Pseudocode

24 Sequential Algorithm

25 Parallel Implementation

25 Parallel Pseudocode

25 Parallel Algorithm

26 CHAPTER 5

26 SIMULATION STUDY

26 Introduction

26 Simulation Environement

27 Simulation Results

27 Simulation Result for Sequential algorithm

28 Simulation Results of Parallel Algorithm in 2D

31 Simulation Results of Parallel Algorithm with Dimesion and Topology

31 Simulation Results for Dimension

33 Simulation results for Topolgy

34 Performance Analysis

34 Speedup

36 CHAPTER 6

36 CONCLUSION

37 BIBLIOGRAPHY

Özet: 'ABSTRACT The true internal working of a parallel algorithm depends on the method of exploitation, as well as hardware capability and environment to which it is being exploited either for data intensive or scientific purposes. In this thesis, we perform parallel programming on matrix multiplication using CUDA on a GPU and make comparison between the results obtained to the sequential execution results on the CPU. Tests are also carried out on varied topology when the parallel algorithm is executed in its best dimension to proffer suitability. It has been observed that, for large computational domains, the parallel implementation of the matrix multiplication provides a significant reductions. Keywords-GPGPU, CUDA, Parallel Computing, Topology, Dimension, Speedup.'

Materyal türü:

Thesis

Mevcut ( 1 )
Başlık notları ( 65 )

Mevcut
Materyal türü	Geçerli Kütüphane	Koleksiyon	Yer Numarası	Durum	Notlar	İade tarihi	Barkod	Materyal Ayırtmaları
Thesis	CIU LIBRARY Tez Koleksiyonu	Tez Koleksiyonu	YL 532 U43 2015 (Rafa gözat(Aşağıda açılır))	Kullanılabilir	Computer Engineering Department		T588

Toplam ayırtılanlar: 0

CIU LIBRARY raflarına göz atılıyor, Raftaki konumu: Tez Koleksiyonu, Koleksiyon: Tez Koleksiyonu Raf tarayıcısını kapatın(Raf tarayıcısını kapatır)

Önceki	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Sonraki
Önceki	YL 53 093 2005 Diglisidil eter bisfenol A ile akrilik asidin reaksiyonunun incelenmesi	YL 530 Y47 2015 Turkish sign language recognition using microsoft Kinect Xbox	YL 531 J64 2015 Brand relevance; The key to brand equity and its importance in maintaining brand loyalty in banks A case study of the Guaranty Trust Bank Nigeria PLC.	YL 532 U43 2015 Parallel programming on GPU using CUDA	YL 533 O74 1990 Bodrum kentindeki turizm potansiyelli planlama ve uygulama süreçleri Sonuçlar ve özgün kentsel ögelerle çatışmalar	YL 534 A88 2015 5-6 yaş grubu çocukların akran ilişkileri ve sosyal becerilerinin bazı değişkenler açısından incelenmesi	YL 535 Y44 2015 Job analysis of organisations - example of Hatay Metropolitan Municipality	Sonraki

Includes CD

Includes references (37-39 p.)

'ABSTRACT The true internal working of a parallel algorithm depends on the method of exploitation, as well as hardware capability and environment to which it is being exploited either for data intensive or scientific purposes. In this thesis, we perform parallel programming on matrix multiplication using CUDA on a GPU and make comparison between the results obtained to the sequential execution results on the CPU. Tests are also carried out on varied topology when the parallel algorithm is executed in its best dimension to proffer suitability. It has been observed that, for large computational domains, the parallel implementation of the matrix multiplication provides a significant reductions. Keywords-GPGPU, CUDA, Parallel Computing, Topology, Dimension, Speedup.'