Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions .idea/.name

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions .idea/Homework-3.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

247 changes: 247 additions & 0 deletions .idea/editor.xml

Large diffs are not rendered by default.

7 changes: 7 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,3 +205,18 @@ git push origin student-name

Good luck, and enjoy accelerating matrix multiplication with CUDA!

Here is my table, with numbers taken from the table in Assignment 2 as well:

| Test Case | Dimensions (\( m \times n \times p \)) | Naive CPU (s) | Blocked CPU (s) | Parallel CPU (s) | Naive CUDA (s) | Tiled CUDA (s) | Tiled CUDA Speedup (vs. Naive CUDA) | Tiled CUDA Speedup (vs. Parallel CPU) |
|-----------|----------------------------------------|---------------|-----------------|------------------|----------------|----------------|-------------------------------------|---------------------------------------|
| 0 | 64x64x64 | 0.000999928 | 0.00200009 | 0.00200009 | 0.000183 | 0.000108 | 1.686373x | 0.49994x |
| 1 | 128x64x128 | 0.00399995 | 0.00500011 | 0.000999928 | 0.00145 | 0.000090 | 1.611844 | 4.00024x |
| 2 | 100x128x56 | 0.00300002 | 0.00300002 | 0.00300002 | 0.000139 | 0.000079 | 1.749597x | 1x |
| 3 | 128x64x128 | 0.00600004 | 0.00399995 | 0.00200009 | 0.000266 | 0.000190 | 1.404657x | 2.99988x |
| 4 | 32x128x32 | 0.00100017 | 0.00099993 | 0 | 0.000190 | 0.000161 | 1.176062x | infx |
| 5 | 200x100x256 | 0.0210001 | 0.0209999 | 0.00700021 | 0.000264 | 0.000133 | 1.986526x | 2.99993x |
| 6 | 256x256x256 | 0.0650001 | 0.066 | 0.017 | 0.000335 | 0.000200 | 1.672361x | 3.82354x |
| 7 | 256x300x256 | 0.069 | 0.0839999 | 0.0180001 | 0.000440 | 0.000432 | 1.019646x | 3.83331x |
| 8 | 64x128x64 | 0.00300002 | 0.00300002 | 0.00300002 | 0.000194 | 0.000112 | 1.727454x | 1x |
| 9 | 256x256x257 | 0.810001 | 0.0669999 | 0.0669999 | 0.000446 | 0.000240 | 1.859331 | 5.39997x |

65 changes: 65 additions & 0 deletions data/0/result.raw

Large diffs are not rendered by default.

65 changes: 65 additions & 0 deletions data/0/result_naive.raw

Large diffs are not rendered by default.

129 changes: 129 additions & 0 deletions data/1/result.raw

Large diffs are not rendered by default.

Loading