Skip to content

niklas-pihl: Implemented CUDA matrix multiplication#11

Open
pihlnikl wants to merge 1 commit into
parallelcomputingabo:mainfrom
pihlnikl:niklas-pihl
Open

niklas-pihl: Implemented CUDA matrix multiplication#11
pihlnikl wants to merge 1 commit into
parallelcomputingabo:mainfrom
pihlnikl:niklas-pihl

Conversation

@pihlnikl
Copy link
Copy Markdown

Implemented both naive and tiled multiplication.

The measured results are not the best. Had some problems due to the lack of an NVIDIA GPU, so I had to run the code in Google Colab which skewed the measurements somewhat and I couldn't even get measurable results on many runs. I suspect the tiled multiplication was too fast, because most of the runs just showed 0 seconds even with precision set to 20. This led to some funny results, comparing colabs speed to the speed of my own CPU (Parallel CPU).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant