Skip to content

Igli Balla: Implemented optimized matrix multiplication#16

Open
Igli333 wants to merge 5 commits into
parallelcomputingabo:mainfrom
Igli333:igli-balla
Open

Igli Balla: Implemented optimized matrix multiplication#16
Igli333 wants to merge 5 commits into
parallelcomputingabo:mainfrom
Igli333:igli-balla

Conversation

@Igli333
Copy link
Copy Markdown

@Igli333 Igli333 commented May 7, 2025

In this assignment, I optimized the naive matrix multiplication in two ways, using cache blocks and using parallel working threads.
The process was rather simple, using tiled memory in the first case to multiply smaller matrices, and in the end, achieve the full multiplication. While for parallelization, OpenMP was used, where the operation is split into threads that do different parts of the multiplication at the same time, allowing for higher gain in performance.

The challenge was to find the small things that could lead to reduced performance, such as unnecessary castings, unoptimized memory access etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant