std::mdspan is a revolutionary multidimensional array view that provides zero-cost abstractions for accessing
contiguous memory as multidimensional structures. This pattern emerged from the Linear Algebra Special Interest
Group's proposal P0009, spearheaded by researchers at Sandia National Laboratories, NVIDIA, and other high-performance
computing organizations. The design addresses critical limitations in scientific computing, machine learning, and
image processing applications where efficient multidimensional data access is essential. Unlike traditional
array-of-arrays approaches that suffer from poor cache locality and indirection overhead, mdspan provides a unified
interface that abstracts memory layout while maintaining optimal performance characteristics. The standardization
process, which began in 2015 and culminated in C++23, involved extensive collaboration with major HPC framework
developers including the Kokkos and RAJA communities, ensuring real-world applicability and performance requirements
were met.
std::mdspan solves fundamental problems in multidimensional data processing by providing layout-agnostic views that
work seamlessly with existing memory allocations. Traditional approaches require separate implementations for
row-major, column-major, and strided layouts, leading to code duplication and maintenance complexity. The mdspan
pattern eliminates these issues through compile-time layout policies that enable the same algorithm to work with
different memory arrangements without performance penalty. This capability is crucial for interoperability between
libraries using different conventions (such as C's row-major and Fortran's column-major layouts), scientific
computing applications requiring specific memory access patterns, and high-performance computing scenarios where
memory layout directly impacts vectorization and cache efficiency. Additionally, mdspan's accessor policies allow
custom element access patterns without breaking the abstraction, enabling specialized implementations for compressed
matrices, bit-packed data, and other domain-specific requirements. The extents system provides both compile-time
and runtime dimension specification, allowing optimal code generation when dimensions are known at compile time
while supporting dynamic sizing when necessary.
This implementation leverages your compiler's native C++23 mdspan support where available, with automatic fallback to the high-quality reference implementation from the Kokkos project (https://github.com/kokkos/mdspan) for maximum compatibility across different compiler versions and platforms.
This comprehensive example showcases all major mdspan capabilities:
- 1D, 2D, and 3D mdspan creation with dynamic and static extents
- Layout policies:
layout_right(row-major),layout_left(column-major),layout_stride - Element access patterns using multidimensional subscript operators
- Extent queries and dimensional introspection
- Subspan operations for extracting views of subregions
- Custom accessor policies for specialized element access patterns
- Performance comparison against direct array access
- Algorithm integration with STL containers and iterators
- Template parameter constraints using modern C++20 concepts
- Railway-oriented programming with
std::expectedfor error handling - Comprehensive testing with assertion-based validation
- Memory layout demonstrations showing cache-friendly access patterns
std::vector<int> data{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
// Row-major layout (default)
std::mdspan<int, std::extents<size_t, std::dynamic_extent, std::dynamic_extent>>
row_major{data.data(), 3, 4};
// Column-major layout
std::mdspan<int, std::extents<size_t, std::dynamic_extent, std::dynamic_extent>,
std::layout_left> col_major{data.data(), 3, 4};
// Access patterns demonstrate different memory orderings
// row_major[1, 2] accesses data[6] (1*4 + 2)
// col_major[1, 2] accesses data[7] (2*3 + 1)std::mdspan<int, std::extents<size_t, std::dynamic_extent, std::dynamic_extent>>
matrix{data.data(), 4, 5};
// Extract entire row
auto row_view = std::submdspan(matrix, 1, std::full_extent);
// Extract submatrix
auto sub_matrix = std::submdspan(matrix, std::pair{1uz, 3uz}, std::pair{1uz, 4uz});template<std::copyable ElementType>
struct ScalingAccessor {
using element_type = ElementType;
using reference = ElementType&;
using data_handle_type = ElementType*;
ElementType m_scale_factor{};
constexpr explicit ScalingAccessor(ElementType scale = ElementType{2})
: m_scale_factor{scale} {}
constexpr auto access(data_handle_type p, size_t i) const noexcept -> reference {
return p[i]; // Can apply scaling logic here
}
constexpr auto offset(data_handle_type p, size_t i) const noexcept -> data_handle_type {
return p + i;
}
};std::vector<double> data{1.0, 2.0, 3.0, 4.0, 5.0};
std::mdspan<double, std::extents<size_t, std::dynamic_extent>> span{data.data(), 5};
for (size_t i = 0; i < span.extent(0); ++i) {
std::cout << span[i] << " ";
}std::vector<float> matrix_data(rows * cols);
std::iota(matrix_data.begin(), matrix_data.end(), 1.0f);
std::mdspan<float, std::extents<size_t, std::dynamic_extent, std::dynamic_extent>>
matrix{matrix_data.data(), rows, cols};
// Matrix element access
for (size_t row = 0; row < matrix.extent(0); ++row) {
for (size_t col = 0; col < matrix.extent(1); ++col) {
std::cout << matrix[row, col] << " ";
}
std::cout << "\n";
}std::vector<int> tensor_data(depth * height * width);
std::mdspan<int, std::extents<size_t, std::dynamic_extent, std::dynamic_extent, std::dynamic_extent>>
tensor{tensor_data.data(), depth, height, width};
// Tensor element access
for (size_t d = 0; d < tensor.extent(0); ++d) {
for (size_t h = 0; h < tensor.extent(1); ++h) {
for (size_t w = 0; w < tensor.extent(2); ++w) {
tensor[d, h, w] = compute_value(d, h, w);
}
}
}-
Use static extents when dimensions are known at compile time:
std::mdspan<double, std::extents<size_t, 3, 4>> fixed_matrix{data.data()};
-
Prefer range-based for loops when possible:
for (const auto& element : underlying_container) { // Process element }
-
Use const-correctness:
void process_matrix(std::mdspan<const double, std::extents<size_t, std::dynamic_extent, std::dynamic_extent>> matrix);
-
Leverage layout policies for performance:
// Choose layout based on access patterns std::mdspan<double, std::extents<size_t, std::dynamic_extent, std::dynamic_extent>, std::layout_left> col_major_matrix{data.data(), rows, cols};
-
Don't manage memory within mdspan - it's a view, not an owner:
// Wrong std::mdspan<int> span{new int[100], 10, 10}; // Memory leak risk // Right std::vector<int> data(100); std::mdspan<int, std::extents<size_t, std::dynamic_extent, std::dynamic_extent>> span{data.data(), 10, 10};
-
Don't ignore layout implications:
// Inefficient - row-major access on column-major layout for (size_t row = 0; row < span.extent(0); ++row) { for (size_t col = 0; col < span.extent(1); ++col) { process(col_major_span[row, col]); // Poor cache locality } }
-
Don't assume subspan availability:
// Not all implementations support subspan yet #ifdef __cpp_lib_mdspan_subspan auto sub = std::submdspan(matrix, 1, std::full_extent); #else // Provide fallback implementation #endif
The mdspan implementation demonstrates excellent performance characteristics:
- Zero-cost abstraction: mdspan access overhead typically < 3% of direct array access
- Optimal vectorization: Compilers can vectorize mdspan loops as effectively as raw arrays
- Cache-friendly patterns: Layout policies enable optimal memory access patterns
- Compile-time optimization: Static extents allow aggressive compiler optimizations
Benchmark results from this implementation:
Performance comparison for 1,000,000 iterations:
Direct array access: 2,226 microseconds
mdspan access: 2,240 microseconds
Overhead ratio: 1.01x (essentially zero-cost)
- "Effective Modern C++" by Scott Meyers - Covers modern template patterns that mdspan exemplifies
- "C++ Templates: The Complete Guide" by Vandevoorde, Josuttis, and Gregor - Deep dive into template metaprogramming techniques used in mdspan
- "Programming Massively Parallel Processors" by Kirk and Hwu - Context for multidimensional array importance in HPC
- "High Performance Computing: Modern Systems and Practices" by Sterling, Anderson, and Brodowicz - Background on scientific computing requirements that drove mdspan development
- P0009R18: "MDSPAN" - The original C++ standardization proposal
- P1684R0: "MDARRAY: An Owning Multidimensional Array Analog of MDSPAN" - Companion proposal
- "Kokkos: Enabling Many-Core Performance Portability" - Real-world mdspan usage in production HPC codes
- "RAJA Performance Portability Layer" - Another major framework using mdspan-like abstractions
- C++ Reference Documentation: https://en.cppreference.com/w/cpp/container/mdspan
- Kokkos mdspan Reference Implementation: https://github.com/kokkos/mdspan (used as fallback in this project)
- ISO C++ Papers: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/
- CppCon Talks: Search for "mdspan", "multidimensional arrays", and "HPC C++"
- mdspan Tutorial: https://kokkos.github.io/mdspan/tutorial/
- C++23-compliant compiler (GCC 12+, Clang 16+, MSVC 19.35+)
- CMake 3.25+
- Internet connection (for automatic fallback library download if needed)
This project first attempts to use your compiler's native C++23 std::mdspan support. If native support is not
available, it automatically downloads and uses the reference implementation from the
Kokkos mdspan library via CMake FetchContent. This ensures compatibility
across different compiler versions while providing access to the full mdspan feature set.
mkdir build && cd build
cmake ..
make -j$(nproc)# Run comprehensive test suite
ctest --verbose
# Run main demonstration
./cpp23_mdspan
# Run individual test executable
./test_mdspanThe test suite includes:
- ✅ 1D, 2D, and 3D mdspan functionality
- ✅ Layout policy validation (row-major, column-major, strided)
- ✅ Subspan operations (where available)
- ✅ Custom accessor patterns
- ✅ Performance benchmarking
- ✅ Algorithm integration testing
- ✅ Static and dynamic extent verification
- ✅ Error handling with
std::expected
All tests pass with comprehensive validation using assertions and detailed logging output.
This code is provided under the MIT License. Feel free to use, modify, and distribute as needed.
Contributions are welcome! Please feel free to submit a Pull Request.