Skip to content

通过aten适配器引入flash attention库#1034

Open
PanZezhong1725 wants to merge 5 commits intomainfrom
issue/1033
Open

通过aten适配器引入flash attention库#1034
PanZezhong1725 wants to merge 5 commits intomainfrom
issue/1033

Conversation

@PanZezhong1725
Copy link
Collaborator

No description provided.

@PanZezhong1725
Copy link
Collaborator Author

PanZezhong1725 commented Feb 28, 2026

已过期(适用commit d2aa36d
复现方法:

  1. third_party目录里拉取cutlass和flash_attn源码
  2. third_party/flash-attention/csrc/flash_attn里添加如下CMakeLists.txt,注意替换其中路径
cmake_minimum_required(VERSION 3.18)
project(flash_attn LANGUAGES CXX CUDA)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_POSITION_INDEPENDENT_CODE ON)

# LibTorch include dirs
set(TORCH_INCLUDE_DIRS
    /home/panzezhong/Projects/InfiniCore/third_party/flash-attention/csrc/flash_attn/src
    /home/panzezhong/.conda/envs/myenv/lib/python3.13/site-packages/torch/include/torch/csrc/api/include
    /home/panzezhong/.conda/envs/myenv/lib/python3.13/site-packages/torch/include/
    /home/panzezhong/.conda/envs/myenv/include/python3.13/
    /home/panzezhong/Projects/InfiniCore/third_party/cutlass/include
)

# LibTorch libraries
set(TORCH_LIBS
    /home/panzezhong/.conda/envs/myenv/lib/python3.13/site-packages/torch/lib/libtorch.so
    /home/panzezhong/.conda/envs/myenv/lib/python3.13/site-packages/torch/lib/libtorch_cuda.so
    /home/panzezhong/.conda/envs/myenv/lib/python3.13/site-packages/torch/lib/libtorch_cpu.so
    /home/panzezhong/.conda/envs/myenv/lib/python3.13/site-packages/torch/lib/libc10.so
    /home/panzezhong/.conda/envs/myenv/lib/python3.13/site-packages/torch/lib/libc10_cuda.so
    /home/panzezhong/.conda/envs/myenv/lib/python3.13/site-packages/torch/lib/libtorch_python.so
)



# Collect all CUDA source files
file(GLOB FLASH_CU_SOURCES
    "${CMAKE_CURRENT_SOURCE_DIR}/flash_attn/src/*.cu"
)

add_library(flash_attn SHARED
    ${CMAKE_CURRENT_SOURCE_DIR}/flash_attn/flash_api.cpp
    ${FLASH_CU_SOURCES}
)
target_include_directories(flash_attn PRIVATE
    ${TORCH_INCLUDE_DIRS}
    ${CMAKE_CURRENT_SOURCE_DIR}/flash_attn
)

target_link_libraries(flash_attn PRIVATE
    ${TORCH_LIBS}
    /home/panzezhong/.conda/envs/myenv/lib/libpython3.13.so
    /home/panzezhong/.conda/envs/myenv/lib/libpython3.so
)

target_link_options(flash_attn PRIVATE "-Wl,--no-undefined")

set_target_properties(flash_attn PROPERTIES
    CUDA_ARCHITECTURES "80;86;90"
)

target_compile_options(flash_attn PRIVATE
    $<$<COMPILE_LANGUAGE:CUDA>:--expt-relaxed-constexpr>
    $<$<COMPILE_LANGUAGE:CUDA>:--use_fast_math>
)

add_definitions(-D_GLIBCXX_USE_CXX11_ABI=1)

  1. 编译flash_attn
mkdir build
cd build 
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j8

编译成功后会在build目录里得到 libflash_attn.so,核心就是把它link到infinicore中去

  1. 正常编译安装infinicore,运行mha_varlen算子测试

@PanZezhong1725
Copy link
Collaborator Author

PanZezhong1725 commented Mar 5, 2026

更新自动化编译流程:

  1. 设置cutlass路径环境变量 CUTLASS_ROOT
  2. 配置环节打开 --aten 开关,并设置 --flash-attn 库位置
    xmake f --nv-gpu=y --ccl=y --cuda=$CUDA_HOME --aten=y --flash-attn=/home/panzezhong/Projects/InfiniCore/third_party/flash-attention -cv
  3. flash attenion库会跟随infinicore_cpp_api一同编译安装

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant