• [gentoo-dev] [PATCH 1/2] rocm.eclass: Fix the xnack feature for gfx90a

    From Yiyang Wu@21:1/5 to All on Fri Nov 17 04:30:01 2023
    MI210 is xnack disabled by default. Compile with :xnack+ will produce
    GPU kernels that cannot be run on MI210.

    Two targets, gfx90a_xnack and gfx90a_noxnack are introduced to replace
    gfx90a, for the following reason:

    Upstream usually ships 2 version: gfx90a:xnack-, gfx90a:xnack+. Although
    a single gfx90a should have maximum compatibility, According to [1,2],
    compile with xnack+/xnack- may have better performance on xnack enabled/disabled GPUs. Therefore we ship both the target, align with
    upstream. gfx900 is also appended with :xnack- to align with upstream
    default.

    However, some ROCm libraries does not need xnack feature flag, such as sci-libs/miopen and sci-libs/rocFFT, so "--no-xnack-flag" option is
    introduced to stop appending xnack feature. Demonstrate this option in
    the second example ebuild.

    Update description file profiles/desc/amdgpu_targets.desc accordingly.

    Github PR: https://github.com/gentoo/gentoo/pull/33861

    [1] https://llvm.org/docs/AMDGPUUsage.html#target-features
    [2] https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html#compiling-hip-kernels-for-specific-xnack-modes

    Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>
    ---
    eclass/rocm.eclass | 31 ++++++++++++++++++++++---------
    profiles/desc/amdgpu_targets.desc | 5 ++++-
    2 files changed, 26 insertions(+), 10 deletions(-)

    diff --git a/eclass/rocm.eclass b/eclass/rocm.eclass
    index b78dfea1cc31..3209c003e323 100644
    --- a/eclass/rocm.eclass
    +++ b/eclass/rocm.eclass
    @@ -60,7 +60,8 @@
    #
    # Examples for packages depend on ROCm libraries -- a package which depends on
    # rocBLAS, uses comma separated ${HCC_AMDGPU_TARGET} to determine GPU
    -# architectures, and requires ROCm version >=5.1
    +# architectures which does not accept :xnack feature flag, and requires ROCm +# version >=5.1
    # @CODE
    # ROCM_VERSION=5.1
    # inherit rocm
    @@ -71,7 +72,7 @@
    #
    # src_configure() {
    # if use rocm; then
    -# local amdgpu_flags=$(get_amdgpu_flags)
    +# local amdgpu_flags=$(get_amdgpu_flags --no-xnack-flag)
    # export HCC_AMDGPU_TARGET=${amdgpu_flags//;/,}
    # fi
    # default
    @@ -143,7 +144,7 @@ _rocm_set_globals() {
    gfx
  • From Yiyang Wu@21:1/5 to All on Fri Nov 17 04:30:01 2023
    Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>
    ---
    profiles/desc/amdgpu_targets.desc | 6 +++---
    1 file changed, 3 insertions(+), 3 deletions(-)

    diff --git a/profiles/desc/amdgpu_targets.desc b/profiles/desc/amdgpu_targets.desc
    index 77ffa43399f1..46513023c3a9 100644
    --- a/profiles/desc/amdgpu_targets.desc
    +++ b/profiles/desc/amdgpu_targets.desc
    @@ -18,6 +18,6 @@ gfx1011 - RDNA GPU, codename navi12, including Radeon Pro 5600M/V520
    gfx1012 - RDNA GPU, codename navi14, including Radeon RX 5500XT/5500/5500M/5500XTB/5300/5300M, Radeon Pro 5500XT/5500M/5300/5300M, Radeon Pro W5500X/W5500/W5500M/W5300M
    gfx1030 - RDNA2 GPU, codename navi21/sienna cichlid, including Radeon RX 6950XT/6900XT/6800XT/6800, Radeon Pro W6800
    gfx1031 - RDNA2 GPU, codename navi22/navy flounder, including Radeon RX 6750XT/6700XT/6800M/6700M
    -gfx1100 - RDNA3 GPU, codename navi31/plum bonito, including Radeon RX 7900XTX/7900XT
    -gfx1101 - RDNA3 GPU, codename navi32
    -gfx1102 - RDNA3 GPU, codename navi33
    +gfx1100 - RDNA3 GPU, codename navi31/plum bonito, including Radeon RX
  • From Benda Xu@21:1/5 to Yiyang Wu on Mon Nov 27 01:10:01 2023
    Hi Yiyang,

    Yiyang Wu <xgreenlandforwyy@gmail.com> writes:

    Upstream usually ships 2 version: gfx90a:xnack-, gfx90a:xnack+. Although
    a single gfx90a should have maximum compatibility, According to [1,2], compile with xnack+/xnack- may have better performance on xnack enabled/disabled GPUs. Therefore we ship both the target, align with upstream. gfx900 is also appended with :xnack- to align with upstream default.

    [1] https://llvm.org/docs/AMDGPUUsage.html#target-features
    [2] https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html#compiling-hip-kernels-for-specific-xnack-modes

    So, you want to add a new ABI to gfx90a for experimental xnack feature.
    I suggest make it gfx908a with gfx908a_xnack, instead of
    "gfx908a_noxnack" for consistency the existing naming scheme.

    With this minimal modification, the remaining cards such as gfx906 and
    gfx908 that support xnack could be updated incrementally.

    Benda

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)