Examples# Basic GEMM Declaring and running a GEMM Changing operation modes Running cached kernels Running non-default GEMMs Handling errors Epilogue Run a GEMM with an identity activation function Run a GEMM with a ReLU element-wise activation function Other element-wise activation functions PyTorch Extension Background on grouped GEMM Declaring a grouped GEMM via the CUTLASS Python interface Exporting the CUTLASS kernel to a PyTorch CUDA extension