CUTLASS#

Subpackages#

Epilogue#

Registry of elementwise epilogues

Elementwise epilogues can be added to many CUTLASS kernels in the CUTLAS Python interface via code like the following for GEMM:

plan = cutlass.op.Gemm(element=cutlass.DataType.f32, layout=cutlass.LayoutType.RowMajor)
plan.activation = cutlass.epilogue.relu
cutlass.epilogue.get_activation_epilogue(activation, element_output, elements_per_access, element_accumulator, element_compute)[source]#

Return an epilogue corresponding to the activation function, data types, and alignment used in the kernel

Parameters:
  • activation – elementwise activation function to use

  • element_output – data type of the output

  • elements_per_access (int) – alignment of operand C of the kernel

  • element_accumulator – data type of the accumulated output C

  • element_compute – data type in which compute operations should be performed

Returns:

epilogue functor

cutlass.epilogue.get_activations()[source]#

Returns a list of available activation functions

Returns:

list of available activation functions

Return type:

list

Library Defaults#

Classes containing valid operations for a given compute capability and data types.

class cutlass.library_defaults.ArchOptions(target_cc, kernel_cc, operation_kind, gemm_kinds, allowed_math_operations=[<MathOperation.multiply_add: 1>, <MathOperation.multiply_add_saturate: 2>])[source]#

Bases: object

Structure for keeping track of kernels available on a given compute capability

Parameters:
  • target_cc (int) – compute capability of the device on which kernels will be run

  • kernel_cc (int) – compute capability of the kernels to generate

  • operation_kind (cutlass.OperationKind) – type of operation to register

  • gemm_kinds (list) – types of GEMM operations that can be included

  • allowed_math_operations (list) – types of primitive math operations allowed

opclass_supports_combination(op_class, datatype_comb, layout_comb)[source]#

Returns whether the provided operation class supports the provided data type and layout combination

Parameters:
  • op_class (cutlass.OpcodeClass) – operation class to consider

  • datatype_comb (tuple[cutlass.DataType]) – tuple of data types for (element_A, element_B, element_accumulator)

  • layout_comb (tuple[cutlass.LayoutType]) – tuple of data types for (layout_A, layout_B)

Returns:

set of operation classes that support the provided data type and layout combination

Return type:

set

operations(op_class, element_a, element_b, element_accumulator, layout_a, layout_b)[source]#

Returns whether the provided operation class supports the provided data type combination

Parameters:
  • op_class (cutlass.OpcodeClass) – operation class to consider

  • element_a (cutlass.DataType) – data type of operand A

  • element_b (cutlass.DataType) – data type of operand B

  • element_accumulator (cutlass.DataType) – data type of accumulator

  • layout_a (cutlass.LayoutType) – layout of operand A

  • layout_b (cutlass.LayoutType) – layout of operand B

Returns:

container of kernels by alignment supported by the provided combination of parameters

Return type:

KernelsForDataType

supporting_opclasses(element_a, element_b, element_accumulator, layout_a, layout_b)[source]#

Returns a set of operation classes that support the provided data type combination

Parameters:
  • element_a (cutlass.DataType) – data type of operand A

  • element_b (cutlass.DataType) – data type of operand B

  • element_accumulator (cutlass.DataType) – data type of accumulator

  • layout_a (cutlass.LayoutType) – layout of operand A

  • layout_b (cutlass.LayoutType) – layout of operand B

Returns:

set of operation classes that support the provided data type combination

Return type:

set

class cutlass.library_defaults.KernelsForDataType(datatype_comb, layout_comb)[source]#

Bases: object

Container class for keeping track of kernels that correspond to a particular combination of data types for operands A, B, and accumulator

Parameters:
  • datatype_comb (tuple) –

  • layout_comb (tuple) –

add(operation)[source]#

Add an operation to the list of supported kernels

property alignments#

Returns an unsorted list of alignments supported by this data type combination

Returns:

unsorted list of alignments supported by this data type combination

Return type:

list

property all_operations#

Returns a list of all operations supported by this data type combination

Returns:

list of all operations supported by this data type combination

Return type:

list

find_alignment(shape, layout)[source]#

Returns the most preferable alignment for a given shape and layout

Parameters:
  • shape (tuple) – extent of each dimension of the tensor

  • layout (cutlass.LayoutType) – layout of the tensor

Returns:

maximum alignment supported by the data type combination and tensor size

Return type:

int

operations(alignment)[source]#

Returns operations satisfying the alignment constraint indicated by alignment

Parameters:

alignment (int) – alignment constraint of operations to return

Returns:

list of operations

Return type:

list

sort()[source]#

Sorts each list of kernels in kernels_by_alignment in descending order of threadblock shape

class cutlass.library_defaults.OptionRegistry(target_cc)[source]#

Bases: object

Container of all architecture-specific options

Parameters:

target_cc (int) – compute capability of the device on which operations will be run

options_for_cc(cc)[source]#
Parameters:

cc (int) –

Return type:

ArchOptions

Swizzle#

Registry of swizzling functions

cutlass.swizzle.get_swizzling_functors()[source]#