triton_wrapper

class coffea.ml_tools.triton_wrapper(model_url: str, client_args: Dict | None = None, batch_size=-1)[source]

Bases: nonserializable_attribute, numpy_call_wrapper

Wrapper for running triton inference.

The target of this class is such that all triton specific operations are wrapped and abstracted-away from the users. The users should then only needs to handle awkward-level operations to mangle the arrays into the expected input format required by the the model of interest.

Attributes Summary

batch_size

Getting the batch size to be used for array splitting.

batch_size_fallback

client_args

Function for adding default arguments to the client constructor kwargs.

http_client_concurrency

pmod

Getting the protocol module based on the url protocol string.

Methods Summary

numpy_call(output_list, input_dict)

param - output_list:

of interest. These strings will be automatically translated into the

validate_numpy_input(output_list, input_dict)

tritonclient can return the expected input array dimensions and available output values.

Attributes Documentation

batch_size

Getting the batch size to be used for array splitting. If it is explicitly set by the users, use that; otherwise, extract from the model configuration hosted on the server.

batch_size_fallback = 10
client_args

Function for adding default arguments to the client constructor kwargs.

http_client_concurrency = 12
pmod

Getting the protocol module based on the url protocol string.

Methods Documentation

numpy_call(output_list: List[str], input_dict: Dict[str, array]) Dict[str, array][source]
Parameters:
  • output_list (-) – of interest. These strings will be automatically translated into the required tritonclient.InferRequestedOutput objects.

  • input_dict (-) – appropriate numpy array as the dictionary value. This dictionary is automatically translated into a list of tritonclient.InferInput objects.

Returns:

  • The return will be the dictionary of numpy arrays that have the

  • output_list arguments as keys.

validate_numpy_input(output_list: List[str], input_dict: Dict[str, array]) None[source]

tritonclient can return the expected input array dimensions and available output values.