triton_wrapper

class coffea.ml_tools.triton_wrapper(model_url: str, client_args: Dict | None = None, batch_size=-1)[source]

Bases: nonserializable_attribute, numpy_call_wrapper

Wrapper for running triton inference.

The target of this class is such that all triton specific operations are wrapped and abstracted-away from the users. The users should then only needs to handle awkward-level operations to mangle the arrays into the expected input format required by the the model of interest.

Attributes Summary

`batch_size`	Getting the batch size to be used for array splitting.
`batch_size_fallback`
`client_args`	Function for adding default arguments to the client constructor kwargs.
`http_client_concurrency`
`pmod`	Getting the protocol module based on the url protocol string.

Methods Summary

numpy_call(output_list, input_dict)

param - output_list:: of interest. These strings will be automatically translated into the

validate_numpy_input(output_list, input_dict)

tritonclient can return the expected input array dimensions and available output values.

Attributes Documentation

batch_size: Getting the batch size to be used for array splitting. If it is explicitly set by the users, use that; otherwise, extract from the model configuration hosted on the server.

batch_size_fallback = 10

client_args: Function for adding default arguments to the client constructor kwargs.

http_client_concurrency = 12

pmod: Getting the protocol module based on the url protocol string.

Methods Documentation

numpy_call(output_list: List[str], input_dict: Dict[str, array]) → Dict[str, array][source]

Parameters:

output_list (-) – of interest. These strings will be automatically translated into the required tritonclient.InferRequestedOutput objects.
input_dict (-) – appropriate numpy array as the dictionary value. This dictionary is automatically translated into a list of tritonclient.InferInput objects.

Returns:

The return will be the dictionary of numpy arrays that have the
output_list arguments as keys.

validate_numpy_input(output_list: List[str], input_dict: Dict[str, array]) → None[source]: tritonclient can return the expected input array dimensions and available output values.