Models

class inseq.models.AttributionModel(**kwargs)[source]

Base class for all attribution models.

model

The wrapped model to be attributed.

model_name

The name of the model.

Type:

str

is_encoder_decoder

Whether the model is an encoder-decoder model.

Type:

bool

pad_token

The pad token used by the model.

Type:

str

embed_scale

Value used to scale the embeddings.

Type:

float

device

The device on which the model is located.

Type:

str

attribution_method

The attribution method used alongside the model.

Type:

FeatureAttribution

is_hooked

Whether the model is currently hooked by the attribution method.

Type:

bool

default_attributed_fn_id

The id for the default step function used as attribution target.

Type:

str

attribute(input_texts: Union[str, Sequence[str]], generated_texts: Optional[Union[str, Sequence[str]]] = None, method: Optional[str] = None, override_default_attribution: Optional[bool] = False, attr_pos_start: Optional[int] = None, attr_pos_end: Optional[int] = None, show_progress: bool = True, pretty_progress: bool = True, output_step_attributions: bool = False, attribute_target: bool = False, step_scores: List[str] = [], include_eos_baseline: bool = False, attributed_fn: Optional[Union[str, Callable[[...], Tensor[Tensor]]]] = None, device: Optional[str] = None, batch_size: Optional[int] = None, **kwargs) FeatureAttributionOutput[source]

Perform sequential attribution of input texts for every token in generated texts using the specified method.

Parameters:
  • input_texts (str or list(str)) – One or more input texts to be attributed.

  • generated_texts (str or list(str), optional) – One or more generated texts to be used as targets for the attribution. Must match the number of input texts. If not provided, the model will be used to generate the texts from the input texts (default behavior). Specifying this argument enables attribution for constrained decoding, which should be interpreted carefully in presence of distributional shifts compared to natural generations (Vamvas and Sennrich, 2021).

  • method (str, optional) – The identifier associated to the attribution method to use. If not provided, the default attribution method specified when initializing the model will be used.

  • override_default_attribution (bool, optional) – Whether to override the default attribution method specified when initializing the model permanently, or to use the method above for a single attribution.

  • attr_pos_start (int, optional) – The starting position of the attribution. If not provided, the whole input text will be attributed. Allows for span-targeted attribution of generated texts.

  • attr_pos_end (int, optional) – The ending position of the attribution. If not provided, the whole input text will be attributed. Allows for span-targeted attribution of generated texts.

  • show_progress (bool) – Whether to show a progress bar for the attribution, default True.

  • pretty_progress (bool, optional) – Whether to show a pretty progress bar for the attribution. Automatically set to False for IPython environments due to visualization issues. If False, a simple tqdm progress bar will be used. default: True.

  • output_step_attributions (bool, optional) – Whether to fill the step_attributions field in FeatureAttributionOutput with step-wise attributions for each generated token. default: False.

  • attribute_target (bool, optional) – Specific to encoder-decoder models. Whether to attribute the target prefix alongside the input text. default: False. Note that an encoder-decoder attribution not accounting for the target prefix does not correctly reflect the overall input importance, since part of the input is not included in the attribution.

  • step_scores (list(str)) – A list of step function identifiers specifying the step scores to be computed alongside the attribution process. Available step functions are listed in list_step_functions().

  • include_eos_baseline (bool, optional) – Whether to include the EOS token in attributed tokens when using an attribution method requiring a baseline. default: False.

  • attributed_fn (str or Callable, optional) – The identifier associated to the step function to be used as attribution target. If not provided, the one specified in default_attributed_fn_id ( model default) will be used. If the provided string is not a registered step function, an error will be raised. If a callable is provided, it must be a function matching the requirements for a step function.

  • device (str, optional) – The device to use for the attribution. If not provided, the default model device will be used.

  • batch_size (int, optional) – The batch size to use to dilute the attribution computation over the set of inputs. If no batch size is provided, the full set of input texts will be attributed at once.

  • **kwargs – Additional keyword arguments. These can include keyword arguments for the attribution method, for the generation process or for the attributed function. Generation arguments can be provided explicitly as a dictionary named generation_args.

Returns:

The attribution output object containing the attribution scores, step-scores, optionally step-wise attributions and general information concerning attributed texts and the attribution process.

Return type:

FeatureAttributionOutput

configure_interpretable_embeddings(**kwargs) None[source]

Configure the model with interpretable embeddings for gradient attribution.

This method needs to be defined for models that cannot receive embeddings directly from their forward method parameters, to allow the usage of interpretable embeddings as surrogate for feature attribution methods. Model that support precomputed embedding inputs by design can skip this method.

abstract forward(attributed_fn: Callable[[...], Tensor[Tensor]], attributed_fn_argnames: Optional[List[str]] = None, *args, **kwargs) Tensor[Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

remove_interpretable_embeddings(**kwargs) None[source]

Removes interpretable embeddings used for gradient attribution.

If the configure_interpretable_embeddings method is defined, this method needs to be defined to allow restoring original embeddings for the model. This is required for methods using the decorator @unhooked since they require the original model capabilities.

setup(device: Optional[str] = None, attribution_method: Optional[str] = None, **kwargs) None[source]

Move the model to device and in eval mode.

Framework Classes

class inseq.models.HuggingfaceModel(model: Union[str, PreTrainedModel], attribution_method: Optional[str] = None, tokenizer: Optional[Union[str, PreTrainedTokenizer]] = None, device: Optional[str] = None, **kwargs)[source]

Model wrapper for any ForCausalLM and ForConditionalGeneration model on the HuggingFace Hub used to enable feature attribution. Corresponds to AutoModelForCausalLM and AutoModelForSeq2SeqLM auto classes.

_autoclass

The HuggingFace model class to use for initialization. Must be defined in subclasses.

Type:

Type[transformers.AutoModel]

model

the model on which attribution is performed.

Type:

transformers.AutoModelForSeq2SeqLM or transformers.AutoModelForSeq2SeqLM

tokenizer

the tokenizer associated to the model.

Type:

transformers.AutoTokenizer

device

the device on which the model is run.

Type:

str

encoder_int_embeds

the interpretable embedding layer for the encoder, used for layer attribution methods in Captum.

Type:

captum.InterpretableEmbeddingBase

decoder_int_embeds

the interpretable embedding layer for the decoder, used for layer attribution methods in Captum.

Type:

captum.InterpretableEmbeddingBase

embed_scale

scale factor for embeddings.

Type:

float, optional

tokenizer_name

The name of the tokenizer in the Huggingface Hub. Default: use model name.

Type:

str, optional

abstract configure_embeddings_scale() None[source]

Configure the scale factor for embeddings.

encode(texts: Union[str, Sequence[str]], as_targets: bool = False, return_baseline: bool = False, include_eos_baseline: bool = False, max_input_length: int = 512) BatchEncoding[source]

Encode one or multiple texts, producing a BatchEncoding

Parameters:
  • texts (str or list of str) – the texts to tokenize.

  • return_baseline (bool, optional) – if True, baseline token ids are returned.

Returns:

contains ids and attention masks.

Return type:

BatchEncoding

generate(inputs: Union[str, Sequence[str], BatchEncoding], return_generation_output: bool = False, **kwargs) Union[List[str], Tuple[List[str], ModelOutput]][source]

Wrapper of model.generate to handle tokenization and decoding.

Parameters:
  • inputs (Union[TextInput, BatchEncoding]) – Inputs to be provided to the model for generation.

  • return_generation_output (bool, optional, defaults to False) – If true, generation outputs are returned alongside the generated text.

Returns:

Generated text or a tuple of generated text and generation outputs.

Return type:

Union[List[str], Tuple[List[str], ModelOutput]]

static load(model: Union[str, PreTrainedModel], attribution_method: Optional[str] = None, tokenizer: Optional[Union[str, PreTrainedTokenizer]] = None, device: Optional[str] = None, **kwargs) HuggingfaceModel[source]

Loads a HuggingFace model and tokenizer and wraps them in the appropriate AttributionModel.

class inseq.models.HuggingfaceEncoderDecoderModel(model: Union[str, PreTrainedModel], attribution_method: Optional[str] = None, tokenizer: Optional[Union[str, PreTrainedTokenizer]] = None, device: Optional[str] = None, **kwargs)[source]

Model wrapper for any ForConditionalGeneration model on the HuggingFace Hub used to enable feature attribution. Corresponds to AutoModelForSeq2SeqLM auto classes in HF transformers.

model (

transformers.AutoModelForSeq2SeqLM): the model on which attribution is performed.

configure_embeddings_scale()[source]

Configure the scale factor for embeddings.

class inseq.models.HuggingfaceDecoderOnlyModel(model: Union[str, PreTrainedModel], attribution_method: Optional[str] = None, tokenizer: Optional[Union[str, PreTrainedTokenizer]] = None, device: Optional[str] = None, **kwargs)[source]

Model wrapper for any ForCausalLM or LMHead model on the HuggingFace Hub used to enable feature attribution. Corresponds to AutoModelForCausalLM auto classes in HF transformers.

model (

transformers.AutoModelForCausalLM): the model on which attribution is performed.

configure_embeddings_scale()[source]

Configure the scale factor for embeddings.

Architectural Classes

class inseq.models.EncoderDecoderAttributionModel(**kwargs)[source]

AttributionModel class for attributing encoder-decoder models.

static enrich_step_output(step_output: FeatureAttributionStepOutput, batch: EncoderDecoderBatch, target_tokens: Sequence[Sequence[str]], target_ids: Tensor[Tensor]) FeatureAttributionStepOutput[source]

Enriches the attribution output with token information, producing the finished FeatureAttributionStepOutput object.

Parameters:
  • step_output (FeatureAttributionStepOutput) – The output produced by the attribution step, with missing batch information.

  • batch (EncoderDecoderBatch) – The batch on which attribution was performed.

  • target_ids (torch.Tensor) – Target token ids of size (batch_size, 1) corresponding to tokens for which the attribution step was performed.

Returns:

The enriched attribution output.

Return type:

FeatureAttributionStepOutput

forward(encoder_tensors: Union[Tensor[Tensor], Tensor[Tensor]], decoder_input_embeds: Union[Tensor[Tensor], Tensor[Tensor]], encoder_input_ids: Tensor[Tensor], decoder_input_ids: Tensor[Tensor], target_ids: Tensor[Tensor], attributed_fn: Callable[[...], Tensor[Tensor]], encoder_attention_mask: Optional[Tensor[Tensor]] = None, decoder_attention_mask: Optional[Tensor[Tensor]] = None, use_embeddings: bool = True, attributed_fn_argnames: Optional[List[str]] = None, *args) Tensor[Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

prepare_inputs_for_attribution(inputs: Tuple[Union[str, Sequence[str], BatchEncoding, Batch], Union[str, Sequence[str], BatchEncoding, Batch]], include_eos_baseline: bool = False) EncoderDecoderBatch[source]

Prepares sources and target to produce an EncoderDecoderBatch. There are two stages of preparation:

  1. Raw text sources and target texts are encoded by the model.

  2. The encoded sources and targets are converted to tensors for the forward pass.

This method is agnostic of the preparation stage of sources and targets. If they are both raw text, they will undergo both steps. If they are already encoded, they will only be embedded. If the feature attribution method works on layers, the embedding step is skipped and embeddings are set to None. The final result will be consistent in both cases.

Parameters:
  • sources (FeatureAttributionInput) – The sources provided to the prepare() method.

  • ( (targets) – obj:FeatureAttributionInput): The targets provided to the :meth:`~inseq.attr.feat.FeatureAttribution.prepare method.

  • include_eos_baseline (bool, optional) – Whether to include the EOS token in the baseline for attribution. By default the EOS token is not used for attribution. Defaults to False.

Returns:

An EncoderDecoderBatch object containing sources

and targets in encoded and embedded formats for all inputs.

Return type:

EncoderDecoderBatch

class inseq.models.DecoderOnlyAttributionModel(**kwargs)[source]

AttributionModel class for attributing encoder-decoder models.

static enrich_step_output(step_output: FeatureAttributionStepOutput, batch: DecoderOnlyBatch, target_tokens: Sequence[Sequence[str]], target_ids: Tensor[Tensor]) FeatureAttributionStepOutput[source]

Enriches the attribution output with token information, producing the finished FeatureAttributionStepOutput object.

Parameters:
  • step_output (FeatureAttributionStepOutput) – The output produced by the attribution step, with missing batch information.

  • batch (DecoderOnlyBatch) – The batch on which attribution was performed.

  • target_ids (torch.Tensor) – Target token ids of size (batch_size, 1) corresponding to tokens for which the attribution step was performed.

Returns:

The enriched attribution output.

Return type:

FeatureAttributionStepOutput

forward(forward_tensor: Union[Tensor[Tensor], Tensor[Tensor]], input_ids: Tensor[Tensor], target_ids: Tensor[Tensor], attributed_fn: Callable[[...], Tensor[Tensor]], attention_mask: Optional[Tensor[Tensor]] = None, use_embeddings: bool = True, attributed_fn_argnames: Optional[List[str]] = None, *args) Tensor[Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.