Inseq CLI

The Inseq CLI is a command line interface for the Inseq library. The CLI enables repeated attribution of individual examples and even entire 🤗 datasets directly from the console. See the available options by typing inseq -h in the terminal after installing the package.

Three commands are supported:

  • inseq attribute: Wrapper for enabling model.attribute usage in console.

  • inseq attribute-dataset: Extends attribute to full dataset using Hugging Face datasets.load_dataset API.

  • inseq attribute-context: Detects and attribute context dependence for generation tasks using the approach of Sarti et al. (2023).

attribute

The attribute command enables attribution of individual examples directly from the console. The command takes the following arguments:

class inseq.commands.attribute.attribute_args.AttributeWithInputsArgs(model_name_or_path: str | None = None, attribution_method: str | None = 'saliency', device: str = 'cpu', attributed_fn: str | None = None, attribution_selectors: list[int] | None = None, attribution_aggregators: list[str] | None = None, normalize_attributions: bool = False, model_kwargs: dict = <factory>, tokenizer_kwargs: dict = <factory>, generation_kwargs: dict = <factory>, attribution_kwargs: dict = <factory>, attribute_target: bool = False, generate_from_target_prefix: bool = False, step_scores: list[str] = <factory>, output_step_attributions: bool = False, include_eos_baseline: bool = False, batch_size: int = 8, aggregate_output: bool = False, hide_attributions: bool = False, save_path: str | None = None, viz_path: str | None = None, start_pos: int | None = None, end_pos: int | None = None, verbose: bool = False, very_verbose: bool = False, input_texts: list[str] | None = None, generated_texts: list[str] | None = None)[source]

Attributes:

model_name_or_path (<class 'str'>): The name or path of the model on which attribution is performed.

attribution_method (typing.Optional[str]): The attribution method used to perform feature attribution.

device (<class 'str'>): The device used for inference with Pytorch. Multi-GPU is not supported.

attributed_fn (typing.Optional[str]): The attribution target used for the attribution method. Default: probability. If a step function requiring additional arguments is used (e.g. contrast_prob_diff), they should be specified using the attribution_kwargs argument.

attribution_selectors (typing.Optional[list[int]]): The indices of the attribution scores to be used for the attribution aggregation. If specified, the aggregation function is applied only to the selected scores, and the other scores are discarded. If not specified, the aggregation function is applied to all the scores.

attribution_aggregators (list[str]): The aggregators used to aggregate the attribution scores for each context. The outcome should produce one score per input token

normalize_attributions (<class 'bool'>): Whether to normalize the attribution scores for each context. If True, the attribution scores for each context are normalized to sum up to 1, providing a relative notion of input salience.

model_kwargs (<class 'dict'>): Additional keyword arguments passed to the model constructor in JSON format.

tokenizer_kwargs (<class 'dict'>): Additional keyword arguments passed to the tokenizer constructor in JSON format.

generation_kwargs (<class 'dict'>): Additional keyword arguments passed to the generation method in JSON format.

attribution_kwargs (<class 'dict'>): Additional keyword arguments passed to the attribution method in JSON format.

attribute_target (<class 'bool'>): Performs the attribution procedure including the generated target prefix at every step.

generate_from_target_prefix (<class 'bool'>): Whether the generated_texts should be used as target prefixes for the generation process. If False, the generated_texts are used as full targets. Option only available for encoder-decoder models, since for decoder-only ones it is sufficient to add prefix to input string. Default: False.

step_scores (list[str]): Adds the specified step scores to the attribution output.

output_step_attributions (<class 'bool'>): Adds step-level feature attributions to the output.

include_eos_baseline (<class 'bool'>): Whether the EOS token should be included in the baseline, used for some attribution methods.

batch_size (<class 'int'>): The batch size used for the attribution computation. Default: no batching.

aggregate_output (<class 'bool'>): If specified, the attribution output is aggregated using its default aggregator before saving.

hide_attributions (<class 'bool'>): If specified, the attribution visualization are not shown in the output.

save_path (typing.Optional[str]): Path where the attribution output should be saved in JSON format.

viz_path (typing.Optional[str]): Path where the attribution visualization should be saved in HTML format.

start_pos (typing.Optional[int]): Start position for the attribution. Default: first token

end_pos (typing.Optional[int]): End position for the attribution. Default: last token

verbose (<class 'bool'>): If specified, use INFO as logging level for the attribution.

very_verbose (<class 'bool'>): If specified, use DEBUG as logging level for the attribution.

input_texts (list[str]): One or more input texts used for generation.

generated_texts (typing.Optional[list[str]]): If specified, constrains the decoding procedure to the specified outputs.

attribute-dataset

The attribute-dataset command extends the attribute command to full datasets using the Hugging Face datasets.load_dataset API. The command takes the following arguments:

class inseq.commands.attribute_dataset.attribute_dataset_args.LoadDatasetArgs(dataset_name: str, input_text_field: str | None, generated_text_field: str | None = None, dataset_config: str | None = None, dataset_dir: str | None = None, dataset_files: list[str] | None = None, dataset_split: str | None = 'train', dataset_revision: str | None = None, dataset_auth_token: str | None = None, dataset_kwargs: dict | None = <factory>)[source]

Attributes:

dataset_name (<class 'str'>): The type of dataset to be loaded for attribution.

input_text_field (typing.Optional[str]): Name of the field containing the input texts used for attribution.

generated_text_field (typing.Optional[str]): Name of the field containing the generated texts used for constrained decoding.

dataset_config (typing.Optional[str]): The name of the Huggingface dataset configuration.

dataset_dir (typing.Optional[str]): Path to the directory containing the data files.

dataset_files (typing.Optional[list[str]]): Path to the dataset files.

dataset_split (typing.Optional[str]): Dataset split.

dataset_revision (typing.Optional[str]): The Huggingface dataset revision.

dataset_auth_token (typing.Optional[str]): The auth token for the Huggingface dataset.

dataset_kwargs (typing.Optional[dict]): Additional keyword arguments passed to the dataset constructor in JSON format.

class inseq.commands.attribute.attribute_args.AttributeExtendedArgs(model_name_or_path: str | None = None, attribution_method: str | None = 'saliency', device: str = 'cpu', attributed_fn: str | None = None, attribution_selectors: list[int] | None = None, attribution_aggregators: list[str] | None = None, normalize_attributions: bool = False, model_kwargs: dict = <factory>, tokenizer_kwargs: dict = <factory>, generation_kwargs: dict = <factory>, attribution_kwargs: dict = <factory>, attribute_target: bool = False, generate_from_target_prefix: bool = False, step_scores: list[str] = <factory>, output_step_attributions: bool = False, include_eos_baseline: bool = False, batch_size: int = 8, aggregate_output: bool = False, hide_attributions: bool = False, save_path: str | None = None, viz_path: str | None = None, start_pos: int | None = None, end_pos: int | None = None, verbose: bool = False, very_verbose: bool = False)[source]

Attributes:

model_name_or_path (<class 'str'>): The name or path of the model on which attribution is performed.

attribution_method (typing.Optional[str]): The attribution method used to perform feature attribution.

device (<class 'str'>): The device used for inference with Pytorch. Multi-GPU is not supported.

attributed_fn (typing.Optional[str]): The attribution target used for the attribution method. Default: probability. If a step function requiring additional arguments is used (e.g. contrast_prob_diff), they should be specified using the attribution_kwargs argument.

attribution_selectors (typing.Optional[list[int]]): The indices of the attribution scores to be used for the attribution aggregation. If specified, the aggregation function is applied only to the selected scores, and the other scores are discarded. If not specified, the aggregation function is applied to all the scores.

attribution_aggregators (list[str]): The aggregators used to aggregate the attribution scores for each context. The outcome should produce one score per input token

normalize_attributions (<class 'bool'>): Whether to normalize the attribution scores for each context. If True, the attribution scores for each context are normalized to sum up to 1, providing a relative notion of input salience.

model_kwargs (<class 'dict'>): Additional keyword arguments passed to the model constructor in JSON format.

tokenizer_kwargs (<class 'dict'>): Additional keyword arguments passed to the tokenizer constructor in JSON format.

generation_kwargs (<class 'dict'>): Additional keyword arguments passed to the generation method in JSON format.

attribution_kwargs (<class 'dict'>): Additional keyword arguments passed to the attribution method in JSON format.

attribute_target (<class 'bool'>): Performs the attribution procedure including the generated target prefix at every step.

generate_from_target_prefix (<class 'bool'>): Whether the generated_texts should be used as target prefixes for the generation process. If False, the generated_texts are used as full targets. Option only available for encoder-decoder models, since for decoder-only ones it is sufficient to add prefix to input string. Default: False.

step_scores (list[str]): Adds the specified step scores to the attribution output.

output_step_attributions (<class 'bool'>): Adds step-level feature attributions to the output.

include_eos_baseline (<class 'bool'>): Whether the EOS token should be included in the baseline, used for some attribution methods.

batch_size (<class 'int'>): The batch size used for the attribution computation. Default: no batching.

aggregate_output (<class 'bool'>): If specified, the attribution output is aggregated using its default aggregator before saving.

hide_attributions (<class 'bool'>): If specified, the attribution visualization are not shown in the output.

save_path (typing.Optional[str]): Path where the attribution output should be saved in JSON format.

viz_path (typing.Optional[str]): Path where the attribution visualization should be saved in HTML format.

start_pos (typing.Optional[int]): Start position for the attribution. Default: first token

end_pos (typing.Optional[int]): End position for the attribution. Default: last token

verbose (<class 'bool'>): If specified, use INFO as logging level for the attribution.

very_verbose (<class 'bool'>): If specified, use DEBUG as logging level for the attribution.

attribute-context

The attribute-context command detects and attributes context dependence for generation tasks using the approach of Sarti et al. (2023). The command takes the following arguments:

class inseq.commands.attribute_context.attribute_context_args.AttributeContextArgs(show_intermediate_outputs: bool = False, save_path: str | None = None, add_output_info: bool = True, viz_path: str | None = None, show_viz: bool = True, model_name_or_path: str | None = None, attribution_method: str | None = 'saliency', device: str = 'cpu', attributed_fn: str | None = None, attribution_selectors: list[int] | None = None, attribution_aggregators: list[str] | None = None, normalize_attributions: bool = False, model_kwargs: dict = <factory>, tokenizer_kwargs: dict = <factory>, generation_kwargs: dict = <factory>, attribution_kwargs: dict = <factory>, context_sensitivity_metric: str = 'kl_divergence', handle_output_context_strategy: str = 'manual', contextless_output_next_tokens: list[str] = <factory>, prompt_user_for_contextless_output_next_tokens: bool = False, special_tokens_to_keep: list[str] = <factory>, decoder_input_output_separator: str = ' ', context_sensitivity_std_threshold: float = 1.0, context_sensitivity_topk: int | None = None, attribution_std_threshold: float = 1.0, attribution_topk: int | None = None, input_current_text: str = '', input_context_text: str | None = None, input_template: str | None = None, output_context_text: str | None = None, output_current_text: str | None = None, output_template: str | None = None, contextless_input_current_text: str | None = None, contextless_output_current_text: str | None = None)[source]

Attributes:

show_intermediate_outputs (<class 'bool'>): If specified, the intermediate outputs produced by the Inseq library for context-sensitive target identification (CTI) and contextual cues imputation (CCI) are shown during the process.

save_path (typing.Optional[str]): If present, the output of the two-step process will be saved in JSON format at the specified path.

add_output_info (<class 'bool'>): If specified, additional information about the attribution process is added to the saved output.

viz_path (typing.Optional[str]): If specified, the visualization produced from the output is saved in HTML format at the specified path.

show_viz (<class 'bool'>): If specified, the visualization produced from the output is shown in the terminal.

model_name_or_path (<class 'str'>): The name or path of the model on which attribution is performed.

attribution_method (typing.Optional[str]): The attribution method used to perform feature attribution.

device (<class 'str'>): The device used for inference with Pytorch. Multi-GPU is not supported.

attributed_fn (typing.Optional[str]): The attribution target used for the attribution method. Default: probability. If a step function requiring additional arguments is used (e.g. contrast_prob_diff), they should be specified using the attribution_kwargs argument.

attribution_selectors (typing.Optional[list[int]]): The indices of the attribution scores to be used for the attribution aggregation. If specified, the aggregation function is applied only to the selected scores, and the other scores are discarded. If not specified, the aggregation function is applied to all the scores.

attribution_aggregators (list[str]): The aggregators used to aggregate the attribution scores for each context. The outcome should produce one score per input token

normalize_attributions (<class 'bool'>): Whether to normalize the attribution scores for each context. If True, the attribution scores for each context are normalized to sum up to 1, providing a relative notion of input salience.

model_kwargs (<class 'dict'>): Additional keyword arguments passed to the model constructor in JSON format.

tokenizer_kwargs (<class 'dict'>): Additional keyword arguments passed to the tokenizer constructor in JSON format.

generation_kwargs (<class 'dict'>): Additional keyword arguments passed to the generation method in JSON format.

attribution_kwargs (<class 'dict'>): Additional keyword arguments passed to the attribution method in JSON format.

context_sensitivity_metric (<class 'str'>): The contrastive metric used to detect context-sensitive tokens in output_current_text.

handle_output_context_strategy (<class 'str'>): Specifies how output context should be handled when it is produced together with the output current text, and the two need to be separated for context sensitivity detection. Options: - manual: The user is prompted to verify an automatic context detection attempt, and optionally to provide a correct context separation manually. - auto: Attempts an automatic detection of context using an automatic alignment with source context (assuming an MT-like task). - pre: If context is required but not pre-defined by the user via the output_context_text argument, the execution will fail instead of attempting to prompt the user for the output context.

contextless_output_next_tokens (list[str]): If specified, it should provide a list of one token per CCI output indicating the next token that should be force-decoded as contextless output instead of the natural output produced by get_contextless_output. This is ignored if the attributed_fn used is not contrastive.

prompt_user_for_contextless_output_next_tokens (<class 'bool'>): If specified, the user is prompted to provide the next token that should be force-decoded as contextless output instead of the natural output produced by get_contextless_output. This is ignored if the attributed_fn used is not contrastive.

special_tokens_to_keep (list[str]): Special tokens to preserve in the generated string, e.g. <brk> separator between context and current.

decoder_input_output_separator (<class 'str'>): If specified, the separator used to split the input and output of the decoder. If not specified, the separator is a whitespace character.

context_sensitivity_std_threshold (<class 'float'>): Parameter to control the selection of output_current_text tokens considered as context-sensitive for moving onwards with attribution. Corresponds to the number of standard deviations above or below the mean context_sensitivity_metric score for tokens to be considered context-sensitive.

context_sensitivity_topk (typing.Optional[int]): If set, after selecting the salient context-sensitive tokens with context_sensitivity_std_threshold only the top-K remaining tokens are used. By default no top-k selection is performed.

attribution_std_threshold (<class 'float'>): Parameter to control the selection of input_context_text and output_context_text tokens considered as salient as a result for the attribution process. Corresponds to the number of standard deviations above or below the mean attribution_method score for tokens to be considered salient. CCI scores for all context tokens are saved in the output, but this parameter controls which tokens are used in the visualization of context reliance.

attribution_topk (typing.Optional[int]): If set, after selecting the most salient tokens with attribution_std_threshold only the top-K remaining tokens are used. By default no top-k selection is performed.

input_current_text (<class 'str'>): The input text used for generation. If the model is a decoder-only model, the input text is a prompt used for language modeling. If the model is an encoder-decoder model, the input text is the source text provided as input to the encoder. It will be formatted as {current} in the input_template.

input_context_text (typing.Optional[str]): Additional input context influencing the generation of output_current_text. If the model is a decoder-only model, the input text is a prefix to the input_current_text prompt. If the model is an encoder-decoder model, the input context is part of the source text provided as input to the encoder. It will be formatted as {context} in the input_template.

input_template (typing.Optional[str]): The template used to format model inputs. The template must contain at least the {current} placeholder, which will be replaced by input_current_text. If {context} is also specified, input-side context will be used. Can be modified for models requiring special tokens or formatting in the input text (e.g. <brk> tags to separate context and current inputs). Defaults to ‘{context} {current}’ if input_context_text is provided, ‘{current}’ otherwise.

output_context_text (typing.Optional[str]): An output contexts for which context sensitivity should be detected. For encoder-decoder models, this is a target-side prefix to the output_current_text used as input to the decoder. For decoder-only models this is a portion of the model generation that should be considered as an additional context (e.g. a chain-of-thought sequence). It will be formatted as {context} in the output_template. If not provided but specified in the output_template, the output context will be generated along with the output current text, and user validation might be required to separate the two.

output_current_text (typing.Optional[str]): The output text generated by the model when all available contexts are provided. Tokens in output_current_text will be tested for context-sensitivity, and their generation will be attributed to input/target contexts (if present) in case they are found to be context-sensitive. If specified, this output is force-decoded. Otherwise, it is generated by the model using infilled input_template and output_template. It will be formatted as {current} in the output_template.

output_template (typing.Optional[str]): The template used to format model outputs. The template must contain at least the {current} placeholder, which will be replaced by output_current_text. If {context} is also specified, output-side context will be used. Can be modified for models requiring special tokens or formatting in the output text (e.g. <brk> tags to separate context and current outputs). Defaults to ‘{context} {current}’ if output_context_text is provided, ‘{current}’ otherwise.

contextless_input_current_text (typing.Optional[str]): The input current text or template to use in the contrastive comparison with contextual input. By default it is the same as input_current_text, but it can be useful in cases where the context is nested inside the current text (e.g. for an input_template like <user> {context} {current} <assistant> we can use this parameter to format the contextless version as <user> {current} <assistant>).If it contains the tag {current}, it will be infilled with the input_current_text. Otherwise, it will be used as-is for the contrastive comparison, enabling contrastive comparison with different inputs.

contextless_output_current_text (typing.Optional[str]): The output current text or template to use in the contrastive comparison with contextual output. By default it is the same as output_current_text, but it can be useful in cases where the context is nested inside the current text (e.g. for an output_template like <user> {context} {current} <assistant> we can use this parameter to format the contextless version as <user> {current} <assistant>).If it contains the tag {current}, it will be infilled with the output_current_text. Otherwise, it will be used as-is for the contrastive comparison, enabling contrastive comparison with different outputs.