【2】使用Captum库解释BERT模型
-
Captum是PyTorch的模型可解释性和理解库。Captum在拉丁语中表示理解,包含PyTorch模型的集成梯度、显著图、平滑、vargrad等通用实现。它可以快速集成使用特定于领域的库(如torchvision、torchtext等)构建的模型。
https://github.com/pytorch/captum/blob/master/tutorials/Bert_SQUAD_Interpret2.ipynbInterpreting BERT Models (Part 2)
In the second part of interpreting Bert models we look into attention matrices, their importance scores, vector norms and compare them with the results that we found in Part 1.
Similar to Part 1 we use Bert Question Answering model fine-tuned on SQUAD dataset using transformers library from Hugging Face: https://huggingface.co/transformers/
In order to be able to use the same setup and reproduce the results form Part 1 we will redefine same setup and helper functions in this tutorial as well.
In this tutorial we compare attention matrices with their importance scores when we attribute them to a particular class, and vector norms as proposed in paper: https://arxiv.org/pdf/2004.10102.pdf
We show that the importance scores computed for the attention matrices and specific class are more meaningful than the attention matrices alone or different norm vectors computed for different input activations.
Note: Before running this tutorial, please install
seaborn
,pandas
andmatplotlib
,transformers
(from hugging face) python packages in addition toCaptum
andtorch
libraries.This tutorial was built using transformer version 4.3.0.
在解释 Bert 模型的第二部分中,我们研究了注意力矩阵、它们的重要性分数、向量范数,并将它们与我们在第 1 部分中发现的结果进行比较。
与第 1 部分类似,我们使用来自 Hugging Face 的转换器库在 SQUAD 数据集上微调 Bert 问答模型:https://huggingface.co/transformers/
为了能够使用相同的设置并重现第 1 部分的结果,我们还将在本教程中重新定义相同的设置和辅助函数。
在本教程中,当我们将注意力矩阵归因于特定类别时,我们将注意力矩阵与其重要性得分进行比较,以及论文中提出的向量范数:https://arxiv.org/pdf/2004.10102.pdf
我们表明,为注意力矩阵和特定类别计算的重要性分数比单独的注意力矩阵或为不同输入激活计算的不同范数向量更有意义。
注意:在运行本教程之前,除了
Captum
和torch
库之外,请安装seaborn
、pandas
和matplotlib
、transformers
(来自拥抱脸)python 包。本教程是使用 Transformer 版本 4.3.0 构建的。
import os import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt import torch import torch.nn as nn from transformers import BertTokenizer, BertForQuestionAnswering, BertConfig from captum.attr import visualization as viz from captum.attr import IntegratedGradients, LayerConductance, LayerIntegratedGradients, LayerActivation from captum.attr import configure_interpretable_embedding_layer, remove_interpretable_embedding_layer
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
The first step is to fine-tune BERT model on SQUAD dataset. This can be easiy accomplished by following the steps described in hugging face’s official web site: https://github.com/huggingface/transformers#run_squadpy-fine-tuning-on-squad-for-question-answering
Note that the fine-tuning is done on a
bert-base-cased
pre-trained model.第一步是在 SQUAD 数据集上微调 BERT 模型。 这可以通过遵循拥抱脸官方网站中描述的步骤轻松完成:https://github.com/huggingface/transformers#run_squadpy-fine-tuning-on-squad-for-question-answering
请注意,微调是在“bert-base-cased”预训练模型上完成的。
After we pretrain the model, we can load the tokenizer and pre-trained BERT model using the commands described below.
在我们预训练模型之后,我们可以使用下面描述的命令加载tokenizer和预训练的 BERT 模型。
# replace <PATH-TO-SAVED-MODEL> with the real path of the saved model model_path = 'bert-base-cased-squad2' # load model model = BertForQuestionAnswering.from_pretrained(model_path, output_attentions=True) model.to(device) model.eval() model.zero_grad() # load tokenizer tokenizer = BertTokenizer.from_pretrained(model_path,do_lower_case=False)
A helper function to perform forward pass of the model and make predictions.
一个辅助函数,用于执行模型的前向传递并进行预测。
def predict(inputs, token_type_ids=None, position_ids=None, attention_mask=None): output = model(inputs, token_type_ids=token_type_ids, position_ids=position_ids, attention_mask=attention_mask, ) return output.start_logits, output.end_logits, output.attentions
Defining a custom forward function that will allow us to access the start and end positions of our prediction using
position
input argument.定义一个自定义前向函数,允许我们使用
position
输入参数访问预测的开始和结束位置。def squad_pos_forward_func(inputs, token_type_ids=None, position_ids=None, attention_mask=None, position=0): pred = model(inputs_embeds=inputs, token_type_ids=token_type_ids, position_ids=position_ids, attention_mask=attention_mask, ) pred = pred[position] return pred.max(1).values
Let’s define some variables and functions that will help us to compute the attribution of attention matrices for specific output such as start or end positions of the prediction.
To do so, we need to define baselines / references, numericalize both the baselines and the inputs. We will define helper functions to achieve that.
The cell below defines numericalized special tokens that will be later used for constructing inputs and corresponding baselines/references.
让我们定义一些变量和函数,它们将帮助我们计算特定输出的注意力矩阵的属性,例如预测的开始或结束位置。
为此,我们需要定义baselines/references,将baselines和inputs数值化。 我们将定义辅助函数来实现这一点。
下面的单元格定义了数字化的特殊标记,稍后将用于构建输入和相应的baselines/references。
ref_token_id = tokenizer.pad_token_id # A token used for generating token reference sep_token_id = tokenizer.sep_token_id # A token used as a separator between question and text and it is also added to the end of the text. cls_token_id = tokenizer.cls_token_id # A token used for prepending to the concatenated question-text word sequence
Below we define a set of helper function for constructing references / baselines for word tokens, token types and position ids.
下面我们定义了一组辅助函数,用于为word tokens,token types和position ids构建baselines/references。
def construct_input_ref_pair(question, text, ref_token_id, sep_token_id, cls_token_id): question_ids = tokenizer.encode(question, add_special_tokens=False) text_ids = tokenizer.encode(text, add_special_tokens=False) # construct input token ids input_ids = [cls_token_id] + question_ids + [sep_token_id] + text_ids + [sep_token_id] # construct reference token ids ref_input_ids = [cls_token_id] + [ref_token_id] * len(question_ids) + [sep_token_id] + \ [ref_token_id] * len(text_ids) + [sep_token_id] return torch.tensor([input_ids], device=device), torch.tensor([ref_input_ids], device=device), len(question_ids) def construct_input_ref_token_type_pair(input_ids, sep_ind=0): seq_len = input_ids.size(1) token_type_ids = torch.tensor([[0 if i <= sep_ind else 1 for i in range(seq_len)]], device=device) ref_token_type_ids = torch.zeros_like(token_type_ids, device=device)# * -1 return token_type_ids, ref_token_type_ids def construct_input_ref_pos_id_pair(input_ids): seq_length = input_ids.size(1) position_ids = torch.arange(seq_length, dtype=torch.long, device=device) # we could potentially also use random permutation with `torch.randperm(seq_length, device=device)` ref_position_ids = torch.zeros(seq_length, dtype=torch.long, device=device) position_ids = position_ids.unsqueeze(0).expand_as(input_ids) ref_position_ids = ref_position_ids.unsqueeze(0).expand_as(input_ids) return position_ids, ref_position_ids def construct_attention_mask(input_ids): return torch.ones_like(input_ids) def construct_whole_bert_embeddings(input_ids, ref_input_ids, \ token_type_ids=None, ref_token_type_ids=None, \ position_ids=None, ref_position_ids=None): input_embeddings = interpretable_embedding.indices_to_embeddings(input_ids) ref_input_embeddings = interpretable_embedding.indices_to_embeddings(ref_input_ids) return input_embeddings, ref_input_embeddings
Let’s define the
question - text
pair that we’d like to use as an input for our Bert model and interpret what the model was focusing on when predicting an answer to the question from given input text让我们定义
问题 - 文本
对,我们希望将其用作 Bert 模型的输入,并在从给定的输入文本预测问题的答案时解释模型关注的内容question, text = "What is important to us?", "It is important to us to include, empower and support humans of all kinds."
Let’s numericalize the question, the input text and generate corresponding baselines / references for all three sub-embeddings (word, token type and position embeddings) types using our helper functions defined above.
让我们使用上面定义的辅助函数对问题、输入文本进行数值化,并为所有三个子嵌入(词、标记类型和位置嵌入)类型生成相应的baselines/references。
input_ids, ref_input_ids, sep_id = construct_input_ref_pair(question, text, ref_token_id, sep_token_id, cls_token_id) token_type_ids, ref_token_type_ids = construct_input_ref_token_type_pair(input_ids, sep_id) position_ids, ref_position_ids = construct_input_ref_pos_id_pair(input_ids) attention_mask = construct_attention_mask(input_ids) indices = input_ids[0].detach().tolist() all_tokens = tokenizer.convert_ids_to_tokens(indices)
Also, let’s define the ground truth for prediction’s start and end positions.
此外,让我们为预测的开始和结束位置定义真实标签值。
ground_truth = 'to include, empower and support humans of all kinds' ground_truth_tokens = tokenizer.encode(ground_truth, add_special_tokens=False) ground_truth_end_ind = indices.index(ground_truth_tokens[-1]) ground_truth_start_ind = ground_truth_end_ind - len(ground_truth_tokens) + 1
Now let’s make predictions using input, token type, position id and a default attention mask.
现在让我们使用输入、标记类型、位置 ID 和默认注意掩码进行预测。
start_scores, end_scores, output_attentions = predict(input_ids, token_type_ids=token_type_ids, \ position_ids=position_ids, \ attention_mask=attention_mask) print('Question: ', question) print('Predicted Answer: ', ' '.join(all_tokens[torch.argmax(start_scores) : torch.argmax(end_scores)+1]))
Question: What is important to us? Predicted Answer: to include , em ##power and support humans of all kinds
Visualizing Attention Matrices
output_attentions
represent attention matrices aka attention probabilities for all 12 layers and all 12 heads. It represents softmax-normalized dot-product between the key and query vectors. In the literature (https://www.aclweb.org/anthology/W19-4828.pdf) it has been used as an importance indicator of how much a token attends / relates to another token in the text. In case of translation for example it is a good indicator of how much a token in one language attends to the corresponding translation in another language. In case of Question Answering model it indicates which tokens attend / relate to each other in question, text or answer segment.Since
output_attentions
contains the layers in a list, we will stack them in order to move everything into a tensor.output_attentions
表示所有 12 层和所有 12 个头的注意力矩阵,即注意力概率。 它表示键向量和查询向量之间的 softmax 归一化点积。 在文献 (https://www.aclweb.org/anthology/W19-4828.pdf) 中,它已被用作一个标记与文本中另一个标记的关联程度的重要性指标。 例如,在翻译的情况下,它可以很好地指示一种语言的标记对另一种语言的相应翻译的参与程度。 在问答模型的情况下,它指示哪些令牌在问题、文本或答案段中相互关联/相关。由于
output_attentions
包含列表中的层,我们将堆叠它们以将所有内容移动到张量中。# shape -> layer x batch x head x seq_len x seq_len output_attentions_all = torch.stack(output_attentions)
A helper function for visualizing Token-To-Token matices
Below helper function will be used for visualizing token-to-token relation / attention scores for all heads in a given layer or for all layers across all heads.
下面的辅助函数将用于可视化给定层中所有头或所有头中所有层token到token的关系/注意力分数。
def visualize_token2token_scores(scores_mat, x_label_name='Head'): fig = plt.figure(figsize=(20, 20)) for idx, scores in enumerate(scores_mat): scores_np = np.array(scores) ax = fig.add_subplot(4, 3, idx+1) # append the attention weights im = ax.imshow(scores, cmap='viridis') fontdict = {'fontsize': 10} ax.set_xticks(range(len(all_tokens))) ax.set_yticks(range(len(all_tokens))) ax.set_xticklabels(all_tokens, fontdict=fontdict, rotation=90) ax.set_yticklabels(all_tokens, fontdict=fontdict) ax.set_xlabel('{} {}'.format(x_label_name, idx+1)) fig.colorbar(im, fraction=0.046, pad=0.04) plt.tight_layout() plt.show()
A helper function for visualizing Token-To-Head matrices
Below helper function will be used for visualizing the importance scores for tokens across all heads in all layers.
下面的辅助函数将用于可视化所有层中所有head的token的重要性分数。
def visualize_token2head_scores(scores_mat): fig = plt.figure(figsize=(30, 50)) for idx, scores in enumerate(scores_mat): scores_np = np.array(scores) ax = fig.add_subplot(6, 2, idx+1) # append the attention weights im = ax.matshow(scores_np, cmap='viridis') fontdict = {'fontsize': 20} ax.set_xticks(range(len(all_tokens))) ax.set_yticks(range(len(scores))) ax.set_xticklabels(all_tokens, fontdict=fontdict, rotation=90) ax.set_yticklabels(range(len(scores[0])), fontdict=fontdict) ax.set_xlabel('Layer {}'.format(idx+1)) fig.colorbar(im, fraction=0.046, pad=0.04) plt.tight_layout() plt.show()
Let’s examine a specific layer. For that reason we will define a fixed layer id that will be used for visualization purposes. The users are free to change this layer if they want to examine a different one.
让我们检查一个特定的层。 出于这个原因,我们将定义一个用于可视化目的的固定层 id。 如果用户想检查不同的层,他们可以自由更改这一层。
layer = 11
Visualizing attention matrices for a selected layer
layer
.可视化选定层“层”的注意力矩阵。
visualize_token2token_scores(output_attentions_all[layer].squeeze().detach().cpu().numpy())
Based on the visualizations above we observe that there is a high attention set along the diagonals and on an uninformative token such as
[SEP]
. This is something that was observed in previous papers which indicates that attention matrices aren’t always a good indicator of finding which tokens are more important or which token is related to which. We observe similar pattern when we examine another layer.基于上面的可视化,我们观察到沿对角线和无信息标记(例如
[SEP]
)有很高的注意力集。 这是在之前的论文中观察到的事情,这表明注意力矩阵并不总是一个很好的指标,可以找到哪些标记更重要或哪些标记与哪些标记相关。 当我们检查另一层时,我们观察到类似的模式。In the cell below we compute and visualize L2 norm across head axis for all 12 layer. This provides a summary for each layer across all heads.
在下面的单元格中,我们计算和可视化所有 12 层的跨头轴的 L2 范数。 这提供了所有头部的每一层的摘要。
Defining normalization function depending on pytorch version.
根据 pytorch 版本定义规范化函数。
if torch.__version__ >= '1.7.0': norm_fn = torch.linalg.norm else: norm_fn = torch.norm
visualize_token2token_scores(norm_fn(output_attentions_all, dim=2).squeeze().detach().cpu().numpy(), x_label_name='Layer')
Based on the visualiziation above we can convince ourselves that attention scores aren’t trustworthy measures of importances for token-to-token relations across all layers. We see strong signal along the diagonal and for the
[SEP]
and[CLS]
tokens. These signals, however, aren’t true indicators of what semantic the model learns.基于上面的可视化,我们可以发现,注意力分数不是所有层中token到token关系重要性的可靠度量。 我们看到沿对角线以及
[SEP]
和[CLS]
令牌的强烈信号。 然而,这些信号并不是模型学习的语义的真正指标。Visualizing attribution / importance scores
In the cells below we visualize the attribution scores of attention matrices for the start and end position positions prediction and compare with the actual attention matrices. To do so, first of all, we compute the attribution scores using LayerConductance algorithm similar to Part 1.
在下面的单元格中,我们将开始和结束位置位置预测的注意力矩阵的属性分数可视化,并与实际的注意力矩阵进行比较。 为此,首先,我们使用类似于第 1 部分的 LayerConductance 算法计算归因分数。
A helper function to summarize attributions for each word token in the sequence.
一个辅助函数,用于总结序列中每个单词标记的属性。
def summarize_attributions(attributions): attributions = attributions.sum(dim=-1).squeeze(0) attributions = attributions / norm_fn(attributions) return attributions