(EMNLP2019) Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification

来自北邮团队，提出了可以集成多种额外信息的HIN和dual-level的GAT，利用附加信息帮助半监督STC。在AGNews上由TextGCN的67.67涨到72.10。

Methods

HIN for STC

Topic Memory Networks for Short Text Classification提出生成latent topic，一些方法中利用KB获取额外知识，以期充实short text的语义信息，这些方法忽视了short text中重要的entity信息。

该文章提出的HIN例子使用了topic和entity信息：

topic通过LDA提取，text包含topic即构建edge
entity通过TAGME工具抽取，先使用w2v embedding，$cos(word_i, entity)>\delta$则构建edge，node embedding用w2v+TF-IDF init

GCN

topic-text-entity Graph中异质节点embedding 维度不同，不能直接aggregate。concatenate方法（其他类型的axis上为0）忽略不同，

$H^{(l+1)}=\sigma(\Sigma_{\tau\in\Tau}\widetilde{A}_\tau \cdot H_\tau^{(l)} \cdot W_\tau^{(l)}),\\where\ \widetilde{A}_\tau\in\mathbb{R}^{|V|\times|V_\tau|} \text{ is the submatrix of }\widetilde{A}$

子邻接矩阵$\hat{A}$的row表示所有node，col表示类型为$\tau$的邻node。

通过不同的转换矩阵$W_\tau$将不同类型的node embedding投影到相同空间，从而聚合到$H^{(l+1)}$

dual-level attention

type-level

对节点$v$，计算其所有type $\tau$的邻节点embedding之和$h_\tau$：

$h_\tau=\Sigma_{v'}\widetilde{A}_{vv'}h_{v'}$

用$h_\tau$计算type $\tau$的attention：

$a_\tau=\sigma(\mu_\tau^T \cdot [h_v||h_\tau])$

计算type-level attention值：

$\alpha_\tau=\frac{exp(a_\tau)}{\Sigma_{\tau'\in\Tau}exp(a_{\tau'})}$

node-level

对节点$v$，计算其与拥有type $\tau’$的邻节点$v’$的attention： $b_{vv'}=\sigma(v^T \cdot \alpha_{\tau'}[h_v||h_{v'}])$

计算node-level attention值：

$\beta_\tau=\frac{exp(b_{vv'})}{\Sigma_{i\in\mathcal{N_v}}exp(b_{vi})}$

propagation：

$H^{(l+1)}=\sigma(\Sigma_{\tau\in\Tau}\mathcal{B_\tau}\cdot H_\tau^{(l)}\cdot W_\tau^{(l)})$

Rooki3Ray | Cyber Security