deepwalk

图嵌入/图表示学习

Graph Embedding将图中的节点以低维稠密向量的形式进行表达，要求在原始图中相似(不同的方法对相似的定义不同)的节点其在低维表达空间也接近。得到的表达向量可以用来进行下游任务，如节点分类，链接预测，可视化或重构原始图等。

deepwalk原理

https://zhuanlan.zhihu.com/p/56380812

在NLP任务中，word2vec是一种常用的word embedding方法，word2vec通过语料库中的句子序列来描述词与词的共现关系，进而学习到词语的向量表示。

DeepWalk的思想类似word2vec，使用图中节点与节点的共现关系来学习节点的向量表示。那么关键的问题就是如何来描述节点与节点的共现关系，DeepWalk给出的方法是使用随机游走(RandomWalk)的方式在图中进行节点采样。

RandomWalk是一种可重复访问已访问节点的DFS算法。给定当前访问起始节点，从其邻居中随机采样节点作为下一个访问节点，重复此过程，直到访问序列长度满足预设条件。

获取足够数量的节点访问序列后（这是得到的训练数据），使用skip-gram model 进行向量学习。

deepwalk伪代码

https://arxiv.org/abs/1403.6652

SkipGram is a language model that maximizes the cooccurrence probability among the words that appear within a window, w, in a sentence.

Lines 3-9 in Algorithm 1 shows the core of our approach. The outer loop specifies the number of times, γ, which we should start random walks at each vertex. We think of each iteration as making a ‘pass’ over the data and sample one walk per node during this pass. At the start of each pass we generate a random ordering to traverse the vertices. This is not strictly required, but is well-known to speed up the convergence of stochastic gradient descent.

node2vec中解释了为啥每轮都要从所有顶点出发做随机游走：
in any random walk, there is an implicit bias due to the choice of the start node u. Since we learn representations for all nodes, we offset this bias by simulating r random walks of fixed length l starting from every node.

In the inner loop, we iterate over all the vertices of the graph. For each vertex vi we generate a random walk |Wvi | = t, and then use it to update our representations (Line 7).

Algorithm 2 iterates over all possible collocations in random walk that appear within the window w (lines 1-2). For each, we map each vertex vj to its current representation vector Φ(vj ) ∈ R d. Given the representation of vj , we would like to maximize the probability of its neighbors in the walk (line 3; 传进来的Wvi是从vi出发随机走t步经过的点的序列，认为在点vj前后宽度w内的点都是离vj很近的点（和vj结构上共现的概率预期要大）).

优化

其中Algorithm 2 的 line 3的概率计算：