1. Motivations
They exist!
- Cohesive subgroup / cluster / community…
- In examining network data, we would first try to detect the cohesive subgroups and then, by looking at common attributes, see if there was some underlying principle that could explain why they identify with each other.
Need for data reduction
- Can analyze each group separately
- Can aggregate subgroups into “super-nodes” and visualize network of groups - simplification
Groups affect social processes we care about (group是原因)
- Groups are powerful influencers/constrainers of members
- Eg: create homogeneity with respect to attitudes, behaviors. i.e., contagion
- people in groups become more similar to each other
Groups are the outcomes of social processes we care about (group是结果)
- Emergent patterns from local rules
- Homophily
- Propinquity provides opportunity
使用visualization做community detection不太靠谱——需要专门的algorithm来做。
2. Hierarchical clustering
Johnson’s Hierarchical Clustering - HiClus: Multivariate techniques in SNA
- Output is a set of nested partitions, starting with identity partition(每个node是一个cluster) and ending with the complete partition(所有nodes构成一个cluster)
- 适用条件
- Undirected network (i.e. symmetric data)
- Doesn’t work well with matrix of 0s and 1s – not enough variation to play with
- Geo distance matrix
- Reciprocal distance
- ……