gds(2) 커뮤니티 감지 체험기 - sfjun/neo4j GitHub Wiki
=========================================================== 커뮤니티 감지 알고리즘 커뮤니티 감지 알고리즘은 노드 그룹이 클러스터링되거나 분할되는 방식과 강화 또는 분리되는 경향을 평가하는 데 사용됩니다
생산품질 -Louvain -라벨 전파 -약하게 연결된 구성 요소 -삼각형 수 -로컬 클러스터링 계수
베타 -K-1 채색 -모듈화 최적화
알파 -강력하게 연결된 구성 요소 -화자-청취자 레이블 전파
Louvain 알고리즘 대규모 네트워크에서 커뮤니티를 감지하는 알고리즘
예)
CREATE (nAlice:User {name: 'Alice', seed: 42}), (nBridget:User {name: 'Bridget', seed: 42}), (nCharles:User {name: 'Charles', seed: 42}), (nDoug:User {name: 'Doug'}), (nMark:User {name: 'Mark'}), (nMichael:User {name: 'Michael'}),
(nAlice)-[:LINK {weight: 1}]->(nBridget), (nAlice)-[:LINK {weight: 1}]->(nCharles), (nCharles)-[:LINK {weight: 1}]->(nBridget),
(nAlice)-[:LINK {weight: 5}]->(nDoug),
(nMark)-[:LINK {weight: 1}]->(nDoug), (nMark)-[:LINK {weight: 1}]->(nMichael), (nMichael)-[:LINK {weight: 1}]->(nMark);
Added 6 labels, created 6 nodes, set 16 properties, created 7 relationships, completed after 505 ms.
#카테고리 확인
CALL gds.graph.list()
삭제
CALL gds.graph.drop('myGraph')
#카테고리 생성
CALL gds.graph.create( 'myGraph', 'User', { LINK: { orientation: 'UNDIRECTED' } }, { nodeProperties: 'seed', relationshipProperties: 'weight' } )
nodeProjection relationshipProjection graphName nodeCount relationshipCount createMillis
{ "User": { "properties": { "seed": { "property": "seed", "defaultValue": null } }, "label": "User" } } { "LINK": { "orientation": "UNDIRECTED", "aggregation": "DEFAULT", "type": "LINK", "properties": { "weight": { "property": "weight", "aggregation": "DEFAULT", "defaultValue": null } } } } "myGraph" 6 14 31
3.1. 메모리 추정
CALL gds.louvain.write.estimate('myGraph', { writeProperty: 'community' }) YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
nodeCount relationshipCount bytesMin bytesMax requiredMemory 6 14 5321 580096 "[5321 Bytes ... 566 KiB]"
3.2. 흐름
CALL gds.louvain.stream('myGraph') YIELD nodeId, communityId, intermediateCommunityIds RETURN gds.util.asNode(nodeId).name AS name, communityId, intermediateCommunityIds ORDER BY name ASC
name communityId intermediateCommunityIds "Alice" 2 null "Bridget" 2 null "Charles" 2 null "Doug" 5 null "Mark" 5 null "Michael" 5 null
-->알고리즘은 각 노드에 대한 커뮤니티 ID를 반환합니다
3.3. 통계
CALL gds.louvain.stats('myGraph') YIELD communityCount
communityCount 2
3.4. 돌연변이
CALL gds.louvain.mutate('myGraph', { mutateProperty: 'communityId' }) YIELD communityCount, modularity, modularities
communityCount modularity modularities 2 0.3571428571428571 [0.3571428571428571]
3.5. 쓰다
CALL gds.louvain.write('myGraph', { writeProperty: 'community' }) YIELD communityCount, modularity, modularities
communityCount modularity modularities 2 0.3571428571428571 [0.3571428571428571]
╒══════════════════════════════════════════╕ │"n" │ ╞══════════════════════════════════════════╡ │{"name":"Alice","seed":42,"community":2} │ ├──────────────────────────────────────────┤ │{"name":"Bridget","seed":42,"community":2}│ ├──────────────────────────────────────────┤ │{"name":"Charles","seed":42,"community":2}│ ├──────────────────────────────────────────┤ │{"name":"Doug","community":5} │ ├──────────────────────────────────────────┤ │{"name":"Mark","community":5} │ ├──────────────────────────────────────────┤ │{"name":"Michael","community":5} │ └──────────────────────────────────────────┘
--> community property 및 알고리즘의 결과 값 들어가 있음.
3.6. 가중 Louvain 알고리즘은 모듈성을 계산할 때 주어진 관계 가중치를 고려하여 가중치 그래프에서도 실행할 수 있습니다.
CALL gds.louvain.stream('myGraph', { relationshipWeightProperty: 'weight' }) YIELD nodeId, communityId, intermediateCommunityIds RETURN gds.util.asNode(nodeId).name AS name, communityId, intermediateCommunityIds ORDER BY name ASC
name communityId intermediateCommunityIds "Alice" 3 null "Bridget" 2 null "Charles" 2 null "Doug" 3 null "Mark" 5 null "Michael" 5 null
3.7. 시드 Louvain 알고리즘은 seed 속성을 제공하여 점진적으로 실행할 수 있습니다. seed 속성을 사용하면 로드된 노드의 하위 집합에 대한 초기 커뮤니티 매핑을 제공 할 수 있습니다. 알고리즘은 시드된 커뮤니티 ID를 유지하려고합니다.
CALL gds.louvain.stream('myGraph', { seedProperty: 'seed' }) YIELD nodeId, communityId, intermediateCommunityIds RETURN gds.util.asNode(nodeId).name AS name, communityId, intermediateCommunityIds ORDER BY name ASC
name communityId intermediateCommunityIds "Alice" 42 null "Bridget" 42 null "Charles" 42 null "Doug" 47 null "Mark" 47 null "Michael" 47 null
3.8. 중간 커뮤니티 스트리밍
#예제 생성
CREATE (a:Node {name: 'a'}) CREATE (b:Node {name: 'b'}) CREATE (c:Node {name: 'c'}) CREATE (d:Node {name: 'd'}) CREATE (e:Node {name: 'e'}) CREATE (f:Node {name: 'f'}) CREATE (g:Node {name: 'g'}) CREATE (h:Node {name: 'h'}) CREATE (i:Node {name: 'i'}) CREATE (j:Node {name: 'j'}) CREATE (k:Node {name: 'k'}) CREATE (l:Node {name: 'l'}) CREATE (m:Node {name: 'm'}) CREATE (n:Node {name: 'n'}) CREATE (x:Node {name: 'x'})
CREATE (a)-[:TYPE]->(b) CREATE (a)-[:TYPE]->(d) CREATE (a)-[:TYPE]->(f) CREATE (b)-[:TYPE]->(d) CREATE (b)-[:TYPE]->(x) CREATE (b)-[:TYPE]->(g) CREATE (b)-[:TYPE]->(e) CREATE (c)-[:TYPE]->(x) CREATE (c)-[:TYPE]->(f) CREATE (d)-[:TYPE]->(k) CREATE (e)-[:TYPE]->(x) CREATE (e)-[:TYPE]->(f) CREATE (e)-[:TYPE]->(h) CREATE (f)-[:TYPE]->(g) CREATE (g)-[:TYPE]->(h) CREATE (h)-[:TYPE]->(i) CREATE (h)-[:TYPE]->(j) CREATE (i)-[:TYPE]->(k) CREATE (j)-[:TYPE]->(k) CREATE (j)-[:TYPE]->(m) CREATE (j)-[:TYPE]->(n) CREATE (k)-[:TYPE]->(m) CREATE (k)-[:TYPE]->(l) CREATE (l)-[:TYPE]->(n) CREATE (m)-[:TYPE]->(n);
Added 15 labels, created 15 nodes, set 15 properties, created 25 relationships, completed after 86 ms
CALL gds.louvain.stream({ nodeProjection: 'Node', relationshipProjection: { TYPE: { type: 'TYPE', orientation: 'undirected', aggregation: 'NONE' } }, includeIntermediateCommunities: true }) YIELD nodeId, communityId, intermediateCommunityIds RETURN gds.util.asNode(nodeId).name AS name, communityId, intermediateCommunityIds ORDER BY name ASC
name communityId intermediateCommunityIds "a" 14 [6, 14] "b" 14 [6, 14] "c" 14 [14, 14] "d" 14 [6, 14] "e" 14 [14, 14] "f" 14 [14, 14] "g" 4 [4, 4] "h" 4 [4, 4] "i" 4 [4, 4] "j" 12 [12, 12] "k" 12 [12, 12] "l" 12 [12, 12] "m" 12 [12, 12] "n" 12 [12, 12] "x" 14 [14, 14]
-->이 예제 그래프에서 첫 번째 반복 후에는 4 개의 클러스터가 표시되며 두 번째 반복에서는 3 개로 줄어 듭니다.
레이블 전파 알고리즘 (LPA)
예시 생성
CREATE (alice:User {name: 'Alice', seed_label: 52}), (bridget:User {name: 'Bridget', seed_label: 21}), (charles:User {name: 'Charles', seed_label: 43}), (doug:User {name: 'Doug', seed_label: 21}), (mark:User {name: 'Mark', seed_label: 19}), (michael:User {name: 'Michael', seed_label: 52}),
(alice)-[:FOLLOW {weight: 1}]->(bridget), (alice)-[:FOLLOW {weight: 10}]->(charles), (mark)-[:FOLLOW {weight: 1}]->(doug), (bridget)-[:FOLLOW {weight: 1}]->(michael), (doug)-[:FOLLOW {weight: 1}]->(mark), (michael)-[:FOLLOW {weight: 1}]->(alice), (alice)-[:FOLLOW {weight: 1}]->(michael), (bridget)-[:FOLLOW {weight: 1}]->(alice), (michael)-[:FOLLOW {weight: 1}]->(bridget), (charles)-[:FOLLOW {weight: 1}]->(doug)
Added 6 labels, created 6 nodes, set 22 properties, created 10 relationships, completed after 31 ms.
카테고리 등록
CALL gds.graph.create( 'myGraph', 'User', 'FOLLOW', { nodeProperties: 'seed_label', relationshipProperties: 'weight' } )
nodeProjection relationshipProjection graphName nodeCount relationshipCount createMillis
{ "User": { "properties": { "seed_label": { "property": "seed_label", "defaultValue": null } }, "label": "User" } } { "FOLLOW": { "orientation": "NATURAL", "aggregation": "DEFAULT", "type": "FOLLOW", "properties": { "weight": { "property": "weight", "aggregation": "DEFAULT", "defaultValue": null } } } } "myGraph" 6 10
3.1. 메모리 추정
CALL gds.labelPropagation.write.estimate('myGraph', { writeProperty: 'community' }) YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
nodeCount relationshipCount bytesMin bytesMax requiredMemory 6 10 1608 1608 "1608 Bytes"
3.2. 흐름
CALL gds.labelPropagation.stream('myGraph') YIELD nodeId, communityId AS Community RETURN gds.util.asNode(nodeId).name AS Name, Community ORDER BY Community, Name
Name Community "Alice" 1 "Bridget" 1 "Michael" 1 "Charles" 4 "Doug" 4 "Mark" 4
3.3. 통계
CALL gds.labelPropagation.stats('myGraph') YIELD communityCount, ranIterations, didConverge
communityCount ranIterations didConverge 2 3 true
3.4. 돌연변이
CALL gds.labelPropagation.mutate('myGraph', { mutateProperty: 'community' }) YIELD communityCount, ranIterations, didConverge
communityCount ranIterations didConverge 2 3 true
3.5. 쓰다
CALL gds.labelPropagation.write('myGraph', { writeProperty: 'community' }) YIELD communityCount, ranIterations, didConverge
communityCount ranIterations didConverge 2 3 true
--> community property와 값 생성
╒════════════════════════════════════════════════╕ │"n" │ ╞════════════════════════════════════════════════╡ │{"name":"Alice","seed_label":52,"community":1} │ ├────────────────────────────────────────────────┤ │{"name":"Bridget","seed_label":21,"community":1}│ ├────────────────────────────────────────────────┤ │{"name":"Charles","seed_label":43,"community":4}│ ├────────────────────────────────────────────────┤ │{"name":"Doug","seed_label":21,"community":4} │ ├────────────────────────────────────────────────┤ │{"name":"Mark","seed_label":19,"community":4} │ ├────────────────────────────────────────────────┤ │{"name":"Michael","seed_label":52,"community":1}│ └────────────────────────────────────────────────┘
3.6. 가중
CALL gds.labelPropagation.stream('myGraph', { relationshipWeightProperty: 'weight' }) YIELD nodeId, communityId AS Community RETURN gds.util.asNode(nodeId).name AS Name, Community ORDER BY Community, Name
Name Community "Bridget" 2 "Michael" 2 "Alice" 4 "Charles" 4 "Doug" 4 "Mark" 4
3.7. 시드 커뮤니티
CALL gds.labelPropagation.stream('myGraph', { seedProperty: 'seed_label' }) YIELD nodeId, communityId AS Community RETURN gds.util.asNode(nodeId).name AS Name, Community ORDER BY Community, Name
Name Community "Charles" 19 "Doug" 19 "Mark" 19 "Alice" 21 "Bridget" 21 "Michael" 21
약하게 연결된 구성 요소 WCC (Weakly Connected Components)
예)
CREATE (nAlice:User {name: 'Alice'}), (nBridget:User {name: 'Bridget'}), (nCharles:User {name: 'Charles'}), (nDoug:User {name: 'Doug'}), (nMark:User {name: 'Mark'}), (nMichael:User {name: 'Michael'}),
(nAlice)-[:LINK {weight: 0.5}]->(nBridget), (nAlice)-[:LINK {weight: 4}]->(nCharles), (nMark)-[:LINK {weight: 1.1}]->(nDoug), (nMark)-[:LINK {weight: 2}]->(nMichael);
Added 6 labels, created 6 nodes, set 10 properties, created 4 relationships, completed after 18 ms.
#카테고리 등록
CALL gds.graph.create( 'myGraph', 'User', 'LINK', { relationshipProperties: 'weight' } )
nodeProjection relationshipProjection graphName nodeCount relationshipCount createMillis
{ "User": { "properties": {
},
"label": "User" } } { "LINK": { "orientation": "NATURAL", "aggregation": "DEFAULT", "type": "LINK", "properties": { "weight": { "property": "weight", "aggregation": "DEFAULT", "defaultValue": null } } } } "myGraph" 6 4 12
3.1. 메모리 추정
CALL gds.wcc.write.estimate('myGraph', { writeProperty: 'component' }) YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
nodeCount relationshipCount bytesMin bytesMax requiredMemory 6 4 176 176 "176 Bytes"
3.2. 흐름
CALL gds.wcc.stream('myGraph') YIELD nodeId, componentId RETURN gds.util.asNode(nodeId).name AS name, componentId ORDER BY componentId, name
name componentId "Alice" 0 "Bridget" 0 "Charles" 0 "Doug" 3 "Mark" 3 "Michael" 3
3.3. 통계
CALL gds.wcc.stats('myGraph') YIELD componentCount
componentCount 2
3.4. 돌연변이
CALL gds.wcc.mutate('myGraph', { mutateProperty: 'componentId' }) YIELD nodePropertiesWritten, componentCount;
3.5. 쓰다
CALL gds.wcc.write('myGraph', { writeProperty: 'componentId' }) YIELD nodePropertiesWritten, componentCount;
nodePropertiesWritten componentCount 6 2
--> ╒══════════════════════════════════╕ │"n" │ ╞══════════════════════════════════╡ │{"name":"Alice","componentId":0} │ ├──────────────────────────────────┤ │{"name":"Bridget","componentId":0}│ ├──────────────────────────────────┤ │{"name":"Charles","componentId":0}│ ├──────────────────────────────────┤ │{"name":"Doug","componentId":3} │ ├──────────────────────────────────┤ │{"name":"Mark","componentId":3} │ ├──────────────────────────────────┤ │{"name":"Michael","componentId":3}│ └──────────────────────────────────┘
3.6. 가중
CALL gds.wcc.stream('myGraph', { relationshipWeightProperty: 'weight', threshold: 1.0 }) YIELD nodeId, componentId RETURN gds.util.asNode(nodeId).name AS Name, componentId AS ComponentId ORDER BY ComponentId, Name
Name ComponentId "Alice" 0 "Charles" 0 "Bridget" 1 "Doug" 3 "Mark" 3 "Michael" 3
3.7. 시드 구성 요소
- 알고리즘을 실행하고 결과를 Neo4j에 기록합니다.
- 그런 다음 그래프에 다른 노드를 추가합니다.이 노드에는 1 단계에서 계산 된 속성이 없습니다.
- 1 단계의 결과가 다음과 같은 새로운 인 메모리 그래프를 생성합니다. nodeProperty
- 그런 다음 이번에는 stream모드 에서 알고리즘을 다시 실행 하고 seedProperty구성 매개 변수를 사용합니다 .
1 단계: 실행 결과 기록
CALL gds.wcc.write('myGraph', { writeProperty: 'componentId', relationshipWeightProperty: 'weight', threshold: 1.0 }) YIELD nodePropertiesWritten, componentCount;
nodePropertiesWritten componentCount 6 3
2 단계: 노드추가
MATCH (b:User {name: 'Bridget'}) CREATE (b)-[:LINK {weight: 2.0}]->(new:User {name: 'Mats'})
Added 1 label, created 1 node, set 2 properties, created 1 relationship, completed after 23 ms.
3 단계: 두번째 메모리 그래프 생성(myGraph-seeded) 다음은 이전에 계산 된 구성 요소 ID를 포함하는 새 그래프를 만듭니다.
CALL gds.graph.create( 'myGraph-seeded', 'User', 'LINK', { nodeProperties: 'componentId', relationshipProperties: 'weight' } )
nodeProjection relationshipProjection graphName nodeCount relationshipCount createMillis
{ "User": { "properties": { "componentId": { "property": "componentId", "defaultValue": null } }, "label": "User" } } { "LINK": { "orientation": "NATURAL", "aggregation": "DEFAULT", "type": "LINK", "properties": { "weight": { "property": "weight", "aggregation": "DEFAULT", "defaultValue": null } } } } "myGraph-seeded" 7 5 14
4 단계
CALL gds.wcc.stream('myGraph-seeded', { seedProperty: 'componentId', relationshipWeightProperty: 'weight', threshold: 1.0 }) YIELD nodeId, componentId RETURN gds.util.asNode(nodeId).name AS name, componentId ORDER BY componentId, name
name componentId "Alice" 0 "Charles" 0 "Bridget" 1 "Mats" 1 "Doug" 3 "Mark" 3 "Michael" 3
3.8. Seeded 구성 요소 작성
CALL gds.wcc.write('myGraph-seeded', { seedProperty: 'componentId', writeProperty: 'componentId', relationshipWeightProperty: 'weight', threshold: 1.0 }) YIELD nodePropertiesWritten, componentCount;
nodePropertiesWritten componentCount 1 3
삼각형수(Triangle Count 알고리즘)
커뮤니티를 감지하고 해당 커뮤니티의 응집성을 측정 삼각형 개수와 클러스터링 계수는 특정 웹 사이트를 스팸 또는 스팸이 아닌 콘텐츠로 분류하는 기능으로 유용
예)
CREATE (alice:Person {name: 'Alice'}), (michael:Person {name: 'Michael'}), (karin:Person {name: 'Karin'}), (chris:Person {name: 'Chris'}), (will:Person {name: 'Will'}), (mark:Person {name: 'Mark'}),
(michael)-[:KNOWS]->(karin), (michael)-[:KNOWS]->(chris), (will)-[:KNOWS]->(michael), (mark)-[:KNOWS]->(michael), (mark)-[:KNOWS]->(will), (alice)-[:KNOWS]->(michael), (will)-[:KNOWS]->(chris), (chris)-[:KNOWS]->(karin)
Added 6 labels, created 6 nodes, set 6 properties, created 8 relationships, completed after 35 ms.
CALL gds.graph.create( 'myGraph', 'Person', { KNOWS: { orientation: 'UNDIRECTED' } } )
nodeProjection relationshipProjection graphName nodeCount relationshipCount createMillis { "Person": { "properties": {
},
"label": "Person" } } { "KNOWS": { "orientation": "UNDIRECTED", "aggregation": "DEFAULT", "type": "KNOWS", "properties": {
}
} } "myGraph" 6 16 10
3.1. 메모리 추정
CALL gds.triangleCount.write.estimate('myGraph', { writeProperty: 'triangleCount' }) YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
nodeCount relationshipCount bytesMin bytesMax requiredMemory 6 16 144 144 "144 Bytes"
3.2. 흐름
CALL gds.triangleCount.stream('myGraph') YIELD nodeId, triangleCount RETURN gds.util.asNode(nodeId).name AS name, triangleCount ORDER BY triangleCount DESC
name triangleCount "Michael" 3 "Chris" 2 "Will" 2 "Karin" 1 "Mark" 1 "Alice" 0
3.3. 통계
CALL gds.triangleCount.stats('myGraph') YIELD globalTriangleCount, nodeCount
globalTriangleCount nodeCount 3 6
3.4. 돌연변이
CALL gds.triangleCount.mutate('myGraph', { mutateProperty: 'triangles' }) YIELD globalTriangleCount, nodeCount
globalTriangleCount nodeCount 3 6
3.5. 쓰다
CALL gds.triangleCount.write('myGraph', { writeProperty: 'triangles' }) YIELD globalTriangleCount, nodeCount
globalTriangleCount nodeCount 3 6
3.6. 최대 학위 Triangle Count 알고리즘은 maxDegree노드의 정도가 구성된 값보다 큰 경우 처리에서 노드를 제외하는 데 사용할 수 있는 구성 매개 변수를 지원 계산에서 제외 된 노드에는 삼각형 개수가 할당됩니다 -1.
CALL gds.triangleCount.stream('myGraph', { maxDegree: 4 }) YIELD nodeId, triangleCount RETURN gds.util.asNode(nodeId).name AS name, triangleCount ORDER BY name ASC
name triangleCount "Alice" 0 "Chris" 0 "Karin" 0 "Mark" 0 "Michael" -1 "Will" 0
-->Michael degree(관계의 갯수) 5로 maxDegree를 초과하여 -1처리
- 삼각형 목록
CALL gds.alpha.triangles('myGraph') YIELD nodeA, nodeB, nodeC RETURN gds.util.asNode(nodeA).name AS nodeA, gds.util.asNode(nodeB).name AS nodeB, gds.util.asNode(nodeC).name AS nodeC
nodeA nodeB nodeC "Will" "Mark" "Michael" "Chris" "Will" "Michael" "Karin" "Chris" "Michael"
로컬 클러스터링 계수
삼각형 수를 카테고리를 그대로 사용
3.1. 메모리 추정
CALL gds.localClusteringCoefficient.write.estimate('myGraph', { writeProperty: 'localClusteringCoefficient' }) YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
nodeCount relationshipCount bytesMin bytesMax requiredMemory 6 16 288 288 "288 Bytes"
3.2. 흐름
CALL gds.localClusteringCoefficient.stream('myGraph') YIELD nodeId, localClusteringCoefficient RETURN gds.util.asNode(nodeId).name AS name, localClusteringCoefficient ORDER BY localClusteringCoefficient DESC
name localClusteringCoefficient "Karin" 1.0 "Mark" 1.0 "Chris" 0.6666666666666666 "Will" 0.6666666666666666 "Michael" 0.3 "Alice" 0.0
3.3. 통계
CALL gds.localClusteringCoefficient.stats('myGraph') YIELD averageClusteringCoefficient, nodeCount
averageClusteringCoefficient nodeCount 0.6055555555555555 6 -->평균적으로 예제 그래프의 각 노드에 인접 항목의 약 60 %가 연결되어 있음
3.4. 돌연변이
CALL gds.localClusteringCoefficient.mutate('myGraph', { mutateProperty: 'localClusteringCoefficient' }) YIELD averageClusteringCoefficient, nodeCount
averageClusteringCoefficient nodeCount 0.6055555555555555 6
3.5. 쓰다
CALL gds.localClusteringCoefficient.write('myGraph', { writeProperty: 'localClusteringCoefficient' }) YIELD averageClusteringCoefficient, nodeCount
averageClusteringCoefficient nodeCount 0.6055555555555555 6
╒══════════════════════════════════════════════════════════════════════╕ │"n" │ ╞══════════════════════════════════════════════════════════════════════╡ │{"triangles":1,"triangles11":1,"name":"Karin","localClusteringCoeffici│ │ent":1.0} │ ├──────────────────────────────────────────────────────────────────────┤ │{"triangles":2,"triangles11":2,"name":"Chris","localClusteringCoeffici│ │ent":0.6666666666666666} │ ├──────────────────────────────────────────────────────────────────────┤ │{"triangles":2,"triangles11":2,"name":"Will","localClusteringCoefficie│ │nt":0.6666666666666666} │ ├──────────────────────────────────────────────────────────────────────┤ │{"triangles":1,"triangles11":1,"name":"Mark","localClusteringCoefficie│ │nt":1.0} │ ├──────────────────────────────────────────────────────────────────────┤ │{"triangles":0,"triangles11":0,"name":"Alice","localClusteringCoeffici│ │ent":0.0} │ ├──────────────────────────────────────────────────────────────────────┤ │{"triangles":3,"triangles11":3,"name":"Michael","localClusteringCoeffi│ │cient":0.3} │
3.6. 미리 계산 된 카운트
CALL gds.triangleCount.mutate('myGraph', { mutateProperty: 'triangles' })
Failed to invoke procedure gds.triangleCount.mutate: Caused by: java.lang.IllegalArgumentException:
Node property triangles already exists in the in-memory graph.
삼각형 수에서 재활용으로 오류
CALL gds.localClusteringCoefficient.stream('myGraph', { triangleCountProperty: 'triangles' }) YIELD nodeId, localClusteringCoefficient RETURN gds.util.asNode(nodeId).name AS name, localClusteringCoefficient ORDER BY localClusteringCoefficient DESC
name localClusteringCoefficient "Karin" 1.0 "Mark" 1.0 "Chris" 0.6666666666666666 "Will" 0.6666666666666666 "Michael" 0.3 "Alice" 0.0
K-1 채색
주어진 노드의 모든 이웃이 노드 자체와 다른 색을 갖도록합니다. 가능한 한 적은 색상을 사용합니다.
예)
CREATE (alice:User {name: 'Alice'}), (bridget:User {name: 'Bridget'}), (charles:User {name: 'Charles'}), (doug:User {name: 'Doug'}),
(alice)-[:LINK]->(bridget),
(alice)-[:LINK]->(charles),
(alice)-[:LINK]->(doug),
(bridget)-[:LINK]->(charles)
Added 4 labels, created 4 nodes, set 4 properties, created 4 relationships, completed after 10 ms.
CALL gds.graph.create( 'myGraph', 'User', { LINK : { orientation: 'UNDIRECTED' } } )
nodeProjection relationshipProjection graphName nodeCount relationshipCount createMillis
{ "User": { "properties": {
},
"label": "User" } } { "LINK": { "orientation": "UNDIRECTED", "aggregation": "DEFAULT", "type": "LINK", "properties": {
}
} } "myGraph" 4 8 8
CALL gds.graph.create('myGraph', 'Person', 'LIKES')
Failed to invoke procedure gds.graph.create:
Caused by: java.lang.IllegalArgumentException: A graph with name 'myGraph' already exists.
CALL gds.beta.k1coloring.stream('myGraph') YIELD nodeId, color RETURN gds.util.asNode(nodeId).name AS name, color ORDER BY name
name color "Alice" 0 "Bridget" 1 "Charles" 2 "Doug" 1
CALL gds.beta.k1coloring.write('myGraph', {writeProperty: 'color'}) YIELD nodeCount, colorCount, ranIterations, didConverge
nodeCount colorCount ranIterations didConverge 4 3 1 true
CALL gds.beta.k1coloring.mutate('myGraph', {mutateProperty: 'color'}) YIELD nodeCount, colorCount, ranIterations, didConverge
nodeCount colorCount ranIterations didConverge 4 3 1 true
CALL gds.beta.k1coloring.stats('myGraph') YIELD nodeCount, colorCount, ranIterations, didConverge
nodeCount colorCount ranIterations didConverge 4 3 1 true
모듈화 최적화
모듈성 은 모듈 또는 커뮤니티 내의 연결 밀도를 측정하는 그래프 구조의 척도입니다. 모듈성 점수가 높은 그래프는 커뮤니티 내에서 많은 연결을 갖지만 다른 커뮤니티를 가리키는 것은 거의 없습니다. 알고리즘은 커뮤니티를 인접 노드 중 하나로 변경하면 모듈성 점수가 증가 할 수있는 모든 노드를 탐색합니다.
예)
CREATE (a:Person {name:'Alice'}) , (b:Person {name:'Bridget'}) , (c:Person {name:'Charles'}) , (d:Person {name:'Doug'}) , (e:Person {name:'Elton'}) , (f:Person {name:'Frank'}) , (a)-[:KNOWS {weight: 0.01}]->(b) , (a)-[:KNOWS {weight: 5.0}]->(e) , (a)-[:KNOWS {weight: 5.0}]->(f) , (b)-[:KNOWS {weight: 5.0}]->(c) , (b)-[:KNOWS {weight: 5.0}]->(d) , (c)-[:KNOWS {weight: 0.01}]->(e) , (f)-[:KNOWS {weight: 0.01}]->(d)
Added 6 labels, created 6 nodes, set 13 properties, created 7 relationships, completed after 16 ms.
CALL gds.graph.create( 'myGraph', 'Person', { KNOWS: { type: 'KNOWS', orientation: 'UNDIRECTED', properties: ['weight'] } })
nodeProjection relationshipProjection graphName nodeCount relationshipCount createMillis { "Person": { "properties": {
},
"label": "Person" } } { "KNOWS": { "orientation": "UNDIRECTED", "aggregation": "DEFAULT", "type": "KNOWS", "properties": { "weight": { "property": "weight", "aggregation": "DEFAULT", "defaultValue": null } } } } "myGraph" 6 14 7
CALL gds.beta.modularityOptimization.stream('myGraph', { relationshipWeightProperty: 'weight' }) YIELD nodeId, communityId RETURN gds.util.asNode(nodeId).name AS name, communityId ORDER BY name
name communityId "Alice" 1 "Bridget" 3 "Charles" 3 "Doug" 3 "Elton" 1 "Frank" 1
CALL gds.beta.modularityOptimization.write('myGraph', { relationshipWeightProperty: 'weight', writeProperty: 'community' }) YIELD nodes, communityCount, ranIterations, didConverge
nodes communityCount ranIterations didConverge 6 2 3 true
╒════════════════════════════════╕ │"n" │ ╞════════════════════════════════╡ │{"name":"Frank","community":1} │ ├────────────────────────────────┤ │{"name":"Alice","community":1} │ ├────────────────────────────────┤ │{"name":"Bridget","community":3}│ ├────────────────────────────────┤ │{"name":"Charles","community":3}│ ├────────────────────────────────┤ │{"name":"Doug","community":3} │ ├────────────────────────────────┤ │{"name":"Elton","community":1} │ └────────────────────────────────┘
write모드를 사용할 때 프로시저는 알고리즘 실행에 대한 정보를 반환합니다. 이 예에서는 처리 된 노드 수, 그래프의 노드에 할당 된 커뮤니티 수, 반복 횟수 및 알고리즘 수렴 여부 정보를 반환합니다.
relationshipWeightProperty를 지정하지 않고 알고리즘을 실행하면 모든 관계 가중치가 1.0으로 기본 설정됩니다.
강력하게 연결된 구성 요소,SCC (Strongly Connected Components) 알고리즘
유 방향 그래프에서 연결된 노드의 최대 집합을 찾습니다. 집합 내의 각 노드 쌍 사이에 경로가있는 경우 집합은 강력하게 연결된 구성 요소로 간주됩니다.
예)
CREATE (nAlice:User {name:'Alice'}) CREATE (nBridget:User {name:'Bridget'}) CREATE (nCharles:User {name:'Charles'}) CREATE (nDoug:User {name:'Doug'}) CREATE (nMark:User {name:'Mark'}) CREATE (nMichael:User {name:'Michael'})
CREATE (nAlice)-[:FOLLOW]->(nBridget) CREATE (nAlice)-[:FOLLOW]->(nCharles) CREATE (nMark)-[:FOLLOW]->(nDoug) CREATE (nMark)-[:FOLLOW]->(nMichael) CREATE (nBridget)-[:FOLLOW]->(nMichael) CREATE (nDoug)-[:FOLLOW]->(nMark) CREATE (nMichael)-[:FOLLOW]->(nAlice) CREATE (nAlice)-[:FOLLOW]->(nMichael) CREATE (nBridget)-[:FOLLOW]->(nAlice) CREATE (nMichael)-[:FOLLOW]->(nBridget);
Added 6 labels, created 6 nodes, set 6 properties, created 10 relationships, completed after 16 ms.
CALL gds.alpha.scc.write({ nodeProjection: 'User', relationshipProjection: 'FOLLOW', writeProperty: 'componentId' }) YIELD setCount, maxSetSize, minSetSize;
setCount maxSetSize minSetSize 3 3 1
╒══════════════════════════════════╕ │"n" │ ╞══════════════════════════════════╡ │{"name":"Alice","componentId":0} │ ├──────────────────────────────────┤ │{"name":"Bridget","componentId":0}│ ├──────────────────────────────────┤ │{"name":"Charles","componentId":2}│ ├──────────────────────────────────┤ │{"name":"Michael","componentId":0}│ ├──────────────────────────────────┤ │{"name":"Doug","componentId":4} │ ├──────────────────────────────────┤ │{"name":"Mark","componentId":4} │ └──────────────────────────────────┘
CALL gds.alpha.scc.stream({ nodeProjection: 'User', relationshipProjection: 'FOLLOW' }) YIELD nodeId, componentId RETURN gds.util.asNode(nodeId).name AS Name, componentId AS Component ORDER BY Component DESC
Name Component "Doug" 4 "Mark" 4 "Charles" 2 "Alice" 0 "Bridget" 0 "Michael" 0
첫번째이자 가장 큰 구성 요소에는 멤버 Alice, Bridget 및 Michael이 있고 두번째 구성 요소에는 Doug 및 Mark가 있습니다. Charles는 해당 노드에서 다른 노드로 나가는 관계가 없기 때문에 자신의 구성 요소로 끝납니다.
MATCH (u:User) RETURN u.componentId AS Component, count(*) AS ComponentSize ORDER BY ComponentSize DESC LIMIT 1
Component ComponentSize 0 3
- 사이퍼 투영
CALL gds.alpha.scc.stream({ nodeQuery: 'MATCH (u:User) RETURN id(u) AS id', relationshipQuery: 'MATCH (u1:User)-[:FOLLOW]->(u2:User) RETURN id(u1) AS source, id(u2) AS target' }) YIELD nodeId, componentId RETURN gds.util.asNode(nodeId).name AS Name, componentId AS Component ORDER BY Component DESC
Name Component "Doug" 4 "Mark" 4 "Charles" 2 "Alice" 0 "Bridget" 0 "Michael" 0
화자-청취자 레이블 전파, 화자-수신자 레이블 전파 알고리즘 (SLLPA)
예)
CREATE (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'}), (c:Person {name: 'Carol'}), (d:Person {name: 'Dave'}), (e:Person {name: 'Eve'}), (f:Person {name: 'Fredrick'}), (g:Person {name: 'Gary'}), (h:Person {name: 'Hilda'}), (i:Person {name: 'Ichabod'}), (j:Person {name: 'James'}), (k:Person {name: 'Khalid'}),
(a)-[:KNOWS]->(b), (a)-[:KNOWS]->(c), (a)-[:KNOWS]->(d), (b)-[:KNOWS]->(c), (b)-[:KNOWS]->(d), (c)-[:KNOWS]->(d),
(b)-[:KNOWS]->(e), (e)-[:KNOWS]->(f), (f)-[:KNOWS]->(g), (g)-[:KNOWS]->(h),
(h)-[:KNOWS]->(i), (h)-[:KNOWS]->(j), (h)-[:KNOWS]->(k), (i)-[:KNOWS]->(j), (i)-[:KNOWS]->(k), (j)-[:KNOWS]->(k);
Added 11 labels, created 11 nodes, set 11 properties, created 16 relationships, completed after 24 ms.
CALL gds.graph.create( 'myGraph', 'Person', { KNOWS: { orientation: 'UNDIRECTED' } } );
nodeProjection relationshipProjection graphName nodeCount relationshipCount createMillis
{ "Person": { "properties": {
},
"label": "Person" } } { "KNOWS": { "orientation": "UNDIRECTED", "aggregation": "DEFAULT", "type": "KNOWS", "properties": {
}
} } "myGraph" 11 32 16
3.2. 흐름
CALL gds.alpha.sllpa.stream('myGraph', {maxIterations: 100, minAssociationStrength: 0.1}) YIELD nodeId, values RETURN gds.util.asNode(nodeId).name AS Name, values.communityIds AS communityIds ORDER BY Name ASC
Name communityIds "Alice" [0] "Bob" [0] "Carol" [0] "Dave" [0] "Eve" [0] "Fredrick" [0] "Gary" [1, 0] "Hilda" [1] "Ichabod" [1] "James" [1] "Khalid" [1]
알고리즘의 임의성으로 인해 결과는 실행마다 달라지는 경향이 있습니다.