您的位置:首页 > 其它

Spark组件之GraphX学习12--GraphX常见操作汇总SimpleGraphX

2016-05-04 16:56 411 查看
更多代码请见:https://github.com/xubo245/SparkLearning

1解释

GraphX常见操作汇总,包括建立图,查询最大度,map和join操作等

2.代码:

/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements.  See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License.  You may obtain a copy of the License at
*
*    http://www.apache.org/licenses/LICENSE-2.0 *
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.graphx

import org.apache.spark.util.collection.SortDataFormat

/**
* A single directed edge consisting of a source id, target id,
* and the data associated with the edge.
*
* @tparam ED type of the edge attribute
*
* @param srcId The vertex id of the source vertex
* @param dstId The vertex id of the target vertex
* @param attr The attribute associated with the edge
*/
case class Edge[@specialized(Char, Int, Boolean, Byte, Long, Float, Double) ED] (
var srcId: VertexId = 0,
var dstId: VertexId = 0,
var attr: ED = null.asInstanceOf[ED])
extends Serializable {

/**
* Given one vertex in the edge return the other vertex.
*
* @param vid the id one of the two vertices on the edge.
* @return the id of the other vertex on the edge.
*/
def otherVertexId(vid: VertexId): VertexId =
if (srcId == vid) dstId else { assert(dstId == vid); srcId }

/**
* Return the relative direction of the edge to the corresponding
* vertex.
*
* @param vid the id of one of the two vertices in the edge.
* @return the relative direction of the edge to the corresponding
* vertex.
*/
def relativeDirection(vid: VertexId): EdgeDirection =
if (vid == srcId) EdgeDirection.Out else { assert(vid == dstId); EdgeDirection.In }
}

object Edge {
private[graphx] def lexicographicOrdering[ED] = new Ordering[Edge[ED]] {
override def compare(a: Edge[ED], b: Edge[ED]): Int = {
if (a.srcId == b.srcId) {
if (a.dstId == b.dstId) 0
else if (a.dstId < b.dstId) -1
else 1
} else if (a.srcId < b.srcId) -1
else 1
}
}

private[graphx] def edgeArraySortDataFormat[ED] = new SortDataFormat[Edge[ED], Array[Edge[ED]]] {
override def getKey(data: Array[Edge[ED]], pos: Int): Edge[ED] = {
data(pos)
}

override def swap(data: Array[Edge[ED]], pos0: Int, pos1: Int): Unit = {
val tmp = data(pos0)
data(pos0) = data(pos1)
data(pos1) = tmp
}

override def copyElement(
src: Array[Edge[ED]], srcPos: Int,
dst: Array[Edge[ED]], dstPos: Int) {
dst(dstPos) = src(srcPos)
}

override def copyRange(
src: Array[Edge[ED]], srcPos: Int,
dst: Array[Edge[ED]], dstPos: Int, length: Int) {
System.arraycopy(src, srcPos, dst, dstPos, length)
}

override def allocate(length: Int): Array[Edge[ED]] = {
new Array[Edge[ED]](length)
}
}
}


3.结果:

**********************************************************
属性演示
**********************************************************
Graph:
vertices:
(4,(David,42))
(1,(Alice,28))
(6,(Fran,50))
(3,(Charlie,65))
(5,(Ed,55))
(2,(Bob,27))
edges:
Edge(2,1,7)
Edge(2,4,2)
Edge(3,2,4)
Edge(3,6,3)
Edge(4,1,1)
Edge(5,2,2)
Edge(5,3,8)
Edge(5,6,3)
triplets:
((2,(Bob,27)),(1,(Alice,28)),7)
((2,(Bob,27)),(4,(David,42)),2)
((3,(Charlie,65)),(2,(Bob,27)),4)
((3,(Charlie,65)),(6,(Fran,50)),3)
((4,(David,42)),(1,(Alice,28)),1)
((5,(Ed,55)),(2,(Bob,27)),2)
((5,(Ed,55)),(3,(Charlie,65)),8)
((5,(Ed,55)),(6,(Fran,50)),3)

找出图中年龄大于30的顶点方法一:
David is 42
Fran is 50
Charlie is 65
Ed is 55
找出图中年龄大于30的顶点方法二:
David is 42
Fran is 50
Charlie is 65
Ed is 55

找出图中属性大于5的边:
2 to 1 att 7
5 to 3 att 8

列出所有的tripltes:
Bob likes Alice
Bob likes David
Charlie likes Bob
Charlie likes Fran
David likes Alice
Ed likes Bob
Ed likes Charlie
Ed likes Fran

列出边属性>5的tripltes:
Bob likes Alice
Ed likes Charlie

找出图中最大的出度、入度、度数:
max of outDegrees:(5,3) max of inDegrees:(2,2) max of Degrees:(2,4)

**********************************************************
转换操作
**********************************************************
顶点的转换操作,顶点age + 10:
4 is (David,52)
1 is (Alice,38)
6 is (Fran,60)
3 is (Charlie,75)
5 is (Ed,65)
2 is (Bob,37)

边的转换操作,边的属性*2:
2 to 1 att 14
2 to 4 att 4
3 to 2 att 8
3 to 6 att 6
4 to 1 att 2
5 to 2 att 4
5 to 3 att 16
5 to 6 att 6

**********************************************************
结构操作
**********************************************************
顶点年纪>30的子图:
子图所有顶点:
David is 42
Fran is 50
Charlie is 65
Ed is 55

子图所有边:
3 to 6 att 3
5 to 3 att 8
5 to 6 att 3

**********************************************************
连接操作
**********************************************************
连接图的属性:
David inDeg: 1  outDeg: 1
Alice inDeg: 2  outDeg: 0
Fran inDeg: 2  outDeg: 0
Charlie inDeg: 1  outDeg: 2
Ed inDeg: 0  outDeg: 3
Bob inDeg: 2  outDeg: 2

出度和入读相同的人员:
David
Bob

**********************************************************
聚合操作
**********************************************************
找出年纪最大的追求者:
Bob is the oldest follower of David.
David is the oldest follower of Alice.
Charlie is the oldest follower of Fran.
Ed is the oldest follower of Charlie.
Ed does not have any followers.
Charlie is the oldest follower of Bob.

找出追求者的平均年纪:
The average age of David's followers is 27.0.
The average age of Alice's followers is 34.5.
The average age of Fran's followers is 60.0.
The average age of Charlie's followers is 55.0.
Ed does not have any followers.
The average age of Bob's followers is 60.0.

**********************************************************
聚合操作
**********************************************************
找出5到各顶点的最短:
(4,4.0)
(1,5.0)
(6,3.0)
(3,8.0)
(5,0.0)
(2,2.0)


参考

【1】 http://spark.apache.org/docs/1.5.2/graphx-programming-guide.html

【2】https://github.com/xubo245/SparkLearning

【3】 炼数成金视频
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息