Gremlin Server/Console 适配 Atlas JanusGraph

Gremlin Server/Console 适配 Atlas JanusGraph

Atlas 底层存储使用的 JanusGraph,由于对于 Atlas 底层数据结构并不太清楚,所以希望能够通过 Gremlin Console 来操作 Atlas 的 JanusGraph,使用 Gremlin Query Language 执行一些更加灵活的查询,并直观的查询数据结构。适配的思路参考了 docker-apache-atlas 项目。

Gremlin Server

  1. 下载相同版本的 gremlin server ,解压,并添加 Atlas 相关的依赖包
1
2
3
4
5
6
ATLAS_HOME=/home/anchor/apache-atlas-2.1.0/
GREMLIN_SERVER_HOME=/home/anchor/gremlin/apache-tinkerpop-gremlin-server-3.4.6
ln -s ${ATLAS_HOME}/server/webapp/atlas/WEB-INF/lib/*.jar ${GREMLIN_SERVER_HOME}/lib 2>/dev/null
rm -f ${GREMLIN_SERVER_HOME}/lib/atlas-webapp-2.1.0.jar
rm -f ${GREMLIN_SERVER_HOME}/lib/netty-3.10.5.Final.jar
rm -f ${GREMLIN_SERVER_HOME}/lib/netty-all-4.0.52.Final.jar
  1. gremlin server 配置

gremlin-server-atlas-wshttp.yaml 配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
host: 0.0.0.0
port: 8182
scriptEvaluationTimeout: 30000
#channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
graphs: {
graph: conf/janusgraph-hbase-es.properties
}
scriptEngines: {
gremlin-groovy: {
plugins: { org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}}}}
# JanusGraph sets default serializers. You need to uncomment the following lines, if you require any custom serializers.
#
# serializers:
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: true }}
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }}
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
# # Older serialization versions for backwards compatibility:
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
# - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
- { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
consoleReporter: {enabled: true, interval: 180000},
csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
jmxReporter: {enabled: true},
slf4jReporter: {enabled: true, interval: 180000},
graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536

janusgraph-hbase-es.properties 配置

1
2
3
4
5
6
7
8
9
10
11
12
13
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=hbase
storage.hostname=127.0.0.1:2181
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.5
storage.hbase.table=apache_atlas_janus
storage.hbase.ext.hbase.security.authentication=kerberos
storage.hbase.ext.hbase.security.authorization=true

index.search.backend=elasticsearch
index.search.hostname=127.0.0.1:9200

将 HBASE_CONF_DIR 加入 gramlin-server classpath 中(避免 kerberos 认证,连接失败等问题)

1
2
# vim gremlin-server.sh
CP="$GREMLIN_HOME/conf/:$HBASE_CONF_DIR"
  1. 启动 gramlin server
1
nohup bin/gremlin-server.sh conf/gremlin-server-atlas-wshttp.yaml > gramlin-server.log 2>&1 &

Gremlin Console

  1. 下载并解压 gremlin console ,启动
1
bin/gremlin.sh
  1. 连接 gremlin server
1
2
:remote connect tinkerpop.server conf/remote.yaml session
:remote console

Gremlin Console 操作

查询一个 hive_table 节点

1
2
3
// hive_table where qualifiedName = "aaa.test@test"
g = graph.traversal()
g.V().has("__typeName", "hive_table").has("Referenceable.qualifiedName", "aaa.test@test").values()

查看 Graph 的一些信息

1
2
3
4
5
6
7
8
9
10
mgmt = graph.openManagement()

mgmt.printVertexLabels() // 打印 VertexLabels 信息
mgmt.printEdgeLabels() // 打印 EdgeLabels 信息
mgmt.printPropertyKeys() // 打印 PropertyKeys 信息
mgmt.printIndexes() // 打印所有索引信息

mgmt.printSchema() // 打印 Schema,包括上面的所有信息

index = mgmt.getGraphIndex("vertex_index"); // 获取索引对象

查询 Patch 节点信息

1
2
3
g.V().has("patch.type", "TYPEDEF_PATCH")

g.V().has("patch.type", "JAVA_PATCH")

Graphexp 安装

Graphexp 是一个前端项目,结合 gremlin server 提供图数据的可视化。项目地址:https://github.com/bricaud/graphexp

拉取 github 代码,安装 Nginx 并进行如下配置。完成后访问:http://localhost:9990/graphexp.html

1
2
3
4
5
6
7
8
# vim /etc/nginx/conf.d/graphexp-9990.conf
server {
keepalive_requests 120; #单连接请求上限次数。
listen 9990; #监听端口
location ~*^.+$ { #请求的url过滤,正则匹配,~为区分大小写,~*为不区分大小写。
root /data/nginx/graphexp; #插件目录
}
}

其他问题

  1. Max frame length of 65536 has been exceeded.
1
2
3
# io.netty.handler.codec.http.websocketx.CorruptedWebSocketFrameException: Max frame length of 65536 has been exceeded.
# vim conf/remote.yaml
connectionPool: {maxContentLength: 655360}