GitHub - tech6589/incubator-griffin: Mirror of Apache griffin (Incubating)

Apache Griffin

Apache Griffin is a model driven data quality solution for modern data systems. It provides a standard process to define data quality measures, execute, report, as well as an unified dashboard across multiple data systems. You can access our home page here. You can access our wiki page here. You can access our issues jira page here.

Contact us

Dev List

CI

Repository

Snapshot:

Release:

How to run in docker

Install docker.

Pull our built docker image.

docker pull bhlx3lyx7/griffin_demo:0.0.1

Increase vm.max_map_count of your local machine, to use elasticsearch.
```
sysctl -w vm.max_map_count=262144
```

Run this docker image, wait for about one minute, then griffin is ready.

docker run -it -h sandbox --name griffin_demo -m 8G --memory-swap -1 \
-p 32122:2122 -p 37077:7077 -p 36066:6066 -p 38088:8088 -p 38040:8040 \
-p 33306:3306 -p 39000:9000 -p 38042:8042 -p 38080:8080 -p 37017:27017 \
-p 39083:9083 -p 38998:8998 -p 39200:9200 bhlx3lyx7/griffin_demo:0.0.1

Now you can visit UI through your browser, login with account "test" and password "test" if required.
```
http://<your local IP address>:38080/
```
You can also follow the steps using UI here.

How to deploy and run at local

Install jdk (1.8 or later versions).
Install mysql.
Install npm (version 6.0.0+).
Install Hadoop (2.6.0 or later), you can get some help here.
Install Spark (version 1.6.x, griffin does not support 2.0.x at current), if you want to install Pseudo Distributed/Single Node Cluster, you can get some help here.
Install Hive (version 1.2.1 or later), you can get some help here. You need to make sure that your spark cluster could access your HiveContext.
Install Livy, you can get some help here. Griffin need to schedule spark jobs by server, we use livy to submit our jobs. For some issues of Livy for HiveContext, we need to download 3 files, and put them into Hdfs.
```
datanucleus-api-jdo-3.2.6.jar
datanucleus-core-3.2.10.jar
datanucleus-rdbms-3.2.9.jar
```
Install ElasticSearch. ElasticSearch works as a metrics collector, Griffin produces metrics to it, and our default UI get metrics from it, you can use your own way as well.

Modify configuration for your environment. You need to modify the configuration part of code, to make Griffin works well in you environment. service/src/main/resources/application.properties

spring.datasource.url = jdbc:mysql://<your IP>:3306/quartz?autoReconnect=true&useSSL=false
spring.datasource.username = <user name>
spring.datasource.password = <password>

hive.metastore.uris = thrift://<your IP>:9083
hive.metastore.dbname = <hive database name>    # default is "default"

service/src/main/resources/sparkJob.properties

sparkJob.file = hdfs://<griffin measure path>/griffin-measure.jar
sparkJob.args_1 = hdfs://<griffin env path>/env.json
sparkJob.jars_1 = hdfs://<datanucleus path>/datanucleus-api-jdo-3.2.6.jar
sparkJob.jars_2 = hdfs://<datanucleus path>/datanucleus-core-3.2.10.jar
sparkJob.jars_3 = hdfs://<datanucleus path>/datanucleus-rdbms-3.2.9.jar
sparkJob.uri = http://<your IP>:8998/batches

ui/js/services/services.js

ES_SERVER = "http://<your IP>:9200"

Configure measure/measure-batch/src/main/resources/env.json for your environment, and put it into Hdfs /

Build the whole project and deploy.(NPM should be installed , on mac you can try 'brew install node')
```
mvn install
```
Create a directory in Hdfs, and put our measure package into it.
```
cp /measure/target/measure-0.1.3-incubating-SNAPSHOT.jar /measure/target/griffin-measure.jar
hdfs dfs -put /measure/target/griffin-measure.jar <griffin measure path>/
```
After all our environment services startup, we can start our server.
```
java -jar service/target/service.jar
```
After a few seconds, we can visit our default UI of Griffin (by default the port of spring boot is 8080).
```
http://<your IP>:8080
```
Follow the steps using UI here.

Note: The front-end UI is still under development, you can only access some basic features currently.

Contributing

See CONTRIBUTING.md for details on how to contribute code, documentation, etc.

Name		Name	Last commit message	Last commit date
Latest commit History 262 Commits
griffin-doc		griffin-doc
measure		measure
service		service
ui		ui
.gitignore		.gitignore
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
DISCLAIMER		DISCLAIMER
KEYS		KEYS
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
merge_pr.py		merge_pr.py
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Apache Griffin

Contact us

CI

Repository

How to run in docker

How to deploy and run at local

Contributing

About

Releases

Packages

Languages

License

tech6589/incubator-griffin

Folders and files

Latest commit

History

Repository files navigation

Apache Griffin

Contact us

CI

Repository

How to run in docker

How to deploy and run at local

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages