Skip to content

dromara/dataCompare

Repository files navigation

dataCompare

EN doc CN doc

Introduction

dataCompare is a database comparison and profiling platform

(1)support Hive table data comparison, MySQL、Doris data comparison, realize automatic configuration for data comparison, avoid frequent SQL writing for processing

(2)support easy configuration for data profiling

image

image

Features

data-compare

(1)Interface-level interactive data comparison task configuration, low code and small amount of configuration to quickly generate comparison tasks

(2)Magnitude comparison, consistency comparison, automatic difference case discovery

(3)JDBC databases such as MySQL, Apache Hive, and Apache Doris are currently supported

(4)Already supports the comparison results to automatically send email alarm reports

data-profiling

(1)Data detection can be completed with low code and a small amount of configuration

(2)Primary key, enumeration value, null value detection

Software Architecture

image

Technology stack:

End:Spring boot + Mybatis

DataBase:MySQL

Parsing Engine:Antrl

Big Data:Hive、Spark

System flowchart

输入图片说明 输入图片说明 输入图片说明 输入图片说明

img_1.png

img.png

Demonstration of system functionality

Home image

data-compare:

DbConfig

mysql config

image

hive config

image

job config image

Comparison results are displayed image

image

3fd83de9c582347f7f88cc82f438db4

data-profiling:

job config

img_2.png

profiling result

img_3.png

img_4.png

image

The system running environment

java jdk8

mysql 5.7.36

Runing config

(1)Run the SQL files in the SQL directory in the database, create database and tables

(2)build jar using the source code of the project: mvn clean package

(3)edit database config information of application.yml

(4)run java -jar -Dspring.config.location=application.yml dataCompare.jar (application.yml and jar must in the same directory)

(5)visit http://127.0.0.1/ (UserName:admin PassWord:admin123)

Environment installation configuration

(1)If you want to implement Hive data comparison configuration, you need to install the Hive environment first(The installation documentation refers to the docker quick installation of the Hive environment:https://blog.csdn.net/ifenggege/article/details/107860477)

(2)After installation, when creating a new data source connection, select Hive at the address jdbc:hive2://ip:10000

Star History

Star History Chart

Thanks

Thanks ruoyi Provides front-end services