improving the documentation to reflect the latest codebase

lensesio · Aug 27, 2016 · a79e108 · a79e108
1 parent 7bd540e
commit a79e108
Showing 1 changed file with 66 additions and 32 deletions.
diff --git a/README.md b/README.md
@@ -4,13 +4,33 @@
 
 # Kafka Connect Query Language
 
-The **Kafka Connect Query Language** is implemented in `antlr4` grammar files.
+The **KCQL** (**K**afka **C**onnect **Q**uery **L**anguages) is a SQL like syntax allowing a streamlined configuration of a Kafka Connect Sink/Source. It is build using the <a href="https://github.com/antlr/grammars-v4">`antlr4`</a> API.
 
-You can find example grammars <a href="https://github.com/antlr/grammars-v4">here</a>
+# Why ?
+
+While working on our sink/sources we ended up producing quite complex configuration in order to support the functionality required. Imagine a sink where you source from different topics
+and from each topic you want to cherry pick the payload fields or even rename them. Furthermore you might want the storage structure to be automatically created and/or even evolve or you 
+might add new support for the likes of bucketing (Riak TS has one such scenario). Imagine the JDBC sink with a table which needs to be linked to two different topics and the fields in there 
+need to be aligned with the table column names and the complex configuration involved ...or you can just write this
+
+```bash
+routes.query = "INSERT INTO transactions SELECT field1 as column1, field2 as column2, field3 FROM topic_A;
+                INSERT INTO transactions SELECT fieldA1 as column1, fieldA2 as column2, fieldC FROM topic_B;"
+```
 
-Kafka Connect Common is in Maven, include it in your connector.
 
-# Build
+# Compile and Build
+This project is using the Gradle build system. So to build you would simply do
+```bash
+gradle clean build
+```
+If you modify the grammar you would need to first compile before the changes are reflected in the code. The antlr gradle plugin would run first and produced the java classes
+for the parser and lexer.
+
+# Using KCQL in your project 
+To include it in your project you, include it in your connector.
+
+Maven
 ```bash
 <dependency>
 	<groupId>com.datamountaineer</groupId>
@@ -19,53 +39,67 @@ Kafka Connect Common is in Maven, include it in your connector.
 </dependency>
 ```
 
-sbt
+SBT
 ```bash
 libraryDependencies += "com.datamountaineer" % "kcql % "0.8.3"
 ```
 
-gradle
+Gradle
 ```bash
 com.datamountaineer:kcql:0.8.3'
 ```
 
-# Why ?
+Check <a href="http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22kcql%22">Maven</a> for latest release.
 
-A Kafka Connect **KCQL** (**K**afka **C**onnect **Q**uery **L**anguages) makes a lot of sense when you need to define mappings between 
-Kafka topics (with Avro records) and external systems as _sinks_ or _sources_. 
 
 # Kafka Connect Query Language 
 
-    INSERT into TARGET_SQL_TABLE SELECT * FROM SOURCE_TOPIC IGNORE a,b,c
-    UPSERT into TARGET_SQL_TABLE SELECT ..           // INSERT & UPSERT allowed. Works out PK from DB
+There are two paths supported by this DSL. One is the INSERT and take the following form: 
+```bash
+INSERT INTO $TARGET 
+SELECT *|columns 
+FROM   $TOPIC_NAME 
+       [IGNORE columns] 
+       [AUTOCREATE] 
+       [PK columns] 
+       [AUTOEVOLVE] 
+       [BATCH = N]
+       [CAPITALIZE]
+       [PARTITIONBY cola[,colb]]
+       [DISTRIBUTEBY cola[,colb]]
+       [CLUSTERBY cola[,colb]]
+       [TIMESTAMP cola|sys_current]
+       [STOREAS AVRO|JSON|BYTE}       
+```
+If you follow our connectors @Datamountaineer you will find depending on the Connect Sink only some of the the options are used.
+You will find all our documentation <a href="https://github.com/datamountaineer/docs/tree/master/source">here</a>
+
+The second path is SELECT only. We have the <a hred="https://github.com/datamountaineer/stream-reactor">Socket Streamer</> which allows you to
+peek into KAFKA via websocket and receive the payloads in real time!
+```bash
+SELECT *|columns 
+FROM   $TOPIC_NAME 
+       [IGNORE columns] 
+       WITHFORMAT  JSON|AVRO|BYTE
+       [WITHGROUP $YOUR_CONSUMER_GROUP] 
+       [WITHPARTITION (partition),[(partition, offset)]
+       [SAMPLE $RECORDS_NUMBER EVERY $SLIDE_WINDOW
+```
 
 ### Examples of SELECT
 
-    .. SELECT field1                                 // Project one avro field named field1
-    .. SELECT field1.subfield1                       // Project one avro field from a complex message
+    .. SELECT field1 FROM mytopic                    // Project one avro field named field1
     .. SELECT field1 AS newName                      // Project and renames a field
-    .. SELECT *                                      // Select everything - perfect for avro evolution
-    .. SELECT *, field1 AS newName                   // Select all & rename a field - excellent for avro evolution
-    .. SELECT * IGNORE badField                      // Select all & ignore a field - excellent for avro evolution
+    .. SELECT *  FROM mytopic                        // Select everything - perfect for avro evolution
+    .. SELECT *, field1 AS newName FROM mytopic      // Select all & rename a field - excellent for avro evolution
+    .. SELECT * FROM mytopic IGNORE badField         // Select all & ignore a field - excellent for avro evolution
+    .. SELECT * FROM mytopic PK field1,field2        //Select all & with primary keys (for the sources where primary keys are required)
+    .. SELECT * FROM mytopic AUTOCREATE              //Select all and create the target source (table for databases)
+    .. SELECT * FROM mytopic AUTOEVOLVE              //Select all & reflect the new fields added to the avro payload into the target
 
-### Other operators
-
-    .. AUTOCREATE                                    // AUTOCREATE TABLE
-    .. AUTOCREATE PK field1,field2                   // AUTOCREATE with Primary Keys
-    .. BATCH 5000                                    // SET BATCHING TO 5000 records
-    .. AUTOEVOLVE
-    .. AUTOCREATE AUTOEVOLVE
 
 ### Future options
 
     .. NOOP | THROW | RETRY                          // Define the error policy 
     .. WHERE ..                                      // Add filtering rules
-    .. CAPITALIZE | TOLOWER                          // Forces TABLE names and COLUMN names to CAPITAL or _lowercase_
-    
-## Building
-
-Get this repository and run:
-
-    gradle clean compile test
-
-Java files are generate under `generated-sources/antlr4` folder
+