1.4.0 feature update

apache · Jul 19, 2023 · bf721f3 · bf721f3
1 parent 5595fbd
commit bf721f3
Show file tree

Hide file tree

Showing 28 changed files with 1,036 additions and 872 deletions.
diff --git a/docs/deployment/deploy-quick.md b/docs/deployment/deploy-quick.md
@@ -235,7 +235,7 @@ HADOOP_KEYTAB_PATH=/appcom/keytab/
 >
 > Note: Linkis has not adapted permissions for S3, so it is not possible to grant authorization for it.
 
-`vim linkis.properties`
+`vim $LINKIS_HOME/conf/linkis.properties`
 ```shell script
 # s3 file system
 linkis.storage.s3.access.key=xxx
@@ -245,7 +245,7 @@ linkis.storage.s3.region=xxx
 linkis.storage.s3.bucket=xxx
 ```
 
-`vim linkis-cg-entrance.properties`
+`vim $LINKIS_HOME/conf/linkis-cg-entrance.properties`
 ```shell script
 wds.linkis.entrance.config.log.path=s3:///linkis/logs
 wds.linkis.resultSet.store.path=s3:///linkis/results

diff --git a/docs/engine-usage/impala.md b/docs/engine-usage/impala.md
@@ -1,42 +1,32 @@
 ---
 title: Impala
-sidebar_position: 15
+sidebar_position: 12
 ---
 
 This article mainly introduces the installation, usage and configuration of the `Impala` engine plugin in `Linkis`.
 
-
 ## 1. Pre-work
 
-### 1.1 Engine installation
+### 1.1 Environment installation
 
-If you want to use the `Impala` engine on your `Linkis` service, you need to prepare the Impala service and provide connection information, such as the connection address of the Impala cluster, SASL username and password, etc.
+If you want to use the Impala engine on your server, you need to prepare the Impala service and provide connection information, such as the connection address of the Impala cluster, SASL user name and password, etc.
 
-### 1.2 Service Verification
+### 1.2 Environment verification
 
-```shell
-# prepare trino-cli
-wget https://repo1.maven.org/maven2/io/trino/trino-cli/374/trino-cli-374-executable.jar
-mv trill-cli-374-executable.jar trill-cli
-chmod +x trino-cli
-
-# Execute the task
-./trino-cli --server localhost:8080 --execute 'show tables from system.jdbc'
-
-# Get the following output to indicate that the service is available
-"attributes"
-"catalogs"
-"columns"
-"procedure_columns"
-"procedures"
-"pseudo_columns"
-"schemas"
-"super_tables"
-"super_types"
-"table_types"
-"tables"
-"types"
-"udts"
+Execute the impala-shell command to get the following output, indicating that the impala service is available.
+```
+[root@8f43473645b1 /]# impala-shell
+Starting Impala Shell without Kerberos authentication
+Connected to 8f43473645b1:21000
+Server version: impalad version 2.12.0-cdh5.15.0 RELEASE (build 23f574543323301846b41fa5433690df32efe085)
+***************************************************** *********************************
+Welcome to the Impala shell.
+(Impala Shell v2.12.0-cdh5.15.0 (23f5745) built on Thu May 24 04:07:31 PDT 2018)
+
+When pretty-printing is disabled, you can use the '--output_delimiter' flag to set
+the delimiter for fields in the same row. The default is ','.
+***************************************************** *********************************
+[8f43473645b1:21000] >
 ```
 
 ## 2. Engine plugin deployment
@@ -101,7 +91,7 @@ select * from linkis_cg_engine_conn_plugin_bml_resources;
 
 ```shell
 sh ./bin/linkis-cli -submitUser impala \
--engineType impala-3.4.0 -code 'select * from default.test limit 10' \
+-engineType impala-3.4.0 -code 'show databases;' \
 -runtimeMap linkis.es.http.method=GET \
 -runtimeMap linkis.impala.servers=127.0.0.1:21050
 ```
@@ -143,37 +133,23 @@ More `Linkis-Cli` command parameter reference: [Linkis-Cli usage](../user-guide/
 
 If the default parameters are not satisfied, there are the following ways to configure some basic parameters
 
-#### 4.2.1 Management console configuration
-
-![](./images/trino-config.png)
-
-Note: After modifying the configuration under the `IDE` tag, you need to specify `-creator IDE` to take effect (other tags are similar), such as:
-
-```shell
-sh ./bin/linkis-cli -creator IDE -submitUser hadoop \
- -engineType impala-3.4.0 -codeType sql \
- -code 'select * from system.jdbc.schemas limit 10' 
-```
-
-#### 4.2.2 Task interface configuration
+#### 4.2.1 Task interface configuration
 Submit the task interface and configure it through the parameter `params.configuration.runtime`
 
 ```shell
 Example of http request parameters
 {
-    "executionContent": {"code": "select * from system.jdbc.schemas limit 10;", "runType":  "sql"},
+    "executionContent": {"code": "show databases;", "runType":  "sql"},
     "params": {
                     "variable": {},
                     "configuration": {
                             "runtime": {
-                                "linkis.trino.url":"http://127.0.0.1:8080",
-                                "linkis.trino.catalog ":"hive",
-                                "linkis.trino.schema ":"default"
-                                }
+                                "linkis.impala.servers"="127.0.0.1:21050"
                             }
-                    },
+                    }
+                },
     "labels": {
-        "engineType": "trino-371",
+        "engineType": "impala-3.4.0",
         "userCreator": "hadoop-IDE"
     }
 }
@@ -185,7 +161,7 @@ Example of http request parameters
 
 ```
 linkis_ps_configuration_config_key: Insert the key and default values of the configuration parameters of the engine
-linkis_cg_manager_label: insert engine label such as: trino-375
+linkis_cg_manager_label: insert engine label such as: impala-3.4.0
 linkis_ps_configuration_category: Insert the directory association of the engine
 linkis_ps_configuration_config_value: Insert the configuration that the engine needs to display
 linkis_ps_configuration_key_engine_relation: the relationship between configuration items and engines

diff --git a/docs/feature/base-engine-compatibilty.md b/docs/feature/base-engine-compatibilty.md
@@ -1,36 +1,74 @@
 ---
-title: reduce base engine compatibility issues
+title: Base Engine Dependency, Compatibility, Default Version Optimization
 sidebar_position: 0.2
 ---
 
-## 1. Requirement Background
-before we may need to modify linkis source code to fit different hive and spark version and compilation may fail for some certain versions, we need to reduce compilation and installation problems caused by base engine versions
+## 1. Requirement background
+1. The lower version of linkis needs to modify the code to adapt to different versions of Hive, Spark, etc. Because of compatibility issues, the compilation may fail, which can reduce the compatibility issues of these basic engines.
+2. Hadoop, Hive, and Spark 3.x are very mature, and lower versions of the engine may have potential risks. Many users in the community use the 3.x version by default, so consider changing the default compiled version of Linkis to 3.x.
 
 ## 2. Instructions for use
-for different hive compilation, we only to compile with target hive versions, such as
-```
-mvn clean install package -Dhive.version=3.1.3
 
-```
+## 2.1 Default version adjustment instructions
+
+Linkis 1.4.0 changes the default versions of Hadoop, Hive, and Spark to 3.x, and the specific versions are Hadoop 3.3.4, Hive 3.1.3, and Spark 3.2.1.
 
-for different spark compilation, we only to compile with target spark versions, here are normal scenes for usage.
+## 2.2 Different version adaptation
+
+To compile different hive versions, we only need to specify `-D=xxx`, for example:
+```
+mvn clean install package -Dhive.version=2.3.3
 ```
-spark3+hadoop3
+To compile different versions of spark, we only need to specify `-D=xxx`. Common usage scenarios are as follows:
+```
+#spark3+hadoop3
 mvn install package
 
-spark3+hadoop2
-mvn install package  -Phadoop-2.7
+#spark3+hadoop2
+mvn install package -Phadoop-2.7
 
-spark2+hadoop2
+#spark2+hadoop2
 mvn install package -Pspark-2.4 -Phadoop-2.7
 
-spark2+ hadoop3
+#spark2+ hadoop3
 mvn install package -Pspark-2.4
-
 ```
 ## 3. Precautions
-spark subversion can be specified by -Dspark.version=xxx
-hadoop subversion can be specified by -Dhadoop.version=xxx
+1. When the default version is compiled, the basic version is: hadoop3.3.4 + hive3.1.3 + spark3.2.1
+```
+mvn install package
+```
+Due to the default version upgrade of the default base engine, `spark-3.2`, `hadoop-3.3` and `spark-2.4-hadoop-3.3` profiles were removed, and profiles `hadoop-2.7` and `spark-2.4` were added.
+
+2. The sub-version of spark can be specified by `-Dspark.version=xxx`. The default scala version used by the system is 2.12.17, which can be adapted to spark 3.x version. To compile spark 2.x, you need to use scala 2.11 version. Can be compiled with -Pspark-2.4 parameter, or -Dspark.version=2.xx -Dscala.version=2.11.12 -Dscala.binary.version=2.11.
+
+3. The subversion of hadoop can be specified by `-Dhadoop.version=xxx`
 
 for example :
-mvn install package -Pspark-3.2 -Phadoop-3.3 -Dspark.version=3.1.3
+```
+mvn install package -Pspark-3.2 -Phadoop-3.3 -Dspark.version=3.1.3
+```
+
+4. Version 2.x of hive needs to rely on jersey. Hive EC does not add jersey dependency when compiling by default. You can compile it through the following guidelines.
+
+**Compile hive version 2.3.3**
+
+When compiling hive EC, the profile that activates adding jersey dependencies when specifying version 2.3.3 is added by default. Users can compile by specifying the -Dhive.version=2.3.3 parameter
+
+**Compile other hive 2.x versions**
+
+Modify the linkis-engineconn-plugins/hive/pom.xml file, modify 2.3.3 to the user-compiled version, such as 2.1.0
+```xml
+<profile>
+      <id>hive-jersey-dependencies</id>
+      <activation>
+        <property>
+          <name>hive.version</name>
+          <!-- <value>2.3.3</value> -->
+          <value>2.1.0</value>
+        </property>
+      </activation>
+      ...
+    </profile>
+```
+Add -Dhive.version=2.1.0 parameter when compiling.