Merge branch 'dev' into dev-upgrade

apache · Jul 27, 2023 · 0142e3d · 0142e3d
2 parents 04f8626 + 10dbad9
commit 0142e3d
Show file tree

Hide file tree

Showing 43 changed files with 1,179 additions and 990 deletions.
diff --git a/docs/about/configuration.md b/docs/about/configuration.md
@@ -40,9 +40,9 @@ Linkis binary packages are compiled based on the following software versions:
 
 | Component | Version | Description |
 | --- | --- | --- |
-| Hadoop | 2.7.2 | |
-| Hive | 2.3.3 | |
-| Spark | 2.4.3 | |
+| Hadoop | 3.3.4 | |
+| Hive | 3.1.3 | |
+| Spark | 3.2.1 | |
 | Flink | 1.12.2 | |
 | openLooKeng | 1.5.0 | |
 | Sqoop | 1.4.6 | |

diff --git a/docs/deployment/deploy-cluster.md b/docs/deployment/deploy-cluster.md
@@ -89,24 +89,48 @@ Total memory: 300 people online at the same time * 1G memory for a single engine
 
 ## 2. Process of distributed deployment
 
->The following is just a reference example, taking two servers as an example for distributed deployment. At present, the one-click installation script does not have good support for distributed deployment, and manual adjustment and deployment are required.
+>All services of Linkis support distributed and multi-cluster deployment. It is recommended to complete stand-alone deployment on one machine before distributed deployment, and ensure the normal use of Linkis functions.
 
-If you have already successfully deployed linkis in a stand-alone mode on server A, and now you want to add a server B for distributed deployment, you can refer to the following steps
+At present, the one-click installation script does not have good support for distributed deployment, and manual adjustment and deployment are required. For the specific distributed deployment, you can refer to the following steps, assuming that the user has completed the single-machine deployment on machine A.
 
-Mode: Eureka service multi-active deployment, some services are deployed on server A, and some services are deployed on server B
 
-### 2.1 Environment preparation for distributed deployment
+### 2.1 Environment preparation for distributed deployment  
 Like server A, server B needs basic environment preparation, please refer to [Linkis environment preparation](deploy-quick#3-linkis%E7%8E%AF%E5%A2%83%E5%87%86%E5% A4%87)
 
-### 2.2 Eureka multi-active configuration adjustment
-The registration center Eureka service needs to be deployed on server A and server B,
+**Network Check**
+
+Check whether the service machines that need distributed deployment are connected to each other, and you can use the ping command to check
+```
+ping IP
+```
+
+**Permission check**
+
+Check whether there is a hadoop user on each machine and whether the hadoop user has sudo authority.
 
+**Required Environmental Checks**
+
+Each linkis service depends on some basic environments before starting or when tasks are executed. Please check the basic environment of each machine according to the table below. For specific inspection methods, refer to [Linkis environment preparation] (deploy-quick#3-linkis%E7%8E%AF%E5 %A2%83%E5%87%86%E5%A4%87)
+
+|Service Name|Dependency Environment|
+|-|-|
+|mg-eureka|Java|
+|mg-gateway|Java|
+|ps-publicservice|Java、Hadoop|
+|cg-linkismanager|Java|
+|cg-entrance|Java|
+|cg-engineconnmanager|Java, Hive, Spark, Python, Shell|
+
+
+Note: If you need to use other non-default engines, you also need to check whether the environment of the corresponding engine on the machine where the cg-engineconnmanager service is located is OK. The engine environment can refer to each [engine in use](https://linkis.apache.org/zh- CN/docs/latest/engine-usage/overview) to check the pre-work.
+
+### 2.2 Eureka multi-active configuration adjustment
 
-Modify the Eureka configuration file, add the configuration addresses of both Eurekas, and let the Eureka services register with each other.
-On server A, make the following configuration changes
+Modify the Eureka configuration file on machine A, add the Eureka configuration addresses of all machines, and let the Eureka services register with each other.  
+On server A, make the following configuration changes, taking two Eureka clusters as an example.
 
 ```
-Revise $LINKIS_HOME/conf/application-eureka.yml和$LINKIS_HOME/conf/application-linkis.yml configuration
+Modify $LINKIS_HOME/conf/application-eureka.yml and $LINKIS_HOME/conf/application-linkis.yml configuration
 
 eureka:
   client:
@@ -120,11 +144,11 @@ wds.linkis.eureka.defaultZone=http:/eurekaIp1:port1/eureka/,http:/eurekaIp2:port
 ```
 
 ### 2.3 Synchronization of installation materials
-On server A, package the successfully installed directory `$LINKIS_HOME` of linkis, then copy and decompress it to the same directory on server B.
-At this point, if the `sbin/linkis-start-all.sh` command is started on server A and server B to start all services, then all services have two instances. You can visit the eureka service display page `http:/eurekaIp1:port1, or http:/eurekaIp2:port2` to view
+Create the same directory `$LINKIS_HOME` on all other machines as on machine A. On server A, package the successfully installed directory `$LINKIS_HOME` of linkis, then copy and decompress it to the same directory on other machines.
+At this point, if the `sbin/linkis-start-all.sh` script is executed to start all services on server A and other machines, then all services have n instances, where n is the number of machines. You can visit the eureka service display page `http:/eurekaIp1:port1, or http:/eurekaIp2:port2` to view.
 
 ### 2.4 Adjust startup script
-According to the actual situation, determine the services that need to be deployed on server A and server B,
+According to the actual situation, determine the Linkis service that needs to be deployed on each machine,
 For example, the microservice `linkis-cg-engineconnmanager` will not be deployed on server A,
 Then modify the one-click start-stop script of server A, `sbin/linkis-start-all.sh`, `sbin/linkis-stop-all.sh`, and comment out the start-stop commands related to the `cg-engineconnmanager` service
 ```html

diff --git a/docs/deployment/deploy-quick.md b/docs/deployment/deploy-quick.md
@@ -509,10 +509,10 @@ wds.linkis.admin.password= #Password
 sh bin/linkis-cli -submitUser hadoop -engineType shell-1 -codeType shell -code "whoami"
 
 #hive engine tasks
-sh bin/linkis-cli -submitUser hadoop -engineType hive-2.3.3 -codeType hql -code "show tables"
+sh bin/linkis-cli -submitUser hadoop -engineType hive-3.1.3 -codeType hql -code "show tables"
 
 #spark engine tasks
-sh bin/linkis-cli -submitUser hadoop -engineType spark-2.4.3 -codeType sql -code "show tables"
+sh bin/linkis-cli -submitUser hadoop -engineType spark-3.2.1 -codeType sql -code "show tables"
 
 #python engine task
 sh bin/linkis-cli -submitUser hadoop -engineType python-python2 -codeType python -code 'print("hello, world!")'
@@ -553,9 +553,9 @@ $ tree linkis-package/lib/linkis-engineconn-plugins/ -L 3
 linkis-package/lib/linkis-engineconn-plugins/
 ├── hive
 │ ├── dist
-│ │ └── 2.3.3 #version is 2.3.3 engineType is hive-2.3.3
+│ │ └── 3.1.3 #version is 3.1.3 engineType is hive-3.1.3
 │ └── plugin
-│ └── 2.3.3
+│ └── 3.1.3
 ├── python
 │ ├── dist
 │ │ └── python2
@@ -568,9 +568,9 @@ linkis-package/lib/linkis-engineconn-plugins/
 │ └── 1
 └── spark
     ├── dist
-    │ └── 2.4.3
+    │ └── 3.2.1
     └── plugin
-        └── 2.4.3
+        └── 3.2.1
 ````
 
 #### Method 2: View the database table of linkis
@@ -597,13 +597,13 @@ Insert yarn data information
 INSERT INTO `linkis_cg_rm_external_resource_provider`
 (`resource_type`, `name`, `labels`, `config`) VALUES
 ('Yarn', 'sit', NULL,
-'{\r\n"rmWebAddress": "http://xx.xx.xx.xx:8088",\r\n"hadoopVersion": "2.7.2",\r\n"authorEnable":false, \r\n"user":"hadoop",\r\n"pwd":"123456"\r\n}'
+'{\r\n"rmWebAddress": "http://xx.xx.xx.xx:8088",\r\n"hadoopVersion": "3.3.4",\r\n"authorEnable":false, \r\n"user":"hadoop",\r\n"pwd":"123456"\r\n}'
 );
 
 config field properties
 
 "rmWebAddress": "http://xx.xx.xx.xx:8088", #need to bring http and port
-"hadoopVersion": "2.7.2",
+"hadoopVersion": "3.3.4",
 "authorEnable":true, //Whether authentication is required You can verify the username and password by visiting http://xx.xx.xx.xx:8088 in the browser
 "user":"user",//username
 "pwd":"pwd"//Password

diff --git a/docs/engine-usage/hive.md b/docs/engine-usage/hive.md
@@ -42,7 +42,7 @@ The version of `Hive` supports `hive1.x` and `hive2.x`. The default is to suppor
 
 <https://github.com/apache/linkis/pull/541>
 
-The `hive` version supported by default is 2.3.3, if you want to modify the `hive` version, you can find the `linkis-engineplugin-hive` module, modify the \<hive.version\> tag, and then compile this module separately Can
+The `hive` version supported by default is 3.1.3, if you want to modify the `hive` version, you can find the `linkis-engineplugin-hive` module, modify the \<hive.version\> tag, and then compile this module separately Can
 
 [EngineConnPlugin engine plugin installation](../deployment/install-engineconn.md)
 
@@ -51,7 +51,7 @@ The `hive` version supported by default is 2.3.3, if you want to modify the `hiv
 ### 3.1 Submitting tasks via `Linkis-cli`
 
 ```shell
-sh ./bin/linkis-cli -engineType hive-2.3.3 \
+sh ./bin/linkis-cli -engineType hive-3.1.3 \
 -codeType hql -code "show databases"  \
 -submitUser hadoop -proxyUser hadoop
 ```
@@ -65,7 +65,7 @@ For the `Hive` task, you only need to modify `EngineConnType` and `CodeType` par
 
 ```java
 Map<String, Object> labels = new HashMap<String, Object>();
-labels.put(LabelKeyConstant.ENGINE_TYPE_KEY, "hive-2.3.3"); // required engineType Label
+labels.put(LabelKeyConstant.ENGINE_TYPE_KEY, "hive-3.1.3"); // required engineType Label
 labels.put(LabelKeyConstant.USER_CREATOR_TYPE_KEY, "hadoop-IDE");// required execute user and creator
 labels.put(LabelKeyConstant.CODE_TYPE_KEY, "hql"); // required codeType
 ```
@@ -95,7 +95,7 @@ Note: After modifying the configuration under the `IDE` tag, you need to specify
 
 ```shell
 sh ./bin/linkis-cli -creator IDE \
--engineType hive-2.3.3 -codeType hql \
+-engineType hive-3.1.3 -codeType hql \
 -code "show databases"  \
 -submitUser hadoop -proxyUser hadoop
 ```
@@ -116,7 +116,7 @@ Example of http request parameters
                             }
                     },
     "labels": {
-        "engineType": "hive-2.3.3",
+        "engineType": "hive-3.1.3",
         "userCreator": "hadoop-IDE"
     }
 }
@@ -128,7 +128,7 @@ Example of http request parameters
 
 ```
 linkis_ps_configuration_config_key: Insert the key and default values of the configuration parameters of the engine
-linkis_cg_manager_label: insert engine label such as: hive-2.3.3
+linkis_cg_manager_label: insert engine label such as: hive-3.1.3
 linkis_ps_configuration_category: Insert the directory association of the engine
 linkis_ps_configuration_config_value: The configuration that the insertion engine needs to display
 linkis_ps_configuration_key_engine_relation: The relationship between the configuration item and the engine
@@ -138,7 +138,7 @@ The initial data related to the engine in the table is as follows
 
 ```sql
 -- set variable
-SET @HIVE_LABEL="hive-2.3.3";
+SET @HIVE_LABEL="hive-3.1.3";
 SET @HIVE_ALL=CONCAT('*-*,',@HIVE_LABEL);
 SET @HIVE_IDE=CONCAT('*-IDE,',@HIVE_LABEL);
 

diff --git a/docs/engine-usage/impala.md b/docs/engine-usage/impala.md
@@ -1,42 +1,32 @@
 ---
 title: Impala
-sidebar_position: 15
+sidebar_position: 12
 ---
 
 This article mainly introduces the installation, usage and configuration of the `Impala` engine plugin in `Linkis`.
 
-
 ## 1. Pre-work
 
-### 1.1 Engine installation
+### 1.1 Environment installation
 
-If you want to use the `Impala` engine on your `Linkis` service, you need to prepare the Impala service and provide connection information, such as the connection address of the Impala cluster, SASL username and password, etc.
+If you want to use the Impala engine on your server, you need to prepare the Impala service and provide connection information, such as the connection address of the Impala cluster, SASL user name and password, etc.
 
-### 1.2 Service Verification
+### 1.2 Environment verification
 
-```shell
-# prepare trino-cli
-wget https://repo1.maven.org/maven2/io/trino/trino-cli/374/trino-cli-374-executable.jar
-mv trill-cli-374-executable.jar trill-cli
-chmod +x trino-cli
-
-# Execute the task
-./trino-cli --server localhost:8080 --execute 'show tables from system.jdbc'
-
-# Get the following output to indicate that the service is available
-"attributes"
-"catalogs"
-"columns"
-"procedure_columns"
-"procedures"
-"pseudo_columns"
-"schemas"
-"super_tables"
-"super_types"
-"table_types"
-"tables"
-"types"
-"udts"
+Execute the impala-shell command to get the following output, indicating that the impala service is available.
+```
+[root@8f43473645b1 /]# impala-shell
+Starting Impala Shell without Kerberos authentication
+Connected to 8f43473645b1:21000
+Server version: impalad version 2.12.0-cdh5.15.0 RELEASE (build 23f574543323301846b41fa5433690df32efe085)
+***************************************************** *********************************
+Welcome to the Impala shell.
+(Impala Shell v2.12.0-cdh5.15.0 (23f5745) built on Thu May 24 04:07:31 PDT 2018)
+
+When pretty-printing is disabled, you can use the '--output_delimiter' flag to set
+the delimiter for fields in the same row. The default is ','.
+***************************************************** *********************************
+[8f43473645b1:21000] >
 ```
 
 ## 2. Engine plugin deployment
@@ -101,7 +91,7 @@ select * from linkis_cg_engine_conn_plugin_bml_resources;
 
 ```shell
 sh ./bin/linkis-cli -submitUser impala \
--engineType impala-3.4.0 -code 'select * from default.test limit 10' \
+-engineType impala-3.4.0 -code 'show databases;' \
 -runtimeMap linkis.es.http.method=GET \
 -runtimeMap linkis.impala.servers=127.0.0.1:21050
 ```
@@ -143,37 +133,23 @@ More `Linkis-Cli` command parameter reference: [Linkis-Cli usage](../user-guide/
 
 If the default parameters are not satisfied, there are the following ways to configure some basic parameters
 
-#### 4.2.1 Management console configuration
-
-![](./images/trino-config.png)
-
-Note: After modifying the configuration under the `IDE` tag, you need to specify `-creator IDE` to take effect (other tags are similar), such as:
-
-```shell
-sh ./bin/linkis-cli -creator IDE -submitUser hadoop \
- -engineType impala-3.4.0 -codeType sql \
- -code 'select * from system.jdbc.schemas limit 10' 
-```
-
-#### 4.2.2 Task interface configuration
+#### 4.2.1 Task interface configuration
 Submit the task interface and configure it through the parameter `params.configuration.runtime`
 
 ```shell
 Example of http request parameters
 {
-    "executionContent": {"code": "select * from system.jdbc.schemas limit 10;", "runType":  "sql"},
+    "executionContent": {"code": "show databases;", "runType":  "sql"},
     "params": {
                     "variable": {},
                     "configuration": {
                             "runtime": {
-                                "linkis.trino.url":"http://127.0.0.1:8080",
-                                "linkis.trino.catalog ":"hive",
-                                "linkis.trino.schema ":"default"
-                                }
+                                "linkis.impala.servers"="127.0.0.1:21050"
                             }
-                    },
+                    }
+                },
     "labels": {
-        "engineType": "trino-371",
+        "engineType": "impala-3.4.0",
         "userCreator": "hadoop-IDE"
     }
 }
@@ -185,7 +161,7 @@ Example of http request parameters
 
 ```
 linkis_ps_configuration_config_key: Insert the key and default values of the configuration parameters of the engine
-linkis_cg_manager_label: insert engine label such as: trino-375
+linkis_cg_manager_label: insert engine label such as: impala-3.4.0
 linkis_ps_configuration_category: Insert the directory association of the engine
 linkis_ps_configuration_config_value: Insert the configuration that the engine needs to display
 linkis_ps_configuration_key_engine_relation: the relationship between configuration items and engines

diff --git a/docs/engine-usage/jdbc.md b/docs/engine-usage/jdbc.md
@@ -202,11 +202,7 @@ Note: After modifying the configuration under the `IDE` tag, you need to specify
 sh ./bin/linkis-cli -creator IDE \
 -engineType jdbc-4 -codeType jdbc \
 -code "show tables"  \
--submitUser hadoop -proxyUser hadoop \
--runtimeMap wds.linkis.jdbc.connect.url=jdbc:mysql://127.0.0.1:3306 \
--runtimeMap wds.linkis.jdbc.driver=com.mysql.jdbc.Driver \
--runtimeMap wds.linkis.jdbc.username=root \
--runtimeMap wds.linkis.jdbc.password=123456 \
+-submitUser hadoop -proxyUser hadoop 
 ```
 
 #### 4.2.2 Task interface configuration

diff --git a/docs/engine-usage/overview.md b/docs/engine-usage/overview.md
@@ -14,8 +14,8 @@ Supported engines and version information are as follows:
 
 | Engine | Default Engine | Default Version |
 |--------------| -- | ---- |
-| [Spark](./spark.md) | Yes | 2.4.3 |
-| [Hive](./hive.md) | yes | 2.3.3 |
+| [Spark](./spark.md) | Yes | 3.2.1 |
+| [Hive](./hive.md) | yes | 3.1.3 |
 | [Python](./python.md) | yes | python2 |
 | [Shell](./shell.md) | Yes | 1 |
 | [JDBC](./jdbc.md) | No | 4 |