Skip to content

Commit

Permalink
Merge branch 'dev' into dev-upgrade
Browse files Browse the repository at this point in the history
  • Loading branch information
casionone authored Jul 27, 2023
2 parents 04f8626 + 10dbad9 commit 0142e3d
Show file tree
Hide file tree
Showing 43 changed files with 1,179 additions and 990 deletions.
6 changes: 3 additions & 3 deletions docs/about/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,9 @@ Linkis binary packages are compiled based on the following software versions:

| Component | Version | Description |
| --- | --- | --- |
| Hadoop | 2.7.2 | |
| Hive | 2.3.3 | |
| Spark | 2.4.3 | |
| Hadoop | 3.3.4 | |
| Hive | 3.1.3 | |
| Spark | 3.2.1 | |
| Flink | 1.12.2 | |
| openLooKeng | 1.5.0 | |
| Sqoop | 1.4.6 | |
Expand Down
48 changes: 36 additions & 12 deletions docs/deployment/deploy-cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,24 +89,48 @@ Total memory: 300 people online at the same time * 1G memory for a single engine

## 2. Process of distributed deployment

>The following is just a reference example, taking two servers as an example for distributed deployment. At present, the one-click installation script does not have good support for distributed deployment, and manual adjustment and deployment are required.
>All services of Linkis support distributed and multi-cluster deployment. It is recommended to complete stand-alone deployment on one machine before distributed deployment, and ensure the normal use of Linkis functions.
If you have already successfully deployed linkis in a stand-alone mode on server A, and now you want to add a server B for distributed deployment, you can refer to the following steps
At present, the one-click installation script does not have good support for distributed deployment, and manual adjustment and deployment are required. For the specific distributed deployment, you can refer to the following steps, assuming that the user has completed the single-machine deployment on machine A.

Mode: Eureka service multi-active deployment, some services are deployed on server A, and some services are deployed on server B

### 2.1 Environment preparation for distributed deployment
### 2.1 Environment preparation for distributed deployment
Like server A, server B needs basic environment preparation, please refer to [Linkis environment preparation](deploy-quick#3-linkis%E7%8E%AF%E5%A2%83%E5%87%86%E5% A4%87)

### 2.2 Eureka multi-active configuration adjustment
The registration center Eureka service needs to be deployed on server A and server B,
**Network Check**

Check whether the service machines that need distributed deployment are connected to each other, and you can use the ping command to check
```
ping IP
```

**Permission check**

Check whether there is a hadoop user on each machine and whether the hadoop user has sudo authority.

**Required Environmental Checks**

Each linkis service depends on some basic environments before starting or when tasks are executed. Please check the basic environment of each machine according to the table below. For specific inspection methods, refer to [Linkis environment preparation] (deploy-quick#3-linkis%E7%8E%AF%E5 %A2%83%E5%87%86%E5%A4%87)

|Service Name|Dependency Environment|
|-|-|
|mg-eureka|Java|
|mg-gateway|Java|
|ps-publicservice|Java、Hadoop|
|cg-linkismanager|Java|
|cg-entrance|Java|
|cg-engineconnmanager|Java, Hive, Spark, Python, Shell|


Note: If you need to use other non-default engines, you also need to check whether the environment of the corresponding engine on the machine where the cg-engineconnmanager service is located is OK. The engine environment can refer to each [engine in use](https://linkis.apache.org/zh- CN/docs/latest/engine-usage/overview) to check the pre-work.

### 2.2 Eureka multi-active configuration adjustment

Modify the Eureka configuration file, add the configuration addresses of both Eurekas, and let the Eureka services register with each other.
On server A, make the following configuration changes
Modify the Eureka configuration file on machine A, add the Eureka configuration addresses of all machines, and let the Eureka services register with each other.
On server A, make the following configuration changes, taking two Eureka clusters as an example.

```
Revise $LINKIS_HOME/conf/application-eureka.yml和$LINKIS_HOME/conf/application-linkis.yml configuration
Modify $LINKIS_HOME/conf/application-eureka.yml and $LINKIS_HOME/conf/application-linkis.yml configuration
eureka:
client:
Expand All @@ -120,11 +144,11 @@ wds.linkis.eureka.defaultZone=http:/eurekaIp1:port1/eureka/,http:/eurekaIp2:port
```

### 2.3 Synchronization of installation materials
On server A, package the successfully installed directory `$LINKIS_HOME` of linkis, then copy and decompress it to the same directory on server B.
At this point, if the `sbin/linkis-start-all.sh` command is started on server A and server B to start all services, then all services have two instances. You can visit the eureka service display page `http:/eurekaIp1:port1, or http:/eurekaIp2:port2` to view
Create the same directory `$LINKIS_HOME` on all other machines as on machine A. On server A, package the successfully installed directory `$LINKIS_HOME` of linkis, then copy and decompress it to the same directory on other machines.
At this point, if the `sbin/linkis-start-all.sh` script is executed to start all services on server A and other machines, then all services have n instances, where n is the number of machines. You can visit the eureka service display page `http:/eurekaIp1:port1, or http:/eurekaIp2:port2` to view.

### 2.4 Adjust startup script
According to the actual situation, determine the services that need to be deployed on server A and server B,
According to the actual situation, determine the Linkis service that needs to be deployed on each machine,
For example, the microservice `linkis-cg-engineconnmanager` will not be deployed on server A,
Then modify the one-click start-stop script of server A, `sbin/linkis-start-all.sh`, `sbin/linkis-stop-all.sh`, and comment out the start-stop commands related to the `cg-engineconnmanager` service
```html
Expand Down
16 changes: 8 additions & 8 deletions docs/deployment/deploy-quick.md
Original file line number Diff line number Diff line change
Expand Up @@ -509,10 +509,10 @@ wds.linkis.admin.password= #Password
sh bin/linkis-cli -submitUser hadoop -engineType shell-1 -codeType shell -code "whoami"
#hive engine tasks
sh bin/linkis-cli -submitUser hadoop -engineType hive-2.3.3 -codeType hql -code "show tables"
sh bin/linkis-cli -submitUser hadoop -engineType hive-3.1.3 -codeType hql -code "show tables"
#spark engine tasks
sh bin/linkis-cli -submitUser hadoop -engineType spark-2.4.3 -codeType sql -code "show tables"
sh bin/linkis-cli -submitUser hadoop -engineType spark-3.2.1 -codeType sql -code "show tables"
#python engine task
sh bin/linkis-cli -submitUser hadoop -engineType python-python2 -codeType python -code 'print("hello, world!")'
Expand Down Expand Up @@ -553,9 +553,9 @@ $ tree linkis-package/lib/linkis-engineconn-plugins/ -L 3
linkis-package/lib/linkis-engineconn-plugins/
├── hive
│ ├── dist
│ │ └── 2.3.3 #version is 2.3.3 engineType is hive-2.3.3
│ │ └── 3.1.3 #version is 3.1.3 engineType is hive-3.1.3
│ └── plugin
│ └── 2.3.3
│ └── 3.1.3
├── python
│ ├── dist
│ │ └── python2
Expand All @@ -568,9 +568,9 @@ linkis-package/lib/linkis-engineconn-plugins/
│ └── 1
└── spark
├── dist
│ └── 2.4.3
│ └── 3.2.1
└── plugin
└── 2.4.3
└── 3.2.1
````
#### Method 2: View the database table of linkis
Expand All @@ -597,13 +597,13 @@ Insert yarn data information
INSERT INTO `linkis_cg_rm_external_resource_provider`
(`resource_type`, `name`, `labels`, `config`) VALUES
('Yarn', 'sit', NULL,
'{\r\n"rmWebAddress": "http://xx.xx.xx.xx:8088",\r\n"hadoopVersion": "2.7.2",\r\n"authorEnable":false, \r\n"user":"hadoop",\r\n"pwd":"123456"\r\n}'
'{\r\n"rmWebAddress": "http://xx.xx.xx.xx:8088",\r\n"hadoopVersion": "3.3.4",\r\n"authorEnable":false, \r\n"user":"hadoop",\r\n"pwd":"123456"\r\n}'
);
config field properties
"rmWebAddress": "http://xx.xx.xx.xx:8088", #need to bring http and port
"hadoopVersion": "2.7.2",
"hadoopVersion": "3.3.4",
"authorEnable":true, //Whether authentication is required You can verify the username and password by visiting http://xx.xx.xx.xx:8088 in the browser
"user":"user",//username
"pwd":"pwd"//Password
Expand Down
14 changes: 7 additions & 7 deletions docs/engine-usage/hive.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ The version of `Hive` supports `hive1.x` and `hive2.x`. The default is to suppor

<https://github.com/apache/linkis/pull/541>

The `hive` version supported by default is 2.3.3, if you want to modify the `hive` version, you can find the `linkis-engineplugin-hive` module, modify the \<hive.version\> tag, and then compile this module separately Can
The `hive` version supported by default is 3.1.3, if you want to modify the `hive` version, you can find the `linkis-engineplugin-hive` module, modify the \<hive.version\> tag, and then compile this module separately Can

[EngineConnPlugin engine plugin installation](../deployment/install-engineconn.md)

Expand All @@ -51,7 +51,7 @@ The `hive` version supported by default is 2.3.3, if you want to modify the `hiv
### 3.1 Submitting tasks via `Linkis-cli`

```shell
sh ./bin/linkis-cli -engineType hive-2.3.3 \
sh ./bin/linkis-cli -engineType hive-3.1.3 \
-codeType hql -code "show databases" \
-submitUser hadoop -proxyUser hadoop
```
Expand All @@ -65,7 +65,7 @@ For the `Hive` task, you only need to modify `EngineConnType` and `CodeType` par

```java
Map<String, Object> labels = new HashMap<String, Object>();
labels.put(LabelKeyConstant.ENGINE_TYPE_KEY, "hive-2.3.3"); // required engineType Label
labels.put(LabelKeyConstant.ENGINE_TYPE_KEY, "hive-3.1.3"); // required engineType Label
labels.put(LabelKeyConstant.USER_CREATOR_TYPE_KEY, "hadoop-IDE");// required execute user and creator
labels.put(LabelKeyConstant.CODE_TYPE_KEY, "hql"); // required codeType
```
Expand Down Expand Up @@ -95,7 +95,7 @@ Note: After modifying the configuration under the `IDE` tag, you need to specify

```shell
sh ./bin/linkis-cli -creator IDE \
-engineType hive-2.3.3 -codeType hql \
-engineType hive-3.1.3 -codeType hql \
-code "show databases" \
-submitUser hadoop -proxyUser hadoop
```
Expand All @@ -116,7 +116,7 @@ Example of http request parameters
}
},
"labels": {
"engineType": "hive-2.3.3",
"engineType": "hive-3.1.3",
"userCreator": "hadoop-IDE"
}
}
Expand All @@ -128,7 +128,7 @@ Example of http request parameters

```
linkis_ps_configuration_config_key: Insert the key and default values ​​​​of the configuration parameters of the engine
linkis_cg_manager_label: insert engine label such as: hive-2.3.3
linkis_cg_manager_label: insert engine label such as: hive-3.1.3
linkis_ps_configuration_category: Insert the directory association of the engine
linkis_ps_configuration_config_value: The configuration that the insertion engine needs to display
linkis_ps_configuration_key_engine_relation: The relationship between the configuration item and the engine
Expand All @@ -138,7 +138,7 @@ The initial data related to the engine in the table is as follows

```sql
-- set variable
SET @HIVE_LABEL="hive-2.3.3";
SET @HIVE_LABEL="hive-3.1.3";
SET @HIVE_ALL=CONCAT('*-*,',@HIVE_LABEL);
SET @HIVE_IDE=CONCAT('*-IDE,',@HIVE_LABEL);

Expand Down
76 changes: 26 additions & 50 deletions docs/engine-usage/impala.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,32 @@
---
title: Impala
sidebar_position: 15
sidebar_position: 12
---

This article mainly introduces the installation, usage and configuration of the `Impala` engine plugin in `Linkis`.


## 1. Pre-work

### 1.1 Engine installation
### 1.1 Environment installation

If you want to use the `Impala` engine on your `Linkis` service, you need to prepare the Impala service and provide connection information, such as the connection address of the Impala cluster, SASL username and password, etc.
If you want to use the Impala engine on your server, you need to prepare the Impala service and provide connection information, such as the connection address of the Impala cluster, SASL user name and password, etc.

### 1.2 Service Verification
### 1.2 Environment verification

```shell
# prepare trino-cli
wget https://repo1.maven.org/maven2/io/trino/trino-cli/374/trino-cli-374-executable.jar
mv trill-cli-374-executable.jar trill-cli
chmod +x trino-cli

# Execute the task
./trino-cli --server localhost:8080 --execute 'show tables from system.jdbc'

# Get the following output to indicate that the service is available
"attributes"
"catalogs"
"columns"
"procedure_columns"
"procedures"
"pseudo_columns"
"schemas"
"super_tables"
"super_types"
"table_types"
"tables"
"types"
"udts"
Execute the impala-shell command to get the following output, indicating that the impala service is available.
```
[root@8f43473645b1 /]# impala-shell
Starting Impala Shell without Kerberos authentication
Connected to 8f43473645b1:21000
Server version: impalad version 2.12.0-cdh5.15.0 RELEASE (build 23f574543323301846b41fa5433690df32efe085)
***************************************************** *********************************
Welcome to the Impala shell.
(Impala Shell v2.12.0-cdh5.15.0 (23f5745) built on Thu May 24 04:07:31 PDT 2018)
When pretty-printing is disabled, you can use the '--output_delimiter' flag to set
the delimiter for fields in the same row. The default is ','.
***************************************************** *********************************
[8f43473645b1:21000] >
```

## 2. Engine plugin deployment
Expand Down Expand Up @@ -101,7 +91,7 @@ select * from linkis_cg_engine_conn_plugin_bml_resources;

```shell
sh ./bin/linkis-cli -submitUser impala \
-engineType impala-3.4.0 -code 'select * from default.test limit 10' \
-engineType impala-3.4.0 -code 'show databases;' \
-runtimeMap linkis.es.http.method=GET \
-runtimeMap linkis.impala.servers=127.0.0.1:21050
```
Expand Down Expand Up @@ -143,37 +133,23 @@ More `Linkis-Cli` command parameter reference: [Linkis-Cli usage](../user-guide/

If the default parameters are not satisfied, there are the following ways to configure some basic parameters

#### 4.2.1 Management console configuration

![](./images/trino-config.png)

Note: After modifying the configuration under the `IDE` tag, you need to specify `-creator IDE` to take effect (other tags are similar), such as:

```shell
sh ./bin/linkis-cli -creator IDE -submitUser hadoop \
-engineType impala-3.4.0 -codeType sql \
-code 'select * from system.jdbc.schemas limit 10'
```

#### 4.2.2 Task interface configuration
#### 4.2.1 Task interface configuration
Submit the task interface and configure it through the parameter `params.configuration.runtime`

```shell
Example of http request parameters
{
"executionContent": {"code": "select * from system.jdbc.schemas limit 10;", "runType": "sql"},
"executionContent": {"code": "show databases;", "runType": "sql"},
"params": {
"variable": {},
"configuration": {
"runtime": {
"linkis.trino.url":"http://127.0.0.1:8080",
"linkis.trino.catalog ":"hive",
"linkis.trino.schema ":"default"
}
"linkis.impala.servers"="127.0.0.1:21050"
}
},
}
},
"labels": {
"engineType": "trino-371",
"engineType": "impala-3.4.0",
"userCreator": "hadoop-IDE"
}
}
Expand All @@ -185,7 +161,7 @@ Example of http request parameters

```
linkis_ps_configuration_config_key: Insert the key and default values ​​​​of the configuration parameters of the engine
linkis_cg_manager_label: insert engine label such as: trino-375
linkis_cg_manager_label: insert engine label such as: impala-3.4.0
linkis_ps_configuration_category: Insert the directory association of the engine
linkis_ps_configuration_config_value: Insert the configuration that the engine needs to display
linkis_ps_configuration_key_engine_relation: the relationship between configuration items and engines
Expand Down
6 changes: 1 addition & 5 deletions docs/engine-usage/jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,11 +202,7 @@ Note: After modifying the configuration under the `IDE` tag, you need to specify
sh ./bin/linkis-cli -creator IDE \
-engineType jdbc-4 -codeType jdbc \
-code "show tables" \
-submitUser hadoop -proxyUser hadoop \
-runtimeMap wds.linkis.jdbc.connect.url=jdbc:mysql://127.0.0.1:3306 \
-runtimeMap wds.linkis.jdbc.driver=com.mysql.jdbc.Driver \
-runtimeMap wds.linkis.jdbc.username=root \
-runtimeMap wds.linkis.jdbc.password=123456 \
-submitUser hadoop -proxyUser hadoop
```

#### 4.2.2 Task interface configuration
Expand Down
4 changes: 2 additions & 2 deletions docs/engine-usage/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ Supported engines and version information are as follows:

| Engine | Default Engine | Default Version |
|--------------| -- | ---- |
| [Spark](./spark.md) | Yes | 2.4.3 |
| [Hive](./hive.md) | yes | 2.3.3 |
| [Spark](./spark.md) | Yes | 3.2.1 |
| [Hive](./hive.md) | yes | 3.1.3 |
| [Python](./python.md) | yes | python2 |
| [Shell](./shell.md) | Yes | 1 |
| [JDBC](./jdbc.md) | No | 4 |
Expand Down
Loading

0 comments on commit 0142e3d

Please sign in to comment.