Skip to content

Commit

Permalink
1.4.0 feature update
Browse files Browse the repository at this point in the history
  • Loading branch information
aiceflower committed Jul 19, 2023
1 parent 5595fbd commit bf721f3
Show file tree
Hide file tree
Showing 28 changed files with 1,036 additions and 872 deletions.
4 changes: 2 additions & 2 deletions docs/deployment/deploy-quick.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ HADOOP_KEYTAB_PATH=/appcom/keytab/
>
> Note: Linkis has not adapted permissions for S3, so it is not possible to grant authorization for it.
`vim linkis.properties`
`vim $LINKIS_HOME/conf/linkis.properties`
```shell script
# s3 file system
linkis.storage.s3.access.key=xxx
Expand All @@ -245,7 +245,7 @@ linkis.storage.s3.region=xxx
linkis.storage.s3.bucket=xxx
```
`vim linkis-cg-entrance.properties`
`vim $LINKIS_HOME/conf/linkis-cg-entrance.properties`
```shell script
wds.linkis.entrance.config.log.path=s3:///linkis/logs
wds.linkis.resultSet.store.path=s3:///linkis/results
Expand Down
76 changes: 26 additions & 50 deletions docs/engine-usage/impala.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,32 @@
---
title: Impala
sidebar_position: 15
sidebar_position: 12
---

This article mainly introduces the installation, usage and configuration of the `Impala` engine plugin in `Linkis`.


## 1. Pre-work

### 1.1 Engine installation
### 1.1 Environment installation

If you want to use the `Impala` engine on your `Linkis` service, you need to prepare the Impala service and provide connection information, such as the connection address of the Impala cluster, SASL username and password, etc.
If you want to use the Impala engine on your server, you need to prepare the Impala service and provide connection information, such as the connection address of the Impala cluster, SASL user name and password, etc.

### 1.2 Service Verification
### 1.2 Environment verification

```shell
# prepare trino-cli
wget https://repo1.maven.org/maven2/io/trino/trino-cli/374/trino-cli-374-executable.jar
mv trill-cli-374-executable.jar trill-cli
chmod +x trino-cli

# Execute the task
./trino-cli --server localhost:8080 --execute 'show tables from system.jdbc'

# Get the following output to indicate that the service is available
"attributes"
"catalogs"
"columns"
"procedure_columns"
"procedures"
"pseudo_columns"
"schemas"
"super_tables"
"super_types"
"table_types"
"tables"
"types"
"udts"
Execute the impala-shell command to get the following output, indicating that the impala service is available.
```
[root@8f43473645b1 /]# impala-shell
Starting Impala Shell without Kerberos authentication
Connected to 8f43473645b1:21000
Server version: impalad version 2.12.0-cdh5.15.0 RELEASE (build 23f574543323301846b41fa5433690df32efe085)
***************************************************** *********************************
Welcome to the Impala shell.
(Impala Shell v2.12.0-cdh5.15.0 (23f5745) built on Thu May 24 04:07:31 PDT 2018)
When pretty-printing is disabled, you can use the '--output_delimiter' flag to set
the delimiter for fields in the same row. The default is ','.
***************************************************** *********************************
[8f43473645b1:21000] >
```

## 2. Engine plugin deployment
Expand Down Expand Up @@ -101,7 +91,7 @@ select * from linkis_cg_engine_conn_plugin_bml_resources;

```shell
sh ./bin/linkis-cli -submitUser impala \
-engineType impala-3.4.0 -code 'select * from default.test limit 10' \
-engineType impala-3.4.0 -code 'show databases;' \
-runtimeMap linkis.es.http.method=GET \
-runtimeMap linkis.impala.servers=127.0.0.1:21050
```
Expand Down Expand Up @@ -143,37 +133,23 @@ More `Linkis-Cli` command parameter reference: [Linkis-Cli usage](../user-guide/

If the default parameters are not satisfied, there are the following ways to configure some basic parameters

#### 4.2.1 Management console configuration

![](./images/trino-config.png)

Note: After modifying the configuration under the `IDE` tag, you need to specify `-creator IDE` to take effect (other tags are similar), such as:

```shell
sh ./bin/linkis-cli -creator IDE -submitUser hadoop \
-engineType impala-3.4.0 -codeType sql \
-code 'select * from system.jdbc.schemas limit 10'
```

#### 4.2.2 Task interface configuration
#### 4.2.1 Task interface configuration
Submit the task interface and configure it through the parameter `params.configuration.runtime`

```shell
Example of http request parameters
{
"executionContent": {"code": "select * from system.jdbc.schemas limit 10;", "runType": "sql"},
"executionContent": {"code": "show databases;", "runType": "sql"},
"params": {
"variable": {},
"configuration": {
"runtime": {
"linkis.trino.url":"http://127.0.0.1:8080",
"linkis.trino.catalog ":"hive",
"linkis.trino.schema ":"default"
}
"linkis.impala.servers"="127.0.0.1:21050"
}
},
}
},
"labels": {
"engineType": "trino-371",
"engineType": "impala-3.4.0",
"userCreator": "hadoop-IDE"
}
}
Expand All @@ -185,7 +161,7 @@ Example of http request parameters

```
linkis_ps_configuration_config_key: Insert the key and default values ​​​​of the configuration parameters of the engine
linkis_cg_manager_label: insert engine label such as: trino-375
linkis_cg_manager_label: insert engine label such as: impala-3.4.0
linkis_ps_configuration_category: Insert the directory association of the engine
linkis_ps_configuration_config_value: Insert the configuration that the engine needs to display
linkis_ps_configuration_key_engine_relation: the relationship between configuration items and engines
Expand Down
72 changes: 55 additions & 17 deletions docs/feature/base-engine-compatibilty.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,74 @@
---
title: reduce base engine compatibility issues
title: Base Engine Dependency, Compatibility, Default Version Optimization
sidebar_position: 0.2
---

## 1. Requirement Background
before we may need to modify linkis source code to fit different hive and spark version and compilation may fail for some certain versions, we need to reduce compilation and installation problems caused by base engine versions
## 1. Requirement background
1. The lower version of linkis needs to modify the code to adapt to different versions of Hive, Spark, etc. Because of compatibility issues, the compilation may fail, which can reduce the compatibility issues of these basic engines.
2. Hadoop, Hive, and Spark 3.x are very mature, and lower versions of the engine may have potential risks. Many users in the community use the 3.x version by default, so consider changing the default compiled version of Linkis to 3.x.

## 2. Instructions for use
for different hive compilation, we only to compile with target hive versions, such as
```
mvn clean install package -Dhive.version=3.1.3

```
## 2.1 Default version adjustment instructions

Linkis 1.4.0 changes the default versions of Hadoop, Hive, and Spark to 3.x, and the specific versions are Hadoop 3.3.4, Hive 3.1.3, and Spark 3.2.1.

for different spark compilation, we only to compile with target spark versions, here are normal scenes for usage.
## 2.2 Different version adaptation

To compile different hive versions, we only need to specify `-D=xxx`, for example:
```
mvn clean install package -Dhive.version=2.3.3
```
spark3+hadoop3
To compile different versions of spark, we only need to specify `-D=xxx`. Common usage scenarios are as follows:
```
#spark3+hadoop3
mvn install package
spark3+hadoop2
mvn install package -Phadoop-2.7
#spark3+hadoop2
mvn install package -Phadoop-2.7
spark2+hadoop2
#spark2+hadoop2
mvn install package -Pspark-2.4 -Phadoop-2.7
spark2+ hadoop3
#spark2+ hadoop3
mvn install package -Pspark-2.4
```
## 3. Precautions
spark subversion can be specified by -Dspark.version=xxx
hadoop subversion can be specified by -Dhadoop.version=xxx
1. When the default version is compiled, the basic version is: hadoop3.3.4 + hive3.1.3 + spark3.2.1
```
mvn install package
```
Due to the default version upgrade of the default base engine, `spark-3.2`, `hadoop-3.3` and `spark-2.4-hadoop-3.3` profiles were removed, and profiles `hadoop-2.7` and `spark-2.4` were added.

2. The sub-version of spark can be specified by `-Dspark.version=xxx`. The default scala version used by the system is 2.12.17, which can be adapted to spark 3.x version. To compile spark 2.x, you need to use scala 2.11 version. Can be compiled with -Pspark-2.4 parameter, or -Dspark.version=2.xx -Dscala.version=2.11.12 -Dscala.binary.version=2.11.

3. The subversion of hadoop can be specified by `-Dhadoop.version=xxx`

for example :
mvn install package -Pspark-3.2 -Phadoop-3.3 -Dspark.version=3.1.3
```
mvn install package -Pspark-3.2 -Phadoop-3.3 -Dspark.version=3.1.3
```

4. Version 2.x of hive needs to rely on jersey. Hive EC does not add jersey dependency when compiling by default. You can compile it through the following guidelines.

**Compile hive version 2.3.3**

When compiling hive EC, the profile that activates adding jersey dependencies when specifying version 2.3.3 is added by default. Users can compile by specifying the -Dhive.version=2.3.3 parameter

**Compile other hive 2.x versions**

Modify the linkis-engineconn-plugins/hive/pom.xml file, modify 2.3.3 to the user-compiled version, such as 2.1.0
```xml
<profile>
<id>hive-jersey-dependencies</id>
<activation>
<property>
<name>hive.version</name>
<!-- <value>2.3.3</value> -->
<value>2.1.0</value>
</property>
</activation>
...
</profile>
```
Add -Dhive.version=2.1.0 parameter when compiling.
Loading

0 comments on commit bf721f3

Please sign in to comment.