Skip to content

Commit

Permalink
Add blog posts to the official website
Browse files Browse the repository at this point in the history
  • Loading branch information
ahaoyao committed Aug 4, 2023
1 parent b6e44f9 commit 9f12a99
Show file tree
Hide file tree
Showing 48 changed files with 1,892 additions and 8 deletions.
2 changes: 1 addition & 1 deletion blog/2023-07-28-gateway-process-analysis.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Linkis1.3.2 Gateway Process Analysis
authors: [ahaoyao]
tags: [blog,linki1.3.0,service merge]
tags: [blog,linkis1.3.2,gateway]
---
### Linkis 1.3.2 Process diagram

Expand Down
243 changes: 243 additions & 0 deletions blog/2023-08-03-analysis-of-engine-material-mg-function.md

Large diffs are not rendered by default.

139 changes: 139 additions & 0 deletions blog/2023-08-03-cdh-linkis-dss.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
---
title: [Practical Experience] Deploying Linkis1.1.1 and DSS1.1.0 based on CDH6.3.2
authors: [kongslove]
tags: [blog,linkis1.1.1,cdh]
---
### Preface

With the development of business and the updating and iteration of community products, we have found that Linkis1. X has significant performance improvements in resource management and engine management, which can better meet the needs of data center construction. Compared to version 0.9.3 and the platform we used before, the user experience has also been greatly improved, and issues such as the inability to view details on task failure pages have also been improved. Therefore, we have decided to upgrade Linkis and the WDS suite. The following are specific practical operations, hoping to provide reference for everyone.

### 一、Environment
CDH6.3.2 Version of each component
- hadoop:3.0.0-cdh6.3.2
- hive:2.1.1-cdh6.3.2
- spark:2.4.8

#### Hardware
2台 128G Cloud physics machine

### 二、Linkis Installation Deployment

#### 2.1Compile code or release installation package?

This installation and deployment is using the release installation package method. In order to adapt to the CDH6.3.2 version within the company, the dependency packages related to Hadoop and hive need to be replaced with the CDH6.3.2 version. Here, the installation package is directly replaced. The dependent packages and modules that need to be replaced are shown in the following list.```
--Modules involved
linkis-engineconn-plugins/spark
linkis-engineconn-plugins/hive
/linkis-commons/public-module
/linkis-computation-governance/
```
```
-----List of CDH packages that need to be replaced
./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hive-shims-0.23-2.1.1-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hive-shims-scheduler-2.1.1-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-annotations-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-auth-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-common-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-hdfs-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-client-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-common-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-jobclient-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-api-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-client-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-server-common-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-core-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-shuffle-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-common-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-annotations-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-auth-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-mapreduce-client-core-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-api-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-client-3.0.0-cdh6.3.2.jar
./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-common-3.0.0-cdh6.3.2.jar
./lib/linkis-commons/public-module/hadoop-annotations-3.0.0-cdh6.3.2.jar
./lib/linkis-commons/public-module/hadoop-auth-3.0.0-cdh6.3.2.jar
./lib/linkis-commons/public-module/hadoop-common-3.0.0-cdh6.3.2.jar
./lib/linkis-commons/public-module/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar
./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-annotations-3.0.0-cdh6.3.2.jar
./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-auth-3.0.0-cdh6.3.2.jar
./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-api-3.0.0-cdh6.3.2.jar
./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-client-3.0.0-cdh6.3.2.jar
./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-common-3.0.0-cdh6.3.2.jar
```
#### 2.2Problems encountered during deployment
1、Kerberos configuration
Need to add in the public configuration of linkis.properties
Each engine conf also needs to be added
```
wds.linkis.keytab.enable=true
wds.linkis.keytab.file=/hadoop/bigdata/kerberos/keytab
wds.linkis.keytab.host.enabled=false
wds.linkis.keytab.host=your_host
```
2、Starting an error after replacing the Hadoop dependency package: java.lang.NoClassDefFoundError:org/apache/commons/configuration2/Configuration
![](/static/Images/blog/hadoop-start-error.png)
Reason: Configuration class conflict, add a commons configuration 2-2.1.1.jar under the linkis commons module to resolve the conflict
3、Running Spark, Python, etc. in the script reports an error of no plugin for XXX
Phenomenon: After modifying the Spark/Python version in the configuration file, starting the engine reports an error of no plugin for XXX
![](/static/Images/blog/no-plugin-error.png)
Reason: The versions of the engine have been written dead in the LabelCommonConfig.java and GovernanceCommonConf.scala classes. The corresponding versions have been modified, and after compilation, all jars containing these two classes (linkis computation governance common-1.1.1. jar and linkis label common-1.1.1. jar) have been replaced in linkis and other components (including schedules)
4、Python engine execution error, initialization failed
- Modify Python. py and remove the introduction of Pandas module
- Configure the Python loading directory and modify the linkis engine conn properties of the Python engine
```
pythonVersion=/usr/local/bin/python3.6
```
5、Failed to run pyspark task and reported an error
![](/static/Images/blog/pyspark-task-error.png)
Reason: PYSPARK not set VERSION
Solution:
Set two parameters under/etc/profile
```
export PYSPARK_PYTHON=/usr/local/bin/python3.6

export PYSPARK_DRIVER_PYTHON=/usr/local/bin/python3.6
```
6、Error in executing pyspark task
java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
![](/static/Images/blog/pyspark-no-such-field-error.png)
Reason: Spark2.4.8 uses the hive1.2.1 package, but our hive has been upgraded to version 2.1.1. This parameter has been removed from hive2, and the code in Spark-SQL still needs to call this parameter of hive, resulting in an error,
So HIVE was removed from the spark SQL/live code_ STATS_ JDBC_ The TIMEOUT parameter is recompiled and packaged to replace the spark hive in Spark2.4.8_ 2.11-2.4.8.jar
7、Proxy user exception during jdbc engine execution
Phenomenon: User A was used to execute a jdbc task 1, and the engine selected it for reuse. Then, I also used User B to execute a jdbc task 2, and found that the submitter of task 2 was A
Analyze the reason:
ConnectionManager::getConnection
![](/static/Images/blog/jdbc-connection-manager.png)
When creating a datasource here, it is determined whether to create it based on the key, which is a jdbc URL. However, this granularity may be a bit large because different users may access the same datasource, such as hive. Their URLs are the same, but their account passwords are different. Therefore, when the first user creates a datasource, the username is already specified, and when the second user enters, When this data source was found to exist, it was directly used instead of creating a new datasource, resulting in the code submitted by user B being executed through A.
Solution: Reduce the key granularity of the data source cache map and change it to jdbc. URL+jdbc. user.
### 三、DSS deployment
Refer to the official website documentation for installation and configuration during the installation process. Below are some issues encountered during installation and debugging.
#### 3.1 DSS The database list displayed on the left is incomplete
Analysis: The database information displayed by the DSS data source module comes from the hive metabase, but due to permission control through Sentry in CDH6, most of the hive table metadata information does not exist in the hive metastore, so the displayed data is missing.
Solution:
Transform the original logic into using jdbc to link hive and obtain table data display from jdbc.
Simple logical description:
The properties information of jdbc is obtained through the IDE jdbc configuration information configured on the linkis console.
DBS: Obtain schema through connection. getMetaData()
TBS: connection. getMetaData(). getTables() Get the tables under the corresponding db
COLUMNS: Obtain the columns information of the table by executing describe table
#### 3.2 DSS Error in executing jdbc script in workflow jdbc.name is empty
Analysis: The default creator executed in dss workflow is Schedulis. Due to the lack of configuration of Schedulis related engine parameters in the management console, all read parameters are empty.
An error occurred when adding Schedulis' Category 'to the console, ”The Schedule directory already exists. Due to the fact that the creator in the scheduling system is schedule, it is not possible to add a Schedule Category. In order to better identify each system, the default creator executed in the dss workflow is changed to nodeexception. This parameter can be configured by adding the line wds. linkis. flow. ob. creator. v1=nodeexecution to the dss flow execution server. properties.
75 changes: 75 additions & 0 deletions blog/2023-08-03-entrance-execution-analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
title: [Source Code Interpretation] Linkis1.1.1 Entry Execution Analysis
authors: [guoshupei]
tags: [blog,linkis1.1.1,entrance]
---
### Preface

The following is a diagram based on the source code analysis of Linkisv1.1.1: Entry service execution process.
All subsequent explanations revolve around this picture, so when reading the explanation, please refer to the entire picture to understand. The explanation idea is to break the whole into parts, accumulate points into lines, and gather lines into surfaces.
![](/static/Images/blog/entry-service-execution-process.jpg)

Roughly divide the above figure into:
Environment initialization area: The EntranceContext that needs to be initialized when the entire Entrance service starts
Submit task area: Users call the EntranceRestfulApi interface to submit tasks, as well as job construction, interceptor operations, etc
Execution area: The job submitted from the submission area contains all operations throughout the entire job lifecycle
![](/static/Images/blog/entrance-context.png)

### Environment initialization area
![](/static/Images/blog/env-init.png)
```
The Entry function is finely divided and each performs its own duties, making it easy to expand. The injection of the entire environment can be viewed in the EnteranceSpringConfiguration configuration class, which is introduced from left to right in sequence below
PersistenceManager(QueryPersistenceManager)Persistence management
The main object of action is job, and operations such as state, progress, and result have been defined. QueryPersistenceEngine and EntranceResultSetEngine are one of the implementations. If there is a change in the storage type, an additional implementation needs to be added. By injecting the change injection class into the entry, the switch can be achieved.
EnteranceParser (CommonEnteranceParser) parameter parser: There are mainly three methods, parseToTask (JSON ->
Request), parseToJob (request ->job), parseToJobRequest (job ->
Request, this process can be roughly expressed as: JSON ->request<=>job
LogManager (CacheLogManager) log management: printing logs and updating error codes, etc
Scheduler (ParallelScheduler) scheduler: responsible for job distribution, job execution environment initialization, etc. Linkis is grouped according to the same tenant and task type. Many settings are based on this grouping principle, such as parallelism, resources, etc. So here are three important functional components and abstract a SchedulerContext context environment management:
1) GroupFactory (EntranceGroupFactory) group factory: Create groups by group and cache groups with groupName as the key. The group mainly records some parameters, such as concurrency, number of threads, etc
2) ConsumerManager (ParallelConsumerManager) consumption manager: create consumers by group, cache consumers with groupName as the key, and initialize a Thread pool for all consumers. Consumer is mainly used to store jobs, submit and execute jobs, etc
3) ExecutorManager (EntranceExecutorManagerImpl) executor management: Create an executor for each job, responsible for all operations throughout the job lifecycle
EntranceInterceptor Interceptor: All interceptors for the entrance service
EnteranceEventListenerBus event listener service: a general event listener service, which is essentially a polling thread, with a built-in Thread pool and 5 threads. Adding an event will distribute events to the registered listener according to the event type
```

### Submit Task Area
![](/static/Images/blog/submit-task.png)
```
Mainly explained by the user calling the execute() method of EnteranceRestfulApi. There are mainly four important steps
ParseToTask: After receiving the request JSON, it is first converted into a request and saved to the database using PersistenceManager to obtain the taskId
Call all interceptors Interceptors
ParseToJob: Convert request to EnteranceExecutionJob, set CodeParser, parse job through job. init(), and build SubJobInfo and SubJobDetail objects (v1.2.0 no longer has a SubJob)
Submit the job to the scheduler to obtain the execId
```

### Execution region
![](/static/Images/blog/excute-area.png)
```
ParallelGroup: Stores some parameters that FIFOUserConsumer will use, but parameter changes should not take effect in real time
FIFOUserConsumer:
1. It contains a ConsumeQueue (LoopArrayQueue), a ring queue with a size of maxCapacity, and an offer method is used to add jobs. If the queue is full, it returns None, and the business reports an error.
2. Essentially, it is a thread. It calls the loop() method by polling, takes only one job at a time, creates an executor through the ExecutorManager, and submits the job using the Thread pool
3. The concurrency count is determined by the maxRunningJobs of ParallelGroup, and tasks will prioritize obtaining tasks that need to be retried.
Default EntranceExecutor: The executor is responsible for monitoring the entire job submission, submitting one SubJobInfo at a time. Summary of general steps:
1. Asynchronously submit Orchestrator and return OrchestratorFuture
2. OrchestratorFuture registers the dealResponse function,
DealResponse: SubJob succeeded. If there is another sub job to continue submitting, call notify to inform the job of success. If the sub job fails, notify to inform the job of failure. Judge to retry and recreate the executor
3. Create an EngineExecuteAsyncReturn and inject OrchestratorFuture
Submission process:
FIFOUserConsumer obtains a job through loop()
Obtain a DefaultEntranceExecutor and inject it into the job
Call the run method of the job through the Thread pool, and the DefaultEntranceExecutor's execute will be triggered in the job
Submit Orchestrator and wait for dealResponse to be called, triggering notify
Change the job status, determine retry, and continue submitting
```
Loading

0 comments on commit 9f12a99

Please sign in to comment.