diff --git a/blog/2023-07-28-gateway-process-analysis.md b/blog/2023-07-28-gateway-process-analysis.md index ebf4264317f..fc34947d95e 100644 --- a/blog/2023-07-28-gateway-process-analysis.md +++ b/blog/2023-07-28-gateway-process-analysis.md @@ -1,7 +1,7 @@ --- title: Linkis1.3.2 Gateway Process Analysis authors: [ahaoyao] -tags: [blog,linki1.3.0,service merge] +tags: [blog,linkis1.3.2,gateway] --- ### Linkis 1.3.2 Process diagram diff --git a/blog/2023-08-03-analysis-of-engine-material-mg-function.md b/blog/2023-08-03-analysis-of-engine-material-mg-function.md new file mode 100644 index 00000000000..3cc1a2cd2e5 --- /dev/null +++ b/blog/2023-08-03-analysis-of-engine-material-mg-function.md @@ -0,0 +1,243 @@ +--- +title: [Source Code Interpretation] Analysis of the Material Management Function of the Linkis Engine +authors: [CCweixiao] +tags: [blog,linkis] +--- +### Catalog Guide + +``` +Introduction: This article takes the engine related material management process as the starting point, combined with the underlying data model and source code, to provide a detailed analysis of the implementation details of the engine's material management function, hoping to help everyone better understand the architecture of BML (Material Warehouse) services. +``` + +#### 1. BML Material Warehouse Service + +The BML material library is a functional module under the Public Enhancement Service (PS) - public enhanced service architecture in Linkis. +![](/static/Images/blog/public-enhancement-service.png) + +In the architecture system of Linkis, the concept of materials refers to various file data that is uniformly stored and managed, including script code, resource files, third-party jars, relevant class libraries and configuration files required for engine startup, and keytab files used for security authentication. + +In short, any data that exists as a file can be centrally hosted in the material library and then downloaded and used in their respective scenarios. + +Material services are stateless and can be deployed with multiple instances to achieve high service availability. Each instance provides independent services to the outside world without interference. All material metadata and version information are shared in the database, and the underlying material data can be stored in HDFS or local (shared) file systems. It also supports the implementation of file storage related interfaces and the expansion of other file storage systems. + +The material service provides precise permission control, and materials of engine resource type can be shared and accessed by all users; For some material data containing sensitive information, it is also possible to achieve limited user readability. + +The material file adopts an appending method, which can merge multiple versions of resource files into one large file to avoid generating too many HDFS small files. Excessive HDFS small files can lead to a decrease in the overall performance of HDFS. + +Material service provides lifecycle management for operational tasks such as file upload, update, and download. At the same time, there are two forms of using material services: rest interface and SDK, and users can choose according to their own needs. + +The BML architecture diagram is as follows: +![](/static/Images/blog/bml-service.png) + +The above overview of BML architecture can be found in the official website documentation:https://linkis.apache.org/zh-CN/docs/latest/architecture/public_enhancement_services/bml + +#### 2. BML Material Warehouse Service Bottom Table Model +Before delving into the process details of BML material management, it is necessary to first sort out the database table models that BML material management services rely on at the bottom. +![](/static/Images/blog/linkis-ps-bml.png) + +Linkis that can be combined with Linkis_ The ddl. SQL file and the engine material upload and update process described in the following content are used to understand the field meanings of bml resources related tables and the field relationships between tables. + +#### 3. Usage scenarios for BML material warehouse services +Currently, the usage scenarios of BML material warehouse services in Linkis include: +- Engine material files, including files in conf and lib required for engine startup +- Storing scripts, such as those in the scripts linked to workflow task nodes, are stored in the BML material library +- Workflow Content Version Management in DSS +- Resource file management required for task runtime + +#### 4. Analysis of Engine Material Management Process +Engine material is a subset of the Linkis material concept, which provides the latest version of jar package resources and configuration files for engine startup. This section mainly analyzes the flow details of engine material data in BML from the perspective of engine material management function. + +##### 4.1 Engine Material Description +After the installation package for Linkis is deployed normally, in LINKIS_ INSTALL_ Under the HOME/lib/linkis engine conn plugins directory, you can see all engine material directories. Taking the jdbc engine as an example, the structure of the engine material directory is as follows: +``` +jdbc +├── dist +│ └── v4 +│ ├── conf +│ ├── conf.zip +│ ├── lib +│ └── lib.zip +└── plugin + └── 4 + └── linkis-engineplugin-jdbc-1.1.2.jar +``` +Material catalog composition: +``` +jdbc/dist/版本号/conf.zip +jdbc/dist/版本号/lib.zip + +jdbc/plugin/版本号(去v留数字)/linkis-engineplugin-引擎名称-1.1.x.jar +``` +conf.zip and lib.zip will be hosted as engine materials in the material management service. After each local modification to the material Conf or lib, a new version number will be generated for the corresponding material, and the material file data will be re uploaded. When the engine starts, it will obtain the latest version of material data, load lib and conf, and start the engine's Java process. + +##### 4.2 Engine Material Upload and Update Process +When Linkis completes deployment and starts for the first time, it will trigger the engine materials (lib. zip and conf. zip) to be uploaded to the material library for the first time; When there are modifications to the jar package or engine configuration file in the engine lib or conf, it is necessary to trigger the engine material refresh mechanism to ensure that the latest material file can be loaded when the engine starts. + +Taking the current version of Linkis1.1. x as an example, there are two ways to trigger engine material refresh: + +Restart the engineplugin service by issuing the command sh sbin/linkis-daemon.sh restart cg-engineplugin + +Interface for requesting engine material refresh +``` +# Refresh all engine materials +curl --cookie "linkis_user_session_ticket_id_v1=kN4HCk555Aw04udC1Npi4ttKa3duaCOv2HLiVea4FcQ=" http://127.0.0.1:9001/api/rest_j/v1/engineplugin/refreshAll +# Specify engine type and version to refresh materials +curl --cookie "linkis_user_session_ticket_id_v1=kN4HCk555Aw04udC1Npi4ttKa3duaCOv2HLiVea4FcQ=" http://127.0.0.1:9001/api/rest_j/v1/engineplugin/refresh?ecType=jdbc&version=4 +``` +The underlying implementation mechanism of these two engine material refresh methods is the same, both of which call the refreshAll() or refresh() methods in the EngineConnResourceService class. + +Inside the init() method in the default implementation class of the abstract class EngineConnResourceService, defaultEngineConnResourceService, the parameter wds. linkis. engineconn. dist. load. enable (default is true) is used to control whether refreshAll (false) is executed every time the engine plugin service is started to check for updates to all engine materials (where false represents asynchronous acquisition of execution results). +``` +The init() method is decorated with the annotation @PostConstruct, which executes only once before the object is used after the defaultEngineConnResourceService is loaded. +``` + +Manually calling the engine plugin/refresh interface means manually executing the refreshAll or refresh methods in the EngineConnResourceService class. + +So the logic for detecting and updating engine materials is within the refreshAll and refresh methods in the Default Engine ConnResourceService. +The core logic of refreshAll() is: + +1) Obtain the installation directory of the engine through the parameter wds.linkis. engineconn.home. The default is: +``` +getEngineConnsHome = Configuration.getLinkisHome() + "/lib/linkis-engineconn-plugins"; +``` +2)Traverse engine directory +``` +getEngineConnTypeListFromDisk: Array[String] = new File(getEngineConnsHome).listFiles().map(_.getName) +``` +3)The EngineConnBmlResourceGenerator interface provides legitimacy detection of underlying files or directories for each engine (version). The corresponding implementation exists in the abstract class AbstractEngineConnBmlResourceGenerator. + +4)The defaultEngineConnBmlResourceGenerator class is mainly used to generate EngineConnLocalizeResources. EngineConnLocalizeResource is the encapsulation of material resource file metadata and InputStream. In subsequent logic, EngineConnLocalizeResource will be used as a material parameter to participate in the material upload process. + +The code details of the three files, EngineConnBmlResourceGenerator, AbstractEngineConnBmlResourceGenerator, and DefaultEngineConnBmlResourceGenerator, will not be explained in detail. The inheritance mechanism can be roughly understood through the following UML class diagram, and the specific implementation within the method can be combined to understand the functionality of this part. + +Return to the refreshAll method in the DefaultEngineConnResourceService class and continue to look at the core process of the refreshTask thread: +``` +engineConnBmlResourceGenerator.getEngineConnTypeListFromDisk foreach { engineConnType => + Utils.tryCatch { + engineConnBmlResourceGenerator.generate(engineConnType).foreach { + case (version, localize) => + logger.info(s" Try to initialize ${engineConnType}EngineConn-$version.") + refresh(localize, engineConnType, version) + } + } + ...... +} +``` +Scan the installation directory of the engine to obtain a list of each engine material directory. After verifying the legality of each engine material directory structure, the corresponding EngineConnLocalizeResource can be obtained. Then, complete the subsequent material upload work by calling refresh (locate: Array [EngineConnLocalizeResource], engineConnType: String, version: String). + +Within the refresh() method, the main processes involved are: + +From table linkis_ Cg_ Engine_ Conn_ Plugin_ BML_ Obtain the material list data corresponding to engineConnType and version from resources, and assign the value to the variable engineConnBmlResources. +``` +val engineConnBmlResources = asScalaBuffer(engineConnBmlResourceDao.getAllEngineConnBmlResource(engineConnType, version)) +``` + +###### 4.2.1 Engine Material Upload Process +Sequence diagram of engine material upload process + +If the table linkis_ Cg_ Engine_ Conn_ Plugin_ BML_ If there is no matching data in the resources, you need to use the data from the EngineConnLocalizeResource to construct the EngineConnBmlResource object and save it to linkis_ Cg_ Engine_ Conn_ Plugin_ BML_ In the resources table, before saving this data, it is necessary to complete the upload operation of the material file, that is, execute the uploadToBml (localizeResource) method. + +Within the uploadToBml (localizeResource) method, an interface is constructed to request material upload by constructing a bmlClient. Namely: +``` +private val bmlClient = BmlClientFactory.createBmlClient() +bmlClient.uploadResource(Utils.getJvmUser, localizeResource.fileName, localizeResource.getFileInputStream) +``` +In BML Server, the location of the material upload interface is within the uploadResource interface method in the BmlRestfulApi class. The main process experienced is: +``` +ResourceTask resourceTask = taskService.createUploadTask(files, user, properties); +``` +Each material upload will construct a ResourceTask to complete the file upload process and record the execution record of the file upload task. The main operations completed within the createUploadTask method are as follows: + +1) Generate a globally unique resource for the uploaded resource file_ Id, String resourceId=UUID. randomUUID(). toString(); + +2) Build ResourceTask records and store them in the table linkis_ Ps_ BML_ Resources_ In the task, as well as a series of subsequent task status modifications. +``` +ResourceTask resourceTask = ResourceTask.createUploadTask(resourceId, user, properties); +taskDao.insert(resourceTask); + +taskDao.updateState(resourceTask.getId(), TaskState.RUNNING.getValue(), new Date()); +``` +3) The actual operation of writing material files into the material library is completed by the upload method in the ResourceServiceImpl class. Within the upload method, a set of byte streams corresponding to Listfiles will be persisted to the material library file storage system; Store the properties data of the material file in the resource record table (linkis_ps bml resources) and resource version record table (linkis_ps bml resources version). +``` +MultipartFile p = files[0] +String resourceId = (String) properties.get("resourceId"); +String fileName =new String(p.getOriginalFilename().getBytes(Constant.ISO_ENCODE), +Constant.UTF8_ENCODE); +fileName = resourceId; +String path = resourceHelper.generatePath(user, fileName, properties); +// generatePath目前支持Local和HDFS路径,路径的构成规则由LocalResourceHelper或HdfsResourceHelper +// 中的generatePath方法实现 +StringBuilder sb = new StringBuilder(); +long size = resourceHelper.upload(path, user, inputStream, sb, true); +// 文件size计算以及文件字节流写入文件由LocalResourceHelper或HdfsResourceHelper中的upload方法实现 +Resource resource = Resource.createNewResource(resourceId, user, fileName, properties); +// 插入一条记录到resource表linkis_ps_bml_resources中 +long id = resourceDao.uploadResource(resource); +// 新增一条记录到resource version表linkis_ps_bml_resources_version中,此时的版本号是onstant.FIRST_VERSION +// 除了记录这个版本的元数据信息外,最重要的是记录了该版本的文件的存储位置,包括文件路径,起始位置,结束位置。 +String clientIp = (String) properties.get("clientIp"); +ResourceVersion resourceVersion = ResourceVersion.createNewResourceVersion( +resourceId, path, md5String, clientIp, size, Constant.FIRST_VERSION, 1); +versionDao.insertNewVersion(resourceVersion); +``` +After the above process is successfully executed, the material data is truly completed. Then, the UploadResult is returned to the client and the status of the ResourceTask is marked as completed. If an error is encountered when uploading a file, the status of the ResourceTask is marked as failed and abnormal information is recorded. + +###### 4.2.2 Engine Material Update Process +Sequence diagram of engine material update process + +If the table linkis_ Cg_ Engine_ Conn_ Plugin_ BML_ If local material data is matched in resources, the data from EngineConnLocalizeResource needs to be used to construct the EngineConnBmlResource object and update the linkis_ Cg_ Engine_ Conn_ Plugin_ BML_ The resources table contains metadata information such as the version number, file size, and modification time of the material file. Before updating this data, it is necessary to complete the update and upload operation of the material file, that is, execute the uploadToBml (localizeResource, engineConnBmlResource. getBmlResourceId) method. + +Within the uploadToBml (localizeResource, resourceId) method, an interface is constructed to request material resource updates by constructing a bmlClient. Namely: +``` +private val bmlClient = BmlClientFactory.createBmlClient() +bmlClient.updateResource(Utils.getJvmUser, resourceId, localizeResource.fileName, localizeResource.getFileInputStream) +``` +In BML Server, the interface for implementing material updates is located in the updateVersion interface method of the BmlRestfulApi class, and the main process is: + +Complete the validity check of resourceId, that is, check whether the incoming resourceId is in the link_ Ps_ BML_ It exists in the resources table. If this resourceId does not exist, an exception will be thrown to the client, causing the material update operation to fail at the interface level. + +So in the table linkis_ Cg_ Engine_ Conn_ Plugin_ BML_ Resources and links_ Ps_ BML_ The corresponding relationship of resource data in resources needs to be ensured to be complete, otherwise an error message may appear that the material file cannot be updated. +``` +resourceService.checkResourceId(resourceId) +``` +If resourceId exists in linkis_ Ps_ BML_ In the resources table, execution will continue: +``` +StringUtils.isEmpty(versionService.getNewestVersion(resourceId)) +``` +The getNewestVersion method is used to create a link in the table_ Ps_ BML_ Resources_ Obtain the maximum version number of the resourceId from version. If the maximum version corresponding to resourceId is empty, the material will also fail to update. Therefore, the integrity of the corresponding relationship of the data here also needs to be strictly guaranteed. + +After passing both of the above checks, a ResourceUpdateTask will be created to complete the final file writing and record updating and saving tasks. +``` +ResourceTask resourceTask = null; +synchronized (resourceId.intern()) { +resourceTask = taskService.createUpdateTask(resourceId, user, file, properties); +} +``` +Within the createUpdateTask method, the main functions implemented are: +``` +// 为物料Resource生成新的version +String lastVersion = getResourceLastVersion(resourceId); +String newVersion = generateNewVersion(lastVersion); +// 然后是对ResourceTask的构建,和状态维护 +ResourceTask resourceTask = ResourceTask.createUpdateTask(resourceId, newVersion, user, system, properties); +// 物料更新上传的逻辑由versionService.updateVersion方法完成 +versionService.updateVersion(resourceTask.getResourceId(), user, file, properties); +``` +The main functions implemented within the versionService. updateVersion method are: +``` +ResourceHelper resourceHelper = ResourceHelperFactory.getResourceHelper(); +InputStream inputStream = file.getInputStream(); +// 获取资源的path +String newVersion = params.get("newVersion").toString(); +String path = versionDao.getResourcePath(resourceId) + "_" + newVersion; +// getResourcePath的获取逻辑是从原有路径中limit一条,然后以_拼接newVersion +// select resource from linkis_ps_bml_resources_version WHERE resource_id = #{resourceId} limit 1 +// 资源上传到hdfs或local +StringBuilder stringBuilder = new StringBuilder(); +long size = resourceHelper.upload(path, user, inputStream, stringBuilder, OVER_WRITE); +// 最后在linkis_ps_bml_resources_version表中插入一条新的资源版本记录 +ResourceVersion resourceVersion = ResourceVersion.createNewResourceVersion(resourceId, path, md5String, clientIp, size, newVersion, 1); +versionDao.insertNewVersion(resourceVersion); +``` +5. prose summary + This article takes the Linkis engine material management function as the starting point, outlines the architecture of BML material services, and combines the underlying source code to analyze in detail the concept of engine materials in the engine material management function, as well as the operation process of uploading, updating, and version management of engine materials. \ No newline at end of file diff --git a/blog/2023-08-03-cdh-linkis-dss.md b/blog/2023-08-03-cdh-linkis-dss.md new file mode 100644 index 00000000000..306af5f5dc0 --- /dev/null +++ b/blog/2023-08-03-cdh-linkis-dss.md @@ -0,0 +1,139 @@ +--- +title: [Practical Experience] Deploying Linkis1.1.1 and DSS1.1.0 based on CDH6.3.2 +authors: [kongslove] +tags: [blog,linkis1.1.1,cdh] +--- +### Preface + +With the development of business and the updating and iteration of community products, we have found that Linkis1. X has significant performance improvements in resource management and engine management, which can better meet the needs of data center construction. Compared to version 0.9.3 and the platform we used before, the user experience has also been greatly improved, and issues such as the inability to view details on task failure pages have also been improved. Therefore, we have decided to upgrade Linkis and the WDS suite. The following are specific practical operations, hoping to provide reference for everyone. + +### 一、Environment +CDH6.3.2 Version of each component +- hadoop:3.0.0-cdh6.3.2 +- hive:2.1.1-cdh6.3.2 +- spark:2.4.8 + +#### Hardware +2台 128G Cloud physics machine + +### 二、Linkis Installation Deployment + +#### 2.1Compile code or release installation package? + +This installation and deployment is using the release installation package method. In order to adapt to the CDH6.3.2 version within the company, the dependency packages related to Hadoop and hive need to be replaced with the CDH6.3.2 version. Here, the installation package is directly replaced. The dependent packages and modules that need to be replaced are shown in the following list.``` +--Modules involved +linkis-engineconn-plugins/spark +linkis-engineconn-plugins/hive +/linkis-commons/public-module +/linkis-computation-governance/ +``` +``` +-----List of CDH packages that need to be replaced +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hive-shims-0.23-2.1.1-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hive-shims-scheduler-2.1.1-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-annotations-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-auth-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-common-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-hdfs-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-client-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-common-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-jobclient-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-api-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-client-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-server-common-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-core-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-shuffle-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-common-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-annotations-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-auth-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-mapreduce-client-core-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-api-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-client-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-common-3.0.0-cdh6.3.2.jar +./lib/linkis-commons/public-module/hadoop-annotations-3.0.0-cdh6.3.2.jar +./lib/linkis-commons/public-module/hadoop-auth-3.0.0-cdh6.3.2.jar +./lib/linkis-commons/public-module/hadoop-common-3.0.0-cdh6.3.2.jar +./lib/linkis-commons/public-module/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar +./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-annotations-3.0.0-cdh6.3.2.jar +./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-auth-3.0.0-cdh6.3.2.jar +./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-api-3.0.0-cdh6.3.2.jar +./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-client-3.0.0-cdh6.3.2.jar +./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-common-3.0.0-cdh6.3.2.jar +``` +#### 2.2Problems encountered during deployment +1、Kerberos configuration +Need to add in the public configuration of linkis.properties +Each engine conf also needs to be added +``` +wds.linkis.keytab.enable=true +wds.linkis.keytab.file=/hadoop/bigdata/kerberos/keytab +wds.linkis.keytab.host.enabled=false +wds.linkis.keytab.host=your_host +``` +2、Starting an error after replacing the Hadoop dependency package: java.lang.NoClassDefFoundError:org/apache/commons/configuration2/Configuration +![](/static/Images/blog/hadoop-start-error.png) + +Reason: Configuration class conflict, add a commons configuration 2-2.1.1.jar under the linkis commons module to resolve the conflict + +3、Running Spark, Python, etc. in the script reports an error of no plugin for XXX +Phenomenon: After modifying the Spark/Python version in the configuration file, starting the engine reports an error of no plugin for XXX + +![](/static/Images/blog/no-plugin-error.png) + +Reason: The versions of the engine have been written dead in the LabelCommonConfig.java and GovernanceCommonConf.scala classes. The corresponding versions have been modified, and after compilation, all jars containing these two classes (linkis computation governance common-1.1.1. jar and linkis label common-1.1.1. jar) have been replaced in linkis and other components (including schedules) + +4、Python engine execution error, initialization failed + +- Modify Python. py and remove the introduction of Pandas module +- Configure the Python loading directory and modify the linkis engine conn properties of the Python engine +``` +pythonVersion=/usr/local/bin/python3.6 +``` +5、Failed to run pyspark task and reported an error +![](/static/Images/blog/pyspark-task-error.png) + +Reason: PYSPARK not set VERSION +Solution: +Set two parameters under/etc/profile + +``` +export PYSPARK_PYTHON=/usr/local/bin/python3.6 + +export PYSPARK_DRIVER_PYTHON=/usr/local/bin/python3.6 +``` +6、Error in executing pyspark task +java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT + +![](/static/Images/blog/pyspark-no-such-field-error.png) + +Reason: Spark2.4.8 uses the hive1.2.1 package, but our hive has been upgraded to version 2.1.1. This parameter has been removed from hive2, and the code in Spark-SQL still needs to call this parameter of hive, resulting in an error, +So HIVE was removed from the spark SQL/live code_ STATS_ JDBC_ The TIMEOUT parameter is recompiled and packaged to replace the spark hive in Spark2.4.8_ 2.11-2.4.8.jar + +7、Proxy user exception during jdbc engine execution + +Phenomenon: User A was used to execute a jdbc task 1, and the engine selected it for reuse. Then, I also used User B to execute a jdbc task 2, and found that the submitter of task 2 was A +Analyze the reason: +ConnectionManager::getConnection + +![](/static/Images/blog/jdbc-connection-manager.png) +When creating a datasource here, it is determined whether to create it based on the key, which is a jdbc URL. However, this granularity may be a bit large because different users may access the same datasource, such as hive. Their URLs are the same, but their account passwords are different. Therefore, when the first user creates a datasource, the username is already specified, and when the second user enters, When this data source was found to exist, it was directly used instead of creating a new datasource, resulting in the code submitted by user B being executed through A. +Solution: Reduce the key granularity of the data source cache map and change it to jdbc. URL+jdbc. user. + +### 三、DSS deployment +Refer to the official website documentation for installation and configuration during the installation process. Below are some issues encountered during installation and debugging. + +#### 3.1 DSS The database list displayed on the left is incomplete +Analysis: The database information displayed by the DSS data source module comes from the hive metabase, but due to permission control through Sentry in CDH6, most of the hive table metadata information does not exist in the hive metastore, so the displayed data is missing. +Solution: +Transform the original logic into using jdbc to link hive and obtain table data display from jdbc. +Simple logical description: +The properties information of jdbc is obtained through the IDE jdbc configuration information configured on the linkis console. +DBS: Obtain schema through connection. getMetaData() +TBS: connection. getMetaData(). getTables() Get the tables under the corresponding db +COLUMNS: Obtain the columns information of the table by executing describe table + +#### 3.2 DSS Error in executing jdbc script in workflow jdbc.name is empty +Analysis: The default creator executed in dss workflow is Schedulis. Due to the lack of configuration of Schedulis related engine parameters in the management console, all read parameters are empty. +An error occurred when adding Schedulis' Category 'to the console, ”The Schedule directory already exists. Due to the fact that the creator in the scheduling system is schedule, it is not possible to add a Schedule Category. In order to better identify each system, the default creator executed in the dss workflow is changed to nodeexception. This parameter can be configured by adding the line wds. linkis. flow. ob. creator. v1=nodeexecution to the dss flow execution server. properties. \ No newline at end of file diff --git a/blog/2023-08-03-entrance-execution-analysis.md b/blog/2023-08-03-entrance-execution-analysis.md new file mode 100644 index 00000000000..27e0137dfec --- /dev/null +++ b/blog/2023-08-03-entrance-execution-analysis.md @@ -0,0 +1,75 @@ +--- +title: [Source Code Interpretation] Linkis1.1.1 Entry Execution Analysis +authors: [guoshupei] +tags: [blog,linkis1.1.1,entrance] +--- +### Preface + +The following is a diagram based on the source code analysis of Linkisv1.1.1: Entry service execution process. +All subsequent explanations revolve around this picture, so when reading the explanation, please refer to the entire picture to understand. The explanation idea is to break the whole into parts, accumulate points into lines, and gather lines into surfaces. +![](/static/Images/blog/entry-service-execution-process.jpg) + +Roughly divide the above figure into: +Environment initialization area: The EntranceContext that needs to be initialized when the entire Entrance service starts +Submit task area: Users call the EntranceRestfulApi interface to submit tasks, as well as job construction, interceptor operations, etc +Execution area: The job submitted from the submission area contains all operations throughout the entire job lifecycle +![](/static/Images/blog/entrance-context.png) + +### Environment initialization area +![](/static/Images/blog/env-init.png) +``` +The Entry function is finely divided and each performs its own duties, making it easy to expand. The injection of the entire environment can be viewed in the EnteranceSpringConfiguration configuration class, which is introduced from left to right in sequence below + +PersistenceManager(QueryPersistenceManager)Persistence management +The main object of action is job, and operations such as state, progress, and result have been defined. QueryPersistenceEngine and EntranceResultSetEngine are one of the implementations. If there is a change in the storage type, an additional implementation needs to be added. By injecting the change injection class into the entry, the switch can be achieved. + +EnteranceParser (CommonEnteranceParser) parameter parser: There are mainly three methods, parseToTask (JSON -> +Request), parseToJob (request ->job), parseToJobRequest (job -> +Request, this process can be roughly expressed as: JSON ->request<=>job + +LogManager (CacheLogManager) log management: printing logs and updating error codes, etc +Scheduler (ParallelScheduler) scheduler: responsible for job distribution, job execution environment initialization, etc. Linkis is grouped according to the same tenant and task type. Many settings are based on this grouping principle, such as parallelism, resources, etc. So here are three important functional components and abstract a SchedulerContext context environment management: +1) GroupFactory (EntranceGroupFactory) group factory: Create groups by group and cache groups with groupName as the key. The group mainly records some parameters, such as concurrency, number of threads, etc +2) ConsumerManager (ParallelConsumerManager) consumption manager: create consumers by group, cache consumers with groupName as the key, and initialize a Thread pool for all consumers. Consumer is mainly used to store jobs, submit and execute jobs, etc +3) ExecutorManager (EntranceExecutorManagerImpl) executor management: Create an executor for each job, responsible for all operations throughout the job lifecycle + +EntranceInterceptor Interceptor: All interceptors for the entrance service + +EnteranceEventListenerBus event listener service: a general event listener service, which is essentially a polling thread, with a built-in Thread pool and 5 threads. Adding an event will distribute events to the registered listener according to the event type +``` + +### Submit Task Area +![](/static/Images/blog/submit-task.png) +``` +Mainly explained by the user calling the execute() method of EnteranceRestfulApi. There are mainly four important steps + +ParseToTask: After receiving the request JSON, it is first converted into a request and saved to the database using PersistenceManager to obtain the taskId +Call all interceptors Interceptors +ParseToJob: Convert request to EnteranceExecutionJob, set CodeParser, parse job through job. init(), and build SubJobInfo and SubJobDetail objects (v1.2.0 no longer has a SubJob) +Submit the job to the scheduler to obtain the execId +``` + +### Execution region +![](/static/Images/blog/excute-area.png) +``` +ParallelGroup: Stores some parameters that FIFOUserConsumer will use, but parameter changes should not take effect in real time + +FIFOUserConsumer: +1. It contains a ConsumeQueue (LoopArrayQueue), a ring queue with a size of maxCapacity, and an offer method is used to add jobs. If the queue is full, it returns None, and the business reports an error. +2. Essentially, it is a thread. It calls the loop() method by polling, takes only one job at a time, creates an executor through the ExecutorManager, and submits the job using the Thread pool +3. The concurrency count is determined by the maxRunningJobs of ParallelGroup, and tasks will prioritize obtaining tasks that need to be retried. + +Default EntranceExecutor: The executor is responsible for monitoring the entire job submission, submitting one SubJobInfo at a time. Summary of general steps: +1. Asynchronously submit Orchestrator and return OrchestratorFuture +2. OrchestratorFuture registers the dealResponse function, +DealResponse: SubJob succeeded. If there is another sub job to continue submitting, call notify to inform the job of success. If the sub job fails, notify to inform the job of failure. Judge to retry and recreate the executor +3. Create an EngineExecuteAsyncReturn and inject OrchestratorFuture + +Submission process: + +FIFOUserConsumer obtains a job through loop() +Obtain a DefaultEntranceExecutor and inject it into the job +Call the run method of the job through the Thread pool, and the DefaultEntranceExecutor's execute will be triggered in the job +Submit Orchestrator and wait for dealResponse to be called, triggering notify +Change the job status, determine retry, and continue submitting +``` diff --git a/blog/2023-08-03-linkis-dss-ansible.md b/blog/2023-08-03-linkis-dss-ansible.md new file mode 100644 index 00000000000..afefaa26306 --- /dev/null +++ b/blog/2023-08-03-linkis-dss-ansible.md @@ -0,0 +1,133 @@ +--- +title: [Installation and Deployment] Linkis1.3.0+DSS1.1.1 Ansible Single Machine One Click Installation Script +authors: [wubolive] +tags: [blog,linkis1.3.0,ansible] +--- +### 一、Brief Introduction + +To solve the tedious deployment process and simplify the installation steps, this script provides a one click installation of the latest version of DSS+Linkis environment; The software in the deployment package adopts my own compiled installation package and is the latest version: DSS1.1.1+Linkis1.3.0. + +#### Version Introduction +The following version and configuration information can be found in the [all: vars] field of the installation program hosts file. + +| Software Name | Software version | Application Path | Test/Connect Command | +|------------------|-------------------|-----------------------|-------------------------------------------| +| MySQL | mysql-5.6 | /usr/local/mysql | mysql -h 127.0.0.1 -uroot -p123456 | +| JDK | jdk1.8.0_171 | /usr/local/java | java -version | +| Python | python 2.7.5 | /usr/lib64/python2.7 | python -V | +| Nginx | nginx/1.20.1 | /etc/nginx | nginx -t | +| Hadoop | hadoop-2.7.2 | /opt/hadoop | hdfs dfs -ls / | +| Hive | hive-2.3.3 | /opt/hive | hive -e "show databases" | +| Spark | spark-2.4.3 | /opt/spark | spark-sql -e "show databases" | +| dss | dss-1.1.1 | /home/hadoop/dss | http://:8085 | +| links | linkis-1.3.0 | /home/hadoop/linkis | http://:8188 | +| zookeeper | 3.4.6 | /usr/local/zookeeper | 无 | +| DolphinScheduler | 1.3.9 | /opt/dolphinscheduler | http://:12345/dolphinscheduler | +| Visualis | 1.0.0 | /opt/visualis-server | http://:9088 | +| Qualitis | 0.9.2 | /opt/qualitis | http://:8090 | +| Streamis | 0.2.0 | /opt/streamis | http://:9188 | +| Sqoop | 1.4.6 | /opt/sqoop | sqoop | +| Exchangis | 1.0.0 | /opt/exchangis | http://:8028 | + + +### 二、Pre deployment considerations + +Ask: +- This script has only been tested on CentOS 7 systems. Please ensure that the installed server is CentOS 7. +- Install only DSS+Linkis server memory of at least 16GB, and install all service memory of at least 32GB. +- Before installation, please close the server firewall and SELinux, and use root user for operation. +- The installation server must have smooth access to the internet, and the script requires downloading some basic software using yum. +- Ensure that the server does not have any software installed, including but not limited to Java, MySQL, nginx, etc., preferably a brand new system. +- It is necessary to ensure that the server has only one IP address, except for the lo: 127.0.1 loopback address, which can be tested using the echo $(hostname - I) command. + + +### 三、Deployment method + +The deployment host IP for this case is 192.168.1.52. Please change the following steps according to your actual situation. + +#### 3.1 Pre installation settings +``` +### Install ansible +$ yum -y install epel-release +$ yum -y install ansible + +### Configure password free +$ ssh-keygen -t rsa +$ ssh-copy-id root@192.168.1.52 + +### Configure password free shutdown firewall and SELinux +$ systemctl stop firewalld.service && systemctl disable firewalld.service +$ sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config && setenforce 0 +``` + +#### 3.2 Deploy linkis+dss +``` +### Obtain installation package +$ git clone https://github.com/wubolive/dss-linkis-ansible.git +$ cd dss-linkis-ansible + +### Catalog Description +dss-linkis-ansible +├── ansible.cfg # ansible profile +├── hosts # Host and variable configuration for hosts +├── playbooks # Playbooks script +├── README.md # documentation +└── roles # Role Configuration + +### Configure the deployment host (note: the value of ansible_ssh_host cannot be set to 127.0.0.1) +$ vim hosts +[deploy] +dss-service ansible_ssh_host=192.168.1.52 ansible_ssh_port=22 + +### Download the installation package to the download directory (if the download fails, you can manually download and place it in that directory) +$ ansible-playbook playbooks/download.yml + +### One click installation of Linkis+DSS +$ ansible-playbook playbooks/all.yml +...... +TASK [dss : Print access information] ***************************************************************************************** +ok: [dss-service] => { + "msg": [ + "*****************************************************************", + " 访问 http://192.168.1.52 View access information ", + "*****************************************************************" + ] +} +``` +After execution, you can access: http://192.168.1.52 View the information page, which records the access addresses and account passwords of all services. +![](/static/Images/blog/view-information-page.png) + +#### 3.3 Deploy other services +``` +# Install dolphinscheduler +$ ansible-playbook playbooks/dolphinscheduler.yml +### Note: To install the following services, priority must be given to installing the Dolphinscheduler scheduling system +# Install visualis +$ ansible-playbook playbooks/visualis.yml +# Install qualitis +$ ansible-playbook playbooks/qualitis.yml +# Install streamis +$ ansible-playbook playbooks/streamis.yml +# Install exchangis +$ ansible-playbook playbooks/exchangis.yml +``` +#### 3.4 Maintenance Guidelines +``` +### View real-time logs +$ su - hadoop +$ tail -f ~/linkis/logs/*.log ~/dss/logs/*.log + +### Start the DSS+Linkis service (if the server restarts, you can use this command to start it) +$ ansible-playbook playbooks/all.yml -t restart +# Launch zookeeper +$ sh /usr/local/zookeeper/bin/zkServer.sh start +# Start other services +$ su - hadoop +$ cd /opt/dolphinscheduler/bin && sh start-all.sh +$ cd /opt/visualis-server/bin && sh start-visualis-server.sh +$ cd /opt/qualitis/bin/ && sh start.sh +$ cd /opt/streamis/streamis-server/bin/ && sh start-streamis-server.sh +$ cd /opt/exchangis/sbin/ && ./daemon.sh start server +``` + +Please visit the official QA document for usage issues:https://docs.qq.com/doc/DSGZhdnpMV3lTUUxq \ No newline at end of file diff --git a/blog/2023-08-03-linkis-dss-compile-deployment.md b/blog/2023-08-03-linkis-dss-compile-deployment.md new file mode 100644 index 00000000000..5fd73d9bc72 --- /dev/null +++ b/blog/2023-08-03-linkis-dss-compile-deployment.md @@ -0,0 +1,349 @@ +--- +title: [Development Experience] Compilation to Deployment of Apache Linkis+DSS +authors: [huasir] +tags: [blog,linkis,dss] +--- +### Background + +With the development of our business and the updating and iteration of community products, we have found that linkis1.2.0 and dss1.1.1 can better meet our needs for real-time data warehousing and machine learning. Compared to the currently used linkis 0.9.3 and dss 0.7.0, there have been significant structural adjustments and design optimizations in task scheduling and plugin access. Based on the above reasons, we now need to upgrade the existing version. Due to the large version span, our upgrade idea is to redeploy a new version and migrate the original business data. The following are specific practical operations, hoping to provide reference for everyone. + +### Obtain source code +![](/static/Images/blog/resource-code.png) + +``` + git clone git@github.com:yourgithub/incubator-linkis.git + git clone git@github.com:yourgithub/DataSphereStudio.git +``` +If there are no developers who plan to submit PR, you can also download the zip source package directly from the official website +### Compile and package + +#### 1. Determine version matching +linkis: 1.2.0 +dss: 1.1.0 +hadoop: 3.1.1 +spark: 2.3.2 +hive: 3.1.0 + +#### 2. linkis1.2.0Compile and package +``` +git checkout -b release-1.2.0 origin/release-1.2.0 +mvn -N install +mvn clean install -DskipTests +``` + +Installation package path: injector linkis/linkis list/target/apache linkis-1.2.0-incubating bin.tar.gz +In order to adapt to our own version, we need to adjust the pom and recompile it + +1. Annotate the scope of mysql driver mysql connector java (input linkis/pom.xml) +``` + + mysql + mysql-connector-java + ${mysql.connector.version} + + +``` +2. Modify the Hadoop version number (incubator linkis/pom.xml) +``` + + 3.1.1 + +``` + +3. hadoop3需要调整hadoop-hdfs的artifactId值(孵化器linkis/linkis-commons/linkis-hadoop-commons/pom.xml) +``` + + org.apache.hadoop + + hadoop-hdfs-client + +``` +4. Adjust the hive version of the hive engine (incubator linkis/linkis engine conn plugins/live/pom. xml) +``` + + 3.1.0 + +``` +5. The hive and hadoop versions of linkis metadata query service live also need to be adjusted (invoice linkis/linkis public enhancements/linkis datasource/linkis metadata query service live/pom. xml) +``` + + UTF-8 + 3.1.1 + 3.1.0 + 4.2.4 + +``` +6. Adjust the Spark version of the Spark engine (invoice linkis/linkis engine conn plugins/park/pom. xml) +``` + + 2.3.2 + +``` +#### 3. links1.2.0 Management end packaging +``` +cd incubator-linkis/linkis-web +npm install +npm run build +#If it's slow, you can use cnpm +npm install -g cnpm --registry=https://registry.npm.taobao.org +cnpm install +``` +Installation package path:incubator-linkis/linkis-web/apache-linkis-1.2.0-incubating-web-bin.tar.gz +#### 4. dss1.1.0 Compile and package +``` +git checkout -b branch-1.1.0 origin/branch-1.1.0 +mvn -N install +mvn clean install -DskipTests +``` +Installation package path:DataSphereStudio/assembly/target/wedatasphere-dss-1.1.0-dist.tar.gz + +#### 5. dss1.1.0 Front end compilation and packaging +``` +cd web/ +npm install lerna -g +lerna bootstrap # Installation dependencies +``` +Installation package path:DataSphereStudio/web/dist/ + +### Deployment Installation +#### Environmental Description + +| master | slave1 | slave2 | slave3 | app | +|--------------------|--------|--------|------------|--------------------------------| +| linksi0.9.3,nginx | mysql | | dss-0.7.0 | | +| | | | | links1.20,dss1.1.0,nginx,mysql | +| hadoop | hadoop | hadoop | hadoop | hadoop | + +Note: A total of 5 machines have been installed in the basic environment of Big data, including hadoop, hive and spark +Install the new version of links1.2.0 and dss1.1.0 on the app machine first. Keep the original version of linkis available and wait for the new deployment before migrating the data from the old version + +#### Collect installation package + +![](/static/Images/blog/collect-installation-package.png) + +#### Install MySQL +``` +docker pull mysql:5.7.40 +docker run -it -d -p 23306:3306 -e MYSQL_ROOT_PASSWORD=app123 -d mysql:5.7.40 +``` +#### Installing linkis +``` +tar zxvf apache-linkis-1.2.0-incubating-bin.tar.gz -C linkis +cd linkis +vi deploy-config/db.sh # Configuration database +``` +![](/static/Images/blog/install-linkis.png) + +Key parameter configuration +``` +deployUser=root +YARN_RESTFUL_URL=http://master:18088 +#HADOOP +HADOOP_HOME=/usr/hdp/3.1.5.0-152/hadoop +HADOOP_CONF_DIR=/etc/hadoop/conf +#HADOOP_KERBEROS_ENABLE=true +#HADOOP_KEYTAB_PATH=/appcom/keytab/ + +#Hive +HIVE_HOME=/usr/hdp/3.1.5.0-152/hive +HIVE_CONF_DIR=/etc/hive/conf + +#Spark +SPARK_HOME=/usr/hdp/3.1.5.0-152/spark2 +SPARK_CONF_DIR=/etc/spark2/conf + + +## Engine version conf +#SPARK_VERSION +SPARK_VERSION=2.3.2 + +##HIVE_VERSION +HIVE_VERSION=3.1.0 + +## java application default jvm memory +export SERVER_HEAP_SIZE="256M" + +##The decompression directory and the installation directory need to be inconsistent +#LINKIS_HOME=/root/linkis-dss/linkis +``` +Implement safety insurance chekcEnv.sh +``` +bin]# ./checkEnv.sh +``` +![](/static/Images/blog/check-env.png) +Because I am using the Docker locally to install MySQL, I need to install an additional MySQL client +``` +wget https://repo.mysql.com//mysql80-community-release-el7-7.noarch.rpm +rpm -Uvh mysql80-community-release-el7-7.noarch.rpm +yum-config-manager --enable mysql57-community +vi /etc/yum.repos.d/mysql-community.repo +#Set enable for mysql8 to 0 +[mysql80-community] +name=MySQL 8.0 Community Server +baseurl=http://repo.mysql.com/yum/mysql-8.0-community/el/6/$basearch/ +enabled=1 +gpgcheck=1 +#install +yum install mysql-community-server +``` +Attempt to install linkis +``` + sh bin/install.sh +``` +![](/static/Images/blog/sh-bin-install-sh.png) +Open the management interface of Spark2 in Ambari, add environment variables, and restart the related services of Spark2 +![](/static/Images/blog/advanced-spark2-env.png) +Finally passed verification +![](/static/Images/blog/check-env1.png) +``` +sh bin/install.sh +``` +For the first installation, the database needs to be initialized. Simply select 2 +![](/static/Images/blog/data-source-init-choose-2.png) + +According to the official website prompt, you need to download the MySQL driver package yourself and place it in the corresponding directory. I used to check and found that there is already a MySQL package. It should be because the MySQL scope was removed during the previous compilation, but the version is incorrect. We are using 5.7 in production, but the driver is the MySQL 8 driver package. So it's best to adjust the MySQL driver version first when compiling.![](/static/Images/blog/choose-true-mysql-version.png) + +Manually adjust the MySQL driver version to lower the previous higher version and comment it out + +``` +wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar +cp mysql-connector-java-5.1.49.jar lib/linkis-spring-cloud-services/linkis-mg-gateway/ +cp mysql-connector-java-5.1.49.jar lib/linkis-commons/public-module/ +mv mysql-connector-java-8.0.28.jar mysql-connector-java-8.0.28.jar.bak#需要cd到对应的lib下执行 +sh sbin/linkis-start-all.sh +``` +Browser Open,http://app:20303/,There are a total of 10 services, and it seems like there is no problem +![](/static/Images/blog/open-eureka-service.png) + +#### Installing linkis-web +``` +tar -xvf apache-linkis-1.2.0-incubating-web-bin.tar.gz -C linkis-web/ +cd linkis-web +sh install.sh +``` +First visit http://app:8088/#/login Error 403 reported. After verification, it is necessary to modify the deployment of conf in nginx +``` +cd /etc/nginx +vi nginx.conf +user root; # Change the default user to their own root +nginx -s reload +``` +Visiting again seems to be correct +![](/static/Images/blog/login.png) + +View default username and password +``` +cat LinkisInstall/conf/linkis-mg-gateway.properties +``` +![](/static/Images/blog/linkis-mg-gateway.png) + +Log in to the linkis management console +![](/static/Images/blog/linkis-console.png) + +Quick verification using linkis cli +``` +sh bin/linkis-cli -submitUser root -engineType hive-3.1.0 -codeType hql -code "show tables" + +============Result:================ +TaskId:5 +ExecId: exec_id018008linkis-cg-entranceapp:9104LINKISCLI_root_hive_0 +User:root +Current job status:FAILED +extraMsg: +errDesc: 21304, Task is Failed,errorMsg: errCode: 12003 ,desc: app:9101_4 Failed to async get EngineNode AMErrorException: errCode: 30002 ,desc: ServiceInstance(linkis-cg-engineconn, app:34197) ticketID:24ab8eed-2a9b-4012-9052-ec1f64b85b5f 初始化引擎失败,原因: ServiceInsta + +[INFO] JobStatus is not 'success'. Will not retrieve result-set. +``` + +Management Console View Log Information +![](/static/Images/blog/console-log-info.png) + +My hive uses the Tez engine, and I need to manually copy the Tez engine related packages to the hive plugin's lib +``` +cp -r /usr/hdp/current/tez-client/tez-* ./lib/linkis-engineconn-plugins/hive/dist/v3.1.0/ +sh sbin/linkis-daemon.sh restart cg-engineplugin +sh bin/linkis-cli -submitUser root -engineType hive-3.1.0 -codeType hql -code "show tables" +``` + +Running again, but still not running, seems to be missing Jackson's library +![](/static/Images/blog/miss-jackson-jar.png) + +``` +// Links commons/links hadoop common needs to be added manually, dependencies need to be added and repackaged + + org.apache.hadoop + hadoop-yarn-common + ${hadoop.version} + +``` + +Running again, but still not running, the log is as follows: +``` +2022-11-09 18:09:44.009 ERROR Job with execId-LINKISCLI_root_hive_0 + subJobId : 51 execute failed,21304, Task is Failed,errorMsg: errCode: 12003 ,desc: app:9101_0 Failed to async get EngineNode AMErrorException: errCode: 30002 ,desc: ServiceInstance(linkis-cg-engineconn, app:42164) ticketID:91f72f2a-598c-4384-9132-09696012d5b5 初始化引擎失败,原因: ServiceInstance(linkis-cg-engineconn, app:42164): log dir: /appcom/tmp/root/20221109/hive/91f72f2a-598c-4384-9132-09696012d5b5/logs,SessionNotRunning: TezSession has already shutdown. Application application_1666169891027_0067 failed 2 times due to AM Container for appattempt_1666169891027_0067_000002 exited with exitCode: 1 +``` + +From the log, it appears that Yarn's app is running abnormally. Check Yarn's container log: +``` +Log Type: syslog + +Log Upload Time: Wed Nov 09 18:09:41 +0800 2022 + +Log Length: 1081 + +2022-11-09 18:09:39,073 [INFO] [main] |app.DAGAppMaster|: Creating DAGAppMaster for applicationId=application_1666169891027_0067, attemptNum=1, AMContainerId=container_e19_1666169891027_0067_01_000001, jvmPid=25804, userFromEnv=root, cliSessionOption=true, pwd=/hadoop/yarn/local/usercache/root/appcache/application_1666169891027_0067/container_e19_1666169891027_0067_01_000001, localDirs=/hadoop/yarn/local/usercache/root/appcache/application_1666169891027_0067, logDirs=/hadoop/yarn/log/application_1666169891027_0067/container_e19_1666169891027_0067_01_000001 +2022-11-09 18:09:39,123 [ERROR] [main] |app.DAGAppMaster|: Error starting DAGAppMaster +java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V + at org.apache.hadoop.conf.Configuration.set(Configuration.java:1358) + at org.apache.hadoop.conf.Configuration.set(Configuration.java:1339) + at org.apache.tez.common.TezUtilsInternal.addUserSpecifiedTezConfiguration(TezUtilsInternal.java:94) + at org.apache.tez.dag.app.DAGAppMaster.main(DAGAppMaster.java:2432) +``` +According to the logs combined with relevant information from Baidu, it is suggested that there is a version issue with Guava. First, confirm whether the Guava version in the hive engine is consistent with the Guava version in Hadoop. If it is consistent +Another possibility is the issue with the version of hive exec, as I deployed hive using Ambari. Therefore, it is best to replace the hive package in the plugin engine with the hive related jar in Ambari. The problem encountered was the latter, which took a long time to troubleshoot. +``` +(base) [root@app lib]# pwd +/root/linkis-dss/linkis/LinkisInstall/lib/linkis-engineconn-plugins/hive/dist/v3.1.0/lib +(base) [root@app lib]# ls -l | grep hive +-rw-r--r-- 1 root root 140117 Nov 10 13:44 hive-accumulo-handler-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 43 Nov 10 13:44 hive-accumulo-handler.jar -> hive-accumulo-handler-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 161078 Nov 10 13:44 hive-beeline-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 34 Nov 10 13:44 hive-beeline.jar -> hive-beeline-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 11508 Nov 10 13:44 hive-classification-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 41 Nov 10 13:44 hive-classification.jar -> hive-classification-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 45753 Nov 10 13:44 hive-cli-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 30 Nov 10 13:44 hive-cli.jar -> hive-cli-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 509029 Nov 10 13:44 hive-common-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 33 Nov 10 13:44 hive-common.jar -> hive-common-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 127200 Nov 10 13:44 hive-contrib-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 34 Nov 10 13:44 hive-contrib.jar -> hive-contrib-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 51747254 Nov 10 13:44 hive-druid-handler-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 40 Nov 10 13:44 hive-druid-handler.jar -> hive-druid-handler-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 42780917 Nov 10 13:44 hive-exec-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 31 Nov 10 13:44 hive-exec.jar -> hive-exec-3.1.0.3.1.5.0-152.jar +.................. +``` +Run again, run successfully, and there seems to be no problem with the deployment of Linkis + +### Installing DSS +Explanation: Since DSS was installed by another colleague, I will not show it here. For details, please refer to the installation on the official website. Here, I will mainly explain the problems encountered during the integration of DSS and Linkis. +1. When logging in to dss, the log of linkis mg gateway displays TooManyServiceException + As shown in the figure: + ![](/static/Images/blog/install-dss.png) + The specific logs in the gateway are as follows +``` +2022-11-11 11:27:06.194 [WARN ] [reactor-http-epoll-6 ] o.a.l.g.r.DefaultGatewayRouter (129) [apply] - org.apache.linkis.gateway.exception.TooManyServiceException: errCode: 11010 ,desc: Cannot find a correct serviceId for parsedServiceId dss, service list is: List(dss-framework-project-server, dss-apiservice-server, dss-scriptis-server, dss-framework-orchestrator-server-dev, dss-flow-entrance, dss-guide-server, dss-workflow-server-dev) ,ip: app ,port: 9001 ,serviceKind: linkis-mg-gateway + at org.apache.linkis.gateway.route.DefaultGatewayRouter$$anonfun$org$apache$linkis$gateway$route$DefaultGatewayRouter$$findCommonService$1.apply(GatewayRouter.scala:101) ~[linkis-gateway-core-1.2.0.jar:1.2.0] + at org.apache.linkis.gateway.route.DefaultGatewayRouter$$anonfun$org$apache$linkis$gateway$route$DefaultGatewayRouter$$findCommonService$1.apply(GatewayRouter.scala:100) ~[linkis-gateway-core-1.2.0.jar:1.2.0] + at org.apache.linkis.gateway.route.AbstractGatewayRouter.findService(GatewayRouter.scala:70) ~[linkis-gatew +``` +The general idea is that the dss cannot be found. Coincidentally, I found a piece of gawayParser code under the plugin in the dss and tried to copy it to the case COMMON of the parse method in the GatewayParser_ Before REGEX, introduce dependent methods, variables, and packages based on compilation prompts. As shown in the figure: +![](/static/Images/blog/gateway-parse-code.png) + +Successfully logged in (remember to restart the mg geteway service on linkis). + +If you find an error after logging in, you will be prompted to manage and create a Working directory. You can configure the following properties in linkis-ps-publicservice.properties, and then restart the ps publicservice service +``` +#LinkisInstall/conf/linkis-ps-publicservice.properties +#Workspace +linkis.workspace.filesystem.auto.create=true +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/2023-07-28-gateway-process-analysis.md b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-07-28-gateway-process-analysis.md index 6ea498dcbf9..6120266c385 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-blog/2023-07-28-gateway-process-analysis.md +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-07-28-gateway-process-analysis.md @@ -1,18 +1,18 @@ --- title: Linkis1.3.2 Gateway流程分析 authors: [ahaoyao] -tags: [blog,linki1.3.0,service merge] +tags: [blog,linki1.3.2,gateway] --- ### Linkis 1.3.2 流程图解 GateWay采用的是webFlux的响应式编程,其整个流程与spring mvc 类似 -| 框架 | Gateway | spring mvc | -|-----|---------|------------| -| 请求分发 | DispatcherHandler | DispatcherServlet | -| 请求映射 | HandlerMapping | HandlerMapping | -| 请求适配 | HanderAdaper | HanderAdaper | -| 请求处理 | WebHander | Hander | +| 框架 | Gateway | spring mvc | +|-------|--------------------|--------------------| +| 请求分发 | DispatcherHandler | DispatcherServlet | +| 请求映射 | HandlerMapping | HandlerMapping | +| 请求适配 | HanderAdaper | HanderAdaper | +| 请求处理 | WebHander | Hander | ### 流程图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-analysis-of-engine-material-mg-function.md b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-analysis-of-engine-material-mg-function.md new file mode 100644 index 00000000000..8495015897d --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-analysis-of-engine-material-mg-function.md @@ -0,0 +1,245 @@ +--- +title: 【源码解读】Linkis引擎物料管理功能剖析 +authors: [CCweixiao] +tags: [blog,linkis] +--- +### 目录导读 + +``` +导语:本文以引擎相关的物料管理流程为切入点,同时结合底层数据模型和源码,为大家详细剖析引擎物料管理功能的实现细节,期望能够帮助大家更好地理解BML(物料库)服务的架构。 +``` + +#### 1. BML物料库服务 + +BML物料库是Linkis中PublicEnhancementService(PS)——公共增强服务架构下的功能模块。 +![](/static/Images/blog/public-enhancement-service.png) + +在Linkis的架构体系里,物料的概念是指被统一存储托管起来的各种文件数据,包括脚本代码、资源文件、第三方jar、引擎启动时所需的相关类库和配置文件以及用于安全认证的keytab文件等。 + +总之,任何以文件态存在的数据,都可以被集中托管在物料库之中,然后在各自所需的场景中被下载使用。 + +物料服务是无状态的,可进行多实例部署,做到服务高可用,每个实例对外提供独立的服务,互不干扰,所有物料元数据及版本信息等在数据库中共享,底层物料数据可被存储到HDFS或本地(共享)文件系统之中,以及支持实现文件存储相关的接口,扩展其他文件存储系统等。 + +物料服务提供精确的权限控制,对于引擎资源类型的物料,可被所有用户共享访问;对于一些含有敏感信息的物料数据,也可做到仅有限用户可读。 + +物料文件采用追加的方式,可将多个版本的资源文件合并成一个大文件,避免产生过多的HDFS小文件,HDFS小文件过多会导致HDFS整体性能的下降。 + +物料服务提供了文件上传、更新、下载等操作任务的生命周期管理。同时,使用物料服务的方式有rest接口和SDK两种形式,用户可以根据自己的需要进行选择。 + +BML架构图如下: +![](/static/Images/blog/bml-service.png) + +上述关于BML架构的概述,有参考官网文档:https://linkis.apache.org/zh-CN/docs/latest/architecture/public_enhancement_services/bml + +#### 2. BML物料库服务底层表模型 +在深入理解BML物料管理的流程细节之前,有必要先梳理下BML物料管理服务底层依赖的数据库表模型。 +![](/static/Images/blog/linkis-ps-bml.png) + +可结合Linkis的linkis_ddl.sql文件以及下文内容阐述的引擎物料上传和更新流程来理解bml resources相关表的字段含义以及表与表之间的字段关系。 + +#### 3. BML物料库服务的使用场景 +目前在Linkis中,BML物料库服务的使用场景包括: +- 引擎物料文件,包括引擎启动时所需的conf和lib中的文件 +- 存储脚本,比如工作流任务节点链接的Scripts中的脚本是存储在BML物料库中的 +- DSS中工作流内容版本管理 +- 任务运行时所需资源文件管理 + +#### 4. 引擎物料管理流程剖析 +引擎物料是Linkis物料概念中的一个子集,其作用是为引擎启动时提供最新版本的jar包资源和配置文件等。本小节主要从引擎物料管理功能为切入点,剖析引擎物料数据在BML中的流转细节。 + +##### 4.1 引擎物料说明 +对Linkis的安装包正常部署之后,在LINKIS_INSTALL_HOME/lib/linkis-engineconn-plugins目录之下可以看到所有的引擎物料目录,以jdbc引擎为例,引擎物料目录的结构如下: +``` +jdbc +├── dist +│ └── v4 +│ ├── conf +│ ├── conf.zip +│ ├── lib +│ └── lib.zip +└── plugin + └── 4 + └── linkis-engineplugin-jdbc-1.1.2.jar +``` +物料目录构成: +``` +jdbc/dist/版本号/conf.zip +jdbc/dist/版本号/lib.zip + +jdbc/plugin/版本号(去v留数字)/linkis-engineplugin-引擎名称-1.1.x.jar +``` +conf.zip和lib.zip会作为引擎物料被托管在物料管理服务中,本地每次对物料conf或lib进行修改之后,对应物料会产生一个新的版本号,物料文件数据会被重新上传。引擎启动时,会获取最新版本号的物料数据,加载lib和conf并启动引擎的java进程。 + +##### 4.2 引擎物料上传和更新流程 +在Linkis完成部署并首次启动时,会触发引擎物料(lib.zip和conf.zip)首次上传至物料库;当引擎lib下jar包或conf中引擎配置文件有修改时,则需要触发引擎物料的刷新机制来保证引擎启动时能够加载到最新的物料文件。 + +以现在Linkis1.1.x版本为例,触发引擎物料刷新的两种方式有两种: + +通过命令sh sbin/linkis-daemon.sh restart cg-engineplugin重启engineplugin服务 + +通过请求引擎物料刷新的接口 +``` +# 刷新所有引擎物料 +curl --cookie "linkis_user_session_ticket_id_v1=kN4HCk555Aw04udC1Npi4ttKa3duaCOv2HLiVea4FcQ=" http://127.0.0.1:9001/api/rest_j/v1/engineplugin/refreshAll +# 指定引擎类型和版本刷新物料 +curl --cookie "linkis_user_session_ticket_id_v1=kN4HCk555Aw04udC1Npi4ttKa3duaCOv2HLiVea4FcQ=" http://127.0.0.1:9001/api/rest_j/v1/engineplugin/refresh?ecType=jdbc&version=4 +``` +这两种引擎物料的刷新方式,其底层的实现机制是一样的,都是调用了EngineConnResourceService类中的refreshAll()或refresh()方法。 + +在抽象类EngineConnResourceService的默认实现类DefaultEngineConnResourceService中的init()方法内部,通过参数wds.linkis.engineconn.dist.load.enable(默认为true)来控制是否在每次启动engineplugin服务时都执行refreshAll(false)来检查所有引擎物料是否有更新(其中faslse代表异步获取执行结果)。 +``` +init()方法被注解@PostConstruct修饰,在DefaultEngineConnResourceService加载后,对象使用前执行,且只执行一次。 +``` +手动调用engineplugin/refresh的接口,即手动执行了EngineConnResourceService类中的refreshAll或refresh方法。 + +所以引擎物料检测更新的逻辑在DefaultEngineConnResourceService中的refreshAll和refresh方法内。 + +其中refreshAll()的核心逻辑是: + +1)通过参数wds.linkis.engineconn.home获取引擎的安装目录,默认是: +``` +getEngineConnsHome = Configuration.getLinkisHome() + "/lib/linkis-engineconn-plugins"; +``` +2)遍历引擎目录 +``` +getEngineConnTypeListFromDisk: Array[String] = new File(getEngineConnsHome).listFiles().map(_.getName) +``` +3)EngineConnBmlResourceGenerator接口提供对各个引擎(版本)底层文件或目录的合法性检测。对应实现存在于抽象类AbstractEngineConnBmlResourceGenerator中。 + +4)DefaultEngineConnBmlResourceGenerator类主要是为了生成EngineConnLocalizeResource。EngineConnLocalizeResource是对物料资源文件元数据和InputStream的封装,在后续的逻辑中EngineConnLocalizeResource会被作为物料参数来参与物料的上传过程。 + +EngineConnBmlResourceGenerator、AbstractEngineConnBmlResourceGenerator、DefaultEngineConnBmlResourceGenerator这三个文件的代码细节暂不细说,可通过以下UML类图,大致了解其继承机制,并结合方法内的具体实现来理解这一部分的功能。 + + +再重新回到DefaultEngineConnResourceService类中的refreshAll方法内,继续看refreshTask线程的核心流程: +``` +engineConnBmlResourceGenerator.getEngineConnTypeListFromDisk foreach { engineConnType => + Utils.tryCatch { + engineConnBmlResourceGenerator.generate(engineConnType).foreach { + case (version, localize) => + logger.info(s" Try to initialize ${engineConnType}EngineConn-$version.") + refresh(localize, engineConnType, version) + } + } + ...... +} +``` +扫描引擎的安装目录,可获得每个引擎物料目录的列表,对于每个引擎物料目录结构的合法性校验通过之后,可得到对应的EngineConnLocalizeResource,然后通过调用refresh(localize: Array[EngineConnLocalizeResource], engineConnType: String, version: String)来完成后续物料的上传工作。 + +而在refresh()方法的内部,主要经过的流程有: + +从表linkis_cg_engine_conn_plugin_bml_resources中获取对应engineConnType和version的物料列表数据,赋值给变量engineConnBmlResources。 +``` +val engineConnBmlResources = asScalaBuffer(engineConnBmlResourceDao.getAllEngineConnBmlResource(engineConnType, version)) +``` + +###### 4.2.1 引擎物料上传流程 +引擎物料上传流程时序图 + +如果表linkis_cg_engine_conn_plugin_bml_resources中没有匹配到数据,则需要拿EngineConnLocalizeResource中的数据来构造EngineConnBmlResource对象,并保存至linkis_cg_engine_conn_plugin_bml_resources表中,此数据保存之前,需要先完成物料文件的上传操作,即执行uploadToBml(localizeResource)方法。 + +在uploadToBml(localizeResource)方法内部,通过构造bmlClient来请求物料上传的接口。即: +``` +private val bmlClient = BmlClientFactory.createBmlClient() +bmlClient.uploadResource(Utils.getJvmUser, localizeResource.fileName, localizeResource.getFileInputStream) +``` +在BML Server中,物料上传的接口位置在BmlRestfulApi类中的uploadResource接口方法内。主要经历的过程是: +``` +ResourceTask resourceTask = taskService.createUploadTask(files, user, properties); +``` +每一次物料上传,都会构造一个ResourceTask来完成文件上传的流程,并记录此次文件上传Task的执行记录。在createUploadTask方法内部,主要完成的操作如下: + +1)为此次上传的资源文件产生一个全局唯一标识的resource_id,String resourceId = UUID.randomUUID().toString(); + +2)构建ResourceTask记录,并存储在表linkis_ps_bml_resources_task中,以及后续一系列的Task状态修改。 +``` +ResourceTask resourceTask = ResourceTask.createUploadTask(resourceId, user, properties); +taskDao.insert(resourceTask); + +taskDao.updateState(resourceTask.getId(), TaskState.RUNNING.getValue(), new Date()); +``` +3)物料文件真正写入物料库的操作是由ResourceServiceImpl类中的upload方法完成的,在upload方法内部,会把一组List files对应的字节流持久化至物料库文件存储系统中;把物料文件的properties数据,存储到资源记录表(linkis_ps_bml_resources)和资源版本记录表(linkis_ps_bml_resources_version)中。 +``` +MultipartFile p = files[0] +String resourceId = (String) properties.get("resourceId"); +String fileName =new String(p.getOriginalFilename().getBytes(Constant.ISO_ENCODE), +Constant.UTF8_ENCODE); +fileName = resourceId; +String path = resourceHelper.generatePath(user, fileName, properties); +// generatePath目前支持Local和HDFS路径,路径的构成规则由LocalResourceHelper或HdfsResourceHelper +// 中的generatePath方法实现 +StringBuilder sb = new StringBuilder(); +long size = resourceHelper.upload(path, user, inputStream, sb, true); +// 文件size计算以及文件字节流写入文件由LocalResourceHelper或HdfsResourceHelper中的upload方法实现 +Resource resource = Resource.createNewResource(resourceId, user, fileName, properties); +// 插入一条记录到resource表linkis_ps_bml_resources中 +long id = resourceDao.uploadResource(resource); +// 新增一条记录到resource version表linkis_ps_bml_resources_version中,此时的版本号是onstant.FIRST_VERSION +// 除了记录这个版本的元数据信息外,最重要的是记录了该版本的文件的存储位置,包括文件路径,起始位置,结束位置。 +String clientIp = (String) properties.get("clientIp"); +ResourceVersion resourceVersion = ResourceVersion.createNewResourceVersion( +resourceId, path, md5String, clientIp, size, Constant.FIRST_VERSION, 1); +versionDao.insertNewVersion(resourceVersion); +``` +上述流程执行成功之后,物料数据才算是真正完成,然后把UploadResult返回给客户端,并标记此次ResourceTask的状态为完成,如果有遇到上传文件报错,则标记此次ResourceTask的状态为失败,记录异常信息。 + + +###### 4.2.2 引擎物料更新流程 +引擎物料更新流程时序图 + +如果表linkis_cg_engine_conn_plugin_bml_resources中匹配到本地物料数据,则需要拿EngineConnLocalizeResource中的数据来构造EngineConnBmlResource对象,并更新linkis_cg_engine_conn_plugin_bml_resources表中原有物料文件的版本号、文件大小、修改时间等元数据信息,此数据更新前,需要先完成物料文件的更新上传操作,即执行uploadToBml(localizeResource, engineConnBmlResource.getBmlResourceId)方法。 + +在uploadToBml(localizeResource, resourceId)方法内部,通过构造bmlClient来请求物料资源更新的接口。即: +``` +private val bmlClient = BmlClientFactory.createBmlClient() +bmlClient.updateResource(Utils.getJvmUser, resourceId, localizeResource.fileName, localizeResource.getFileInputStream) +``` +在BML Server中,实现物料更新的接口位置在BmlRestfulApi类中的updateVersion接口方法内,主要经历的过程是: + +完成resourceId的有效性检测,即检测传入的resourceId是否在linkis_ps_bml_resources表中存在,如果此resourceId不存在,给客户端抛出异常,在接口层面此次物料更新操作失败。 + +所以在表linkis_cg_engine_conn_plugin_bml_resources和linkis_ps_bml_resources中的资源数据的对应关系需要保证完整,否则会出现物料文件无法更新的报错。 +``` +resourceService.checkResourceId(resourceId) +``` +resourceId如果存在于linkis_ps_bml_resources表中,会继续执行: +``` +StringUtils.isEmpty(versionService.getNewestVersion(resourceId)) +``` +getNewestVersion方法是为了在表linkis_ps_bml_resources_version中获取该resourceId的最大版本号,如果resourceId对应的最大version为空,那么物料同样会更新失败,所以此处数据的对应关系完整性也需要严格保证。 + +上述两处检查都通过之后,会创建ResourceUpdateTask来完成最终的文件写入和记录更新保存等工作。 +``` +ResourceTask resourceTask = null; +synchronized (resourceId.intern()) { +resourceTask = taskService.createUpdateTask(resourceId, user, file, properties); +} +``` +而在createUpdateTask方法内部,主要实现的功能是: +``` +// 为物料Resource生成新的version +String lastVersion = getResourceLastVersion(resourceId); +String newVersion = generateNewVersion(lastVersion); +// 然后是对ResourceTask的构建,和状态维护 +ResourceTask resourceTask = ResourceTask.createUpdateTask(resourceId, newVersion, user, system, properties); +// 物料更新上传的逻辑由versionService.updateVersion方法完成 +versionService.updateVersion(resourceTask.getResourceId(), user, file, properties); +``` +在versionService.updateVersion方法内部,主要实现的功能是: +``` +ResourceHelper resourceHelper = ResourceHelperFactory.getResourceHelper(); +InputStream inputStream = file.getInputStream(); +// 获取资源的path +String newVersion = params.get("newVersion").toString(); +String path = versionDao.getResourcePath(resourceId) + "_" + newVersion; +// getResourcePath的获取逻辑是从原有路径中limit一条,然后以_拼接newVersion +// select resource from linkis_ps_bml_resources_version WHERE resource_id = #{resourceId} limit 1 +// 资源上传到hdfs或local +StringBuilder stringBuilder = new StringBuilder(); +long size = resourceHelper.upload(path, user, inputStream, stringBuilder, OVER_WRITE); +// 最后在linkis_ps_bml_resources_version表中插入一条新的资源版本记录 +ResourceVersion resourceVersion = ResourceVersion.createNewResourceVersion(resourceId, path, md5String, clientIp, size, newVersion, 1); +versionDao.insertNewVersion(resourceVersion); +``` +5. 文章小结 + 本文从Linkis引擎物料管理功能作为切入点,概述了BML物料服务的架构,并结合底层源码,详细地剖析了在引擎物料管理功能中,引擎物料的概念,以及引擎物料的上传、更新、版本管理等操作流程。 \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-cdh-linkis-dss.md b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-cdh-linkis-dss.md new file mode 100644 index 00000000000..86fa8147514 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-cdh-linkis-dss.md @@ -0,0 +1,140 @@ +--- +title: 【实践经验】基于CDH6.3.2部署Linkis1.1.1和DSS1.1.0 +authors: [kongslove] +tags: [blog,linkis1.1.1,cdh] +--- +### 前言 + +随着业务的发展和社区产品的更新迭代,我们发现Linkis1.X在资源管理,引擎管理方面有极大的性能提升,可以更好的满足数据中台的建设。相较于0.9.3版本和我们之前使用的平台, 在用户体验方面也得到很大的提升,任务失败页面无法方便查看详情等问题也都得到改善,因此决定升级Linkis以及WDS套件,那么如下是具体的实践操作,希望给大家带来参考。 + +### 一、环境 +CDH6.3.2 各组件版本 +- hadoop:3.0.0-cdh6.3.2 +- hive:2.1.1-cdh6.3.2 +- spark:2.4.8 + +#### 硬件环境 +2台 128G 云物理机 + +### 二、Linkis安装部署 + +#### 2.1编译代码or release安装包? + +本次安装部署采用的是release安装包方式部署。为了适配司内CDH6.3.2版本,hadoop和hive的相关依赖包需要替换成CDH6.3.2版本,这里采用的是直接替换安装包的方式。需要替换的依赖包与模块如下l列表所示。 +``` +--涉及到的模块 +linkis-engineconn-plugins/spark +linkis-engineconn-plugins/hive +/linkis-commons/public-module +/linkis-computation-governance/ +``` +``` +-----需要更换cdh包的列表 +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hive-shims-0.23-2.1.1-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hive-shims-scheduler-2.1.1-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-annotations-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-auth-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-common-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-hdfs-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-client-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-common-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-jobclient-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-api-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-client-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-server-common-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-core-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-shuffle-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-common-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-annotations-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-auth-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-mapreduce-client-core-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-api-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-client-3.0.0-cdh6.3.2.jar +./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-common-3.0.0-cdh6.3.2.jar +./lib/linkis-commons/public-module/hadoop-annotations-3.0.0-cdh6.3.2.jar +./lib/linkis-commons/public-module/hadoop-auth-3.0.0-cdh6.3.2.jar +./lib/linkis-commons/public-module/hadoop-common-3.0.0-cdh6.3.2.jar +./lib/linkis-commons/public-module/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar +./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-annotations-3.0.0-cdh6.3.2.jar +./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-auth-3.0.0-cdh6.3.2.jar +./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-api-3.0.0-cdh6.3.2.jar +./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-client-3.0.0-cdh6.3.2.jar +./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-common-3.0.0-cdh6.3.2.jar +``` +#### 2.2部署过程中遇到的问题 +1、kerberos配置 +需要在linkis.properties公共配置中添加 +各个引擎conf也需要添加 +``` +wds.linkis.keytab.enable=true +wds.linkis.keytab.file=/hadoop/bigdata/kerberos/keytab +wds.linkis.keytab.host.enabled=false +wds.linkis.keytab.host=your_host +``` +2、更换Hadoop依赖包后启动报错java.lang.NoClassDefFoundError:org/apache/commons/configuration2/Configuration +![](/static/Images/blog/hadoop-start-error.png) + +原因:Configuration类冲突,在linkis-commons模块下在添加一个commons-configuration2-2.1.1.jar解决冲突 + +3、script中运行spark、python等报错no plugin for XXX +现象:在配置文件中修改完spark/python的版本后,启动引擎报错no plugin for XXX + +![](/static/Images/blog/no-plugin-error.png) + +原因:LabelCommonConfig.java和GovernaceCommonConf.scala这两个类中写死了引擎的版本,修改相应版本,编译后替换掉linkis以及其他组件(包括schedulis等)里面所有包含这两个类的jar(linkis-computation-governance-common-1.1.1.jar和linkis-label-common-1.1.1.jar) + +4、python引擎执行报错,初始化失败 + +- 修改python.py,移除引入pandas模块 +- 配置python加载目录,修改python引擎的linkis-engineconn.properties +``` +pythonVersion=/usr/local/bin/python3.6 +``` +5、运行pyspark任务失败报错 +![](/static/Images/blog/pyspark-task-error.png) + +原因:未设置PYSPARK_VERSION +解决方法: +在/etc/profile下设置两个参数 + +``` +export PYSPARK_PYTHON=/usr/local/bin/python3.6 + +export PYSPARK_DRIVER_PYTHON=/usr/local/bin/python3.6 +``` +6、执行pyspark任务报错 +java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT + +![](/static/Images/blog/pyspark-no-such-field-error.png) + +原因:spark2.4.8里面使用的是hive1.2.1的包,但是我们的hive升级到了2.1.1版本,hive2里面已经去掉了这个参数,然后spark-sql里面的代码依然是要调用hive的这个参数的,然后就报错了, +所以在spark-sql/hive代码中删除掉了HIVE_STATS_JDBC_TIMEOUT这个参数,重新编译后打包,替换spark2.4.8中的spark-hive_2.11-2.4.8.jar + +7、jdbc引擎执行出现代理用户异常 + +现象:用A用户去执行一个jdbc任务1,引擎选了可以复用,然后我也用B用户去执行一个jdbc任务2,发现 任务2的提交人是A +分析原因: +ConnectionManager::getConnection + +![](/static/Images/blog/jdbc-connection-manager.png) +这里创建datasource的时候是根据key来判断是否创建,而这个key是jdbc url ,但这种粒度可能有点大,因为有可能是不同的用户去访问同一个数据源,比如说hive,他们的url是一样的,但是账号密码是不一样的,所以当第一个用户去创建datasource时,username已经指定了,第二个用户进来的时候,发现这个数据源存在,就直接拿这个数据源去用,而不是创建一个新的datasource,所以造成了用户B提交的代码通过A去执行了。 +解决方法:数据源缓存map的key粒度降低,改成jdbc.url+jdbc.user。 + +### 三、DSS部署 +安装过程参考官网文档进行安装配置,下面说明一下在安装调试过程中遇到的一些事项。 + +#### 3.1 DSS 左侧数据库展示的数据库列表显示不全 +分析:DSS数据源模块显示的数据库信息是来源于hive的元数据库,但由于CDH6中通过sentry进行权限控制,大部分的hive表元数据信息没有存在于hive metastore中,所以展示的数据存在缺失。 +解决方法: +将原有逻辑改造成使用jdbc链接hive的方式,从jdbc中获取表数据展示。 +简单逻辑描述: +jdbc的properties信息通过linkis控制台配置的IDE-jdbc的配置信息获取。 +DBS:通过connection.getMetaData()获取schema +TBS:connection.getMetaData().getTables()获取对应db下的tables +COLUMNS:通过执行describe table 获取表的columns信息 + +#### 3.2 DSS 工作流中执行jdbc脚本报错 jdbc.name is empty +分析:dss workflow中默认执行的creator是Schedulis,由于在管理台中未配置Schedulis的相关引擎参数,导致读取的参数全为空。 +在控制台中添加Schedulis的Category时报错,”Schedulis目录已存在“。由于调度系统中的creator是schedulis,导致无法添加Schedulis Category,为了更好的标识各个系统,所以将dss workflow中默认执行的creator改成nodeexcetion,该参数可以在dss-flow-execution-server.properties中添加wds.linkis.flow.job.creator.v1=nodeexecution一行配置即可。 \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-entrance-execution-analysis.md b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-entrance-execution-analysis.md new file mode 100644 index 00000000000..440c6572e32 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-entrance-execution-analysis.md @@ -0,0 +1,76 @@ +--- +title: 【源码解读】Linkis1.1.1 Entrance执行分析 +authors: [guoshupei] +tags: [blog,linkis1.1.1,entrance] +--- +### 前言 + +以下是基于Linkisv1.1.1源码分析得出的图解:Entrance服务执行流程。 +后面所有的讲解都是围绕这一张图,所以在看讲解时,请参考整个图去理解。讲解思路是化整为零,积点成线,集线成面。 +![](/static/Images/blog/entry-service-execution-process.jpg) + +将上图进行大致划分,分为: +环境初始化区:整个Entrance服务启动时,需要初始化的EntranceContext +提交任务区:用户调用EntranceRestfulApi接口提交任务,以及Job构建,拦截器操作等 +执行区:从提交区提交过来的Job, 包含了整个Job生命周期的所有操作 +![](/static/Images/blog/entrance-context.png) + +### 环境初始化区 +![](/static/Images/blog/env-init.png) +``` +Entrance功能拆分很细,各司其职,易于扩展。整个环境的注入可查看EntranceSpringConfiguration配置类,下面从左到右依次介绍 + +PersistenceManager(QueryPersistenceManager)持久化管理 +:主要作用对象是job,已经定义好了state,progress,result等操作。QueryPersistenceEngine和EntranceResultSetEngine就是其中一种实现,如果存储类型有变更,只需要额外新增实现,在entrance注入更改注入类就可实现切换。 + +EntranceParser(CommonEntranceParser)参数解析器:主要有三个方法,parseToTask (json -> +request),parseToJob (request -> job),parseToJobRequest (job -> +request),此过程大致可以表示为:json -> request <=> job + +LogManager(CacheLogManager)日志管理: 打印日志以及更新错误码等 + +Scheduler(ParallelScheduler)调度器:负责job分发,job执行环境初始化等,linkis按同租户同任务类型进行分组,很多设置都是基于这个分组原则设置的,比如并行度,资源等。所以这里有三个重要功能组件,并抽象出一个SchedulerContext上下文环境管理: +1)GroupFactory(EntranceGroupFactory)group工厂:按分组创建group,并以groupName为key缓存group。group主要记录一些参数,如并发数,线程数等 +2)ConsumerManager(ParallelConsumerManager)消费管理器:按分组创建consumer,并以groupName为key缓存consumer,并且初始化一个线程池,供所有consumer使用。consumer主要用于存放job,提交执行job等 +3)ExecutorManager(EntranceExecutorManagerImpl)executor管理:为每一个job创建一个executor,负责job整个生命周期所有操作 + +EntranceInterceptor拦截器:entrance服务所有拦截器 + +EntranceEventListenerBus事件监听服务:一个通用事件监听服务,本质是个轮询线程,内置线程池,线程数5,添加event会向注册的listener按事件类型分发事件 +``` + +### 提交任务区 +![](/static/Images/blog/submit-task.png) +``` +主要以用户调用EntranceRestfulApi的execute()方法讲解。主要有四个重要步骤 + +ParseToTask:接收到请求json后,先转化为request,依赖PersistenceManager保存到数据库,得到taskId +调用所有拦截器Interceptors +ParseToJob:将request转化为EntranceExecutionJob,并设置CodeParser,通过job.init()将job解析并构建SubJobInfo和SubJobDetail对象(v1.2.0已经没有SubJob了) +提交job到Scheduler,得到execId +``` + +### 执行区 +![](/static/Images/blog/excute-area.png) +``` +ParallelGroup:存放一些参数,FIFOUserConsumer会用到,参数变更应该不会实时生效 + +FIFOUserConsumer: +1. 内含ConsumeQueu(LoopArrayQueue),环队列,大小为maxCapacity,添加job采用offer方法,如果队列满了则返回None,业务上报错。 +2. 本质是个线程,轮询调用loop()方法,每次只取一个job,并通过ExecutorManager创建一个executor,使用线程池提交job +3. 并发数由ParallelGroup的maxRunningJobs决定,任务会优先获取需要重试的任务。 + +DefaultEntranceExecutor:executor负责监控整个Job提交,每次提交一个SubJobInfo。大致步骤总结: +1. 异步提交Orchestrator并返回orchestratorFuture +2. orchestratorFuture注册dealResponse函数, +dealResponse:subJob成功,有下一个继续提交SubJob,没有则调用notify告知Job成功,如果SubJob失败,则notify告知Job失败,判断重试,重新创建executor +3. 创建EngineExecuteAsyncReturn,注入orchestratorFuture + +提交过程: + +FIFOUserConsumer通过loop()获取一个job +获取一个DefaultEntranceExecutor,注入到job中 +通过线程池调用job的run方法,job内触发DefaultEntranceExecutor的execute +提交Orchestrator并等待调用dealResponse,触发notify +更改job状态,判断重试,继续提交 +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-linkis-dss-ansible.md b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-linkis-dss-ansible.md new file mode 100644 index 00000000000..3c55d582947 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-linkis-dss-ansible.md @@ -0,0 +1,134 @@ +--- +title: 【安装部署】Linkis1.3.0+DSS1.1.1 Ansible 单机一键安装脚本 +authors: [wubolive] +tags: [blog,linkis1.3.0,ansible] +--- +### 一、简介 + +为解决繁琐的部署流程,简化安装步骤,本脚本提供一键安装最新版本的DSS+Linkis环境;部署包中的软件采用我自己编译的安装包,并且为最新版本:DSS1.1.1 + Linkis1.3.0。 + +#### 版本介绍 +以下版本及配置信息可参考安装程序hosts文件中的[all:vars]字段。 + +| 软件名称 | 软件版本 | 应用路径 | 测试/连接命令 | +|------------------|--------------|-----------------------|---------------------------------------| +| MySQL | mysql-5.6 | /usr/local/mysql | mysql -h 127.0.0.1 -uroot -p123456 | +| JDK | jdk1.8.0_171 | /usr/local/java | java -version | +| Python | python 2.7.5 | /usr/lib64/python2.7 | python -V | +| Nginx | nginx/1.20.1 | /etc/nginx | nginx -t | +| Hadoop | hadoop-2.7.2 | /opt/hadoop | hdfs dfs -ls / | +| Hive | hive-2.3.3 | /opt/hive | hive -e "show databases" | +| Spark | spark-2.4.3 | /opt/spark | spark-sql -e "show databases" | +| dss | dss-1.1.1 | /home/hadoop/dss | http://<服务器IP>:8085 | +| links | linkis-1.3.0 | /home/hadoop/linkis | http://<服务器IP>:8188 | +| zookeeper | 3.4.6 | /usr/local/zookeeper | 无 | +| DolphinScheduler | 1.3.9 | /opt/dolphinscheduler | http://<服务器IP>:12345/dolphinscheduler | +| Visualis | 1.0.0 | /opt/visualis-server | http://<服务器IP>:9088 | +| Qualitis | 0.9.2 | /opt/qualitis | http://<服务器IP>:8090 | +| Streamis | 0.2.0 | /opt/streamis | http://<服务器IP>:9188 | +| Sqoop | 1.4.6 | /opt/sqoop | sqoop | +| Exchangis | 1.0.0 | /opt/exchangis | http://<服务器IP>:8028 | + + +### 二、部署前注意事项 + +要求: + +- 本脚本仅在CentOS 7系统上测试过,请确保安装的服务器为CentOS 7。 +- 仅安装DSS+Linkis服务器内存至少16G,安装全部服务内存至少32G。 +- 安装前请关闭服务器防火墙及SElinux,并使用root用户进行操作。 +- 安装服务器必须通畅的访问互联网,脚本需要yum下载一些基础软件。 +- 保证服务器未安装任何软件,包括不限于java、mysql、nginx等,最好是全新系统。 +- 必须保证服务器除lo:127.0.0.1回环地址外,仅只有一个IP地址,可使用echo $(hostname -I)命令测试。 + + +### 三、部署方法 + +本案例部署主机IP为192.168.1.52,以下步骤请按照自己实际情况更改。 + +#### 3.1 安装前设置 +``` +### 安装ansible +$ yum -y install epel-release +$ yum -y install ansible + +### 配置免密 +$ ssh-keygen -t rsa +$ ssh-copy-id root@192.168.1.52 + +### 关闭防火墙及SELinux +$ systemctl stop firewalld.service && systemctl disable firewalld.service +$ sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config && setenforce 0 +``` + +#### 3.2 部署linkis+dss +``` +### 获取安装包 +$ git clone https://github.com/wubolive/dss-linkis-ansible.git +$ cd dss-linkis-ansible + +### 目录说明 +dss-linkis-ansible +├── ansible.cfg # ansible 配置文件 +├── hosts # hosts主机及变量配置 +├── playbooks # playbooks剧本 +├── README.md # 说明文档 +└── roles # 角色配置 + +### 配置部署主机(注:ansible_ssh_host的值不能设置127.0.0.1) +$ vim hosts +[deploy] +dss-service ansible_ssh_host=192.168.1.52 ansible_ssh_port=22 + +### 下载安装包到download目录(如果下载失败,可以手动下载放到该目录) +$ ansible-playbook playbooks/download.yml + +### 一键安装Linkis+DSS +$ ansible-playbook playbooks/all.yml +...... +TASK [dss : 打印访问信息] ***************************************************************************************** +ok: [dss-service] => { + "msg": [ + "*****************************************************************", + " 访问 http://192.168.1.52 查看访问信息 ", + "*****************************************************************" + ] +} +``` +执行结束后,即可访问:http://192.168.1.52 查看信息页面,上面记录了所有服务的访问地址及账号密码。 +![](/static/Images/blog/view-information-page.png) + +#### 3.3 部署其它服务 +``` +# 安装dolphinscheduler +$ ansible-playbook playbooks/dolphinscheduler.yml +### 注: 安装以下服务必须优先安装dolphinscheduler调度系统 +# 安装visualis +$ ansible-playbook playbooks/visualis.yml +# 安装qualitis +$ ansible-playbook playbooks/qualitis.yml +# 安装streamis +$ ansible-playbook playbooks/streamis.yml +# 安装exchangis +$ ansible-playbook playbooks/exchangis.yml +``` +#### 3.4 维护指南 +``` +### 查看实时日志 +$ su - hadoop +$ tail -f ~/linkis/logs/*.log ~/dss/logs/*.log + +### 启动DSS+Linkis服务(如服务器重启可使用此命令一建启动) +$ ansible-playbook playbooks/all.yml -t restart +# 启动zookeeper +$ sh /usr/local/zookeeper/bin/zkServer.sh start +# 启动其它服务 +$ su - hadoop +$ cd /opt/dolphinscheduler/bin && sh start-all.sh +$ cd /opt/visualis-server/bin && sh start-visualis-server.sh +$ cd /opt/qualitis/bin/ && sh start.sh +$ cd /opt/streamis/streamis-server/bin/ && sh start-streamis-server.sh +$ cd /opt/exchangis/sbin/ && ./daemon.sh start server +``` + +使用问题请访问官方QA文档:https://docs.qq.com/doc/DSGZhdnpMV3lTUUxq \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-linkis-dss-compile-deployment.md b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-linkis-dss-compile-deployment.md new file mode 100644 index 00000000000..6fc44ff95f2 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/2023-08-03-linkis-dss-compile-deployment.md @@ -0,0 +1,350 @@ +--- +title: 【开发经验】Apache linkis +DSS 的编译到部署 +authors: [huasir] +tags: [blog,linkis,dss] +--- +### 背景 + +随着业务的发展,和社区产品的更新迭代,我们发现 linkis1.2.0和dss1.1.1能够更好的满足我们对实时数仓和机器学习需求。同时相较于我们目前使用的linkis0.9.3和dss0.7.0, 在任务调度方面和插件接入等方面也有很大的结构调整和设计优化。基于以上原因,我们现在需要将现有的版本进行升级,由于版本跨度较大,我们的升级思路是重部署新的版本,并将原有的业务数据进行迁移,如下是具体的实践操作,希望给大家带来参考。 + +### 获取源码 +![](/static/Images/blog/resource-code.png) + +``` + git clone git@github.com:yourgithub/incubator-linkis.git + git clone git@github.com:yourgithub/DataSphereStudio.git +``` +如果没有打算提交pr的开放人员,也可以在官方直接下载zip源码包 + +### 编译打包 + +#### 1. 确定版本配套 +linkis: 1.2.0 +dss: 1.1.0 +hadoop: 3.1.1 +spark: 2.3.2 +hive: 3.1.0 + +#### 2. linkis1.2.0编译打包 +``` +git checkout -b release-1.2.0 origin/release-1.2.0 +mvn -N install +mvn clean install -DskipTests +``` + +安装包路径:incubator-linkis/linkis-dist/target/apache-linkis-1.2.0-incubating-bin.tar.gz +为了适配我们自己的版本配套,需要调整pom并重新编译 + +1. 将mysql驱动mysql-connector-java的scope注释掉(incubator-linkis/pom.xml) +``` + + mysql + mysql-connector-java + ${mysql.connector.version} + + +``` +2. 修改hadoop版本号(incubator-linkis/pom.xml) +``` + + 3.1.1 + +``` + +3. hadoop3 需要调整hadoop-hdfs的artifactId值(incubator-linkis/linkis-commons/linkis-hadoop-common/pom.xml) +``` + + org.apache.hadoop + + hadoop-hdfs-client + +``` +4. 调整hive引擎的hive版本(incubator-linkis/linkis-engineconn-plugins/hive/pom.xml) +``` + + 3.1.0 + +``` +5. linkis-metadata-query-service-hive的hive版本和hadoop版本也需要调整(incubator-linkis/linkis-public-enhancements/linkis-datasource/linkis-metadata-query/service/hive/pom.xml) +``` + + UTF-8 + 3.1.1 + 3.1.0 + 4.2.4 + +``` +6. 调整spark引擎的spark版本(incubator-linkis/linkis-engineconn-plugins/spark/pom.xml) +``` + + 2.3.2 + +``` +#### 3. links1.2.0管理端打包 +``` +cd incubator-linkis/linkis-web +npm install +npm run build +#如果比较慢可以用cnpm +npm install -g cnpm --registry=https://registry.npm.taobao.org +cnpm install +``` +安装包路径:incubator-linkis/linkis-web/apache-linkis-1.2.0-incubating-web-bin.tar.gz +#### 4. dss1.1.0编译打包 +``` +git checkout -b branch-1.1.0 origin/branch-1.1.0 +mvn -N install +mvn clean install -DskipTests +``` +安装包路径:DataSphereStudio/assembly/target/wedatasphere-dss-1.1.0-dist.tar.gz + +#### 5. dss1.1.0前端编译打包 +``` +cd web/ +npm install lerna -g +lerna bootstrap #安装依赖 +``` +安装包路径:DataSphereStudio/web/dist/ + +### 部署安装 +#### 环境说明 + +| master | slave1 | slave2 | slave3 | app | +|--------------------|--------|--------|------------|--------------------------------| +| linksi0.9.3,nginx | mysql | | dss-0.7.0 | | +| | | | | links1.20,dss1.1.0,nginx,mysql | +| hadoop | hadoop | hadoop | hadoop | hadoop | + +说明:总共5台机器,大数据基础环境都已安装,hadoop,hive,spark等 +先在app机器安装新版本links1.2.0, dss1.1.0.保留原有的linkis版本可用,待新的部署好以后,再对老版本的的数据进行迁移 + +#### 归集安装包 + +![](/static/Images/blog/collect-installation-package.png) + +#### 安装mysql +``` +docker pull mysql:5.7.40 +docker run -it -d -p 23306:3306 -e MYSQL_ROOT_PASSWORD=app123 -d mysql:5.7.40 +``` +#### 安装linkis +``` +tar zxvf apache-linkis-1.2.0-incubating-bin.tar.gz -C linkis +cd linkis +vi deploy-config/db.sh # 配置数据库 +``` +![](/static/Images/blog/install-linkis.png) + +关键参数配置 +``` +deployUser=root +YARN_RESTFUL_URL=http://master:18088 +#HADOOP +HADOOP_HOME=/usr/hdp/3.1.5.0-152/hadoop +HADOOP_CONF_DIR=/etc/hadoop/conf +#HADOOP_KERBEROS_ENABLE=true +#HADOOP_KEYTAB_PATH=/appcom/keytab/ + +#Hive +HIVE_HOME=/usr/hdp/3.1.5.0-152/hive +HIVE_CONF_DIR=/etc/hive/conf + +#Spark +SPARK_HOME=/usr/hdp/3.1.5.0-152/spark2 +SPARK_CONF_DIR=/etc/spark2/conf + + +## Engine version conf +#SPARK_VERSION +SPARK_VERSION=2.3.2 + +##HIVE_VERSION +HIVE_VERSION=3.1.0 + +## java application default jvm memory +export SERVER_HEAP_SIZE="256M" + +##The decompression directory and the installation directory need to be inconsistent +#LINKIS_HOME=/root/linkis-dss/linkis +``` +安全保险执行一下chekcEnv.sh +``` +bin]# ./checkEnv.sh +``` +![](/static/Images/blog/check-env.png) +因为我本地使用的docker安装的mysql,所以需要额外安装一个mysql客户端 +``` +wget https://repo.mysql.com//mysql80-community-release-el7-7.noarch.rpm +rpm -Uvh mysql80-community-release-el7-7.noarch.rpm +yum-config-manager --enable mysql57-community +vi /etc/yum.repos.d/mysql-community.repo +#将mysql8的enable设置为0 +[mysql80-community] +name=MySQL 8.0 Community Server +baseurl=http://repo.mysql.com/yum/mysql-8.0-community/el/6/$basearch/ +enabled=1 +gpgcheck=1 +#安装 +yum install mysql-community-server +``` +尝试安装linkis +``` + sh bin/install.sh +``` +![](/static/Images/blog/sh-bin-install-sh.png) +打开ambari的spark2的管理界面添加环境变量并重启spark2的相关服务 +![](/static/Images/blog/advanced-spark2-env.png) +终于通过验证 +![](/static/Images/blog/check-env1.png) +``` +sh bin/install.sh +``` +第一次安装,数据库需要初始化,直接选择2 +![](/static/Images/blog/data-source-init-choose-2.png) + +根据官网提示,需要自己下载mysql驱动包并放到对应目录下,我习惯查了一下,发现已经有mysql包了额,应当是之前编译的时候去掉了mysql scope的原因,但是版本不对,我们生产使用的是5.7,但是驱动是mysql8的驱动包。所以大家在编译的时候最好先调整mysql驱动版本。 +![](/static/Images/blog/choose-true-mysql-version.png) + +手动调整一下mysql驱动版本,降原来的高版本注释掉 +``` +wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar +cp mysql-connector-java-5.1.49.jar lib/linkis-spring-cloud-services/linkis-mg-gateway/ +cp mysql-connector-java-5.1.49.jar lib/linkis-commons/public-module/ +mv mysql-connector-java-8.0.28.jar mysql-connector-java-8.0.28.jar.bak#需要cd到对应的lib下执行 +sh sbin/linkis-start-all.sh +``` +浏览器打开,http://app:20303/,共10个服务,好像没问题 +![](/static/Images/blog/open-eureka-service.png) + +#### 安装linkis-web +``` +tar -xvf apache-linkis-1.2.0-incubating-web-bin.tar.gz -C linkis-web/ +cd linkis-web +sh install.sh +``` +第一次访问http://app:8088/#/login 报错403,经过查证需要修改nginx中conf个的部署用 +``` +cd /etc/nginx +vi nginx.conf +user root; # 降默认的用户改成自己的root +nginx -s reload +``` +再次访问,好像正确了额 +![](/static/Images/blog/login.png) + +查看默认的用户名和密码 +``` +cat LinkisInstall/conf/linkis-mg-gateway.properties +``` +![](/static/Images/blog/linkis-mg-gateway.png) + +登录linkis管理台 +![](/static/Images/blog/linkis-console.png) + +使用linkis-cli进行快速验证 +``` +sh bin/linkis-cli -submitUser root -engineType hive-3.1.0 -codeType hql -code "show tables" + +============Result:================ +TaskId:5 +ExecId: exec_id018008linkis-cg-entranceapp:9104LINKISCLI_root_hive_0 +User:root +Current job status:FAILED +extraMsg: +errDesc: 21304, Task is Failed,errorMsg: errCode: 12003 ,desc: app:9101_4 Failed to async get EngineNode AMErrorException: errCode: 30002 ,desc: ServiceInstance(linkis-cg-engineconn, app:34197) ticketID:24ab8eed-2a9b-4012-9052-ec1f64b85b5f 初始化引擎失败,原因: ServiceInsta + +[INFO] JobStatus is not 'success'. Will not retrieve result-set. +``` + +管理台查看日志信息 +![](/static/Images/blog/console-log-info.png) + +我的的hive使用的是tez引擎,需要手动将tez引擎相关的包拷贝到hive插件的lib下 +``` +cp -r /usr/hdp/current/tez-client/tez-* ./lib/linkis-engineconn-plugins/hive/dist/v3.1.0/ +sh sbin/linkis-daemon.sh restart cg-engineplugin +sh bin/linkis-cli -submitUser root -engineType hive-3.1.0 -codeType hql -code "show tables" +``` + +再次跑动,还是没运行起来,好像是缺失jackson的库 +![](/static/Images/blog/miss-jackson-jar.png) + +``` +// 需要添linkis-commons/linkis-hadoop-common需要手动添加依赖,重新打包 + + org.apache.hadoop + hadoop-yarn-common + ${hadoop.version} + +``` + +再次跑动,还是没运行起来,日志如下: +``` +2022-11-09 18:09:44.009 ERROR Job with execId-LINKISCLI_root_hive_0 + subJobId : 51 execute failed,21304, Task is Failed,errorMsg: errCode: 12003 ,desc: app:9101_0 Failed to async get EngineNode AMErrorException: errCode: 30002 ,desc: ServiceInstance(linkis-cg-engineconn, app:42164) ticketID:91f72f2a-598c-4384-9132-09696012d5b5 初始化引擎失败,原因: ServiceInstance(linkis-cg-engineconn, app:42164): log dir: /appcom/tmp/root/20221109/hive/91f72f2a-598c-4384-9132-09696012d5b5/logs,SessionNotRunning: TezSession has already shutdown. Application application_1666169891027_0067 failed 2 times due to AM Container for appattempt_1666169891027_0067_000002 exited with exitCode: 1 +``` + +从日志上看,是yarn的app运行异常,查看yarn的container日志: +``` +Log Type: syslog + +Log Upload Time: Wed Nov 09 18:09:41 +0800 2022 + +Log Length: 1081 + +2022-11-09 18:09:39,073 [INFO] [main] |app.DAGAppMaster|: Creating DAGAppMaster for applicationId=application_1666169891027_0067, attemptNum=1, AMContainerId=container_e19_1666169891027_0067_01_000001, jvmPid=25804, userFromEnv=root, cliSessionOption=true, pwd=/hadoop/yarn/local/usercache/root/appcache/application_1666169891027_0067/container_e19_1666169891027_0067_01_000001, localDirs=/hadoop/yarn/local/usercache/root/appcache/application_1666169891027_0067, logDirs=/hadoop/yarn/log/application_1666169891027_0067/container_e19_1666169891027_0067_01_000001 +2022-11-09 18:09:39,123 [ERROR] [main] |app.DAGAppMaster|: Error starting DAGAppMaster +java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V + at org.apache.hadoop.conf.Configuration.set(Configuration.java:1358) + at org.apache.hadoop.conf.Configuration.set(Configuration.java:1339) + at org.apache.tez.common.TezUtilsInternal.addUserSpecifiedTezConfiguration(TezUtilsInternal.java:94) + at org.apache.tez.dag.app.DAGAppMaster.main(DAGAppMaster.java:2432) +``` +日志上看结合百度相关资料,提示是guava的版本问题,首先先确认是否hive引擎中的guava版本和hadoop里的guava版本一致,如果一致 +还有一种可能就是hive-exec的版本问题,因为我是用的ambari部署的hive,所以最好用ambari中的hive相关的jar替换掉插件引擎中相关的hive包。遇到的问题是后一种,花了很长时间才排查出。 +``` +(base) [root@app lib]# pwd +/root/linkis-dss/linkis/LinkisInstall/lib/linkis-engineconn-plugins/hive/dist/v3.1.0/lib +(base) [root@app lib]# ls -l | grep hive +-rw-r--r-- 1 root root 140117 Nov 10 13:44 hive-accumulo-handler-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 43 Nov 10 13:44 hive-accumulo-handler.jar -> hive-accumulo-handler-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 161078 Nov 10 13:44 hive-beeline-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 34 Nov 10 13:44 hive-beeline.jar -> hive-beeline-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 11508 Nov 10 13:44 hive-classification-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 41 Nov 10 13:44 hive-classification.jar -> hive-classification-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 45753 Nov 10 13:44 hive-cli-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 30 Nov 10 13:44 hive-cli.jar -> hive-cli-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 509029 Nov 10 13:44 hive-common-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 33 Nov 10 13:44 hive-common.jar -> hive-common-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 127200 Nov 10 13:44 hive-contrib-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 34 Nov 10 13:44 hive-contrib.jar -> hive-contrib-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 51747254 Nov 10 13:44 hive-druid-handler-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 40 Nov 10 13:44 hive-druid-handler.jar -> hive-druid-handler-3.1.0.3.1.5.0-152.jar +-rw-r--r-- 1 root root 42780917 Nov 10 13:44 hive-exec-3.1.0.3.1.5.0-152.jar +lrwxrwxrwx 1 root root 31 Nov 10 13:44 hive-exec.jar -> hive-exec-3.1.0.3.1.5.0-152.jar +.................. +``` +再次跑动,运行成功,到此,linkis部署好像没有问题了额 + +### 安装DSS +说明:DSS因为是另外以为同事安装的,这里我就不再展示,具体可以参考官网的安装,这里我主要说明一下dss和linkis集成时遇到的问题。 +1. dss 登录时,linkis-mg-gateway的日志显示 TooManyServiceException + 如图: + ![](/static/Images/blog/install-dss.png) + 具体gateway中日志如下 +``` +2022-11-11 11:27:06.194 [WARN ] [reactor-http-epoll-6 ] o.a.l.g.r.DefaultGatewayRouter (129) [apply] - org.apache.linkis.gateway.exception.TooManyServiceException: errCode: 11010 ,desc: Cannot find a correct serviceId for parsedServiceId dss, service list is: List(dss-framework-project-server, dss-apiservice-server, dss-scriptis-server, dss-framework-orchestrator-server-dev, dss-flow-entrance, dss-guide-server, dss-workflow-server-dev) ,ip: app ,port: 9001 ,serviceKind: linkis-mg-gateway + at org.apache.linkis.gateway.route.DefaultGatewayRouter$$anonfun$org$apache$linkis$gateway$route$DefaultGatewayRouter$$findCommonService$1.apply(GatewayRouter.scala:101) ~[linkis-gateway-core-1.2.0.jar:1.2.0] + at org.apache.linkis.gateway.route.DefaultGatewayRouter$$anonfun$org$apache$linkis$gateway$route$DefaultGatewayRouter$$findCommonService$1.apply(GatewayRouter.scala:100) ~[linkis-gateway-core-1.2.0.jar:1.2.0] + at org.apache.linkis.gateway.route.AbstractGatewayRouter.findService(GatewayRouter.scala:70) ~[linkis-gatew +``` +大概的意思就是找不到dss,无独有偶,我在dss中的plugin下面发现有一段gaeway parser代码,尝试拷贝到GatewayParser的parse方法的case COMMON_REGEX的前面,再根据编译提示引入需要依赖的方法,变量和包。如图: +![](/static/Images/blog/gateway-parse-code.png) + +顺利登录(记得要重启linkis的mg-geteway服务)。 + +登录进去后如果发现报错,提示需要管理创建工作目录,可以在linkis-ps-publicservice.properties中配置如下属性,然后重启ps-publicservice服务 +``` +#LinkisInstall/conf/linkis-ps-publicservice.properties +#Workspace +linkis.workspace.filesystem.auto.create=true +``` diff --git a/static/Images/blog/advanced-spark2-env.png b/static/Images/blog/advanced-spark2-env.png new file mode 100644 index 00000000000..6e8a7e4c104 Binary files /dev/null and b/static/Images/blog/advanced-spark2-env.png differ diff --git a/static/Images/blog/bml-service.png b/static/Images/blog/bml-service.png new file mode 100644 index 00000000000..fed79f7555b Binary files /dev/null and b/static/Images/blog/bml-service.png differ diff --git a/static/Images/blog/check-env.png b/static/Images/blog/check-env.png new file mode 100644 index 00000000000..3b90db420c1 Binary files /dev/null and b/static/Images/blog/check-env.png differ diff --git a/static/Images/blog/check-env1.png b/static/Images/blog/check-env1.png new file mode 100644 index 00000000000..ef89690a29d Binary files /dev/null and b/static/Images/blog/check-env1.png differ diff --git a/static/Images/blog/choose-true-mysql-version.png b/static/Images/blog/choose-true-mysql-version.png new file mode 100644 index 00000000000..0cd58a723e3 Binary files /dev/null and b/static/Images/blog/choose-true-mysql-version.png differ diff --git a/static/Images/blog/collect-installation-package.png b/static/Images/blog/collect-installation-package.png new file mode 100644 index 00000000000..2cefdb5c88c Binary files /dev/null and b/static/Images/blog/collect-installation-package.png differ diff --git a/static/Images/blog/console-log-info.png b/static/Images/blog/console-log-info.png new file mode 100644 index 00000000000..cb2e21cfa81 Binary files /dev/null and b/static/Images/blog/console-log-info.png differ diff --git a/static/Images/blog/data-source-init-choose-2.png b/static/Images/blog/data-source-init-choose-2.png new file mode 100644 index 00000000000..bb6ea229816 Binary files /dev/null and b/static/Images/blog/data-source-init-choose-2.png differ diff --git a/static/Images/blog/default-engine-conn-resource-service.png b/static/Images/blog/default-engine-conn-resource-service.png new file mode 100644 index 00000000000..0f30be213cc Binary files /dev/null and b/static/Images/blog/default-engine-conn-resource-service.png differ diff --git a/static/Images/blog/default-engine-conn-uml.png b/static/Images/blog/default-engine-conn-uml.png new file mode 100644 index 00000000000..e83fb1be257 Binary files /dev/null and b/static/Images/blog/default-engine-conn-uml.png differ diff --git a/static/Images/blog/engine-connbml-resoure.png b/static/Images/blog/engine-connbml-resoure.png new file mode 100644 index 00000000000..ea93a751ed1 Binary files /dev/null and b/static/Images/blog/engine-connbml-resoure.png differ diff --git a/static/Images/blog/entrance-context.png b/static/Images/blog/entrance-context.png new file mode 100644 index 00000000000..76fba2e28eb Binary files /dev/null and b/static/Images/blog/entrance-context.png differ diff --git a/static/Images/blog/entry-service-execution-process.jpg b/static/Images/blog/entry-service-execution-process.jpg new file mode 100644 index 00000000000..de0121a3558 Binary files /dev/null and b/static/Images/blog/entry-service-execution-process.jpg differ diff --git a/static/Images/blog/env-init.png b/static/Images/blog/env-init.png new file mode 100644 index 00000000000..9b093e26ec7 Binary files /dev/null and b/static/Images/blog/env-init.png differ diff --git a/static/Images/blog/excute-area.png b/static/Images/blog/excute-area.png new file mode 100644 index 00000000000..e590fcaf4bd Binary files /dev/null and b/static/Images/blog/excute-area.png differ diff --git a/static/Images/blog/gateway-parse-code.png b/static/Images/blog/gateway-parse-code.png new file mode 100644 index 00000000000..af7551a85ae Binary files /dev/null and b/static/Images/blog/gateway-parse-code.png differ diff --git a/static/Images/blog/hadoop-start-error.png b/static/Images/blog/hadoop-start-error.png new file mode 100644 index 00000000000..58fd54f11b7 Binary files /dev/null and b/static/Images/blog/hadoop-start-error.png differ diff --git a/static/Images/blog/install-dss.png b/static/Images/blog/install-dss.png new file mode 100644 index 00000000000..5c2c6186641 Binary files /dev/null and b/static/Images/blog/install-dss.png differ diff --git a/static/Images/blog/install-linkis.png b/static/Images/blog/install-linkis.png new file mode 100644 index 00000000000..296630a522d Binary files /dev/null and b/static/Images/blog/install-linkis.png differ diff --git a/static/Images/blog/jdbc-connection-manager.png b/static/Images/blog/jdbc-connection-manager.png new file mode 100644 index 00000000000..dcf5bf302f6 Binary files /dev/null and b/static/Images/blog/jdbc-connection-manager.png differ diff --git a/static/Images/blog/linkis-cg-engine-conn-plugin-bml-resource.png b/static/Images/blog/linkis-cg-engine-conn-plugin-bml-resource.png new file mode 100644 index 00000000000..4283d8e4a09 Binary files /dev/null and b/static/Images/blog/linkis-cg-engine-conn-plugin-bml-resource.png differ diff --git a/static/Images/blog/linkis-console.png b/static/Images/blog/linkis-console.png new file mode 100644 index 00000000000..9cf468be476 Binary files /dev/null and b/static/Images/blog/linkis-console.png differ diff --git a/static/Images/blog/linkis-mg-gateway.png b/static/Images/blog/linkis-mg-gateway.png new file mode 100644 index 00000000000..9b0bd59dbb3 Binary files /dev/null and b/static/Images/blog/linkis-mg-gateway.png differ diff --git a/static/Images/blog/linkis-ps-bml.png b/static/Images/blog/linkis-ps-bml.png new file mode 100644 index 00000000000..176fcc632ac Binary files /dev/null and b/static/Images/blog/linkis-ps-bml.png differ diff --git a/static/Images/blog/login.png b/static/Images/blog/login.png new file mode 100644 index 00000000000..d0bde8705e3 Binary files /dev/null and b/static/Images/blog/login.png differ diff --git a/static/Images/blog/miss-jackson-jar.png b/static/Images/blog/miss-jackson-jar.png new file mode 100644 index 00000000000..cc40024b2dc Binary files /dev/null and b/static/Images/blog/miss-jackson-jar.png differ diff --git a/static/Images/blog/no-plugin-error.png b/static/Images/blog/no-plugin-error.png new file mode 100644 index 00000000000..58fd54f11b7 Binary files /dev/null and b/static/Images/blog/no-plugin-error.png differ diff --git a/static/Images/blog/open-eureka-service.png b/static/Images/blog/open-eureka-service.png new file mode 100644 index 00000000000..21118c3856a Binary files /dev/null and b/static/Images/blog/open-eureka-service.png differ diff --git a/static/Images/blog/public-enhancement-service.png b/static/Images/blog/public-enhancement-service.png new file mode 100644 index 00000000000..bf353314b6c Binary files /dev/null and b/static/Images/blog/public-enhancement-service.png differ diff --git a/static/Images/blog/pyspark-no-such-field-error.png b/static/Images/blog/pyspark-no-such-field-error.png new file mode 100644 index 00000000000..f34b8d2da0f Binary files /dev/null and b/static/Images/blog/pyspark-no-such-field-error.png differ diff --git a/static/Images/blog/pyspark-task-error.png b/static/Images/blog/pyspark-task-error.png new file mode 100644 index 00000000000..7f845a49835 Binary files /dev/null and b/static/Images/blog/pyspark-task-error.png differ diff --git a/static/Images/blog/resource-code.png b/static/Images/blog/resource-code.png new file mode 100644 index 00000000000..066a667c07b Binary files /dev/null and b/static/Images/blog/resource-code.png differ diff --git a/static/Images/blog/sh-bin-install-sh.png b/static/Images/blog/sh-bin-install-sh.png new file mode 100644 index 00000000000..828713e4f6d Binary files /dev/null and b/static/Images/blog/sh-bin-install-sh.png differ diff --git a/static/Images/blog/submit-task.png b/static/Images/blog/submit-task.png new file mode 100644 index 00000000000..8a0a788e7ec Binary files /dev/null and b/static/Images/blog/submit-task.png differ diff --git a/static/Images/blog/upload-reslut.png b/static/Images/blog/upload-reslut.png new file mode 100644 index 00000000000..4f7282d750c Binary files /dev/null and b/static/Images/blog/upload-reslut.png differ diff --git a/static/Images/blog/view-information-page.png b/static/Images/blog/view-information-page.png new file mode 100644 index 00000000000..524f4a135cf Binary files /dev/null and b/static/Images/blog/view-information-page.png differ