Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] 1.5.0部署过程中,linkis-cg-linkismanager启动失败。 #5068

Closed
1 task done
tigerHM opened this issue Jan 12, 2024 · 2 comments
Closed
1 task done
Labels
Question Further information is requested

Comments

@tigerHM
Copy link

tigerHM commented Jan 12, 2024

Before asking

Your environment

  • Linkis version used: 1.5.0
  • Environment name and version:
    • hdp-3.1.5
    • hadoop-3.1.1
    • hive-3.1.0
    • spark-3.1.2
    • jdk 1.8.0_321
    • ....

Describe your questions

LINKIS-CG-ENTRANCE、LINKIS-MG-EUREKA、LINKIS-MG-GATEWAY、LINKIS-PS-PUBLICSERVIC启动正常,linkis-cg-linkismanager启动失败。

linkis-cg-linkismanager.log日志如下:
2024-01-12 11:06:02.109 [INFO ] [main ] o.a.l.DataWorkCloudApplication (61) [logStarted] [JobId-] - Started DataWorkCloudApplication in 17.221 seconds (JVM running for 21.334)
2024-01-12 11:06:04.654 [INFO ] [Linkis-Default-Scheduler-Thread-1 ] o.a.l.e.s.s.DefaultEngineConnResourceService (126) [run] [JobId-] - Try to initialize sparkEngineConn-3.2.1.
2024-01-12 11:06:04.675 [INFO ] [Linkis-Default-Scheduler-Thread-1 ] o.a.l.e.s.s.DefaultEngineConnResourceService (225) [refresh] [JobId-] - Ready to upload a new bmlResource for sparkEngineConn-3.2.1. path: conf.zip
2024-01-12 11:06:08.811 [INFO ] [Linkis-Default-Scheduler-Thread-1 ] o.a.l.h.d.DWSHttpClient (150) [addAttempt$1] [JobId-] - invoke http://xxx:9001/api/rest_j/v1/bml/upload get status 400 taken: 4.0 s.
2024-01-12 11:06:09.083 [ERROR] [Linkis-Default-Scheduler-Thread-1 ] o.a.l.e.s.s.DefaultEngineConnResourceService (135) [run] [JobId-] - Failed to upload engine conn to bml, now exit! org.apache.linkis.httpclient.exception.HttpClientResultException: errCode: 10905 ,desc: URL /api/rest_j/v1/bml/upload request failed! ResponseBody is {"method":null,"status":1,"message":"error code(错误码): 60050, error message(错误信息): The first upload of the resource failed(首次上传资源失败).","data":{"errorMsg":{"serviceKind":"linkis-ps-publicservice","port":9105,"level":2,"errCode":50073,"ip":"xxxx","desc":"The commit upload resource task failed(提交上传资源任务失败):errCode: 60050 ,desc: The first upload of the resource failed(首次上传资源失败) ,ip: xxx ,port: 9105 ,serviceKind: linkis-ps-publicservice"}}}. errCode: 10905 ,desc: URL /api/rest_j/v1/bml/upload request failed! ResponseBody is {"method":null,"status":1,"message":"error code(错误码): 60050, error message(错误信息): The first upload of the resource failed(首次上传资源失败).","data":{"errorMsg":{"serviceKind":"linkis-ps-publicservice","port":9105,"level":2,"errCode":50073,"ip":"xxx","desc":"The commit upload resource task failed(提交上传资源任务失败):errCode: 60050 ,desc: The first upload of the resource failed(首次上传资源失败) ,ip: xxx ,port: 9105 ,serviceKind: linkis-ps-publicservice"}}}. ,ip: xxx ,port: 9101 ,serviceKind: linkis-cg-linkismanager ,ip: xxx ,port: 9101 ,serviceKind: linkis-cg-linkismanager
at org.apache.linkis.httpclient.dws.response.DWSResult.$anonfun$set$2(DWSResult.scala:86) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0]
at org.apache.linkis.httpclient.dws.response.DWSResult.$anonfun$set$2$adapted(DWSResult.scala:84) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0]
at org.apache.linkis.common.utils.Utils$.tryCatch(Utils.scala:69) ~[linkis-common-1.5.0.jar:1.5.0]
at org.apache.linkis.httpclient.dws.response.DWSResult.set(DWSResult.scala:84) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0]
at org.apache.linkis.httpclient.dws.response.DWSResult.set$(DWSResult.scala:57) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0]
at org.apache.linkis.bml.response.BmlResult.set(BmlResult.scala:26) ~[linkis-pes-client-1.5.0.jar:1.5.0]
at org.apache.linkis.httpclient.dws.DWSHttpClient.$anonfun$httpResponseToResult$1(DWSHttpClient.scala:83) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0]
at scala.Option.map(Option.scala:230) ~[scala-library-2.12.17.jar:?]
at org.apache.linkis.httpclient.dws.DWSHttpClient.httpResponseToResult(DWSHttpClient.scala:79) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0]
at org.apache.linkis.httpclient.AbstractHttpClient.$anonfun$responseToResult$1(AbstractHttpClient.scala:546) ~[linkis-httpclient-1.5.0.jar:1.5.0]
at org.apache.linkis.common.utils.Utils$.tryFinally(Utils.scala:77) ~[linkis-common-1.5.0.jar:1.5.0]
at org.apache.linkis.httpclient.AbstractHttpClient.responseToResult(AbstractHttpClient.scala:559) ~[linkis-httpclient-1.5.0.jar:1.5.0]
at org.apache.linkis.httpclient.AbstractHttpClient.execute(AbstractHttpClient.scala:183) ~[linkis-httpclient-1.5.0.jar:1.5.0]
at org.apache.linkis.httpclient.AbstractHttpClient.execute(AbstractHttpClient.scala:128) ~[linkis-httpclient-1.5.0.jar:1.5.0]
at org.apache.linkis.bml.client.impl.HttpBmlClient.uploadResource(HttpBmlClient.scala:412) ~[linkis-pes-client-1.5.0.jar:1.5.0]
at org.apache.linkis.engineplugin.server.service.DefaultEngineConnResourceService.uploadToBml(DefaultEngineConnResourceService.java:82) ~[linkis-application-manager-1.5.0.jar:1.5.0]
at org.apache.linkis.engineplugin.server.service.DefaultEngineConnResourceService.refresh(DefaultEngineConnResourceService.java:230) ~[linkis-application-manager-1.5.0.jar:1.5.0]
at org.apache.linkis.engineplugin.server.service.DefaultEngineConnResourceService.access$300(DefaultEngineConnResourceService.java:60) ~[linkis-application-manager-1.5.0.jar:1.5.0]
at org.apache.linkis.engineplugin.server.service.DefaultEngineConnResourceService$1.run(DefaultEngineConnResourceService.java:128) ~[linkis-application-manager-1.5.0.jar:1.5.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_321]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_321]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_321]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_321]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_321]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_321]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_321]

2024-01-12 11:06:09.106 [INFO ] [SpringContextShutdownHook ] o.s.c.n.e.s.EurekaServiceRegistry (65) [deregister] [JobId-] - Unregistering application LINKIS-CG-LINKISMANAGER with eureka with status DOWN
2024-01-12 11:06:09.107 [INFO ] [SpringContextShutdownHook ] c.n.d.DiscoveryClient (1352) [notify] [JobId-] - Saw local status change event StatusChangeEvent [timestamp=1705028769107, current=DOWN, previous=UP]
2024-01-12 11:06:09.109 [INFO ] [DiscoveryClient-InstanceInfoReplicator-0] c.n.d.DiscoveryClient (873) [register] [JobId-] - DiscoveryClient_LINKIS-CG-LINKISMANAGER/xxx:linkis-cg-linkismanager:9101: registering service...
2024-01-12 11:06:09.126 [INFO ] [DiscoveryClient-InstanceInfoReplicator-0] c.n.d.DiscoveryClient (882) [register] [JobId-] - DiscoveryClient_LINKIS-CG-LINKISMANAGER/xxx:linkis-cg-linkismanager:9101 - registration status: 204
2024-01-12 11:06:09.138 [INFO ] [SpringContextShutdownHook ] o.e.j.s.AbstractConnector (383) [doStop] [JobId-] - Stopped ServerConnector@6516181f{HTTP/1.1, (http/1.1)}{0.0.0.0:9101}
2024-01-12 11:06:09.138 [INFO ] [SpringContextShutdownHook ] o.e.j.s.session (149) [stopScavenging] [JobId-] - node0 Stopped scavenging
2024-01-12 11:06:09.143 [INFO ] [SpringContextShutdownHook ] o.e.j.s.h.C.application (2368) [log] [JobId-] - Destroying Spring FrameworkServlet 'dispatcherServlet'
2024-01-12 11:06:09.144 [INFO ] [SpringContextShutdownHook ] o.e.j.s.h.C.application (2368) [log] [JobId-] - Destroying Spring FrameworkServlet 'springrestful'
2024-01-12 11:06:09.149 [INFO ] [SpringContextShutdownHook ] o.e.j.s.h.ContextHandler (1159) [doStop] [JobId-] - Stopped o.s.b.w.e.j.JettyEmbeddedWebAppContext@3f6a9ba0{application,/,[file:///tmp/jetty-docbase.9101.9187708320195614229/, jar:file:/opt/bigdata/apache-linkis-1.5.0/lib/linkis-commons/public-module/knife4j-spring-ui-2.0.9.jar!/META-INF/resources],STOPPED}
2024-01-12 11:06:09.165 [WARN ] [SpringContextShutdownHook ] o.e.j.u.t.QueuedThreadPool (299) [doStop] [JobId-] - Stopped without executing or closing null
2024-01-12 11:06:09.181 [INFO ] [SpringContextShutdownHook ] o.s.s.c.ThreadPoolTaskExecutor (218) [shutdown] [JobId-] - Shutting down ExecutorService 'applicationTaskExecutor'
2024-01-12 11:06:09.233 [INFO ] [SpringContextShutdownHook ] c.a.d.p.DruidDataSource (2138) [close] [JobId-] - {dataSource-1} closing ...
2024-01-12 11:06:09.243 [INFO ] [SpringContextShutdownHook ] c.a.d.p.DruidDataSource (2211) [close] [JobId-] - {dataSource-1} closed
2024-01-12 11:06:09.255 [INFO ] [SpringContextShutdownHook ] c.n.d.DiscoveryClient (935) [shutdown] [JobId-] - Shutting down DiscoveryClient ...
2024-01-12 11:06:12.262 [INFO ] [SpringContextShutdownHook ] c.n.d.DiscoveryClient (971) [unregister] [JobId-] - Unregistering ...
2024-01-12 11:06:12.285 [INFO ] [SpringContextShutdownHook ] c.n.d.DiscoveryClient (973) [unregister] [JobId-] - DiscoveryClient_LINKIS-CG-LINKISMANAGER/xxx:linkis-cg-linkismanager:9101 - deregister status: 200
2024-01-12 11:06:12.305 [INFO ] [SpringContextShutdownHook ] c.n.d.DiscoveryClient (960) [shutdown] [JobId-] - Completed shut down of DiscoveryClient

@tigerHM tigerHM added the Question Further information is requested label Jan 12, 2024
Copy link

😊 Welcome to the Apache Linkis community!!

We are glad that you are contributing by opening this issue.

Please make sure to include all the relevant context.
We will be here shortly.

If you are interested in contributing to our website project, please let us know!
You can check out our contributing guide on
👉 How to Participate in Project Contribution.

Community

WeChat Assistant WeChat Public Account

Mailing Lists

Name Description Subscribe Unsubscribe Archive
[email protected] community activity information subscribe unsubscribe archive

@tigerHM tigerHM closed this as completed Jan 12, 2024
@tigerHM tigerHM reopened this Jan 12, 2024
@tigerHM
Copy link
Author

tigerHM commented Jan 16, 2024

在LINKIS-PS-PUBLICSERVIC日志里面找到以下信息:
Caused by: org.apache.hadoop.security.KerberosAuthException: failure to login: for principal: hadoop from keytab /etc/security/keytabs/xxx.keytab/hadoop.keytab javax.security.auth.login.LoginException: Unable to obtain password from user

根据日志看,对比配置里面HADOOP_KEYTAB_PATH配置的是/etc/security/keytabs/xxx.keytab,实际运行时获取的是/etc/security/keytabs/xxx.keytab/{username}.keytab,
所以HADOOP_KEYTAB_PATH实际只需要配置keytab文件的目录。

@tigerHM tigerHM closed this as completed Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant