Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Issue with multiple mule applications #9

Open
krishnaprasadhg opened this issue Sep 16, 2020 · 44 comments
Open

Issue with multiple mule applications #9

krishnaprasadhg opened this issue Sep 16, 2020 · 44 comments

Comments

@krishnaprasadhg
Copy link

krishnaprasadhg commented Sep 16, 2020

Hello,

We have multiple mule applications. Mule application & APM works fine with security policies are disabled. The issue is mule applications stops working when we enable client id enforcement security policies.

We tested with APM agent v 1.18.0/1.17.0.

Thanks

@michaelhyatt
Copy link
Owner

michaelhyatt commented Sep 18, 2020 via email

@krishnaprasadhg
Copy link
Author

Hi Michael,

Yes. We are using mule domain.

Thanks

@manishgcjain
Copy link

Michael,
There are multple projects which are under 1 common domain so we cant apply domain-tracer approach their as it would then group all APM under that domain .
so we are following tracer.xml approach .
Manish

@michaelhyatt
Copy link
Owner

michaelhyatt commented Sep 24, 2020 via email

@michaelhyatt
Copy link
Owner

Hi @manishgcjain and @krishnaprasadhg,

I created a test project that also uses API autodiscovery (attached) and Mule 4.3.0. I then pushed a client.id enforcement policy and created an application that allowed the request to go through:

➜  stack-docker curl -v --basic --user 2afa4df8242c4341806f0951b6f4d7a3:e964FeADfB894c35a953b53A242f9600 http://localhost:8081
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8081 (#0)
* Server auth using Basic with user '2afa4df8242c4341806f0951b6f4d7a3'
> GET / HTTP/1.1
> Host: localhost:8081
> Authorization: Basic MmFmYTRkZjgyNDJjNDM0MTgwNmYwOTUxYjZmNGQ3YTM6ZTk2NEZlQURmQjg5NGMzNWE5NTNiNTNBMjQyZjk2MDA=
> User-Agent: curl/7.64.1
> Accept: */*
> 
< HTTP/1.1 401 Unauthorized
* Authentication problem. Ignoring this.
< www-authenticate: Basic realm="mule-realm"
< Content-Type: application/json; charset=UTF-8
< Content-Length: 31
< Date: Thu, 24 Sep 2020 01:50:48 GMT
< 
{
  "error": "Invalid Client"
* Connection #0 to host localhost left intact
}* Closing connection 0
➜  stack-docker curl -v --basic --user c6677a797a64409dbd6507f0393759b:6625d30dc204EF3aE9ca2cE7F81dA38 http://localhost:8081
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8081 (#0)
* Server auth using Basic with user 'c6677a797a64409dbd6507f03937599b'
> GET / HTTP/1.1
> Host: localhost:8081
> Authorization: Basic YzY2NzdhNzk3YTY0NDA5ZGJkNjUwN2YwMzkzNzUOWI6NjYyNWQzODBkYzIwNEVGM2FFOWNhMmNFN0Y4MWRBMzg=
> User-Agent: curl/7.64.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Length: 18
< Date: Thu, 24 Sep 2020 02:00:14 GMT
< 
* Connection #0 to host localhost left intact
{ "result": "ok" }* Closing connection 0

I was able to see an APM transaction created for the successful call to the flow, but the one where the policy blocked the execution due to the invalid client-id the flow wasn't invoked. In the flow itself, I am tracking the start and finish of the flow execution, but with APM policy violation the flow is not being invoked at all. Is this the problem you are experiencing?

I wanted to ask your help to get Mulesoft to update their documentation and publish the specs for their event APIs the same way they did for Mule 3. Mule 4 has been around for a while, but their documentation is inadequate, for example the page with Javadoc points still at 4.1.1 and I had no luck trying to contact them to do something about it. Voeful lack of documentation and open community (Stack Overflow or even Mulesoft own forums) made me stop developing the agent for Mule 4, so if you can help me to get Mulesoft to lift up their game with the documentation, I am happy to continue making enhancements.

Javadoc:
https://docs.mulesoft.com/mule-runtime/4.3/mule-4-api-javadoc
test-policies-1.zip

@michaelhyatt
Copy link
Owner

michaelhyatt commented Sep 24, 2020

I can also invoke the application multiple times with the policy succeeding and it seems to work ok. See below 7 successful calls. Do you know which policy you are using that causes the problem?

Screen Shot 2020-09-24 at 12 48 12 pm

My API definition is:

{"api":{"$self":{"name":"v1:16383844","groupId":"a1cb42d2-1a95-4d63-bdc9-4ea2b5ad0e35","assetId":"test-api-1","assetVersion":"1.0.0","productVersion":"v1","description":null,"tags":[],"order":1,"providerId":null,"deprecated":false,"endpointUri":null,"instanceLabel":null}},"endpoint":{"$self":{"type":"http","uri":null,"proxyUri":null,"isCloudHub":null,"deploymentType":"CH","policiesVersion":null,"referencesUserDomain":null,"responseTimeout":null,"muleVersion4OrAbove":true}},"policyConfigurations":[{"$self":{"policyTemplateId":"299240","groupId":"68ef9520-24e9-4cf2-b2f5-620025690913","assetId":"client-id-enforcement","assetVersion":"1.2.2","configurationData":{"credentialsOriginHasHttpBasicAuthenticationHeader":"httpBasicAuthenticationHeader","clientIdExpression":"#[attributes.headers['client_id']]"},"pointcutData":null,"disabled":false,"order":1}}]}

@AlanElliottMAG
Copy link

Hi Michael,

I have also been having issues when API Auto-discovery is enabled. I've raised a ticket with Mulesoft asking them to update their documentation for https://docs.mulesoft.com/mule-runtime/4.3/mule-4-api-javadoc. If there is anything else that would help to progress this, please let me know.

Thanks,

Alan

@AlanElliottMAG
Copy link

Hi,

I have had a response from Mulesoft stating:

The documentation for 4.3.0 can be found at:
https://www.mulesoft.org/docs/site/4.3.0/apidocs/

However we are ware that the title of the documentation is wrong as it states:
pom-javadoc 4.2.0-SNAPSHOT API

Hope this helps.

Thanks,

Alan

@akhilpathak26
Copy link

Hi Michael,

I am facing issue intermittently with APM, it is working fine if we are adding the mule-agent in a first place means while building a mule application from scratch. But, if we are adding the mule-agent in an existing mule application then it is throwing 500 connection close error.

To make work APM changes in existing application if we remove the application instance from API Manager and deploy the application in Runtime Manager and after the successful deployment again manage the application in API Manager it is working fine.

@michaelhyatt
Copy link
Owner

Hi Akhil. Can you send me the error that Mule is throwing when 500 error occurs? Hopefully, it logs an exception when it happens.

@michaelhyatt
Copy link
Owner

Also, what policies are you using? I am trying to figure out if one of them can be causing an issue.

@akhilpathak26
Copy link

Hi Michael,

getting below error, it is not logging any error in mule_ee as well as mule application logs.
HTTP/1.1 500 Server Error
Transfer-Encoding: chunked
Date: Fri, 09 Oct 2020 05:01:39 GMT
Connection: close

following policies are applied on APIs -

  1. Client Id enforcement.
  2. JSON Threat.
  3. Rate Limiting.
  4. JWT token. (in fewAPIs)
    5 HTTP Caching. (in fewAPIs)

@michaelhyatt
Copy link
Owner

Can you please set the elastic.apm.log_level=INFO and increase the Mule log level verbosity for org.mule and com.mulesoft loggers to see if there is anything useful being logged that can give a bit more details about the error?

Another thing, I tried to run some load testing on my end to try and reproduce the failure and the only thing I could achieve was saturating the internal reporter queue for the APM agent due to too many messages needing to be sent to the APM server. Just to test this further, can you please increase the size and the number of instances of the APM server?

This is the error I saw in my logs:

2020-10-09 16:59:38,602 [elastic-apm-server-reporter] ERROR co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Error sending data to APM server: Server returned HTTP response code: 503 for URL: http://localhost:8200/intake/v2/events, response code is 503
2020-10-09 16:59:38,603 [elastic-apm-server-reporter] WARN  co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - {
  "accepted": 110,
  "errors": [
    {
      "message": "queue is full"
    }
  ]
}

@michaelhyatt
Copy link
Owner

Also, to debug it further, can you please ask Mulesoft support to get you the exact exception that Mule policy throws when the API consumers see the 500 error?

@AlanElliottMAG
Copy link

AlanElliottMAG commented Oct 12, 2020

Hi Michael,

I wasn't getting any messages back other than the 500 Server Error when the APM log level was set to INFO. I ran the application in Debug mode in Anypoint Studio and set the log level to Debug. Anypoint Studio gave the following message:

java.lang.RuntimeException: java.util.NoSuchElementException: Context does not contain key: policy.nextOperation

The logs from APM in Debug are below:

2020-10-12 09:34:36.722 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Type match for instrumentation StartTransactionInstrumentation: (name(equals(co.elastic.apm.api.ElasticApm)) and not(isInterface())) matches class co.elastic.apm.api.ElasticApm
2020-10-12 09:34:36.722 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Type match for instrumentation StartTransactionWithRemoteParentInstrumentation: (name(equals(co.elastic.apm.api.ElasticApm)) and not(isInterface())) matches class co.elastic.apm.api.ElasticApm
2020-10-12 09:34:36.722 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Type match for instrumentation CurrentTransactionInstrumentation: (name(equals(co.elastic.apm.api.ElasticApm)) and not(isInterface())) matches class co.elastic.apm.api.ElasticApm
2020-10-12 09:34:36.722 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Type match for instrumentation CurrentSpanInstrumentation: (name(equals(co.elastic.apm.api.ElasticApm)) and not(isInterface())) matches class co.elastic.apm.api.ElasticApm
2020-10-12 09:34:36.723 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Type match for instrumentation CaptureExceptionInstrumentation: (name(equals(co.elastic.apm.api.ElasticApm)) and not(isInterface())) matches class co.elastic.apm.api.ElasticApm
2020-10-12 09:34:36.740 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Method match for instrumentation StartTransactionInstrumentation: name(equals(doStartTransaction)) matches private static java.lang.Object co.elastic.apm.api.ElasticApm.doStartTransaction()
2020-10-12 09:34:36.742 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Method match for instrumentation StartTransactionWithRemoteParentInstrumentation: name(equals(doStartTransactionWithRemoteParentFunction)) matches private static java.lang.Object co.elastic.apm.api.ElasticApm.doStartTransactionWithRemoteParentFunction(java.lang.invoke.MethodHandle,co.elastic.apm.api.HeaderExtractor,java.lang.invoke.MethodHandle,co.elastic.apm.api.HeadersExtractor)
2020-10-12 09:34:36.744 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Method match for instrumentation CurrentTransactionInstrumentation: name(equals(doGetCurrentTransaction)) matches private static java.lang.Object co.elastic.apm.api.ElasticApm.doGetCurrentTransaction()
2020-10-12 09:34:36.744 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Method match for instrumentation CurrentSpanInstrumentation: name(equals(doGetCurrentSpan)) matches private static java.lang.Object co.elastic.apm.api.ElasticApm.doGetCurrentSpan()
2020-10-12 09:34:36.745 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Method match for instrumentation CaptureExceptionInstrumentation: name(equals(captureException)) matches public static void co.elastic.apm.api.ElasticApm.captureException(java.lang.Throwable)
2020-10-12 09:34:36.749 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Type match for instrumentation CaptureExceptionInstrumentation: ((name(equals(co.elastic.apm.api.NoopTransaction)) or name(equals(co.elastic.apm.api.NoopSpan))) and not(isInterface())) matches class co.elastic.apm.api.NoopTransaction
2020-10-12 09:34:36.756 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Method match for instrumentation CaptureExceptionInstrumentation: (name(equals(captureException)) and hasParameter(hasTypes(erasures(containing(is(class java.lang.Throwable)))))) matches public java.lang.String co.elastic.apm.api.NoopTransaction.captureException(java.lang.Throwable)
2020-10-12 09:34:36.759 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Type match for instrumentation CaptureExceptionInstrumentation: ((name(equals(co.elastic.apm.api.NoopTransaction)) or name(equals(co.elastic.apm.api.NoopSpan))) and not(isInterface())) matches class co.elastic.apm.api.NoopSpan
2020-10-12 09:34:36.764 [http.listener.09 SelectorRunner] DEBUG co.elastic.apm.agent.bci.ElasticApmAgent - Method match for instrumentation CaptureExceptionInstrumentation: (name(equals(captureException)) and hasParameter(hasTypes(erasures(containing(is(class java.lang.Throwable)))))) matches public java.lang.String co.elastic.apm.api.NoopSpan.captureException(java.lang.Throwable)
2020-10-12 09:34:38.411 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Beginning scheduled configuration reload (interval is 30 sec)...
2020-10-12 09:34:38.411 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Finished scheduled configuration reload
2020-10-12 09:34:38.452 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Receiving METRICS event (sequence 0)
2020-10-12 09:34:38.452 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Starting new request to https://.................................................apm.eu-west-1.aws.cloud.es.io:443/intake/v2/events
2020-10-12 09:34:38.860 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Reloading configuration from APM Server https://..................................................apm.eu-west-1.aws.cloud.es.io:443/config/v1/agents
2020-10-12 09:34:39.077 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Configuration did not change
2020-10-12 09:34:39.077 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Scheduling next remote configuration reload in 30s

@michaelhyatt
Copy link
Owner

Hi Alan,

Thanks for that. Is there a stack trace preceding this?

java.lang.RuntimeException: java.util.NoSuchElementException: Context does not contain key: policy.nextOperation

The other debug messages seem to be normal APM agent class instrumentation.

Also, is it a particular policy that causes this error, a combination of policies, or any policy that can cause it? Still trying to reproduce it on my end, really appreciate your help.

@AlanElliottMAG
Copy link

Hi Michael,

Unfortunately there is no stack trace available.

I've tried with a couple of API's with varying policies. The first one I tried had only Client ID Enforcement, whereas the current one has Client ID Enforcement, HTTP Caching and Spike Control.

@michaelhyatt
Copy link
Owner

michaelhyatt commented Oct 14, 2020 via email

@michaelhyatt
Copy link
Owner

Tried to run it with these 3 policies added, still can't get any 5xx errors back to the calling client.

Also, can you send me the configuration of your API policies? There can be something that causes the error that comes from how the policies are configured.

@AlanElliottMAG
Copy link

Hi Michael,

Have you tried creating a new API without any of the APM code and adding it to API Manager with policies applied and then adding APM afterwards? From what I have seen from the comments here, if APM is baked in at the time of first development then it seems to work ok, but retrofitting it is where the problems lie.

In terms of the policy settings, they are as follow:

Policy | Client ID enforcement
Credentials origin | customExpression
Client ID Expression | #[attributes.headers['client-id']]
Client Secret Expression | #[attributes.headers['client-secret']]

Policy | HTTP Caching
HTTP Caching Key | #[attributes.requestUri]
Maximum Cache Entries | 10000
Entry Time To Live (in Seconds) | 21600
Distributed | true
Persistent Cache | --
Follow HTTP Caching directives | true
Conditional Request Caching Expression | #[attributes.method == 'GET' or attributes.method == 'HEAD']
Conditional Response Caching Expression | #[([200, 203, 204, 206, 300, 301, 404, 405, 410, 414, 501] contains attributes.statusCode) and (sizeOf(payload) >= 3)]

Policy | Spike Control
Amount of Reqs | 30
Time Period | 1000
Delay Time in Milliseconds | 3000
Delay Attempts | 3
Queuing Limit | 5
Expose Headers | --

Thanks,

Alan

@michaelhyatt
Copy link
Owner

Thanks. I managed to reproduce the issue with transactions not being sent from Cloudhub when I added api-gateway:autodiscovery element to my project. The same project works fine when deployed locally. Cloudhub doesn't show any errors as well.

I am suspecting this must be the substitutions CH deployment process does when you deploy projects that stops org.mule.runtime.api.notification.PipelineMessageNotification events from being published.

@michaelhyatt
Copy link
Owner

Now, I managed to fix the issue where Mule application with API autodiscovery deployed into CloudHub and not showing APM transactions. But I couldn't reproduce the 500 errors on my end. @krishnaprasadhg @adrianj @manishgcjain can you please test v0.1.0 and let me know?

https://github.com/michaelhyatt/elastic-apm-mule4-agent/releases/tag/v0.1.0

@akhilpathak26
Copy link

Hi Michael,

We are using on-prem Mule Runtimes after, adding the new version I am not facing 500 connection close server error. But, still facing issue while sending transaction data to APM. Internally checking within the team if it gets resolved I will let you know.

Thanks for the help!!

@michaelhyatt
Copy link
Owner

@akhilpathak26 I added an example to the repo for you to test with domain, APIKit and 2 projects in it. Can you please check if it runs in your environment?

https://github.com/michaelhyatt/elastic-apm-mule4-agent/tree/master/examples/domain

It uses Mule 4.3.0 by default, also make sure to install the latest and greatest version of the Mule 4 APM agent v0.2.0.

@AlanElliottMAG
Copy link

Hi Michael,

Just wanted to let you know that I tried the latest and greatest version v0.2.0 and it works like a charm. I even went a step further and got Mule and APM working together in a Docker container.

Thanks for all your help.

Alan.

@akhilpathak26
Copy link

Hi Michael,

I've tried with the new 0.2.0 version. it is working fine when I am triggering requests from my local machine.(postman and application deployed on Studio). But, when I deployed same application on Mule Runtime it is not publishing transaction data to APM server.

@AlanElliottMAG
Copy link

Will Distributed Tracing be added anytime soon? It would be really beneficial to include the use of service map to showcase APM's monitoring capabilities to the business.

Thanks,

Alan

@michaelhyatt
Copy link
Owner

Hi Michael,

I've tried with the new 0.2.0 version. it is working fine when I am triggering requests from my local machine.(postman and application deployed on Studio). But, when I deployed same application on Mule Runtime it is not publishing transaction data to APM server.

@akhilpathak26 is there any further info, or maybe error messages from your server (not Studio) that can shed the light?

@michaelhyatt
Copy link
Owner

Will Distributed Tracing be added anytime soon? It would be really beneficial to include the use of service map to showcase APM's monitoring capabilities to the business.

Thanks,

Alan

Hi @AlanElliottMAG ,

Yes, I wanted to look into it again, now that there is a newer Javadoc. In the past, the problem was the new immutable message model that Mule 4 introduced that prohibited changing the Mule message to propagate the tracing header from within the notification code. Mule 4 also deviated from Mule 3 on how Mule message properties converted into the protocol headers. In Mule 3, message properties were automatically converted into HTTP headers, JMS properties etc. Mule 4 doesn't support this convention anymore and each protocol doesn't seem to have a common way of setting the properties anymore.

The latter can be overcome by building separate support for the most commonly used protocols. The former is a bigger issue since ideally, I wanted to avoid using things like a reflection to force header propagation as it may break the code with every change of Mule APIs and implementation. If you know the way or can ask MuleSoft support about how MuleMessage headers and flow vars can change within the Java code invoked from MessageProcessorNotification, it will help me add support for remote tracing. I will also have another look, now that the Javadoc is out.

@michaelhyatt
Copy link
Owner

michaelhyatt commented Oct 22, 2020

Hi Michael,
I've tried with the new 0.2.0 version. it is working fine when I am triggering requests from my local machine.(postman and application deployed on Studio). But, when I deployed same application on Mule Runtime it is not publishing transaction data to APM server.

@akhilpathak26 is there any further info, or maybe error messages from your server (not Studio) that can shed the light?

@akhilpathak26 @krishnaprasadhg - I tried to reproduce your issue with multiple domain applications, but it is working fine in both, Anypoint and standalone server. I used the sample domain test app:

https://github.com/michaelhyatt/elastic-apm-mule4-agent/tree/master/examples/domain

This is what I did:

  1. I downloaded a fresh copy of mule-4.3.0-ee server and unzipped it. I registered the mule-ee server to the Anypoint platform and saw it appear there under servers. I don't think server registration is necessary for the APIs to work, but I did it anyway just in case (token is obfuscated):
./amc_setup -H XXX-YYYY-ZZZZZZZZZ macbook1
  1. I build 3 jar files in Anypoint Studio: test-domain.jar with only the domain config, proj1.jar and proj2.jar that only contained the applications that use the test-domain. I build the files through the right-click on the project name in Project Explorer and selecting Export->Mule->Deployable archive ensuring the box saying to include all the project dependencies is checked.
  2. I copied the test-domain.jar into domains directory of the unzipped mule-ee server installation.
  3. I copied proj1.jar and proj2.jar into the apps directory of the unzipped mule-ee server installation.
  4. I started the mule server with the following command line (secrets and URLs are obfuscated, use your values):
bin/mule -M-XX:-UseBiasedLocking -M-Dfile.encoding=UTF-8 -M-XX:+UseG1GC -M-XX:+UseStringDeduplication -M-Delastic.apm.server_urls=https://XXXXX.elastic-cloud.com -M-Delastic.apm.secret_token=YYYYYYY -M-Delastic.apm.service_name=dep-component1 -M-Delastic.apm.service_version=v1.0.0 -M-Delastic.apm.log_level=DEBUG -M-Dmule.verbose.exceptions=true -M-Danypoint.platform.analytics_base_uri=https://analytics-ingest.anypoint.mulesoft.com/ -M-Danypoint.platform.client_id=XYXYXYXYXXYXY -M-Danypoint.platform.client_secret=XYZXYZXYZ
  1. I saw the applications and the domain successfully deploy:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ Mule is up and kicking (every 5000ms)                                        +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

**********************************************************************
*              - - + DOMAIN + - -               * - - + STATUS + - - *
**********************************************************************
* default                                       * DEPLOYED           *
* test-domain-1.0.0-SNAPSHOT-mule-domain        * DEPLOYED           *
**********************************************************************

*******************************************************************************************************
*            - - + APPLICATION + - -            *       - - + DOMAIN + - -       * - - + STATUS + - - *
*******************************************************************************************************
* proj2                                         * test-domain-1.0.0-SNAPSHOT-mul * DEPLOYED           *
* proj1                                         * test-domain-1.0.0-SNAPSHOT-mul * DEPLOYED           *
*******************************************************************************************************
  1. I then run 2 commands in the command line to trigger both applications:
➜  elastic-apm-mule4-agent git:(master) ✗ curl -v --basic --user XXX:YYY http://localhost:8081/proj1
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8081 (#0)
* Server auth using Basic with user 'XXX'
> GET /proj1 HTTP/1.1
> Host: localhost:8081
> Authorization: Basic XYZXYZ=
> User-Agent: curl/7.64.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Length: 29
< Date: Thu, 22 Oct 2020 00:12:50 GMT
< 
* Connection #0 to host localhost left intact
{ "result": "proj1 success" }* Closing connection 0
➜  elastic-apm-mule4-agent git:(master) ✗ curl -v --basic --user XXX:YYY http://localhost:8081/proj2
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8081 (#0)
* Server auth using Basic with user 'XXX'
> GET /proj2 HTTP/1.1
> Host: localhost:8081
> Authorization: Basic XYZXYZ=
> User-Agent: curl/7.64.1
> Accept: */*
> 
< HTTP/1.1 200 
< Content-Length: 29
< Date: Thu, 22 Oct 2020 00:12:54 GMT
< 
* Connection #0 to host localhost left intact
{ "result": "proj2 success" }* Closing connection 0
  1. I then saw heaps of debug logs popping up in the server log since I run it in the DEBUG mode and the traces for proj1Flow and proj2Flow for the component dep-component1 appear in the APM UI in my Kibana in the Elastic cloud cluster.

Do you see anything there that shows that your setup maybe different?

@akhilpathak26
Copy link

Hi Michael,

Thanks for sharing the domain examples - https://github.com/michaelhyatt/elastic-apm-mule4-agent/tree/master/examples/domain. But, in this example I observed that you did the APM config changes at proj1 and proj2. I tried the same by adding that at domain project and it is not working after adding the APM config changes in domain project.

We have 50+ mule application on single Mule Runtime, if we do the changes at mule application level then it will be a huge difference. Is it possible if we add all the APM config at domain and refer the domain to other applications. In this way no change is required at each mule application level.

@manishgcjain
Copy link

just to add on Akhil's point adding dependency in each project would increase the jar size by ~15MB each. and we will be having apm agent in each jar which is not required as it would be loaded in the single jvm. so is there a way to have the dependency at the server level and in each project we can define them as "provided " or "test"

@michaelhyatt
Copy link
Owner

@akhilpathak26 - I couldn't find a way to configure and enable notifications at the domain level in Mule 4 the same way it worked in Mule 3. My agent relies on notifications to capture pipeline, processor and exception events to translate them into APM transactions and spans. If you have access to Mulesoft support, can you please ask them for an example on how to enable and configure server notifications in the domain configuration so they work across all the projects deployed with the domain? To be clear, these are the notifications I am relying upon the APM agent to work:
https://docs.mulesoft.com/mule-runtime/4.3/mule-server-notifications

@michaelhyatt
Copy link
Owner

@manishgcjain - because there is no way to configure events and dependencies at the domain level, I have to add dependencies into every project deployed with the domain. So, the official Elastic Java APM agent jars that Mule agent is using have to be added to each project, the attach and agent jars are about 7MB each because they are shadowing all the dependencies, so I can't do anything about the size of the resulting jar, unfortunately. I don't want to bypass the agent completely, since it provides tons of really useful functionality, such as metrics collections, JMX beans integration, etc, so the size problem will be there for quite some time, unfortunately.

@michaelhyatt
Copy link
Owner

@AlanElliottMAG - I have the version of the agent that supports adding Mule transaction to the overall trace, i.e. creates a transaction with the same parent trace.id supporting only HTTP at this stage. I am still trying to figure out how to propagate the trace.id down HTTP requests from Mule onwards, which seems to be non-trivial, as there is no public APIs that allow changing Mule message from within a notification.

If it is useful and you would like to have a look, happy to make a jar available, otherwise, will try to figure out a more complete solution that supports distributed tracing end to end.

@manishgcjain
Copy link

@manishgcjain - because there is no way to configure events and dependencies at the domain level, I have to add dependencies into every project deployed with the domain. So, the official Elastic Java APM agent jars that Mule agent is using have to be added to each project, the attach and agent jars are about 7MB each because they are shadowing all the dependencies, so I can't do anything about the size of the resulting jar, unfortunately. I don't want to bypass the agent completely, since it provides tons of really useful functionality, such as metrics collections, JMX beans integration, etc, so the size problem will be there for quite some time, unfortunately.

@michaelhyatt - Any update on the issue as it requires the mule runtime for change in each API project which is practically not feasible. Can you please help in providing fix .

@AlanElliottMAG
Copy link

@michaelhyatt - An end to end distributed tracing solution would be ideal. I'm happy to test out anything you have though. If it isn't too much trouble to make a jar available, I'll stick it into my dev instances and see what we can get out of it.

Thanks

@michaelhyatt
Copy link
Owner

@manishgcjain - because there is no way to configure events and dependencies at the domain level, I have to add dependencies into every project deployed with the domain. So, the official Elastic Java APM agent jars that Mule agent is using have to be added to each project, the attach and agent jars are about 7MB each because they are shadowing all the dependencies, so I can't do anything about the size of the resulting jar, unfortunately. I don't want to bypass the agent completely, since it provides tons of really useful functionality, such as metrics collections, JMX beans integration, etc, so the size problem will be there for quite some time, unfortunately.

@michaelhyatt - Any update on the issue as it requires the mule runtime for change in each API project which is practically not feasible. Can you please help in providing fix .

@manishgcjain adding tracing once at the domain level with Mule 4 is not possible for 2 reasons:

  1. Classloader isolation: Mule 4 applications are deployed with all their dependencies packaged with the application to isolate classloaders of different applications and allow them to use different versions of the same libraries. Your domain project will have to expose all the packages that are required for the APM agent to work, which is not a great idea and you are better off doing it at the project level. More on classloader isolation:

https://docs.mulesoft.com/mule-runtime/4.3/about-classloading-isolation

  1. The APM agent uses server notifications and Mule 4 doesn't support enabling server notifications at the domain level, only at the application level.

Things worked differently in Mule 3, but in Mule 4 the classloader isolation and notifications make it impossible to enable tracing at the domain level, so you will need to redeploy the applications to enable tracing. If you want to help me with escalating the two questions below to Mulesoft support, I would really appreciate your help:

  1. How can I add a jar dependency at the domain level that will be visible on the classloader path with all the applications deployed in the domain?
  2. How can I enable server notifications at the domain level for all the applications deployed using this domain instead of enabling them one by one in my applications?

BTW, were you able to run the demo applications I sent you earlier with API policies applied?

@michaelhyatt
Copy link
Owner

@michaelhyatt - An end to end distributed tracing solution would be ideal. I'm happy to test out anything you have though. If it isn't too much trouble to make a jar available, I'll stick it into my dev instances and see what we can get out of it.

Thanks

@AlanElliottMAG - created release v0.3.0. Will keep you posted with how I go about propagating the trace.id

@AlanElliottMAG
Copy link

@michaelhyatt - Unfortunately, that change has rendered the Trace Sample section unusable. It now looks like this:

image

Whereas in 0.2.0 it looks like the below which is far more useful:

image

The Trace Sample section is one of the most useful aspects of APM as it allows us to pinpoint bottlenecks to the code level.

Thanks,

Alan

@michaelhyatt
Copy link
Owner

michaelhyatt commented Nov 3, 2020 via email

@AlanElliottMAG
Copy link

Unfortunately I won't be able to contribute the project as it is a live system that we are retrofitting APM into and as such needs to communicate with Anypoint Platform and several other API's to work, plus it contains sensitive information. I'll try again with 0.3.0 and see if there is anything in the logs to suggest what is happening, however, I'm going to be off for 4 weeks after today, so it will have to wait until I'm back, sorry.

@michaelhyatt
Copy link
Owner

@manishgcjain @akhilpathak26 @krishnaprasadhg - I updated the example domain project to use the shared library from the domain, so now you can only add it once in the pom.xml of the domain project. Then, all the projects deployed in the domain will just need to import the tracer.xml in the flow definition without having to declare the tracer as a dependency.

Sample app:
https://github.com/michaelhyatt/elastic-apm-mule4-agent/tree/master/examples/domain

README:
https://github.com/michaelhyatt/elastic-apm-mule4-agent#support-for-mule-domains

Let me know what you think.

@michaelhyatt
Copy link
Owner

Unfortunately I won't be able to contribute the project as it is a live system that we are retrofitting APM into and as such needs to communicate with Anypoint Platform and several other API's to work, plus it contains sensitive information. I'll try again with 0.3.0 and see if there is anything in the logs to suggest what is happening, however, I'm going to be off for 4 weeks after today, so it will have to wait until I'm back, sorry.

@AlanElliottMAG I added a distributed tracing example project, please check it out https://github.com/michaelhyatt/elastic-apm-mule4-agent/tree/master/examples/tracing-test

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants