Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetDiscovery fails: Failed to add certificate to CA store error:0588000B:x509 certificate routines::X509 lib #718

Open
silhusk opened this issue Jul 22, 2024 · 26 comments
Labels
windows Issue occurs on windows

Comments

@silhusk
Copy link

silhusk commented Jul 22, 2024

Bug reporting acknowledgment

Yes, I read it

Professional support

Yes, I know

Describe the bug

The NetDiscovery task causes the following error:

[error] [http client] internal response: 500 Can't connect to glpi.domain:443, Failed to add certificate to CA store error:0588000B:x509 certificate routines::X509 lib

It seems to crash, and the error message is not always printed. Sometimes it just stops with

[Fri Jul 19 09:28:57 2024][debug2] [http client OCS] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <CONTENT>
    <AGENT>
      <AGENTVERSION>1.9</AGENTVERSION>
      <START>1</START>
    </AGENT>
    <MODULEVERSION>6.2</MODULEVERSION>
    <PROCESSNUMBER>429</PROCESSNUMBER>
  </CONTENT>
  <DEVICEID>REDACTED</DEVICEID>
  <QUERY>NETDISCOVERY</QUERY>
</REQUEST>

The full log with the error: glpi-agent_error.log
With --no-ssl-check version 1.9 works fine: glpi-agent_no-ssl-check.log.
Version 1.10 doesn't work, even with --no-ssl-check, and doesn't print any error.

Other tasks (collect, deploy) are working correctly.

To reproduce

  1. prepare a NetDiscovery task
  2. glpi-agent.exe --force --debug --debug --task=NetDiscovery

Expected behavior

  • The whole IP range is scanned
  • Results are sent to the server
  • The task is either failed or completed, but is not left in the executing state

Operating system

Windows

GLPI Agent version

1.9

GLPI version

10.0.16

GLPIInventory plugin or other plugin version

GLPI Inventory v1.3.5

@g-bougard
Copy link
Member

Hi @silhusk

please try glpi-agent v1.10 as it should fix your issue. See:

* Fixed tasks blocking on windows with core concurrent calls control
and
* Avoid to expire SSL CA certs cache in threads
.

@silhusk
Copy link
Author

silhusk commented Jul 23, 2024

Hi @g-bougard

Version 1.10 is working if the inventory is started from the web UI of the agent, but not from the CLI.

here are the logs for version 1.10:

Could it be that NetInventory is using a different client implementation compared to Collect, where #530 has not been fixed? I'm just guessing.

Last working version is 1.5. Then 1.6 has a different issue (probably #530). And I couldn't get newer versions to work from CLI.

@g-bougard
Copy link
Member

Hi @silhusk

I don't see any error in the first log, do you mean it just blocks ?

Anyway can better you share the output when run from the commandline ? You have to run the command from an administrative console and from the installation folder. You need also to force the task preparation in GlpiInventory plugin just before starting the command.

glpi-agent --logger=stderr --debug --debug --force --task=netdiscovery

@silhusk
Copy link
Author

silhusk commented Jul 23, 2024

Hi @g-bougard

log copied from administrative prompt: glpi-agent_110_cli.log

I don't see any error in the first log, do you mean it just blocks ?

Yes, exactly. It looks like the first worker that tries to send a result crashes the whole process. Sometimes it prints the error, sometimes not. On the server the task is left in the "execution" state.

@g-bougard
Copy link
Member

Thank, this output is more useful.

I think this should also work if you use ssl-fingerprint option. You can try it on the commandline too and the value to use is reported when you use no-ssl-check option.
So can you confirm it works when using ssl-fingerprint option ?

@silhusk
Copy link
Author

silhusk commented Jul 23, 2024

It does work with ssl-fingerprint: glpi-agent_110_ssl-fingerprint.log

@g-bougard
Copy link
Member

Hi @silhusk

it looks like another problem occurs in your context. I'll try to reproduce, but as it's probably related to multi-threading support on windows and as multi-threading context can really change from one user to another, I've no guaranty to find what's going on.

By now, as a work-around, you still can use ssl-fingerprint. But as it seems your server certificate is signed by the Let's Encrypt public CA, this is not a good work-around as you'll have to update it every time the certificate is updated and if I remember well the default expiration for this certificate kind is 3 months.

As you use Let's Encrypt public CA on GLPI server, glpi-agent shouldn't really require to extract locally known certificate from Windows keystore. The fix I pointed in my first comment make indeed the keystore export don't expire in multi-threading context. This way glpi-agent http client will use cached exported certificates. But in your log, it seems the error occurs in OpenSSL, when we load cached certificates. So we should manage on your case to disable keystore export.

I think you can disable the export by better set ca-cert-dir configuration. If you set this parameter to the path of an empty folder, OpenSSL should finally only load default certificates.
So can you test to set ca-cert-dir and tell us if the task runs as expected ?

Another work-around could also be to implement a new option to define if glpi-agent should load server certificate from keystore. Anyway such an option could also be developed to help users to set which stores section have to be exported.

@silhusk
Copy link
Author

silhusk commented Jul 29, 2024

Hi @g-bougard

thank you for your very kind support. Since the agent 1.10 now works correctly as a service, this issue is just a minor annoyance for me personally.

c:\Program Files\GLPI-Agent>mkdir c:\empty
c:\Program Files\GLPI-Agent>glpi-agent.bat --logger=stderr --debug --debug --force --task=netdiscovery --ca-cert-dir="c:\empty"
[...]
[error] [http client] internal response: 500 Can't connect to glpi.domain:443 (No such file or directory), SSL connect attempt failed error:0A000086:SSL routines::certificate verify failed
[error] 
No supported answer from server at https://glpi.domain/marketplace/glpiinventory/front/communication.php

full glpi-agent_110_ca-cert-dir.txt

I confirm that without the --ca-cert-dir option it does indeed contact the server. So it's not the server which cannot be found, but something related to the folder. If the folder indeed does not exist, the error is different:

[error] non-existing certificate directory C:\empty1 at C:/Program Files/GLPI-Agent/perl/agent/GLPI/Agent/HTTP/Client.pm line 39.

@g-bougard
Copy link
Member

Hi @silhusk

your url here is probably wrong, but this is not the reason of the error. It should probably be just:https://glpi.domain/marketplace/glpiinventory/

I failed to reproduce on friday and I really don't know what happens in your context. Maybe one of the keystore extracted & cached certificates makes the OpenSSL stack to crash/fail. And I won't be able to check that point without the extracted certificates.

Anyway, the ca-cert-dir try seems not correct too.

To know if a possible option to disable the local keystore can help in your context, can you try to remove l.519 & l.520 in Client.pm ?

@silhusk
Copy link
Author

silhusk commented Aug 3, 2024

Hi @g-bougard

removing the lines you indicate doesn't seem to make any difference
glpi-agent_110_comment.txt

But if, instead, I change the conditions on l.621 and l.681 to always be false, then it works.
Moreover:

  • if only the Windows keystore is included, then it doesn't find the CA
  • if only Mozilla is included, then it crashes

@silhusk
Copy link
Author

silhusk commented Aug 3, 2024

I replaced l.621 with if (-d $certdir && 1 == 0) {. No other change to Client.pm.

I replaced C:\Program Files\GLPI-Agent\perl\vendor\lib\Mozilla\CA\cacert.pem to only contain my root CA.

This time it did not crash: glpi-agent_110_cacert.txt
Two workers (1 and 5) successfully sent the result to the server. But then at line 383 in the log the error appears again.

Crash trace

I got it to run with Visual Studio attached, the crash message is Eccezione generata in corrispondenza di 0x00007FFCF51587CD (libcrypto-3__.dll) in glpi-agent.exe: 0xC0000005: violazione di accesso durante la lettura del percorso 0x0000000000000000.
Below is the stack trace, but unfortunately I don't have a debug build at present.

libcrypto-3__.dll!00007ffcf51587cd()
libcrypto-3__.dll!00007ffcf4e78f9e()
libcrypto-3__.dll!00007ffcf5145cd2()
libcrypto-3__.dll!00007ffcf511587e()
libcrypto-3__.dll!00007ffcf5148c75()
libcrypto-3__.dll!00007ffcf5148f96()
SSLeay.xs.dll!00007ffd517f0efd()
perl538.dll!00007ffd3bab302c()
perl538.dll!00007ffd3bb07982()
perl538.dll!00007ffd3bb1677c()
kernel32.dll!00007ffd6837257d()
ntdll.dll!00007ffd69f2af28()

No-fork

When setting my $max_threads = 0 in NetDiscovery.pm it works without crashes and without errors.

@g-bougard
Copy link
Member

Hi @g-bougard

(being back from vacation)

Version 1.10 is working if the inventory is started from the web UI of the agent, but not from the CLI.

Do you mean everything is working when glpi-agent is running from the service context ? If yes, when running from VS debugger, are you able to check from where DLLs are loaded ? I want you to check if libcrypto-3__.dll can be loaded from a wrong place or if it loads openssl from a wrong place.

@g-bougard g-bougard added the windows Issue occurs on windows label Aug 26, 2024
@silhusk
Copy link
Author

silhusk commented Aug 28, 2024

Hi @g-bougard

Do you mean everything is working when glpi-agent is running from the service context ?

Sorry, I thought so because the server seems to get some result from time to time. But while the task is scheduled daily, the successful runs are a couple in a month. The rest of the time it reports an error. I cannot tell for sure if the NetDiscovery task is always run. Now I've set log level to debug=2 on the service to try to get a better picture.

If yes, when running from VS debugger, are you able to check from where DLLs are loaded ? I want you to check if libcrypto-3__.dll

It is the one in the agent's install path.

I can compile perl for the agent using glpi-agent-extlibs-build.pl, but I cannot figure out how to generate debug symbols. Any suggestion? If you think it can help.

@g-bougard
Copy link
Member

Hi @silhusk

can you try latest nightly build as it includes #725 which fixes a problem with KeyStore export ?

I actually don't know how to enable debug symbols but I don't think this can help.

@silhusk
Copy link
Author

silhusk commented Aug 29, 2024

1.11-git7b8a5f0b is still affected:
glpi-agent_111_git.txt

@g-bougard
Copy link
Member

Hello @silhusk

I tried to reproduce, but without success. I suspect something is going wrong with one of your certificate in the keystore or maybe with the SSL negotiate with the server itself.

I tried to develop a test script to emulate the same comportment: test-certs-for-issue-718.zip

Can you extract the script from this archive, copy it to glpi-agent installation folder and run it from an administrative console using the following command:

perl\bin\glpi-agent.exe test-certs-for-issue-718.pl --server=https://glpi.domain/marketplace/glpiinventory/

Of course, you must use your server URL for --server option. Here I removed front/communication.php as you don't have to use it.
Indeed, you need to be admin to copy the script under the glpi-agent installation folder, but you can run this script as a simple user as administrator rights are not required here to send httpd requests and this script will use users temp folder to export authorized certificates from keystore.

The script support only --server, --no-ssl-check, --ssl-fingerprint & --threads options. It emulates how glpi-agent calls a GLPI 10 server and fork a process to send http multiple requests related to a netdiscovery job start. As the content should not be a problem I'm using here just a PROLOG which simply has no effect on the server. Then the script leaves.

I'm curious to know if this is sufficient to reproduce the issue in your case. Here is my own output against my GLPI 10.0.16 test instance for which I require ssl-fingerprint option:

C:\Program Files\GLPI-Agent>perl\bin\glpi-agent.exe test-certs.pl --server=https://192.168.1.37 --ssl-fingerprint=sha256$7261777918f63709ad52dce70167a78e95905d050244681759d24fce1e146713
[debug] Logger backend Stderr initialized
[debug] GLPI Agent (1.10)
[debug2] [http client] Using Compress::Zlib for compression
[debug2] [http client] 001243BB: sending message:
{
   "action": "contact",
   "deviceid": "foo",
   "name": "GLPI-Agent",
   "version": "1.10"
}
[debug] [http client] Updating keystore known certificates
[debug2] Changing to 'C:/Users/G-BOUG~1/AppData/Local/Temp/keystore-export-Z1nL2D' temporary folder
[debug2] executing certutil -Silent -Split -Store CA
[debug2] executing certutil -Silent -Split -Store Root
[debug2] executing certutil -Silent -Split -Enterprise -Store CA
[debug2] executing certutil -Silent -Split -Enterprise -Store Root
[debug2] executing certutil -Silent -Split -GroupPolicy -Store CA
[debug2] executing certutil -Silent -Split -GroupPolicy -Store Root
[debug2] executing certutil -Silent -Split -User -Store CA
[debug2] executing certutil -Silent -Split -User -Store Root
[debug2] executing certutil -encode 0119e81be9a14cd8e22f40ac118c687ecba3f4d8.crt temp.cer
[debug2] executing certutil -encode 0563b8630d62d75abbc8ab1e4bdfb5a899b24d43.crt temp.cer
[debug2] executing certutil -encode 06f1aa330b927b753a40e68cdf22e34bcbef3352.crt temp.cer
[debug2] executing certutil -encode 109f1caed645bb78b3ea2b94c0697c740733031c.crt temp.cer
[debug2] executing certutil -encode 18f7c1fcc3090203fd5baa2f861a754976c8dd25.crt temp.cer
[debug2] executing certutil -encode 245c97df7514e7cf2df8be72ae957b9e04741e85.crt temp.cer
[debug2] executing certutil -encode 30656fca8c48b1d98623a94b40a6bc98bd87bfad.crt temp.cer
[debug2] executing certutil -encode 31f9fc8ba3805986b721ea7295c65b3a44534274.crt temp.cer
[debug2] executing certutil -encode 3b1efd3a66ea28b16697394703a72ca340a05bd5.crt temp.cer
[debug2] executing certutil -encode 5fb7ee0633e259dbad0c4c9ae6d38f1a61c7dc25.crt temp.cer
[debug2] executing certutil -encode 7e92b66be51b79d8ce3ff25c15c2df6ab8c7f2f2.crt temp.cer
[debug2] executing certutil -encode 7f88cd7223f3c813818c994614a89c99fa3b5247.crt temp.cer
[debug2] executing certutil -encode 8bfe3107712b3c886b1c96aaec89984914dc9b6b.crt temp.cer
[debug2] executing certutil -encode 8f43288ad272f3103b6fb1428485ea3014c0bcfe.crt temp.cer
[debug2] executing certutil -encode 92b46c76e13054e104f230517e6e504d43ab10b5.crt temp.cer
[debug2] executing certutil -encode a2f9c670faf5b654641e0989ad30165d480b0d4f.crt temp.cer
[debug2] executing certutil -encode a43489159a520f0d93d032ccaf37e7fe20a8b419.crt temp.cer
[debug2] executing certutil -encode be36a4562fb2ee05dbb3d32323adf445084ed656.crt temp.cer
[debug2] executing certutil -encode cdd4eeae6000ac7f40c3802c171e30148030c072.crt temp.cer
[debug2] executing certutil -encode d559a586669b08f46a30a133f8a9ed3d038e2ea8.crt temp.cer
[debug2] executing certutil -encode fee449ee0e3965a5246f000e87fde2a065fd89d4.crt temp.cer
[debug2] Changing back to 'C:/Program Files/GLPI-Agent' folder
[debug] [http client] TRACE: Setting SSL_ca ssl option: 168
DEBUG: .../IO/Socket/SSL.pm:3024: new ctx 3023316353776
DEBUG: .../IO/Socket/SSL.pm:705: socket not yet connected
DEBUG: .../IO/Socket/SSL.pm:707: socket connected
DEBUG: .../IO/Socket/SSL.pm:730: ssl handshake not started
DEBUG: .../IO/Socket/SSL.pm:777: not using SNI because hostname is unknown
DEBUG: .../IO/Socket/SSL.pm:815: request OCSP stapling
DEBUG: .../IO/Socket/SSL.pm:831: set socket to non-blocking to enforce timeout=180
DEBUG: .../IO/Socket/SSL.pm:845: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (OUT), TLS handshake, Client hello (1)
DEBUG: .../IO/Socket/SSL.pm:848: done Net::SSLeay::connect -> -1
DEBUG: .../IO/Socket/SSL.pm:858: ssl handshake in progress
DEBUG: .../IO/Socket/SSL.pm:868: waiting for fd to become ready: SSL wants a read first
DEBUG: .../IO/Socket/SSL.pm:888: socket ready, retrying connect
DEBUG: .../IO/Socket/SSL.pm:845: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Server hello (2)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Certificate (11)
DEBUG: .../IO/Socket/SSL.pm:2868: ok=0 [0] /CN=192.168.1.152/CN=192.168.1.152
DEBUG: .../IO/Socket/SSL.pm:2868: ok=1 [0] /CN=192.168.1.152/CN=192.168.1.152
DEBUG: .../IO/Socket/SSL.pm:1840: scheme=www cert=3023325925744
DEBUG: .../IO/Socket/SSL.pm:1850: identity=192.168.1.37 cn=192.168.1.152 alt=2 192.168.1.152
DEBUG: .../IO/Socket/SSL.pm:2869: local error: hostname verification failed
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, CERT verify (15)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Finished (20)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (OUT), TLS handshake, Finished (20)
DEBUG: .../IO/Socket/SSL.pm:848: done Net::SSLeay::connect -> 1
DEBUG: .../IO/Socket/SSL.pm:903: ssl handshake done
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4)
[debug2] format: Zlib
[debug2] [http client] 001243BB: received message:
{"expiration":"24","status":"ok"}
[debug2] Got 168 certificates in SSL ca cache
[debug2] [http client] Using Compress::Zlib for compression
DEBUG: .../IO/Socket/SSL.pm:3073: free ctx 3023316353776 open=3023316353776
DEBUG: .../IO/Socket/SSL.pm:3077: free ctx 3023316353776 callback
DEBUG: .../IO/Socket/SSL.pm:3084: OK free ctx 3023316353776
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug] [http client] TRACE: Setting SSL_ca ssl option: 168
DEBUG: .../IO/Socket/SSL.pm:3024: new ctx 3023316353776
DEBUG: .../IO/Socket/SSL.pm:705: socket not yet connected
DEBUG: .../IO/Socket/SSL.pm:707: socket connected
DEBUG: .../IO/Socket/SSL.pm:730: ssl handshake not started
DEBUG: .../IO/Socket/SSL.pm:777: not using SNI because hostname is unknown
DEBUG: .../IO/Socket/SSL.pm:815: request OCSP stapling
DEBUG: .../IO/Socket/SSL.pm:831: set socket to non-blocking to enforce timeout=180
DEBUG: .../IO/Socket/SSL.pm:845: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (OUT), TLS handshake, Client hello (1)
DEBUG: .../IO/Socket/SSL.pm:848: done Net::SSLeay::connect -> -1
DEBUG: .../IO/Socket/SSL.pm:858: ssl handshake in progress
DEBUG: .../IO/Socket/SSL.pm:868: waiting for fd to become ready: SSL wants a read first
DEBUG: .../IO/Socket/SSL.pm:888: socket ready, retrying connect
DEBUG: .../IO/Socket/SSL.pm:845: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Server hello (2)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Certificate (11)
DEBUG: .../IO/Socket/SSL.pm:2868: ok=0 [0] /CN=192.168.1.152/CN=192.168.1.152
DEBUG: .../IO/Socket/SSL.pm:2868: ok=1 [0] /CN=192.168.1.152/CN=192.168.1.152
DEBUG: .../IO/Socket/SSL.pm:1840: scheme=www cert=3023325931888
DEBUG: .../IO/Socket/SSL.pm:1850: identity=192.168.1.37 cn=192.168.1.152 alt=2 192.168.1.152
DEBUG: .../IO/Socket/SSL.pm:2869: local error: hostname verification failed
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, CERT verify (15)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Finished (20)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (OUT), TLS handshake, Finished (20)
DEBUG: .../IO/Socket/SSL.pm:848: done Net::SSLeay::connect -> 1
DEBUG: .../IO/Socket/SSL.pm:903: ssl handshake done
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4)
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
[debug2] Starting child: -10020
[debug2] [http client] Using Compress::Zlib for compression
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug] [http client] TRACE: Setting SSL_ca ssl option: 168
DEBUG: .../IO/Socket/SSL.pm:3024: new ctx 3023345726640
DEBUG: .../IO/Socket/SSL.pm:705: socket not yet connected
DEBUG: .../IO/Socket/SSL.pm:707: socket connected
DEBUG: .../IO/Socket/SSL.pm:730: ssl handshake not started
DEBUG: .../IO/Socket/SSL.pm:777: not using SNI because hostname is unknown
DEBUG: .../IO/Socket/SSL.pm:815: request OCSP stapling
DEBUG: .../IO/Socket/SSL.pm:831: set socket to non-blocking to enforce timeout=180
DEBUG: .../IO/Socket/SSL.pm:845: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (OUT), TLS handshake, Client hello (1)
DEBUG: .../IO/Socket/SSL.pm:848: done Net::SSLeay::connect -> -1
DEBUG: .../IO/Socket/SSL.pm:858: ssl handshake in progress
DEBUG: .../IO/Socket/SSL.pm:868: waiting for fd to become ready: SSL wants a read first
DEBUG: .../IO/Socket/SSL.pm:888: socket ready, retrying connect
DEBUG: .../IO/Socket/SSL.pm:845: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Server hello (2)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Certificate (11)
DEBUG: .../IO/Socket/SSL.pm:2868: ok=0 [0] /CN=192.168.1.152/CN=192.168.1.152
DEBUG: .../IO/Socket/SSL.pm:2868: ok=1 [0] /CN=192.168.1.152/CN=192.168.1.152
DEBUG: .../IO/Socket/SSL.pm:1840: scheme=www cert=3023350336480
DEBUG: .../IO/Socket/SSL.pm:1850: identity=192.168.1.37 cn=192.168.1.152 alt=2 192.168.1.152
DEBUG: .../IO/Socket/SSL.pm:2869: local error: hostname verification failed
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, CERT verify (15)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Finished (20)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (OUT), TLS handshake, Finished (20)
DEBUG: .../IO/Socket/SSL.pm:848: done Net::SSLeay::connect -> 1
DEBUG: .../IO/Socket/SSL.pm:903: ssl handshake done
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4)
DEBUG: .../IO/Socket/SSL.pm:3751: * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4)
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
[debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8"?>
<REQUEST>
  <DEVICEID>foo</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
DEBUG: .../IO/Socket/SSL.pm:3073: free ctx 3023345726640 open=3023345726640
DEBUG: .../IO/Socket/SSL.pm:3077: free ctx 3023345726640 callback
DEBUG: .../IO/Socket/SSL.pm:3084: OK free ctx 3023345726640
[debug2] Leaving child: -10020
DEBUG: .../IO/Socket/SSL.pm:3073: free ctx 3023316353776 open=
DEBUG: .../IO/Socket/SSL.pm:3073: free ctx 3023316353776 open=3023316353776
DEBUG: .../IO/Socket/SSL.pm:3077: free ctx 3023316353776 callback
DEBUG: .../IO/Socket/SSL.pm:3084: OK free ctx 3023316353776

So can you tell me if this simple script reproduces your issue ? And than share the output ?

@g-bougard
Copy link
Member

Hi @silhusk

have you any feedback after my last comment ?

Anyway, you may be interested by last d23ad9f commit. It introduces ssl-keystore option support and in your case, you'll be able to test --ssl-keystore=none to just disable Windows KeyStore automatic support.

So can you test this new feature using the next nightly build ?

@silhusk
Copy link
Author

silhusk commented Sep 18, 2024

Hi @g-bougard, sorry for the delay in the response, I was trying to catch up with other things

the script does not reproduce the issue: glpi-agent_test-718.txt

I'm using the full url, including front/communication.php, because I have basic auth in place for the agents (Apache AuthType basic). Doing it on glpi.domain/marketplace/glpiinventory/ gets in the way of the admin web UI, which I don't want to use basic auth.

I modified the test script to add user => 'username', password => 'pw' to each of the three places where a new client is instantiated.

Will try the nightly. But I have a feeling it's more due to threading than certificates.

@g-bougard
Copy link
Member

Thank you for this feedback.

Indeed, the test I asked to run tends more to tell this is not a multi-threading issue and indeed the original error you have is really related to keystore certificates loading.

Now you explained you also use basic authentication, maybe this can also be related but I really hope not.

Let's wait to see if the new option permits you to use it as work-around.

@silhusk
Copy link
Author

silhusk commented Sep 18, 2024

This variation of the script does trigger the problem
test-certs-for-issue-718-mod.zip

here the respective log: glpi-agent_test-718-mod.txt

It is based on NetDiscovery.pm. In particular it uses Parallel::ForkManager.

When run with --threads=0 it does not crash. (change condition on line 109 to actually send something)

@silhusk
Copy link
Author

silhusk commented Sep 18, 2024

Nightly d23ad9f with --ssl-keystore=none does work perfectly!! Thanks!

@g-bougard
Copy link
Member

Nice to hear the new option helps you.

About your change on the test script, I don't see error triggered in your log. Did I miss something ?
About using Parallel::ForkManager where I use threads, indeed, on windows, forking means threads is used. So my purpose was to try to reproduce what Parallel::ForkManager does on windows.

@silhusk
Copy link
Author

silhusk commented Sep 19, 2024

About your change on the test script, I don't see error triggered in your log. Did I miss something ?

It just crashes midway through, no error message. A complete execution would end with each thread having printed [debug2] Leaving child: and output like this:

DEBUG: .../IO/Socket/SSL.pm:920: got OCSP success with stapling
DEBUG: .../IO/Socket/SSL.pm:3800: * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4)
[debug] [http client] authentication required, submitting credentials
[debug] [http client] authentication required, trying basic with AgentSmart user (Agent)
[debug2] format: Zlib
[debug2] [http client] receiving message:
<?xml version="1.0"?>
<REPLY><PROLOG_FREQ>24</PROLOG_FREQ><RESPONSE>SEND</RESPONSE></REPLY>
DEBUG: .../IO/Socket/SSL.pm:3122: free ctx 2438569961200 open=2438569961200 2438569959456
DEBUG: .../IO/Socket/SSL.pm:3126: free ctx 2438569961200 callback
DEBUG: .../IO/Socket/SSL.pm:3133: OK free ctx 2438569961200
[debug2] Leaving child: 15796
DEBUG: .../IO/Socket/SSL.pm:3122: free ctx 2438569959456 open=2438569959456
DEBUG: .../IO/Socket/SSL.pm:3126: free ctx 2438569959456 callback
DEBUG: .../IO/Socket/SSL.pm:3133: OK free ctx 2438569959456

c:\Program Files\GLPI-Agent>

In the log attached to the previous post, please notice the last Starting child: -10092, then a message is sent but no response is ever logged.

@silhusk
Copy link
Author

silhusk commented Sep 19, 2024

This glpi-agent_test-718-t1nos.txt is a run with errors instead of crash. The first requests go through, but the second child fails.

@g-bougard
Copy link
Member

Hi @g-bougard

okay, thank you for this feedback.

By now, I can only suppose Parallel::ForkManager is doing something wrong in this context. And I need to find out what before investigating for a fix.

@g-bougard
Copy link
Member

Hi @silhusk

fyi, just released GLPI-Agent 1.11 which includes the ssl-keystore option support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
windows Issue occurs on windows
Projects
None yet
Development

No branches or pull requests

2 participants