Unable to connect to Formplayer

Hi,

The following error is generated when try to make new version of app after a new deployment.
SSL Certificate Checker shows that nothing is wrong with the certificate validation.

2024-07-09 07:31:18,566 ERROR [notify] Notify Exception: Error calling Formplayer form validation endpoint
Traceback (most recent call last):
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 404, in _make_request
    self._validate_conn(conn)
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1060, in _validate_conn
    conn.connect()
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/urllib3/connection.py", line 419, in connect
    self.sock = ssl_wrap_socket(
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/gevent/ssl.py", line 121, in wrap_socket
    return self.sslsocket_class(
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/gevent/ssl.py", line 319, in __init__
    raise x
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/gevent/ssl.py", line 315, in __init__
    self.do_handshake()
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/gevent/ssl.py", line 673, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1133)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 801, in urlopen
    retries = retries.increment(
  File "/home/cchq/www/echis/releases/2024-07-08_21.56/python_env-3.9/lib/python3.9/site-packages/urllib3/util/retry.py", line 594, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.echisethiopia.org', port=443): Max retries exceeded with url: /formplayer/validate_form (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1133)')))

Services status:

SUCCESS (Took   0.11s) kafka          : Kafka seems to be in order
SUCCESS (Took   0.00s) redis          : Redis is up and using 2.68G memory
SUCCESS (Took   0.06s) postgres       : default:commcarehq:OK auditcare:commcarehq_auditcare:OK p1:commcarehq_p1:OK p2:commcarehq_p2:OK p3:commcarehq_p3:OK p4:commcarehq_p4:OK p5:commcarehq_p5:OK p6:commcarehq_p6:OK p7:commcarehq_p7:OK p8:commcarehq_p8:OK proxy:commcarehq_proxy:OK synclogs:commcarehq_synclogs:OK ucr:commcarehq_ucr:OK Successfully got a user from postgres
SUCCESS (Took   0.01s) couch          : Successfully queried an arbitrary couch view
SUCCESS (Took   0.01s) celery         : OK
SUCCESS (Took   0.08s) elasticsearch  : Successfully sent a doc to ES and read it back
SUCCESS (Took   0.27s) blobdb         : Successfully saved a file to the blobdb
SUCCESS (Took   0.02s) formplayer     : Formplayer returned a 200 status code: https://www.echisethiopia.org/formplayer/serverup

Thank you,

Hi Siraj,

Looking at the error it seems that the server isn't recognising the CA from the SSL certificate, leading it to believe that the certificate is self-signed. A possible cause for this might be that the CA DB on the machine needs to be updated. This can be done by running the following commands on the machine:

sudo apt-get update
sudo update-ca-certificates

After updating the library, you can verify that the certificate has been added to the list of trusted CAs with openssl verify /path/to/mycert.crt

Kind regards,
Zandre

Hi Zandre,

Followed the following steps:

echo -n | openssl s_client -connect [www.echisethiopia.org:443](http://www.echisethiopia.org:443/) | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > echisethiopia.crt
sudo cp echisethiopia.crt /usr/local/share/ca-certificates/

sudo update-ca-certificates

Verification:

openssl verify /usr/local/share/ca-certificates/ca-certificates.crt
/usr/local/share/ca-certificates/ca-certificates.crt: OK

openssl verify /etc/ssl/certs/echisethiopia.crt
CN = www.echisethiopia.org
error 20 at 0 depth lookup: unable to get local issuer certificate
error /etc/ssl/certs/echisethiopia.crt: verification failed

The problem is not solved.

Thank you,

Hi Siraj,

Could you kindly please follow the steps laid out in this documentation to create a new Let's Encrypt certificate for the machine?

Essentially, the steps are as follows:

  1. Run commcare-cloud <instance_name> ansible-playbook letsencrypt_cert.yml --skip-check to generate a new cert.
  2. Ensure that fake_ssl_cert is set to False in proxy.yml.
  3. Deploy proxy with commcare-cloud <instance_name> ansible-playbook deploy_proxy.yml --skip-check

Kind regards,
Zandre

Hi Zandre,

Sure, followed the steps.

SSL Server Test: www.echisethiopia.org (Powered by Qualys SSL Labs) shows that the certificate has an issue.
We are investigating the problem. Will let you know.

Thank you,

Hi Zandre,

The certificate is issued properly except the following.

DNS CAA : No
Chrome 49 / XP SP3: Server sent fatal alert: handshake_failure
Session resumption (caching): No (IDs assigned but not accepted)
Strict Transport Security (HSTS): Invalid

Are these attributes required to be trusted?

Thank you,

Hi Siraj,

Point 1, 3, and 4 are good to implement from a security and performance standpoint, however they are not required for the SSL certificate to be trusted.

Point 2 simply means that browsers running on version 49 of Chrome might have trouble validating your domain when connecting to it.

After creating a new SSL certificate, were you able to successfully create a new app version?

Kind regards,
Zandre

Hi Zandre,

Yes.
July 7th created one app after the new certification was issued (July 5th), the error occurred all of the sudden.
The error is still appearing even though new certificate is created.

Thank you,

Hi, @sirajhassan.

The error is still appearing even though new certificate is created.

Could you confirm whether this is the exact same error as you reported in the first comment?

Hi @CharlSmit,

Sure, it's the same error

Are you by any chance running the CommCare HQ environment on a virtual private server?

Yes. One formplayer and one web worker service are deployed on the telecom's VPN network.

Also, could you confirm the values of the following fields in the proxy.yml file, please?

  1. fake_ssl_cert
  2. letsencrypt_cchq_ssl

Also, are the following keys present in the file (you don't have to share the value):

  1. nginx_combined_cert_value
  2. nginx_key_value
fake_ssl_cert: no
letsencrypt_cchq_ssl: True

nginx_combined_cert_value and nginx_key_value are not added.

Hi @CharlSmit @zandre_eng,

Now able to make new version of the apps after CouchDB redeployment and the certificate error has gone.
Still there is formplayer connection error.

Error message from the server:

ERROR [notify] [Cloudcare] Unable to connect to form playing service. Please report an issue if you continue to see this message.

Regards,

Hi, @sirajhassan .

I'm glad to hear that the certificate error is gone.

To better diagnose the formplayer error, would you mind looking at the formplayer logs and see if it says anything interesting there?

The logs should be at /home/cchq/www/<env>/log/formplayer-spring.log

This is the formplayer error message in /home/cchq/www/env/log/formplayer-spring.log file.

Hi, @sirajhassan .

Having a look at the logs the following line catches my eye:

sun.security.provider.certpath.SunCertPathBuilderException

From a quick google search it appears to be related to the self-signed certificate issue you were having earlier unfortunately.

Let me run this issue past some other folks as well to get more input.

Hi, @sirajhassan .

Two things to consider:

  1. Is the formplayer machine using some custom hosts file or similar to route traffic from formplayer to the HQ machine?

  2. The issue in the logs seems like a general issue that occurs when the Java environment does not have a proper CA certificate path to the HTTPS server (HQ in this case) to verify that it is a valid website. As such we need to make sure the certificate is present in the java trusted keystore.

For number 2 above you can read through this and this articles for context.

Basically, we need to make sure the certificate of the server (https://www.echisethiopia.org/) is listed in the keystore. To check this, log into the formplayer machine and run the following command:

keytool -list -keystore

The cacerts file would most likely be located at /etc/ssl/certs/java/cacerts

Hi @CharlSmit,

Now, the formplayer log shows this error message.