Internal Server error: on Local server setup

Hi,

Yesterday I do a partial server upgrade OS to ubuntu 18.04(installed locally). I've taken all the necessary steps, like taking a backup and restore back after upgrade.

  1. Upgrade to ubuntu 18.04
  2. run deploy-stack
  3. restore couchdb2 data on the upgraded VM
  4. check_service:
    everything is running fine, except some shards missing on elasticsearch
    5.check django.log
    No error
    6.checking echis0-commcarehq.django.log: No error
    tail -f /home/cchq/www/echis/log/echis0-commcarehq.django.log
  5. implement lets-encrypt-CA
    Successful using certbot

But the page is still displaying Internal Server Error.

Note!. the following Databases are out of the upgrade process and are untouched during the process: Postgresql, Redis, Minio,

Hi Demisew

Is there any details of the error in either the nginx log or the Django log?

tail -f /home/cchq/www/echis/log/django.log

/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/psycopg2/init.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
""")
2019-10-28 07:41:22,337 INFO AXES: BEGIN LOG
/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/psycopg2/init.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.

tail -f /home/cchq/www/echis/log/echis0-commcarehq.django.log

INFO 2019-10-28 07:43:57,321 login 659 139667994446280 AXES: BEGIN LOG
ERROR 2019-10-28 07:44:09,247 conn 27679 139742349107528 <BrokerConnection node_id=0 host=172.19.3.40:9092 [IPv4 ('172.19.3.40', 9092)]>: Error receiving network data closing socket
Traceback (most recent call last):
File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/kafka/conn.py", line 817, in _recv
data = self._sock.recv(self.config['sock_chunk_bytes'])
File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/gevent/_socket3.py", line 378, in recv
return _socket.socket.recv(self._sock, *args)
ConnectionResetError: [Errno 104] Connection reset by peer
ERROR 2019-10-28 07:44:09,351 conn 27679 139742349107528 <BrokerConnection node_id=0 host=172.19.3.40:9092 [IPv4 ('172.19.3.40', 9092)]>: socket disconnected
INFO 2019-10-28 07:44:11,288 login 723 140302094375368 AXES: BEGIN LOG
INFO 2019-10-28 07:44:25,290 login 850 139629977754056 AXES: BEGIN LOG
INFO 2019-10-28 07:44:39,209 login 986 139928720890312 AXES: BEGIN LOG

tail -f /var/log/nginx/error.log
Displays nothing

Hi SImon,
The above django.log file i took, doesn't show the problem. Because django service was down(or continiously restarting).

The following log displays the error when django service started successfully.

tail -f /home/cchq/www/echis/log/django.log

raise original_exception
File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 889, in execute_sql
cursor.execute(sql, params)
File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/sentry_sdk/integrations/django/init.py", line 395, in execute
return real_execute(self, sql, params)
File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/django/db/utils.py", line 94, in exit
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/django/utils/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: column phone_synclogsql.app_id does not exist
LINE 1: ...sql"."log_format", "phone_synclogsql"."build_id", "phone_syn...
^
tail -f /home/cchq/www/echis/log/echis0-commcarehq.django.log

File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/django/db/utils.py", line 94, in exit
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/django/utils/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/home/cchq/www/echis/releases/2019-10-26_17.12/python_env-3.6/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: column phone_synclogsql.app_id does not exist
LINE 1: ...sql"."log_format", "phone_synclogsql"."build_id", "phone_syn...

tail -f /var/log/nginx/error.log

2019/10/28 07:59:15 [alert] 514#514: worker process 4722 exited on signal 9

The error is referring to a column in the DB that was recently added. You should do a deploy to fix the issue (since a deploy will run the DB migration and ensure that all your services have the same version of code running).

The reason this happened is that running deploy_stack installed the latest version of CommCare HQ code but it did not run the migrations. Migrations are only run if the CCHQ_IS_FRESH_INSTALL var is defined.

In general you should do a deploy after setting up a new machine to ensure that you have consistent code across all the CommCare services. This workflow is described here: https://dimagi.github.io/commcare-cloud/howto/add-celery-machines.html

Am getting the following error on redeployment

i do set the following inside /home/ansible/commcare-cloud/src/commcare_cloud/fab/config.py
GITHUB_APIKEY = 'None'

cchq echis deploy

Deploy commcare? [y/N] y
.....
....
Github username (leave blank for no auth):
I pressed Enter.
..................................
...............................

[172.19.3.41] Executing task 'deploy_commcare'
Are you sure you want to preindex and deploy to echis? [y/N] y

.....................................
.......................................
.....................................

File "/home/ansible/.virtualenvs/ansible/local/lib/python2.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/home/ansible/.virtualenvs/ansible/local/lib/python2.7/site-packages/requests/adapters.py", line 529, in send
raise ReadTimeout(e, request=request)
ReadTimeout: HTTPSConnectionPool(host='"'"'api.github.com'"'"', port=443): Read timed out. (read timeout=15)' --environment echis"
Disconnecting from 172.19.3.41... done

Is there anything i missed on the config?

You should set it to None (no quotes) otherwise it will try use use the value 'None' as your API key:

GITHUB_APIKEY = None

Thank you, Simon

It worked! Redeployment succeeded. And the error is gone.