Below error and warning messages are found in the celery_ucr_queue logs. Are they helpful to investigate the issue?
Unfortunately, I couldn't get celery_background_log file.
2023-05-14 11:33:22,268 ERROR [celery.utils.dispatch.signal] Signal handler <function celery_add_time_sent at 0x7f05793bea60> raised: ConnectionError('Error 111 connecting to 172.19.4.33:6379. Connection refused.')
Traceback (most recent call last):
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/cache.py", line 27, in _decorator
return method(self, *args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/cache.py", line 76, in set
return self.client.set(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/client/default.py", line 166, in set
raise ConnectionInterrupted(connection=client) from e
django_redis.exceptions.ConnectionInterrupted: Redis ConnectionError: Error 111 connecting to 172.19.4.33:6379. Connection refused.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/celery/utils/dispatch/signal.py", line 276, in send
response = receiver(signal=self, sender=sender, **named)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/corehq/celery_monitoring/signals.py", line 22, in celery_add_time_sent
TimeToStartTimer(task_id).start_timing(eta)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/corehq/celery_monitoring/signals.py", line 110, in start_timing
cache.set(self._cache_key, eta or datetime.datetime.utcnow(), timeout=3 * 24 * 60 * 60)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/cache.py", line 34, in _decorator
raise e.__cause__
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/client/default.py", line 156, in set
return bool(client.set(nkey, nvalue, nx=nx, px=timeout, xx=xx))
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/redis/commands/core.py", line 2302, in set
return self.execute_command("SET", *pieces, **options)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/sentry_sdk/integrations/redis.py", line 170, in sentry_patched_execute_command
return old_execute_command(self, name, *args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/redis/client.py", line 1255, in execute_command
conn = self.connection or pool.get_connection(command_name, **options)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/redis/connection.py", line 1442, in get_connection
connection.connect()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/redis/connection.py", line 704, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to 172.19.4.33:6379. Connection refused.
2023-05-14 11:36:23,159 ERROR [celery.utils.dispatch.signal] Signal handler <function update_celery_state at 0x7f056fa189d0> raised: OperationalError('connection to server at "172.19.3.36", port 6432 failed: FATAL: client_login_timeout (server down)\n')
Traceback (most recent call last):
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/cache.py", line 27, in _decorator
return method(self, *args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/cache.py", line 94, in _get
return self.client.get(key, default=default, version=version, client=client)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/client/default.py", line 222, in get
raise ConnectionInterrupted(connection=client) from e
django_redis.exceptions.ConnectionInterrupted: Redis ConnectionError: Error 111 connecting to 172.19.4.33:6379. Connection refused.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/sentry_sdk/integrations/celery.py", line 229, in _inner
reraise(*exc_info)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/sentry_sdk/_compat.py", line 60, in reraise
raise value
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/sentry_sdk/integrations/celery.py", line 224, in _inner
return f(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__
return self.run(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/corehq/celery_monitoring/heartbeat.py", line 118, in heartbeat
self.get_and_report_blockage_duration()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/corehq/celery_monitoring/heartbeat.py", line 74, in get_and_report_blockage_duration
blockage_duration = self.get_blockage_duration()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/corehq/celery_monitoring/heartbeat.py", line 70, in get_blockage_duration
return max(datetime.datetime.utcnow() - self.get_last_seen() - HEARTBEAT_FREQUENCY,
File "/home/cchq/www/echis/releases/2023-05-13_11.49/corehq/celery_monitoring/heartbeat.py", line 49, in get_last_seen
value = self._heartbeat_cache.get()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/corehq/celery_monitoring/heartbeat.py", line 27, in get
return cache.get(self._cache_key())
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/cache.py", line 87, in get
value = self._get(key, default, version, client)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/cache.py", line 34, in _decorator
raise e.__cause__
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_redis/client/default.py", line 220, in get
value = client.get(key)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/redis/commands/core.py", line 1790, in get
return self.execute_command("GET", name)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/sentry_sdk/integrations/redis.py", line 170, in sentry_patched_execute_command
return old_execute_command(self, name, *args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/redis/client.py", line 1255, in execute_command
conn = self.connection or pool.get_connection(command_name, **options)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/redis/connection.py", line 1442, in get_connection
connection.connect()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/redis/connection.py", line 704, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to 172.19.4.33:6379. Connection refused.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
self.connect()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/sentry_sdk/integrations/django/__init__.py", line 605, in connect
return real_connect(self)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/utils/asyncio.py", line 33, in inner
return func(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/backends/base/base.py", line 200, in connect
self.connection = self.get_new_connection(conn_params)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/utils/asyncio.py", line 33, in inner
return func(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/backends/postgresql/base.py", line 187, in get_new_connection
connection = Database.connect(**conn_params)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/psycopg2/__init__.py", line 127, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: connection to server at "172.19.3.36", port 6432 failed: FATAL: client_login_timeout (server down)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/celery/utils/dispatch/signal.py", line 276, in send
response = receiver(signal=self, sender=sender, **named)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/corehq/ex-submodules/casexml/apps/phone/tasks.py", line 68, in update_celery_state
backend.store_result(headers['id'], None, ASYNC_RESTORE_SENT)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/celery/backends/base.py", line 528, in store_result
self._store_result(task_id, result, state, traceback,
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_celery_results/backends/database.py", line 132, in _store_result
self.TaskModel._default_manager.store_result(**task_props)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_celery_results/managers.py", line 43, in _inner
return fun(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django_celery_results/managers.py", line 168, in store_result
obj, created = self.using(using).get_or_create(task_id=task_id,
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/models/query.py", line 581, in get_or_create
return self.get(**kwargs), False
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/models/query.py", line 431, in get
num = len(clone)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/models/query.py", line 262, in __len__
self._fetch_all()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/models/query.py", line 1324, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/models/query.py", line 51, in __iter__
results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1173, in execute_sql
cursor = self.connection.cursor()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/utils/asyncio.py", line 33, in inner
return func(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/backends/base/base.py", line 259, in cursor
return self._cursor()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/backends/base/base.py", line 235, in _cursor
self.ensure_connection()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/utils/asyncio.py", line 33, in inner
return func(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
self.connect()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
self.connect()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/sentry_sdk/integrations/django/__init__.py", line 605, in connect
return real_connect(self)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/utils/asyncio.py", line 33, in inner
return func(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/backends/base/base.py", line 200, in connect
self.connection = self.get_new_connection(conn_params)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/utils/asyncio.py", line 33, in inner
return func(*args, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/django/db/backends/postgresql/base.py", line 187, in get_new_connection
connection = Database.connect(**conn_params)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/psycopg2/__init__.py", line 127, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: connection to server at "172.19.3.36", port 6432 failed: FATAL: client_login_timeout (server down)
2023-05-14 11:36:23,727 WARNING [celery.worker.consumer.consumer] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/celery/worker/consumer/consumer.py", line 332, in start
blueprint.start(self)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/celery/worker/consumer/consumer.py", line 628, in start
c.loop(*c.loop_args())
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/celery/worker/loops.py", line 130, in synloop
connection.drain_events(timeout=2.0)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/kombu/connection.py", line 316, in drain_events
return self.transport.drain_events(self.connection, **kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/kombu/transport/pyamqp.py", line 169, in drain_events
return connection.drain_events(**kwargs)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/amqp/connection.py", line 525, in drain_events
while not self.blocking_read(timeout):
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/amqp/connection.py", line 530, in blocking_read
frame = self.transport.read_frame()
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/amqp/transport.py", line 294, in read_frame
frame_header = read(7, True)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/amqp/transport.py", line 635, in _read
raise OSError('Server unexpectedly closed connection')
OSError: Server unexpectedly closed connection
2023-05-14 11:36:23,729 WARNING [py.warnings] /home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/celery/worker/consumer/consumer.py:367: CPendingDeprecationWarning:
In Celery 5.1 we introduced an optional breaking change which
on connection loss cancels all currently executed tasks with late acknowledgement enabled.
These tasks cannot be acknowledged as the connection is gone, and the tasks are automatically redelivered back to the queue.
You can enable this behavior using the worker_cancel_long_running_tasks_on_connection_loss setting.
In Celery 5.1 it is set to False by default. The setting will be set to True by default in Celery 6.0.
warnings.warn(CANCEL_TASKS_BY_DEFAULT, CPendingDeprecationWarning)
2023-05-14 11:36:24,973 CRITICAL [celery.worker.request] Couldn't ack 47, reason:RecoverableConnectionError(None, 'connection already closed', None, '')
Traceback (most recent call last):
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/kombu/message.py", line 128, in ack_log_error
self.ack(multiple=multiple)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/kombu/message.py", line 123, in ack
self.channel.basic_ack(self.delivery_tag, multiple=multiple)
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/amqp/channel.py", line 1407, in basic_ack
return self.send_method(
File "/home/cchq/www/echis/releases/2023-05-13_11.49/python_env-3.9/lib/python3.9/site-packages/amqp/abstract_channel.py", line 67, in send_method
raise RecoverableConnectionError('connection already closed')
Thank you,