Celery check_services exception Service check errored with exception 'AttributeError("'NoneType' object has no attribute 'total_seconds'")'

I'm getting this on a fresh installation when running check_services:

EXCEPTION (Took 0.00s) celery : Service check errored with exception 'AttributeError("'NoneType' object has no attribute 'total_seconds'")'

I've redeployed and getting the same exception. Any ideas are welcome!

If you can get the full stacktrace that would be helpful. You'll have to run it from a Django shell:

from corehq.apps.hqadmin.service_checks import check_celery
check_celery()
1 Like

Hey Simon, thanks for the response. I get this:

In [2]: check_celery()
---------------------------------------------------------------------------
HeartbeatNeverRecorded                    Traceback (most recent call last)
File ~/www/monolith/releases/2022-03-10_14.25/corehq/apps/hqadmin/service_checks.py:167, in check_celery()
    166 try:
--> 167     blockage_duration = heartbeat.get_and_report_blockage_duration()
    168     heartbeat_time_to_start = heartbeat.get_and_report_time_to_start()

File ~/www/monolith/releases/2022-03-10_14.25/corehq/celery_monitoring/heartbeat.py:74, in Heartbeat.get_and_report_blockage_duration(self)
     73 def get_and_report_blockage_duration(self):
---> 74     blockage_duration = self.get_blockage_duration()
     75     metrics_gauge(
     76         'commcare.celery.heartbeat.blockage_duration',
     77         blockage_duration.total_seconds(),
     78         tags={'celery_queue': self.queue},
     79         multiprocess_mode=MPM_MAX
     80     )

File ~/www/monolith/releases/2022-03-10_14.25/corehq/celery_monitoring/heartbeat.py:70, in Heartbeat.get_blockage_duration(self)
     68 # Subtract off the time between heartbeats
     69 # since we don't know how long it's been since the last heartbeat
---> 70 return max(datetime.datetime.utcnow() - self.get_last_seen() - HEARTBEAT_FREQUENCY,
     71            datetime.timedelta(seconds=0))

File ~/www/monolith/releases/2022-03-10_14.25/corehq/celery_monitoring/heartbeat.py:51, in Heartbeat.get_last_seen(self)
     50 if value is None:
---> 51     raise HeartbeatNeverRecorded()
     52 else:

HeartbeatNeverRecorded:

During handling of the above exception, another exception occurred:

NameError                                 Traceback (most recent call last)
Input In [2], in <module>
----> 1 check_celery()

File ~/www/monolith/releases/2022-03-10_14.25/corehq/apps/hqadmin/service_checks.py:170, in check_celery()
    168     heartbeat_time_to_start = heartbeat.get_and_report_time_to_start()
    169 except HeartbeatNeverRecorded:
--> 170     blocked_queues.append((queue, 'as long as we can see', threshold))
    171 else:
    172     # We get a lot of self-resolving celery "downtime" under 5 minutes
    173     # so to make actionable, we never alert on blockage under 5 minutes
    174     # It is still counted as out of SLA for the celery uptime metric in datadog
    175     if blockage_duration > max(threshold, datetime.timedelta(minutes=5)):

NameError: name 'blocked_queues' is not defined

Thanks Ed, I've got a PR out to fix it: fix celery service check by snopoke · Pull Request #31256 · dimagi/commcare-hq · GitHub

1 Like

Awesome, thanks Simon... will keep my eyes peeled.
Ed

I did a direct edit in my latest deployed code directory:
/home/cchq/www/monolith/releases/2022-03-15_09.49/corehq/apps/hqadmin/service_checks.py
...but am still getting an error...

:frowning_face: I've pushed another commit to that same PR

1 Like

Thanks Simon, I'll test and revert

EDIT OK, that last update seemed to do the trick. Looks like it's up! Thanks again @Simon_Kelly !