Blobdb Migration

Hi All,

The following error is appearing after the system is migrated to new server environment. "172.19.3.36" is the old PostgreSQL database IP address.

Minio copy command is used to migrate the object files to the new cluster (MINIO is our object storage).
Is there any additional task to migrate the files?

EXCEPTION (Took 130.06s) blobdb : Service check errored with exception 'InternalError('PL/Proxy function public.delete_blob_meta(1): [commcarehq_p1] PQconnectPoll: connection to server at "172.19.3.36", port 6432 failed: Connection timed out\n\tIs the server running on that host and accepting TCP/IP connections?\n\n')'

Thank you,

Hi Siraj,

Are you able to access CommCareHQ on the new setup?
Is it a monolith? Or is minio on a separate machine?

The error you shared is coming when CommCareHQ is trying to confirm if blobdb service is working well and is unable to connect to the postgres. It appears that somewhere in the configurations the IP address for connecting to postgres is still the old one.

Hi Manish,

Yes, able to access it using IP address (https://196.189.126.214).

It is deployed on multiple servers and checked that they are replicating each other (the copy process from the source server is not finished yet).

Pools:
   1st, Erasure sets: 1, Drives per erasure set: 16

2.9 GiB Used, 1 Bucket, 4,500,819 Objects
16 drives online, 0 drives offline

The proxy has been set in public config file:

s3_blob_db_url: "http://192.168.1.41"

Thank you,

Hi Siraj,

the copy process from the source server is not finished yet

This is the minio copy process?

Are you able to confirm that the S3_BLOB_DB_SETTINGS in localsettings file is using the new url set in s3_blob_db_url? This should be on all the new machines.

Hi Manish,

It is minio copy process from the old cluster to the new one - still in progress.

S3_BLOB_DB_SETTINGS has been set in the localsettings.py file.


S3_BLOB_DB_SETTINGS = {
    'url': 'http://192.168.1.41',
    'access_key': ' access key here',
    'secret_key': ' secret key here',
    's3_bucket': 'echis',
    'bulk_delete_chunksize': 200,
    'config': {'signature_version': ' '},
}

Hi Siraj,

Good that the localsettings has the correct url for blob db settings.

Just confirming that the old environment is not accepting new data while the migration is in process?

Can you run the following check in a django shell so that we have a stacktrace to know where the old IP address is being accessed from?

There is no any output of the command, the modules are imported.
Is there something wrong with the command?

In [2]: def check_blobdb():
   ...:     """Save something to the blobdb and try reading it back."""
   ...:     db = get_blob_db()
   ...:     contents = b"/home/cchq/www/echis/current/test.txt"
   ...:     meta = db.put(
   ...:         BytesIO(contents),
   ...:         domain="fmoh-echis",
   ...:         parent_id="check_blobdb",
   ...:         type_code=CODES.tempfile,
   ...:     )
   ...:     with db.get(meta=meta) as fh:
   ...:         res = fh.read()
   ...:     db.delete(key=meta.key)
   ...:     if res == contents:
   ...:         return ServiceStatus(True, "Successfully saved a file to the blobdb")
   ...:     return ServiceStatus(False, "Failed to save a file to the blobdb")
   ...:

In [3]:

In [3]:

Hi Siraj,

You would need to call that function "check_blobdb()" so its executed.

Hi Manish,

This is the output of the function.

---------------------------------------------------------------------------
InternalError_Traceback (most recent call last)
File ~/www/echis/releases/2024-03-18_20.04/python_env/lib/python3.9/site-packages/django/db/backends/utils.py:84, in CursorWrapper._execute(self, sql, params, *ignored_wrapper_args)
     83 else:
---> 84     return self.cursor.execute(sql, params)

InternalError_: PL/Proxy function public.delete_blob_meta(1): [commcarehq_p1] PQconnectPoll: connection to server at "172.19.3.36", port 6432 failed: Connection timed out
        Is the server running on that host and accepting TCP/IP connections?

The above exception was the direct cause of the following exception:

InternalError  Traceback (most recent call last)
Cell In[4], line 1
----> 1 check_blobdb()

Cell In[3], line 13, in check_blobdb()
     11 with db.get(meta=meta) as fh:
     12     res = fh.read()
---> 13 db.delete(key=meta.key)
     14 if res == contents:
     15     return ServiceStatus(True, "Successfully saved a file to the blobdb")

File ~/www/echis/releases/2024-03-18_20.04/corehq/blobs/s3db.py:134, in S3BlobDB.delete(self, key)
    132     obj.delete()
    133     success = True
--> 134 self.metadb.delete(key, deleted_bytes)
    135 return success

File ~/www/echis/releases/2024-03-18_20.04/corehq/blobs/metadata.py:82, in MetaDB.delete(self, key, content_length)
     73 """Delete blob metadata
     74 
     75 Metadata for temporary blobs is deleted. Non-temporary metadata
   (...)
     79 :returns: The number of metadata rows deleted.
     80 """
     81 with BlobMeta.get_plproxy_cursor() as cursor:
---> 82     cursor.execute('SELECT 1 FROM delete_blob_meta(%s)', [key])
     83 metrics_counter('commcare.blobs.deleted.count')
     84 metrics_counter('commcare.blobs.deleted.bytes', value=content_length)

File ~/www/echis/releases/2024-03-18_20.04/python_env/lib/python3.9/site-packages/sentry_sdk/integrations/django/__init__.py:582, in install_sql_hook.<locals>.execute(self, sql, params)
    577     return real_execute(self, sql, params)
    579 with record_sql_queries(
    580     hub, self.cursor, sql, params, paramstyle="format", executemany=False
    581 ):
--> 582     return real_execute(self, sql, params)

File ~/www/echis/releases/2024-03-18_20.04/python_env/lib/python3.9/site-packages/django/db/backends/utils.py:66, in CursorWrapper.execute(self, sql, params)
     65 def execute(self, sql, params=None):
---> 66     return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)

File ~/www/echis/releases/2024-03-18_20.04/python_env/lib/python3.9/site-packages/django/db/backends/utils.py:75, in CursorWrapper._execute_with_wrappers(self, sql, params, many, executor)
     73 for wrapper in reversed(self.db.execute_wrappers):
     74     executor = functools.partial(wrapper, executor)
---> 75 return executor(sql, params, many, context)

File ~/www/echis/releases/2024-03-18_20.04/python_env/lib/python3.9/site-packages/django/db/backends/utils.py:84, in CursorWrapper._execute(self, sql, params, *ignored_wrapper_args)
     82     return self.cursor.execute(sql)
     83 else:
---> 84     return self.cursor.execute(sql, params)

File ~/www/echis/releases/2024-03-18_20.04/python_env/lib/python3.9/site-packages/django/db/utils.py:90, in DatabaseErrorWrapper.__exit__(self, exc_type, exc_value, traceback)
     88 if dj_exc_type not in (DataError, IntegrityError):
     89     self.wrapper.errors_occurred = True
---> 90 raise dj_exc_value.with_traceback(traceback) from exc_value

File ~/www/echis/releases/2024-03-18_20.04/python_env/lib/python3.9/site-packages/django/db/backends/utils.py:84, in CursorWrapper._execute(self, sql, params, *ignored_wrapper_args)
     82     return self.cursor.execute(sql)
     83 else:
---> 84     return self.cursor.execute(sql, params)

InternalError: PL/Proxy function public.delete_blob_meta(1): [commcarehq_p1] PQconnectPoll: connection to server at "172.19.3.36", port 6432 failed: Connection timed out
        Is the server running on that host and accepting TCP/IP connections?

Hi Manish,

The IP address is persisting somewhere in the database/config and unable to address it.
I have changed the IP address of the new server to 172.19.3.36 to solve the issue.

This is the output of the function:

In [5]: check_blobdb()
Out[5]: ServiceStatus(success=True, msg='Successfully saved a file to the blobdb', exception=None, duration=None)

Thank you,