Issues with Changelog 0087 - upgrade to ES 6

erobinson · June 9, 2025, 7:12pm

Hi Amit, on step 12, I had to first check the cluster name with
ls /opt/data/elasticsearch-6.8.23/data/
which identified it as:
monolith-es
I could then run the command with the cluster name:
cchq ${ENV} run-shell-command elasticsearch "mv /opt/data/elasticsearch-6.8.23/data/monolith-es/* /opt/data/elasticsearch-6.8.23/data/" -b
That wasn't clear from the instructions.

Step 16 is where I'm now running into issues, it looks like the ES service is not running.

The ES log looks like this:

2025-06-09T19:00:43,492][INFO ][o.e.e.NodeEnvironment    ] [es0] using [1] data paths, mounts [[/opt/data (/dev/sda1)]], net usable_space [197.8gb], net total_space [502.8gb], types [ext4]
[2025-06-09T19:00:43,494][INFO ][o.e.e.NodeEnvironment    ] [es0] heap size [2gb], compressed ordinary object pointers [true]
[2025-06-09T19:00:43,551][INFO ][o.e.n.Node               ] [es0] node name [es0], node ID [1k_6ORgNRK6wPf8kc0_4xA]
[2025-06-09T19:00:43,551][INFO ][o.e.n.Node               ] [es0] version[6.8.23], pid[114910], build[default/tar/4f67856/2022-01-06T21:30:50.087716Z], OS[Linux/6.8.0-1028-azure/amd64], JVM[Ubuntu/OpenJDK 64-Bit Server VM/17.0.15/17.0.15+6-Ubuntu-0ubuntu122.04]
[2025-06-09T19:00:43,552][INFO ][o.e.n.Node               ] [es0] JVM arguments [-Xms2048m, -Xmx2048m, -XX:+UseG1GC, -XX:G1ReservePercent=25, -XX:InitiatingHeapOccupancyPercent=30, -Djava.io.tmpdir=/tmp, -XX:-HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/opt/data/elasticsearch-6.8.23/logs/heapdump.hprof, -XX:ErrorFile=/opt/data/elasticsearch-6.8.23/logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Des.path.home=/opt/elasticsearch-6.8.23, -Des.path.conf=/etc/elasticsearch-6.8.23, -Des.distribution.flavor=default, -Des.distribution.type=tar]
[2025-06-09T19:00:44,567][INFO ][o.e.p.PluginsService     ] [es0] loaded module [aggs-matrix-stats]
[2025-06-09T19:00:44,567][INFO ][o.e.p.PluginsService     ] [es0] loaded module [analysis-common]
[2025-06-09T19:00:44,567][INFO ][o.e.p.PluginsService     ] [es0] loaded module [ingest-common]
[2025-06-09T19:00:44,568][INFO ][o.e.p.PluginsService     ] [es0] loaded module [ingest-geoip]
[2025-06-09T19:00:44,568][INFO ][o.e.p.PluginsService     ] [es0] loaded module [ingest-user-agent]
[2025-06-09T19:00:44,568][INFO ][o.e.p.PluginsService     ] [es0] loaded module [lang-expression]
[2025-06-09T19:00:44,568][INFO ][o.e.p.PluginsService     ] [es0] loaded module [lang-mustache]
[2025-06-09T19:00:44,568][INFO ][o.e.p.PluginsService     ] [es0] loaded module [lang-painless]
[2025-06-09T19:00:44,568][INFO ][o.e.p.PluginsService     ] [es0] loaded module [mapper-extras]
[2025-06-09T19:00:44,569][INFO ][o.e.p.PluginsService     ] [es0] loaded module [parent-join]
[2025-06-09T19:00:44,569][INFO ][o.e.p.PluginsService     ] [es0] loaded module [percolator]
[2025-06-09T19:00:44,569][INFO ][o.e.p.PluginsService     ] [es0] loaded module [rank-eval]
[2025-06-09T19:00:44,569][INFO ][o.e.p.PluginsService     ] [es0] loaded module [reindex]
[2025-06-09T19:00:44,569][INFO ][o.e.p.PluginsService     ] [es0] loaded module [repository-url]
[2025-06-09T19:00:44,569][INFO ][o.e.p.PluginsService     ] [es0] loaded module [transport-netty4]
[2025-06-09T19:00:44,570][INFO ][o.e.p.PluginsService     ] [es0] loaded module [tribe]
[2025-06-09T19:00:44,570][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-ccr]
[2025-06-09T19:00:44,570][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-core]
[2025-06-09T19:00:44,570][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-deprecation]
[2025-06-09T19:00:44,570][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-graph]
[2025-06-09T19:00:44,570][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-ilm]
[2025-06-09T19:00:44,571][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-logstash]
[2025-06-09T19:00:44,571][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-ml]
[2025-06-09T19:00:44,571][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-monitoring]
[2025-06-09T19:00:44,571][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-rollup]
[2025-06-09T19:00:44,571][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-security]
[2025-06-09T19:00:44,571][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-sql]
[2025-06-09T19:00:44,571][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-upgrade]
[2025-06-09T19:00:44,571][INFO ][o.e.p.PluginsService     ] [es0] loaded module [x-pack-watcher]
[2025-06-09T19:00:44,572][INFO ][o.e.p.PluginsService     ] [es0] loaded plugin [analysis-phonetic]
[2025-06-09T19:00:44,703][INFO ][i.n.u.i.PlatformDependent] [es0] Your platform does not provide complete low-level API for accessing direct buffers reliably. Unless explicitly requested, heap buffer will always be preferred to avoid potential system instability.
[2025-06-09T19:00:46,666][INFO ][i.n.u.i.PlatformDependent] [es0] Your platform does not provide complete low-level API for accessing direct buffers reliably. Unless explicitly requested, heap buffer will always be preferred to avoid potential system instability.
[2025-06-09T19:00:46,859][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [es0] [controller/114997] [Main.cc@114] controller (64 bit): Version 6.8.23 (Build 31256deab94add) Copyright (c) 2022 Elasticsearch BV
[2025-06-09T19:00:47,103][DEBUG][o.e.a.ActionModule       ] [es0] Using REST wrapper from plugin org.elasticsearch.xpack.security.Security
[2025-06-09T19:00:47,288][ERROR][o.e.g.GatewayMetaState   ] [es0] failed to read local state, exiting...
java.lang.IllegalStateException: The index [[users-20230524/PeeM1L30RZ-zV2tZzDPyQQ]] was created with version [2.4.6] but the minimum compatible version is [5.0.0]. It should be re-indexed in Elasticsearch 5.x before upgrading to 6.8.23.
        at org.elasticsearch.cluster.metadata.MetaDataIndexUpgradeService.checkSupportedVersion(MetaDataIndexUpgradeService.java:126) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.cluster.metadata.MetaDataIndexUpgradeService.upgradeIndexMetaData(MetaDataIndexUpgradeService.java:97) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.gateway.GatewayMetaState.upgradeMetaData(GatewayMetaState.java:246) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.gateway.GatewayMetaState.<init>(GatewayMetaState.java:89) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.node.Node.<init>(Node.java:499) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.node.Node.<init>(Node.java:266) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:212) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:212) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) [elasticsearch-cli-6.8.23.jar:6.8.23]
        at org.elasticsearch.cli.Command.main(Command.java:90) [elasticsearch-cli-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:116) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) [elasticsearch-6.8.23.jar:6.8.23]
[2025-06-09T19:00:47,294][ERROR][o.e.b.Bootstrap          ] [es0] Exception
java.lang.IllegalStateException: The index [[users-20230524/PeeM1L30RZ-zV2tZzDPyQQ]] was created with version [2.4.6] but the minimum compatible version is [5.0.0]. It should be re-indexed in Elasticsearch 5.x before upgrading to 6.8.23.
        at org.elasticsearch.cluster.metadata.MetaDataIndexUpgradeService.checkSupportedVersion(MetaDataIndexUpgradeService.java:126) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.cluster.metadata.MetaDataIndexUpgradeService.upgradeIndexMetaData(MetaDataIndexUpgradeService.java:97) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.gateway.GatewayMetaState.upgradeMetaData(GatewayMetaState.java:246) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.gateway.GatewayMetaState.<init>(GatewayMetaState.java:89) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.node.Node.<init>(Node.java:499) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.node.Node.<init>(Node.java:266) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:212) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:212) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) [elasticsearch-cli-6.8.23.jar:6.8.23]
        at org.elasticsearch.cli.Command.main(Command.java:90) [elasticsearch-cli-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:116) [elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) [elasticsearch-6.8.23.jar:6.8.23]
[2025-06-09T19:00:47,296][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [es0] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: The index [[users-20230524/PeeM1L30RZ-zV2tZzDPyQQ]] was created with version [2.4.6] but the minimum compatible version is [5.0.0]. It should be re-indexed in Elasticsearch 5.x before upgrading to 6.8.23.
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:163) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.8.23.jar:6.8.23]
        at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:116) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) ~[elasticsearch-6.8.23.jar:6.8.23]
Caused by: java.lang.IllegalStateException: The index [[users-20230524/PeeM1L30RZ-zV2tZzDPyQQ]] was created with version [2.4.6] but the minimum compatible version is [5.0.0]. It should be re-indexed in Elasticsearch 5.x before upgrading to 6.8.23.
        at org.elasticsearch.cluster.metadata.MetaDataIndexUpgradeService.checkSupportedVersion(MetaDataIndexUpgradeService.java:126) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.cluster.metadata.MetaDataIndexUpgradeService.upgradeIndexMetaData(MetaDataIndexUpgradeService.java:97) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.gateway.GatewayMetaState.upgradeMetaData(GatewayMetaState.java:246) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.gateway.GatewayMetaState.<init>(GatewayMetaState.java:89) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.node.Node.<init>(Node.java:499) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.node.Node.<init>(Node.java:266) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:212) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:212) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-6.8.23.jar:6.8.23]
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-6.8.23.jar:6.8.23]
        ... 6 more
[2025-06-09T19:00:47,298][INFO ][o.e.x.m.p.NativeController] [es0] Native controller process has stopped - no new native processes can be started

Not sure why we're getting "The index .... was created with version [2.4.6] but the minimum compatible version is [5.0.0]."

I was under the impression we've been on ES v5 for at least a year.

The /opt/data directory looks like this:

drwxr-xr-x 3 redis         root                4096 Jun  9 19:16 redis
drwxrwxr-x 3 nobody        nfs            148918272 Jun  9 18:52 blobdb
drwxr-xr-x 5 elasticsearch elasticsearch       4096 Jun  9 18:28 elasticsearch-6.8.23
drwxr-xr-x 5 elasticsearch elasticsearch       4096 Jun  9 18:06 elasticsearch-6.8.23-new-installation
drwxr-xr-x 4 root          root                4096 May 17 12:47 home
drwxr-xr-x 5 cchq          cchq                4096 Jun  7  2024 formplayer
drwxr-xr-x 4 elasticsearch elasticsearch       4096 Feb  2  2024 elasticsearch-5.6.16-new-installation
drwxrwxrwx 6 root          root                4096 Feb  2  2024 backups
drwxr-x--- 5 couchdb       couchdb             4096 Aug  9  2023 couchdb2
drwxr-xr-x 3 kafka         kafka               4096 Aug  9  2023 kafka
drwxr-xr-x 4 elasticsearch elasticsearch       4096 Aug  9  2023 elasticsearch-5.6.16-backup
-rw-r--r-- 1 root          root          1073741824 Aug  9  2023 emerg_delete.dummy
drwx------ 3 postgres      postgres            4096 Aug  9  2023 postgresql
-rwxrwx--- 1 root          root                  55 Aug  9  2023 README

Thanks!
Ed

aphulera · June 19, 2025, 12:57pm

Hey @erobinson!

Apologies for the delayed response. I was offline for a while.

on step 12, I had to first check the cluster name with
ls /opt/data/elasticsearch-6.8.23/data/
which identified it as:
monolith-es
I could then run the command with the cluster name:
cchq ${ENV} run-shell-command elasticsearch "mv /opt/data/elasticsearch-6.8.23/data/monolith-es/* /opt/data/elasticsearch-6.8.23/data/" -b
That wasn't clear from the instructions.

Thanks for flagging, I will update the docs to make it clearer.

The index [[users-20230524/PeeM1L30RZ-zV2tZzDPyQQ]] was created with version [2.4.6] but the minimum compatible version is [5.0.0]. It should be re-indexed in Elasticsearch 5.x before upgrading to 6.8.23.

This index should have been deleted by the command that is mentioned in "Swap the Indices" part
in the changelog.

cchq <env> django-manage elastic_sync_multiplexed delete users

Can you share the output of

curl -X GET "10.2.0.4:9200/_cat/indices?format=json&expand_wildcards=all&pretty"

Before you start the actual upgrade? This will give us a list of all the indices that are present in the cluster during the time you are attempting the upgrade.

Thanks.

erobinson · June 20, 2025, 1:09pm

Thanks Amit, I'll do that today and report back.

erobinson · June 20, 2025, 7:25pm

Hi Amit, I think it's working now. If you could just read through this post and let me know your thoughts? Thanks!

One thing that's been consistent from the start is step 2 under number 7:
cchq <env> run-shell-command elasticsearch "grep '<Task Number>.*ReindexResponse' /opt/data/elasticsearch*/logs/*.log"
Always fails:

(cchq) ccc@monolith:~/commcare-cloud$ cchq monolith run-shell-command elasticsearch "grep '35229.*ReindexResponse' /opt/data/elasticsearch*/logs/*.log"
ansible elasticsearch -m shell -i /home/ccc/environments/monolith/inventory.ini -a 'grep '"'"'35229.*ReindexResponse'"'"' /opt/data/elasticsearch*/logs/*.log' -u ansible '--ssh-common-args=-o UserKnownHostsFile=/home/ccc/environments/monolith/known_hosts' --diff
10.2.0.4 | FAILED | rc=1 >>
non-zero return code
✗ Apply failed with status code 2

I've been ignoring that up to now as the reindex appears to complete successfully with the doc counts matching.

Sample output from one of the delete commands:

(cchq) ccc@monolith:~/commcare-cloud$ cchq monolith django-manage elastic_sync_multiplexed delete users
ssh ccc@10.2.0.4 -t -o UserKnownHostsFile=/home/ccc/environments/monolith/known_hosts 'sudo -iu cchq bash -c '"'"'cd /home/cchq/www/monolith/current; python_env/bin/python manage.py elastic_sync_multiplexed delete users'"'"''
Ubuntu 22.04.5 LTS
Docs in older index - 916
Docs in newer index - 916

Docs in new index should be greater than or equals to in older index
Are you sure you want to delete the older index - users-20230524?
WARNING: - This step can't be un-done.
Enter 'users' to continue, any other key to cancel
users
Deleting Index - users-20230524
Connection to 10.2.0.4 closed

After deleting the old and residual indices, I ran:
curl -X GET "10.2.0.4:9200/_cat/indices?format=json&pretty"
and the output is:

(cchq) ccc@monolith:~/commcare-cloud$ curl -X GET "10.2.0.4:9200/_cat/indices?format=json&pretty"
[
  {
    "health" : "green",
    "status" : "open",
    "index" : "forms-2024-05-09",
    "uuid" : "CQwgJj87TWydwCRXpd_xkw",
    "pri" : "5",
    "rep" : "0",
    "docs.count" : "554817",
    "docs.deleted" : "11070",
    "store.size" : "2.3gb",
    "pri.store.size" : "2.3gb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : "case-search-2024-05-09",
    "uuid" : "lDZ-9kFHTSaN0qqyTE2rOA",
    "pri" : "5",
    "rep" : "0",
    "docs.count" : "12111473",
    "docs.deleted" : "0",
    "store.size" : "1.3gb",
    "pri.store.size" : "1.3gb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : "sms-2024-05-09",
    "uuid" : "fcWWIr81SsC8XGBBPg7Iuw",
    "pri" : "5",
    "rep" : "0",
    "docs.count" : "0",
    "docs.deleted" : "0",
    "store.size" : "960b",
    "pri.store.size" : "960b"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : "users-2024-05-09",
    "uuid" : "aOjMZ82nRjq1jiKMhbdcWw",
    "pri" : "2",
    "rep" : "0",
    "docs.count" : "15531",
    "docs.deleted" : "0",
    "store.size" : "6.1mb",
    "pri.store.size" : "6.1mb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : "domains-2024-05-09",
    "uuid" : "SrLY4cp7TuCCga5_Ijfx7w",
    "pri" : "5",
    "rep" : "0",
    "docs.count" : "2",
    "docs.deleted" : "0",
    "store.size" : "50.5kb",
    "pri.store.size" : "50.5kb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : ".tasks",
    "uuid" : "8rmPZyl_Ql-FqiGpdVSw_w",
    "pri" : "1",
    "rep" : "0",
    "docs.count" : "9",
    "docs.deleted" : "0",
    "store.size" : "61.1kb",
    "pri.store.size" : "61.1kb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : "cases-2024-05-09",
    "uuid" : "0KYqworjR0KSRb4MttdqiA",
    "pri" : "5",
    "rep" : "0",
    "docs.count" : "1105765",
    "docs.deleted" : "0",
    "store.size" : "633.1mb",
    "pri.store.size" : "633.1mb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : "apps-2024-05-09",
    "uuid" : "Lid2bXvCS3W0LJBGNhqIiA",
    "pri" : "5",
    "rep" : "0",
    "docs.count" : "82563",
    "docs.deleted" : "0",
    "store.size" : "94mb",
    "pri.store.size" : "94mb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : "groups-2024-05-09",
    "uuid" : "-ioryyEmTcivY8KQ17ILCA",
    "pri" : "5",
    "rep" : "0",
    "docs.count" : "0",
    "docs.deleted" : "0",
    "store.size" : "960b",
    "pri.store.size" : "960b"
  }
]

Note I excluded the expand_wildcard parameter as it's not supported for _cat/indices it seems.

I notice the various _USERS_INDEX_SWAPPED: True statements remain in public.yml - should they be removed at some point?

On step 15 ES fails to start with the following in the logs:

.
.
.
[2025-06-20T17:33:15,351][INFO ][o.e.p.PluginsService     ] [es0] loaded plugin [analysis-phonetic]
[2025-06-20T17:33:15,486][INFO ][i.n.u.i.PlatformDependent] [es0] Your platform does not provide complete low-level API for accessing direct buffers reliably. Unless explicitly requested, heap buffer will always be preferred to avoid potential system instability.
[2025-06-20T17:33:17,438][INFO ][i.n.u.i.PlatformDependent] [es0] Your platform does not provide complete low-level API for accessing direct buffers reliably. Unless explicitly requested, heap buffer will always be preferred to avoid potential system instability.
[2025-06-20T17:33:17,630][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [es0] [controller/186786] [Main.cc@114] controller (64 bit): Version 6.8.23 (Build 31256deab94add) Copyright (c) 2022 Elasticsearch BV
[2025-06-20T17:33:17,844][DEBUG][o.e.a.ActionModule       ] [es0] Using REST wrapper from plugin org.elasticsearch.xpack.security.Security
[2025-06-20T17:33:17,980][INFO ][o.e.d.DiscoveryModule    ] [es0] using discovery type [zen] and host providers [settings]
[2025-06-20T17:33:18,408][INFO ][o.e.n.Node               ] [es0] initialized
[2025-06-20T17:33:18,409][INFO ][o.e.n.Node               ] [es0] starting ...
[2025-06-20T17:33:18,484][INFO ][o.e.t.TransportService   ] [es0] publish_address {10.2.0.4:9300}, bound_addresses {10.2.0.4:9300}
[2025-06-20T17:33:18,499][ERROR][o.e.b.Bootstrap          ] [es0] node validation exception
Cluster name [monolith-es] subdirectory exists in data paths [/opt/data/elasticsearch-6.8.23/data/monolith-es]. All data under these paths must be moved up one directory to paths [/opt/data/elasticsearch-6.8.23/data]
[2025-06-20T17:33:18,502][INFO ][o.e.n.Node               ] [es0] stopping ...

Step 12 moves the data from the <cluster_name> subdirectory into the parent dir but doesn't delete the <cluster_name> subdirectory. I checked and it's empty:

(cchq) ccc@monolith:~/commcare-cloud$ ls /opt/data/elasticsearch-6.8.23/data/
monolith-es  nodes
(cchq) ccc@monolith:~/commcare-cloud$ ls /opt/data/elasticsearch-6.8.23/data/monolith-es/
(cchq) ccc@monolith:~/commcare-cloud$

So I removed it and restarted ES and it appears to be running now.
Step 16 (check shard allocation) outputs this:

(cchq) ccc@monolith:~/commcare-cloud$ cchq ${ENV} django-manage elastic_sync_multiplexed display_shard_info
ssh ccc@10.2.0.4 -t -o UserKnownHostsFile=/home/ccc/environments/monolith/known_hosts 'sudo -iu cchq bash -c '"'"'cd /home/cchq/www/monolith/current; python_env/bin/python manage.py elastic_sync_multiplexed display_shard_info'"'"''
Ubuntu 22.04.5 LTS
Cluster Status: green
Active Shards Count: 38
Initializing Shards: 0
Unassigned Shards Count: 0
Relocating Shards: 0
Active Shard Percentage: 100%
Connection to 10.2.0.4 closed

Does that look OK?
Step 17 seems to indicate all is fine:

(cchq) ccc@monolith:~/commcare-cloud$ curl -XGET "${ES_HOST}:9200/_cluster/health?pretty"
{
  "cluster_name" : "monolith-es",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 38,
  "active_shards" : 38,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Last observation is that the verification step says:
cchq ${ENV} check-services
I imagine that should be:
cchq ${ENV} django-manage check_services

I'm not sure what's different from the last time but it seemed to run through fine this time, though again I had to add elasticsearch_jvm_tmp_dir : '/tmp' to public.yml

Thanks!

aphulera · July 10, 2025, 4:48am

Hey @erobinson !

One thing that's been consistent from the start is step 2 under number 7:
cchq <env> run-shell-command elasticsearch "grep '<Task Number>.*ReindexResponse' /opt/data/elasticsearch*/logs/*.log"

This is interesting, I wonder if something changed in ES 6 in log formatting Can you go to /opt/data/elasticsearch*/logs/*.log and just look for ReindexResponse? I think you should see something there.

I notice the various _USERS_INDEX_SWAPPED: True statements remain in public.yml - should they be removed at some point?

These variables would be meaningless after we roll out HQ code change. But you will need to update them when we move to a newer version of ES. We will follow the same process again.

I think rest of the output that you have shared is looking correct and I saw in other forum post that you have upgraded ES on your machines.

I was on an extended leave for last 1.5 months and just came back this week. Really sorry for the delayed responses.

erobinson · July 10, 2025, 6:00am

Thanks for the response Amit. I made some notes with our latest server update and the only things that I had to take into account that weren't detailed in the instructions were:

I added:
elasticsearch_jvm_tmp_dir: '/tmp'
To the public.yml file to avoid the temp directory error.

also,

Step 15 fails because of the presence of empty directory
/opt/data/elasticsearch-6.8.23/data/<cluster_name>
I had to manually remove the directory for ES 6 to start up.

Otherwise we've upgraded one system and have one more to do in the coming weeks.
Thanks!

aphulera · July 14, 2025, 3:02pm

Thanks for the feedback @erobinson. I will consolidate all your feedback in the changelog in coming week or so.

We would be officially removing support for ES 5 and Commcare would only support ES 6+ from Wednesday this week.

If there are servers which are not yet upgraded would not be able to deploy newer version of Commcare after the PR - Stop ES 5 support by AmitPhulera · Pull Request #36468 · dimagi/commcare-hq · GitHub is merged.

Thanks again for your patience and overcoming all the hurdles that came along the way. You rock, Ed

erobinson · July 14, 2025, 7:27pm

Thanks Amit, we have one server still to upgrade and have been requested to hold off until early August for that one so I assume I can deploy to this verified release: Merge pull request #36134 from dimagi/pkv/fcm-analytics-label · dimagi/commcare-hq@ded7c3f · GitHub
...before performing the ES 6 upgrade and then continuing on to a later release.

Should I hold Commcare Cloud back as well until the ES6 upgrade is complete?

aphulera · July 15, 2025, 2:51am

Thanks Amit, we have one server still to upgrade and have been requested to hold off until early August for that one so I assume I can deploy to this verified release: Merge pull request #36134 from dimagi/pkv/fcm-analytics-label · dimagi/commcare-hq@ded7c3f · GitHub
...before performing the ES 6 upgrade and then continuing on to a later release.

That sounds good.

Should I hold Commcare Cloud back as well until the ES6 upgrade is complete?

No. That should not be required. You can update commcare cloud.

erobinson · July 15, 2025, 4:34pm

Thanks for your assistance Amit!

erobinson · August 3, 2025, 9:31pm

An update on this - we have managed to upgrade the server to verified release ded7c3f but while performing the ES upgrade to version 6 as per changelog 0087, all appears fine until the point of swapping the indices. After that, I am unable to produce reports - I receive 500 errors and the following output in the django log:

https://pastebin.com/vLCNn16v

I'm concerned this server is falling behind. Let me know if you need more details to troubleshoot this issue?

check_services shows all services up and green status for ES. The reindex appears to go fine - all indices match. The service restart after swapping the indices also appears fine.

Thanks!

EDIT something that came to mind during the upgrade - while reindexing and swapping indices, I notice the index names contain old dates in the index name. The old indices have 2023 dates and the "new" indices are dated 2024. I thought I'd mention that in case it's significant e.g. perhaps it's working with the wrong indexes, not sure.