Adding couchdb Node to the cluster

Hi,

I was trying to add couchdb to the cluster, following this procedure: http://dimagi.github.io/commcare-cloud/howto/add-couchdb2-node.html

i prepared myplan.yml file with the following content

target_allocation:

  • echis_server0,echis_server3,echis_server1:3

But brings an error while running the following cmd:
cchq echis migrate-couchdb myplan.yml plan

Traceback (most recent call last):
File "/home/administrator/.virtualenvs/ansible/bin/cchq", line 11, in
load_entry_point('commcare-cloud', 'console_scripts', 'cchq')()
File "/home/administrator/commcare-cloud/src/commcare_cloud/commcare_cloud.py", line 176, in main
exit_code = call_commcare_cloud()
File "/home/administrator/commcare-cloud/src/commcare_cloud/commcare_cloud.py", line 167, in call_commcare_cloud
exit_code = commands[args.command].run(args, unknown_args)
File "/home/administrator/commcare-cloud/src/commcare_cloud/commands/migrations/couchdb.py", line 90, in run
return plan(migration)
File "/home/administrator/commcare-cloud/src/commcare_cloud/commands/migrations/couchdb.py", line 283, in plan
shard_allocations = generate_shard_plan(migration)
File "/home/administrator/commcare-cloud/src/commcare_cloud/commands/migrations/couchdb.py", line 292, in generate_shard_plan
migration.source_couch_config, migration.plan.target_allocation
File "/home/administrator/.virtualenvs/ansible/lib/python2.7/site-packages/couchdb_cluster_admin/suggest_shard_allocation.py", line 419, in generate_shard_allocation
parse_allocation_line(config, allocation_line) for allocation_line in allocation
File "/home/administrator/.virtualenvs/ansible/lib/python2.7/site-packages/couchdb_cluster_admin/suggest_shard_allocation.py", line 406, in parse_allocation_line
nodes = [config.get_formal_node_name(node) for node in nodes.split(',')]
File "/home/administrator/.virtualenvs/ansible/lib/python2.7/site-packages/couchdb_cluster_admin/utils.py", line 148, in get_formal_node_name
return self._formal_name_lookup[node_nickname]
KeyError: u'echis_server0'

Can you share a link to your inventory file? It looks like echis_server0 isn't present in your inventory.

[webworkers:children]
echis_server0

[postgresql:children]
echis_server1

[pg_standby:children]
echis_server3

[couchdb2:children]
echis_server0
echis_server1
echis_server3
#echis_server6

[couchdb2_proxy:children]
echis_server0

[redis:children]
echis_server1

[zookeeper:children]
echis_server1

[kafka:children]
echis_server1
echis_server0

[rabbitmq:children]
echis_server7

[celery:children]
echis_server7

where is echis_server0 defined? Is that a host name?

This is the full file content: inventory.yml

[echis_server0]
172.19.3.41 hostname="echis0"

[echis_server0:vars]
public_ip=213.55.85.203
kafka_broker_id=1
elasticsearch_node_name=es0

[echis_server1]
172.19.3.40 hostname="echis1"

[echis_server1:vars]
elasticsearch_node_name=es1
kafka_broker_id=0
postgresql_replication_slots=['standby','spare']
hot_standby_server='172.19.4.36'

#[echis_server2:vars]
#elasticsearch_node_name=es2
#kafka_broker_id=2

[echis_server2]
172.19.4.33 hostname="echis2"

[echis_server3:vars]
kafka_broker_id=3
hot_standby_master='172.19.3.40'
replication_slot = 'standby'
#elasticsearch_node_name=es2

[echis_server3]
172.19.4.36 hostname="echis3"

[echis_server4:vars]

[echis_server4]
172.19.4.37 hostname="echis4"

[echis_server5:vars]
elasticsearch_node_name=es4

[echis_server5]
172.19.4.35 hostname="echis5"

[echis_server6]
172.19.4.41 hostname="echis6"

[echis_server7:vars]
elasticsearch_node_name=es3
kafka_broker_id=4

[echis_server7]
172.19.4.42 hostname="echis7"

[echis_server8]
172.19.4.43 hostname="echis8"

[minio:children]
#echis_server3
#echis_server6
#echis_server7
echis_server8

[proxy:children]
echis_server0

[webworkers:children]
echis_server0

[postgresql:children]
echis_server1

[pg_standby:children]
echis_server3

[couchdb2:children]
echis_server0
echis_server1
echis_server3
#echis_server6

nginx

[couchdb2_proxy:children]
echis_server0

[redis:children]
echis_server1

[zookeeper:children]
echis_server1

[kafka:children]
echis_server1
echis_server0

[rabbitmq:children]
echis_server7

background tasks

[celery:children]
echis_server7

change / stream processors

[pillowtop:children]
echis_server7

[formplayer:children]
echis_server0

[elasticsearch:children]
echis_server0
echis_server1
#echis_server3
#echis_server5

NFS drive

[shared_dir_host:children]
echis_server0

[control:children]
echis_server0

[mailrelay:children]
echis_server0

[django_manage:children]
echis_server0

Did some digging and this is an edge case where multiple groups have a single host. I've put out a fix for the issue here: Sk/get alias by snopoke · Pull Request #3546 · dimagi/commcare-cloud · GitHub

This fix has been merged. Please update commcare-cloud to test it out.

Yes, it's working now. Thanks for the quick response.

But after adding a new node to couchdb2 cluster, the following error log appeared.

tail -f /home/cchq/www/echis/log/django.log

return self._fetch(self._arg, self.params)

File "/home/cchq/www/echis/releases/2020-01-11_07.11/python_env-3.6/lib/python3.6/site-packages/couchdbkit/client.py", line 758, in raw_view
return view(**params)
File "/home/cchq/www/echis/releases/2020-01-11_07.11/python_env-3.6/lib/python3.6/site-packages/cloudant/view.py", line 239, in call
**kwargs)
File "/home/cchq/www/echis/releases/2020-01-11_07.11/python_env-3.6/lib/python3.6/site-packages/cloudant/_common_util.py", line 258, in get_docs
resp.raise_for_status()
File "/home/cchq/www/echis/releases/2020-01-11_07.11/python_env-3.6/lib/python3.6/site-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error unknown_error function_clause for url: http://172.19.3.41:25984/commcarehq__meta/_design/by_domain_doc_type_date/_view/view?startkey=["fmoh-echis"%2C+"ReportConfiguration"]&endkey=["fmoh-echis"%2C+"ReportConfiguration"%2C+{}]&reduce=false&include_docs=true