Cluster setup error

Hello! Today I've prepared six VMs for test cluster setup. My ~/environments/cluster/inventory.ini is:

[proxy1]
192.169.233.135 hostname=proxy1 ufw_private_interface=ens32

[control1]
192.169.233.134 hostname=control1 ufw_private_interface=ens32

[webworker1]
192.169.233.136 hostname=webworker1 ufw_private_interface=ens32

[webworker2]
192.169.233.137 hostname=webworker1 ufw_private_interface=ens32

[db1]
192.169.233.138 hostname=db1 ufw_private_interface=ens32 elasticsearch_node_name=es0 kafka_broker_id=0

[db2]
192.169.233.139 hostname=db2 ufw_private_interface=ens32 elasticsearch_node_name=es1 kafka_broker_id=1

[control:children]
control1

[proxy:children]
proxy1

[webworkers:children]
webworker1
webworker2

[celery:children]
webworker1
webworker2

[pillowtop:children]
webworker1
webworker2

[django_manage:children]
webworker1

[formplayer:children]
webworker2

[rabbitmq:children]
webworker1

[postgresql:children]
db1
db2

[pg_backup:children]
db1
db2

[pg_standby]

[couchdb2:children]
db1
db2

[couchdb2_proxy:children]
db1

[shared_dir_host:children]
db2

[redis:children]
db1
db2

[zookeeper:children]
db1
db2

[kafka:children]
db1
db2

[elasticsearch:children]
db1
db2

When I run commcare-cloud cluster deploy-stack --first-time -e 'CCHQ_IS_FRESH_INSTALL=1'
I've got error

(cchq) lamp@control1:~$ commcare-cloud cluster deploy-stack --first-time -e 'CCHQ_IS_FRESH_INSTALL=1'
Traceback (most recent call last):
File "/home/lamp/.virtualenvs/cchq/bin/commcare-cloud", line 33, in
sys.exit(load_entry_point('commcare-cloud', 'console_scripts', 'commcare-cloud')())
File "/home/lamp/commcare-cloud/src/commcare_cloud/commcare_cloud.py", line 262, in main
exit_code = call_commcare_cloud()
File "/home/lamp/commcare-cloud/src/commcare_cloud/commcare_cloud.py", line 231, in call_commcare_cloud
exit_code = command.run(args, unknown_args)
File "/home/lamp/commcare-cloud/src/commcare_cloud/commands/ansible/ansible_playbook.py", line 204, in run
rc = BootstrapUsers(self.parser).run(deepcopy(args), deepcopy(unknown_args))
File "/home/lamp/commcare-cloud/src/commcare_cloud/commands/ansible/ansible_playbook.py", line 322, in run
return AnsiblePlaybook(self.parser).run(args, unknown_args, always_skip_check=True)
File "/home/lamp/commcare-cloud/src/commcare_cloud/commands/ansible/ansible_playbook.py", line 78, in run
environment.create_generated_yml()
File "/home/lamp/commcare-cloud/src/commcare_cloud/environment/main.py", line 382, in create_generated_yml
generated_variables.update(self.app_processes_config.to_generated_variables())
File "/home/lamp/.virtualenvs/cchq/lib/python3.6/site-packages/memoized.py", line 20, in _memoized
cache[key] = value = fn(*args, **kwargs)
File "/home/lamp/commcare-cloud/src/commcare_cloud/environment/main.py", line 247, in app_processes_config
app_processes_config.check_and_translate_hosts(self)
File "/home/lamp/commcare-cloud/src/commcare_cloud/environment/schemas/app_processes.py", line 67, in check_and_translate_hosts
self.management_commands = check_and_translate_hosts(environment, self.management_commands)
File "/home/lamp/commcare-cloud/src/commcare_cloud/environment/schemas/app_processes.py", line 201, in check_and_translate_hosts
translated[environment.translate_host(host, environment.paths.app_processes_yml)] = config
File "/home/lamp/commcare-cloud/src/commcare_cloud/environment/main.py", line 408, in translate_host
assert group, 'Unknown host referenced in {}: {}'.format(filename_for_error, host)
AssertionError: Unknown host referenced in /home/lamp/environments/cluster/app-processes.yml: monolith

What should i change in my inventory.ini and maybe in environments/cluster/app-processes.yml to run the above command normally?
Any help would be appreciated!

Hi there @robynton

Before we get to the main error, I spotted something in the inventory.ini file:

[webworker1]
192.169.233.136 hostname=webworker1 ...

[webworker2]
192.169.233.137 hostname=webworker1 ...

webworker2's hostname is set to "webworker1". I'm worried that could cause problems.

To the main error:

AssertionError: Unknown host referenced in /home/lamp/environments/cluster/app-processes.yml: monolith

The environments/cluster/app-processes.yml file is referring to a machine named "monolith". But the environment is named "cluster", not "monolith", and none of the hosts in inventory.ini are named "monolith", so I think there is something configured incorrectly in app-processes.yml.

I also suspect that maybe the commcare-cloud documentation has been misleading here. If there is information that is incorrect, or hard to follow, please let me know.

Your app-processes.yml file will configure the resources for Formplayer, and which machines your Celery workers and Pillows will run on. Based on your inventory.ini file, I would guess that whatever is currently set to "monolith" in app-processes.yml should probably be "webworker1" and/or "webworker2". Your choice would depend on what resources your machines have.

If you want multiple machines to be handling background tasks, here is an example of an environment configured to do that: commcare-cloud/environments/india/app-processes.yml at cf81944595f425be171c27f069554ac1b94506c3 · dimagi/commcare-cloud · GitHub (That is a fairly big environment, so you most likely won't need to give your queues the same resources that that environment needs.)