Proposed updated sevices configuration for default monolith environment (app-process.yml)

Hi,

We've been working around a number of issues on one the CCHQ monolith instance we have deployed. Besides some things we did wrong, I'd like to share the following.

The monolith environment setup only configures a limited number of services, that basically provide all the basic features you need to build CommCare apps.

I think that it would be good to add a few other services, that in my opinion, are also very important to have more robust deployment.

  1. Submission Reprocessing Queues: We've experienced the situation where cases were uploaded while Kafka was down. Consequently ES was out of sync with other databases of the instance (ex: inability to import cases and link them to a parent). Having the reprocessing services up and running make the system more resilient to these kind of issues/events.

  2. Conditional Alerts / SMS: The services that make these features work are not configured in the monolith env. These are super handy features that allow for the implementation of very interesting workflows.

I am sharing below our resulting app-process.yml file (from a rather old CCHQ version, but seems to be valid with newer versions too). We've had to out-of-pocket contract external support to get to this point (we're still teaching ourselves and should probably read more the architecture docs). We're happy to give back to the community as we hope this lowers the barrier to better monolith hosting.

formplayer_memory: "1024m"

management_commands:
  monolith:
    run_submission_reprocessing_queue:
    run_pillow_retry_queue:
    queue_schedule_instances:
celery_processes:
  monolith:
    repeat_record_queue,celery,case_import_queue,background_queue,export_download_queue,saved_exports_queue,analytics_queue,ucr_queue,async_restore_queue,email_queue,case_rule_queue,celery_periodic,reminder_case_update_queue,reminder_rule_queue:
      concurrency: 1
    reminder_queue:
      pooling: gevent
      concurrency: 2
    beat: {}
    flower: {}
    submission_reprocessing_queue:
      concurrency: 1
    sms_queue:
      concurrency: 1
pillows:
  monolith:
    case-pillow:
      num_processes: 1
      processor_chunk_size: 1
    xform-pillow:
      num_processes: 1
      processor_chunk_size: 1
    user-pillow:
      num_processes: 1
      processor_chunk_size: 1
    group-pillow:
      num_processes: 1

    AppDbChangeFeedPillow:
      num_processes: 1
    ApplicationToElasticsearchPillow:
      num_processes: 1
    CacheInvalidatePillow:
      num_processes: 1
    DefaultChangeFeedPillow:
      num_processes: 1
    DomainDbKafkaPillow:
      num_processes: 1
    KafkaDomainPillow:
      num_processes: 1
    LedgerToElasticsearchPillow:
      num_processes: 1
    SqlSMSPillow:
      num_processes: 1
    UpdateUserSyncHistoryPillow:
      num_processes: 1
    UserCacheInvalidatePillow:
      num_processes: 1
    UserGroupsDbKafkaPillow:
      num_processes: 1
1 Like