Couchdb failed: after adding a new couchdb node to the cluster

we are getting the following errors on all coucdhdb cluster nodes. The error occurred after adding a new node using the following guideline: Add a new CouchDB node to an existing cluster

[error] 2020-01-23T18:52:52.031672Z couchdb@172.19.3.40 emulator -------- Error in process <0.3403.6> on node 'couchdb@172.19.3.40' with exit value: {{badmatch,{error,{{invalid_header_size,{db_header,7,925,0,{456584840,{41,5,{size_info,439751321,68569273}},37257},{456589314,46,36425},{441500335,[],4086},nil,nil,439726313,1000,<<32 bytes>>,[{'couchdb@172.19.3.41',838},{'couchdb@172.19.3.40',0}],75,1000}},[{couch_db_header,upgrade_tuple,1,[{file,"src/couch_db_header....
........................
[error] 2020-01-23T18:52:52.031774Z couchdb@172.19.3.40 emulator -------- Error in process <0.3406.6> on node 'couchdb@172.19.3.40' with exit value: {{badmatch,{error,{{invalid_header_size,{db_header,7,666,0,{97708162,{50,5,{size_info,93066351,66965500}},26079},{97711135,55,25235},{84361837,[],4695},nil,nil,79630569,1000,<<32 bytes>>,[{'couchdb@172.19.3.41',498},{'couchdb@172.19.3.40',0}],104,1000}},[{couch_db_header,upgrade_tuple,1,[{file,"src/couch_db_header....

[error] 2020-01-23T18:52:52.031594Z couchdb@172.19.3.40 <0.3382.6> -------- Failed to get security object on {'couchdb@172.19.3.40',<<"shards/c0000000-dfffffff/commcarehq__apps.1527170339">>} :: {{badmatch,{error,{{invalid_header_size,{db_header,7,430,0,{82987384,{41,3,{size_info,80615929,52918516}},16743},{82989919,44,16486},{67113647,[],4003},nil,nil,64442601,1000,<<"e22058be7e91dac8288d772f5d283ff0">>,[{'couchdb@172.19.3.41',329},{'couchdb@172.19.3.40',0}],109,1000}},[{couch_db_header,upgrade_tuple,1,[{file,"src/couch_db_header.erl"},{line,193}]},{lists,foldl,3,[{file,"lists.erl"},{line,1248}]},{couch_db_updater,init_db,5,[{file,"src/couch_db_updater.erl"},{line,573}]},{couch_db_updater,init,1,[{file,"src/couch_db_updater.erl"},{line,67}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}}},[{couch_db,start_link,3,[{file,"src/couch_db.erl"},{line,136}]},{couch_server,'-open_async/5-fun-0-',5,[{file,"src/couch_server.erl"},{line,310}]}]}

Are all the nodes using the same version of CouchDB?

No, it's different versions

New coucdhb: : {"couchdb":"Welcome","version":"2.3.1","git_sha":"c298091a4","uuid":"f84bf404831352f2a8d1f511552cd3d9","features":["pluggable-storage-engines","scheduler"],"vendor":{"name":"The Apache Software Foundation"}}

Old couchdb:
{"couchdb":"Welcome","version":"2.1.1","features":["scheduler"],"vendor":{"name":"The Apache Software Foundation"}}

I would recommend you make all versions consistent.

Here is the current couchdb nodes:
echis0: 2.3.1
echis1: 2.1.1
echis2: 2.3.1

  • Should i uninstall the old version and install it again?
  • what about the database stored there?
  • Is there any guideline to do that?

this is the plan file I previously applied.

myplan.yml

target_allocation:

  • echis_server0,echis_server1,echis_server3:3

Provided you don't remove the data from the data folder I think it should be safe to install the new version on echis1. I would however recommend you take a backup of your data to be safe.

Because it's running on ubuntu 14.04, re-installing couchdb2 results on version:2.1.1. Which is still the old version.
Steps I followed:

  1. stop coucdhb2 service
  2. delete couchdb installation folder: /usr/local/couchdb2
  3. remove coucdhb2 from the service: /etc/init.d/CouchDB
  4. install coucdhb2: `cchq echis aps --tags=couchdb2
    5.check the version of newly installed couuchdb:
    curl -XGET 172.19.3.40:15984
    {"couchdb":"Welcome","version":"2.1.1","features":["scheduler"],"vendor":{"name":"The Apache Software Foundation"}}

Now, i am able to install it manually[not using ansible playbook]. I used the following steps.

  • Stop all coucdhb nodes

  • Uninstall old version of couchdb

  • Uninstall old verion of erlang

  • install erlang latest:

  • install couchdb 2.3 on echis_server1:
    Source: https://docs.couchdb.org/en/2.2.0/install/unix.html

  • Installation Paths:
    home:/opt/couchdb, config dir:/opt/couchdb/etc, data dir:/opt/couchdb/data

  • Copy the data from Old data dir to new data-dir:
    cp -r /opt/data/couchdb/* to /opt/couchdb/data

  • Change some parameters:
    copy config from other nodes: /opt/couchdb/etc/local.d
    change the port from 5984 to 15984

  • start couchdb on all nodes.
    #Now, all couchdb nodes are running the same version. The previous error is gone

##Checking the logs:
[notice] 2020-01-26T18:54:53.364869Z couchdb@172.19.3.40 <0.6918.80> 834e102a9e 172.19.3.41:25984 172.19.3.41 admin GET /commcarehq__domains/_design/domain/_info 200 ok 7
[notice] 2020-01-26T18:54:53.379573Z couchdb@172.19.3.40 <0.25392.80> 3b0f3dee7c 172.19.3.41:25984 172.19.3.41 admin GET /commcarehq__fixtures 200 ok 2
[notice] 2020-01-26T18:54:53.396332Z couchdb@172.19.3.40 <0.5576.81> 6c88a4ec59 172.19.3.41:25984 172.19.3.41 admin GET /commcarehq__fixtures/_design/all_docs/_info 200 ok 8
[notice] 2020-01-26T18:54:53.414960Z couchdb@172.19.3.40 <0.1291.81> b91763f267 172.19.3.41:25984 172.19.3.41 admin GET /commcarehq__fixtures/_design/domain/_info 200 ok 7
[notice] 2020-01-26T18:54:53.435946Z couchdb@172.19.3.40 <0.11614.80> 0f470cd7a7 172.19.3.41:25984 172.19.3.41 admin GET /commcarehq__fixtures/_design/fixtures/_info 200 ok 9
...................................
....................................
[notice] 2020-01-26T18:56:17.370029Z couchdb@172.19.3.40 <0.5327.81> 9f87a7c227 172.19.3.41:25984 172.19.3.41 admin GET /commcarehq__users/_design/users/_view/by_username?key=%22haragu_lemano.1%40fmoh-echis.commcarehq.org%22&include_docs=true&reduce=false 200 ok 9
[info] 2020-01-26T18:56:17.391068Z couchdb@172.19.3.40 <0.1235.73> -------- Index update finished for db: shards/80000000-9fffffff/commcarehq__users.1527170311 idx: _design/users
...................................................
...................................................
[notice] 2020-01-26T18:57:06.479268Z couchdb@172.19.3.40 <0.22404.81> a6184bf61c 172.19.3.41:25984 172.19.4.42 admin GET /commcarehq/_changes?since=8428-g1AAAAKbeJyd0rEOgjAQBuBGSHRzcHBw0CcgQIHSSRIfRHtXCBLUyVk3d19A30TfRCdfA0tBXdShy9-kl_tyubYkhPRyS5IBbraYS0g85jsed6gTeKUqdgSBcVVVRW4BIdZjpe66NHZTDMT3rr8aTFTC9A3aMw2yyGeQxSZgUoPzz4QHDSINmKDUBFzU4O4z4UiDXHg-itAAXNsqyV4dyjy16FKjgCllERqj5wa9tOhQo1Jwl3PXGL026K1F-xqNMkQQgTF6b9DXTo_NNwozkPzHqxdPXWSpvw&include_docs=true&feed=continuous 200 ok 60005
...............................................
..............................................
[info] 2020-01-26T18:57:17.634948Z couchdb@172.19.3.40 <0.10803.6> -------- OS Process #Port<0.13859> Log :: function raised exception (new TypeError("value is null", "undefined", 25))
[info] 2020-01-26T18:57:17.635198Z couchdb@172.19.3.40 <0.9870.6> -------- OS Process #Port<0.13850> Log :: function raised exception (new TypeError("value is null", "undefined", 25))
[info] 2020-01-26T18:57:17.635232Z couchdb@172.19.3.40 <0.9866.6> -------- OS Process #Port<0.13848> Log :: function raised exception (new TypeError("value is null", "undefined", 25))
[info] 2020-01-26T18:57:17.635491Z couchdb@172.19.3.40 <0.9862.6> -------- OS Process #Port<0.13846> Log :: function raised exception (new TypeError("value is null", "undefined", 25))
[info] 2020-01-26T18:57:17.635601Z couchdb@172.19.3.40 <0.9864.6> -------- OS Process #Port<0.13847> Log :: function raised exception (new TypeError("value is null", "undefined", 25))
[info] 2020-01-26T18:57:17.635682Z couchdb@172.19.3.40 <0.9868.6> -------- OS Process #Port<0.13849> Log :: function raised exception (new TypeError("value is null", "undefined", 25))
[info] 2020-01-26T18:57:17.635684Z couchdb@172.19.3.40 <0.9538.6> -------- OS Process #Port<0.13841> Log :: function raised exception (new TypeError("value is null", "undefined", 25))
[info] 2020-01-26T18:57:17.635906Z couchdb@172.19.3.40 <0.10805.6> -------- OS Process #Port<0.13860> Log :: function raised exception (new TypeError("value is null", "undefined", 25))