Export problem with repeat groups - missing fields


(Peter MacPherson) #1

Hi,

I am using the data export tool.

Workflow is:

  1. Run Commcare-export tool in terminal on my Mac (Mac OS 10.14.2), calling a local excel file to update a local SQLite database using form data.
  2. Import tables (about 10 in total) from SQLite database into R for analysis.

However, for two forms, the imported variables appear incomplete and corrupted.

Both of the forms with problems have CommCare repeat patterns.

Some variable in the excel sheet don’t appear in the download, and some additional variable names (particularly in the repeat group questions) are incorrect.

Very grateful for suggestions as to how to resolve, as currently can’t access important data.


(Clayton Sims) #2

Hi Peter,

Can you be a bit more concrete about how the variables are incorrect or corrupted, and what the variables look like from the SQLite database?

It would help to know if the issue is in the structure of the data or in the transfer to the local db.

-Clayton


(Peter MacPherson) #3

Hi Clayton,

Thanks for your response.

So, a bit more detail (simplified for ease of understanding below):

  • I have 13 forms in total, call them Form A: Form M for simplicity. Form A is a case registration form, and Forms B to M are case list forms.
  • Only forms G and H have repeat patterns in them.
  • In Form H, I have questions such as: h0001_age, h0002_sex etc.
  • Form H additionally has has a question h0010_clinic_visit (“How many times have you visited the clinic”?)
  • There is then a repeat loop, with questions h0011_clinic_name to h0019_dateofvisit, that ask details of each clinic visit.
  • There is a similar pattern in Form G, but none of the other forms have repeat patterns.

In the CommCare export Excel spreadsheet, I have sheets for each form, Form A to Form M:

  • In the sheet for Form H, I have columns for Source Field and Field for questions h0001_age, h0002_sex, etc
  • I additionally have entries for the repeat loop questions h0011_clinic_name to h0019_dateofvisit.
  • There is a similar sheet for Form G.

I run the CommCare export tool to update the SQLite database using the following code:

commcare-export --output-format sql --output sqlite:///prospect.db --project prospect-1 --query prospect_extract.xlsx --username USERNAME --password PASSWORD

This updating has worked very smoothly until I added sheets for Forms G and H with repeat loops to the export spreadsheet.

Now, when the database is updated, I see the following (when using DB browser, and same when imported into R):

  • Some fields on Forms G and H are missing, despite being included in the Excel spreadsheet (e.g. h0008_date_of_diagnosis, and h0009_severity are both missing.
  • The variable names for the repeat questions have the format: h0010_clinic_visit.h0011_clinic_name, but don’t seem to have any data for instances when there was more than one repeat loop completed.

I have tried making separate Excel sheets for the only the repeat loop questions in Forms G and H, but:

  • Fields still remain missing in the main Form G and H tables
  • I can’t see a way to link repeat loop fields by case_id

Very grateful for any suggestions

Many thanks,

Peter