Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info
titleThis page is a work in progress.

This page is a work in progress. This info box will be removed once complete.

To Migrate data from couchDB

  1.  Download and the tool couch2pg. You can use the below command
    git clone https://github.com/medic/medic-couch2pg.git
  2. Create a postgres database that will act as the staging database for migration to OpenSRP Postgres model
  3. Change directory to medic-couch2pg
  4.  Execute the below on the terminal to start the couch2pg tool

        export POSTGRESQL_URL=postgresql://<postgres_username>:<postgres_password>@localhost:5432/<postgres_database>

        export COUCHDB_URL=http://<couch_username>:<couch_password>@localhost:5984/<couch_database>

        export COUCH2PG_SLEEP_MINS=120

        export COUCH2PG_DOC_LIMIT=1000

        export COUCH2PG_RETRY_COUNT=5

        node index.js

  5. Wait after all the data has been uploaded on the postgres staging database. You should see "Run complete" on the terminal after the data import into postgres is completed. Example of output  when tool finishes migrating data is illustrated below
  6. copy the table couchdb  table from the staging database above to the opensrp database. Use the command 
       pg_dump -t coucdb <postgres_staging_database> | psql <opensrp_database>
  7. Execute the scripts in the folder data_migration_scripts folder. Execute while connected to the postgres database opensrp_database

       opensrp/assets/migrations/data_migration_scripts
  8. Delete the table couchdb in the OpenSRP database. You can use the below
       drop table couchdb

Migrating Very Large databases

The data_migration_scripts use insert select statements to migrate data from couchdb staging table to relevant in the OpenSRP database table(s). Each script selects all document types of a particular domain e.g Clients then inserts into the OpenSRP database table.

This may not work in very large databases with tens of millions of records for each document type because of memory overhead. For this scenario you may have to use postgres database cursors approach.
The actions script has an example of using cursors and can be extended to support all domain objects if required. 
However insert select was tested with 5.6 milliion actions records and was completing in 12 minutes to import data from couchdb staging table  to OpenSRP database tables action and action_metadata.

Below is an actions migrations example of using cursors to perform migration and can be extended to support all domain objects if required by adapting the data_migration_scripts to use cursors. 

Code Block
languagesql
titleMigrating Actions Metadata using cursor
DO $$
DECLARE
  DECLARE couchdb_cursor CURSOR FOR SELECT * FROM couchdb where doc->>'type'='Action';
  DECLARE actions_cursor CURSOR FOR SELECT * FROM core.action;
  t_action   RECORD;
BEGIN

  OPEN couchdb_cursor;
  LOOP
    FETCH couchdb_cursor into t_action;
    EXIT WHEN NOT FOUND;
    INSERT INTO core.action(json) VALUES (t_action.doc);
  END LOOP;
  CLOSE couchdb_cursor;

  OPEN actions_cursor;
  LOOP
      FETCH actions_cursor INTO t_action;
      EXIT WHEN NOT FOUND;

    INSERT INTO core.action_metadata(action_id, document_id, base_entity_id,server_version, provider_id,location_id,team,team_id)
      VALUES (t_action.id ,t_action.json->>'_id',t_action.json->>'baseEntityId',(t_action.json->>'timeStamp')::BIGINT ,
              t_action.json->>'providerId',t_action.json->>'locationId',t_action.json->>'team',t_action.json->>'teamId');
  END LOOP;
  CLOSE actions_cursor;
END$$;

...