Data Storage by OpenSRP

The data storage is designed to support the smart and dynamic nature of registers provided by OpenSRP as well as to mitigate the risk of losing form (register) data due to unreliable nature of network connectivity.

Dynamic registers are supported using the NoSQL document database CouchDB. Whereas the analysis, tracking and reporting is supported through the traditional RDBMS MySQL / PostGres.

The unreliable network connectivity is mitigated by storing data on handheld device using SQLite.

Relational and No-SQL Database Management Systems (DBMS) are used in different types of domains, and the features provided by each cannot be ignored; therefore, you find different DBMS used in OpenSRP for different types of functionality and project needs.

The data storage systems used by OpenSRP are as follows (this documentation explains the first two categories in detail):

  • SQLite (Used by mobile client only)

  • CouchDB

  • MySQL (App portable to Postgres)

SQLite

The sqlite database will store the form submissions as well as the data in relative databases tables according to the bind types defined .The sqlite database is used in the mobile client for showing the registers. The data shown in the register activities are spooled from the sqlite tables of the respective bind types. By default the tables present in the opensrp client sqlite database are:

  • android_metadata (locale is stored here)

  • settings (user information is stored here)

  • sqlite_sequence

  • alerts (used for scheduling of forms)

  • timeline event (used to record milestones sent from server sent by the same web service as alerts)

  • report (used to store report sent from server)

  • form_submissions (table where the raw form submissions are saved)

  • service_provided (table to store data of services provided , usually vaccinations to clients)

  • the rest additional tables will vary on implementations of opensrp and the bind types used. These tables are populated by ziggy which reads form submissions table and then processes the form submission according to their bind types and puts the data into the corresponding tables

The database is usually encrypted with the user login password via sql cipher.

CouchDB

This is the main repository where all individual and identifiable data with each and every detail is stored. There are different databases in CouchDB repository, named as:

opensrp:

  • This is the main database where all entities are registered as unique, identifiable records, showing the role of the entity in the system. All entities can link to any other entity by entityId/baseEntityId.

    • Let's consider a sample Family Register where each member in the family is registered as entity FamilyMember. The Family Register also has a member  (pregnant woman) eligible to be part of ANC register i.e. pre-registered woman in the family would be enrolled as Mother in ANC care. Once ANC care period is over Mother would be removed from ANC register after delivery. The delivery can result in one or more new FamilyMember(s) and newly added entities would also be registered into Child Register as entity Child for tracking the vaccination course. The Mother could also die during delivery with a live birth.

    • In real life different Health Workers are dedicated for different types of register(s) depending on the domain they are working in. For example CHW might be managing Family Register, Vaccinator would be keeping track of Child Register, and maternity care centers might be enrolling pregnant women in ANC register. To link all of the entities in different registers we enroll the beneficiary in core system as Client with a unique baseEntityId with basic information (name, DoB, DoD, Gender, Unique IDs) and then each of the register manages its own implementation and domain specific information.

    • For the case above, before enrolling a woman as FamilyMember first a Client would be registered into the core system with a unique baseEntityId and then a record would be saved into Family Register as FamilyMember. The record would have link to baseEntityId in core Client.

    • When woman becomes eligible for ANC care she would be enrolled into ANC register as entity Mother with link to baseEntityId of her Client record.

    • On delivery if she gives birth to a live baby an entry would be made in Family Register as FamilyMember and another entry would be made in Child Register as entity Child but preceded by an entry in core system as Client so that FamilyMember and Child could be linked back to any other register added ever.

    • Now assume that the woman dies during delivery so another Health Worker enrolls her in Death register as entity Deceased. As per protocol the Deceased would be linked with Client via baseEntityId.

    • If we ever need combined reports from different registers we can easily pull data because we have a unique single Client baseEntityId for that woman and every other custom register entity (Mother, FamilyMember, Deceased) has a link to it.

  • Entity relationships can be one-to-one or one-to-many for example:

    • one-to-many: One woman (Mother) may have multiple pregnancies over her lifetime which would lead to registration of one or more children (Child) on filling of a delivery outcome form if outcome is livebirth(s). This way a Mother is linked to her Child(ren)

    • one-to-one: One woman (Client → Mother) dies during delivery and is registered via Death Register as Deceased. The Mother is a client of system and has one-to-one relationship with Client. Now Deceased is a new entity into system of Death Register and is linked to Client via one-to-one relationship via baseEntityId. Whenever one wants data for women died during delivery from both registers he would link Clients having one-to-one relationship in registers for Mother and Deceased via baseEntityId.

  • The repository also contains all Actions (maintained and triggered via motech schedule tracking) to be taken on a particular schedule. These actions are used by the mobile client to show alerts to the health worker or end user for any specific schedule.

  • The repository unnecessarily contains a few configuration variables like InfantBalanceOnHandReportToken and FormExportToken and will be removed in the next version to the motech scheduler handler.

opensrp-form

  • Every form filled out by the end user on the client application is directly submitted as it is into the form repository without any normalization or concern with the role of the entity into system.

  • The submitted forms are then periodically processed by ziggy to be broken down into respective entities forming the registers. At this point the form data is processed and normalized according to the view or data that the register needs to show. The processed entities are then stored into the drishti database (mentioned above).

motech-scheduletracking-api

  • This is the repository internally used by motech for managing health schedules for generating alerts (OpenSRP maintains alerts as entity Action)

  • It also contains JSON schedules and Entities enrolled into any schedule.

drishti-mcts (Mother and Child Tracking System)

The database contains all reports generated by the MCTS report generator component. The MCTS report generator generates MCTS reports and sends SMS reports to a configured number for audit and tracking purposes for maternal and child health.  This module was created for a specific reporting requirement in India, and it is not currently being used in OpenSRP and will likely be removed in future.

Postgres/MySQL

Postgres and MySQL are relational DBMS and are mainly used for storing OpenMRS data and aggregate data for generating summary and analytics reports. The DBMS is also used by motech quartz scheduler. Details of each database is explained below:

openmrs

motechquartz

  • This database is used internally by the motech quartz scheduler for managing  Triggers and Jobs.

  • All details about the status of a scheduled quartz Job and its Trigger is maintained in this database.

  • Note that this database does not maintain any individual records for entity schedules and neither does it keep track of old quartz Job or Triggers after completion.

opensrp (drishti in older versions):
  • If the DBMS you choose is Postgres, you would find a container/umbrella database for the schemas below. MySQL does not treat database and schema differently, hence schemas below would be independent databases in MySQL.

  • This is where data for aggregate and summary data for health worker reports is maintained.

report

  • This is where the original drishti app maintained its end users’ (ANMs) information and their location mappings. This is no longer being used for this purpose.

  • The schema also kept track of ANMs as service providers and services provided by each ANM for aggregated reports.

  • Annual targets for ANM and indicators were also managed in this database.

  • The final output from this database was generated in the form of a DB View that lists the ANMs and the services provided by each with location and date.

  • This data is used by aggregator provider bamboo to get summary reports.

  • This data is also used by drishti-site for allowing users to download aggregated reports.

  • Although this is still a part of the latest OpenSRP code, this module was abandoned by OpenSRP core dev team and replaced by the OpenMRS reporting, team management, and location modules.

anm_report

  • The purpose of this schema is to generate and keep data for ANM summary for mobile client. The table anm_report_data holds all data for this summary by indicator, date and ANM.

  • It also has a column external_id that holds the entity_id for beneficiary so that record could be tracked down to CouchDB anytime. This also helps in tracking actual person in summary reports on client.