The Github repository with the data, processing and mapping scripts is available at deedy/india-nfhs4

The National Family Health Survey (NFHS) is the biggest national-scale health data collection effort, funded by the Bill and Melinda Gates Foundation, US Agency for International Development, the Department for International Development and more.

Over the course of the last 24 years, India has held 4 National Family Health Surveys. The most recent one, conducted in 2015-16, has been much awaited for almost 10 years. NFHS-4 aims to collect data from a representative sample on a district level from 29 states and 6 union territories - a total of nearly ~650 districts. For each of these districts, we get measures of 93 key indicators ranging from a variety of issues from pregnancy, literacy, nutrition, and more. It also includes the number of households, men and women surveyed, as well as breakdowns of those 93 indicators by Rural, Urban and in Total. The details of these indicators are documented at the bottom of this article. As of May 2017, this data has only recently been fully released on the NFHS website.

While NFHS promises to release their data through the Democratic Health Survey Program, as of May 2017, it seems like the only data available through DHS are just the same state level fact sheets that can be obtained from their own site. All of this rich data is locked up in PDFs.

My goal with this project was to:

  • Structure Parse these PDFs, and structure this data in a spreadsheet that you can play with, and perform bulk computations on.
  • Open Release said structured data into the public domain.
  • Visualize Geo-spatially visualize this data on a district level map.

Structuring and Opening the Data

The compiled data can be viewed and downloaded from this Google Sheet. The data covers 637 districts and has 283 total datapoints for each district, for a total of 180k values. You can download a CSV here. Here’s an embedded version:

Visualization

We only really begin to extract meaningful value from this data if we plot them on a map. I obtained district-level geographic data for India from GADM, the database of Global Administrative Areas. There were two primary issues with the data, however, where another one of my primary contributions comes in:

  • Naming Standards There isn’t a common naming standard in English for all the 637 districts. In fact, about 50 of the district names I had couldn’t be directly mapped on to the GADM data. I did the manual matching for these districts.
  • Quickly changing number of districts There were 637 districts in the NFHS-4 data, but 667 in the GADM data, although both had supposedly been snapshots at around 2016. In fact, the district-ization of NFHS-4 data was done around 3-4 years in the past, and many more districts. There were about 8 districts in Delhi from the NFHS-4 data that mapped into one district, and ~30 districts which had split in the last 3 years into 2 or 3 parts each.

After the painstaking work of correctly pointing the right data to the right part of the map, I had a complete map of India.

A map of all the districts of India.

Population and Household Profile

Left: Sex Ratio distribution.
Right: Sex Ratio of births in the last 5 years.
The sex ratio trends do not seem comforting. While overall the country seems to be geographically divided on sex ratio, in the last 5 years, we seem to be uniformly male-leaning.
Left: Women with more than 10 years of schooling
Right: Women between the ages 20-24 who were married before 18
There's an obvious inverse correlation between early marriages and duration of a woman's education.
Left: Women under 6 who went to school
Right: Population under the age of 15
There's seems to be an inverse correlation between early woman's education and high birth rates.
Left: Households with Electricity
Right: Households with Health Insurance
While most of the country besides Tamil Nadu, Telengana, Andhra and Chhattisgarh lacks health insurance, UP, Bihar and Orissa are doing the worst in terms of electricity.

Literacy

Left: Male Literacy Rate
Right: Women Literacy Rate
The gap between gender literacy is still quite high. Kerala continues to be the stronghold of literacy in India, but Maharashtra, parts of the East and the North are catching up.

Current Use of Family Planning Methods

Left: Use of IUD as contraception
Middle: Use of Pill as contraception
Right: Use of Condom as contraception
There's extremely fascinating patterns of contraception usage in the country, with the east predominantly preferring pills and the north preferring condoms.

Nutritional Status of Adults

Left: Percentage of Obese men (BMI >= 25). Right: Percentage of Obese women (BMI >= 25).
The obesity of both genders seem fairly correlated. The Andhra coast looks particularly bad, although pockets of obesity are prevalent in Sikkim, Punjab and the major metropolitan centers.
Left: Percentage of Underweight men (BMI <= 18.5). Right: Percentage of Underweight women (BMI <= 18.5).
The malnutrition of both genders is also fairly correlated. It's largely the geographic inverse of the obesity graph.
Left: Percentage of Normal weight men (18.5 <= BMI <= 25). Right: Percentage of Normal weight women (18.5 <= BMI <= 25).
Aside from pockets in the Northeast, in most of the country, one in two people are not in the normal weight range.

Child Immunizations

Left: Percentage of 1-2 year old children with full immunization. Right: Percentage of 1-2 year old children with polio immunization.
West Bengal, Kerala and Punjab emerge as leaders in the child immunization space, with Polio being particularly common to immunize for.
Left: Mothers will full pre-birth (antenatal) care. Right: Caesarean births.
While the entire country struggles with providing mothers with antenatal birth care, but precariously, Caesareans, or C-sections, are particularly concentrated in the south of India.

Documentation

The metadata associated with each district are:

  • State Name
  • District Name
  • Households Surveyed
    The number of households surveyed for that district
  • Women Surveyed
    The number of women surveyed for that district
  • Men Surveyed
    The number of men surveyed for that district
  • Rural-Urban Classification
    If the percent rural population is above 70%, the district is classified as “Rural”.
    If the percent urban population is above 70%, the district is classified as “Urban”.
    If the percent urban/rural population is between 30-70%, the district is classified as “Both”.

The 93 key indicators are:

  • Population and Household Profile
  • Population (female) age 6 years and above who ever attended school (%)
  • Population below age 15 years (%)
  • Sex ratio of the total population (females per 1,000 males)
  • Sex ratio at birth for children born in the last five years (females per 1,000 males)
  • Children under age 5 years whose birth was registered (%)
  • Households with electricity (%)
  • Households with an improved drinking-water source (%)
    Piped water into dwelling/yard/plot, public tap/standpipe, tube well or borehole, protected dug well, protected spring, rainwater, community RO plant.
  • Households using improved sanitation facility (%)
    Flush to piped sewer system, flush to septic tank, flush to pit latrine, ventilated improved pit (VIP)/biogas latrine, pit latrine with slab, twin pit/composting toilet, which is not shared with any other household.
  • Households using clean fuel for cooking (%)
    Electricity, LPG/natural gas, biogas.
  • Households using iodized salt (%)
  • Households with any usual member covered by a health scheme or health insurance (%)
  • Characteristics of Adults (age 15 - 49)
  • Women who are literate (%)
  • Men who are literate (%)
  • Women with 10 or more years of schooling (%)
  • Marriage and Fertility
  • Women age 20-24 years married before age 18 years (%)
  • Men age 25-29 years married before age 21 years (%)
  • Women age 15-19 years who were already mothers or pregnant at the time of the survey (%)
  • Current Use of Family Planning Methods (currently married women age 15-49 years)
  • Any methods (%)
    Includes other methods that are not shown separately
  • Any modern method (%)
    Includes other methods that are not shown separately
  • Female sterilization (%)
  • Male sterilization (%)
  • IUD/PPIUD (%)
  • Pill (%)
  • Condom (%)
  • Unmet Need for Family Planning (currently married women age 15-49 years)
    Unmet need for family planning refers to fecund women who are not using contraception but who wish to postpone the next birth (spacing) or stop childbearing altogether (limiting). Specifically, women are considered to have unmet need for spacing if they are:
    a. At risk of becoming pregnant, not using contraception, and either do not want to become pregnant within the next two years, or are unsure if or when they want to become pregnant.
    b. Pregnant with a mistimed pregnancy.
    c. Postpartum amenorrheic for up to two years following a mistimed birth and not using contraception.
    Women are considered to have unmet need for limiting if they are:
    a. At risk of becoming pregnant, not using contraception, and want no (more) children.
    b. Pregnant with an unwanted pregnancy.
    c. Postpartum amenorrheic for up to two years following an unwanted birth and not using contraception.
    Women who are classified as infecund have no unmet need because they are not at risk of becoming pregnant. Unmet need for family planning is the sum of unmet need for spacing plus unmet need for limiting
  • Total unmet need (%)
  • Unmet need for spacing (%)
  • Quality of Family Planning Services
  • Health worker ever talked to female non-users about family planning (%)
  • Current users ever told about side effects of current method (%)
    Based on current users of female sterilization, IUD/PPIUD, injectables and pill who started using that method in the past 5 years.
  • Maternity Care (for last birth in the 5 years before the survey)e
  • Mothers who had antenatal check-up in the first trimester (%)
  • Mothers who had at least 4 antenatal care visits (%)
  • Mothers whose last birth was protected against neonatal tetanus (%)
    Includes mothers with two injections during the pregnancy of her last birth, or two or more injections (the last within 3 years of the last live birth), or three or more injections (the last within 5 years of the last birth), or four or more injections (the last within 10 years of the last live birth), or five or more injections at any time prior to the last birth
  • Mothers who consumed iron folic acid for 100 days or more when they were pregnant (%)
  • Mothers who had full antenatal care (%)
    Full antenatal care is at least four antenatal visits, at least one tetanus toxoid (TT) injection and iron folic acid tablets or syrup taken for 100 or more days.
  • Registered pregnancies for which the mother received Mother and Child Protection (MCP) card (%)
  • Mothers who received postnatal care from a doctor/nurse/LHV/ANM/midwife/other health personnel within 2 days of delivery (%)
  • Mothers who received financial assistance under Janani Suraksha Yojana (JSY) for births delivered in an institution (%)
  • Average out of pocket expenditure per delivery in public health facility (Rs.)
  • Children born at home who were taken to a health facility for check-up within 24 hours of birth (%)
  • Children who received a health check after birth from a doctor/nurse/LHV/ANM/ midwife/other health personnel within 2 days of birth (%)
  • Delivery Care (for births in the 5 years before the survey)
  • Institutional births (%)
  • Institutional births in public facility (%)
  • Home delivery conducted by skilled health personnel (out of total deliveries) (%)
  • Births assisted by a doctor/nurse/LHV/ANM/other health personnel (%)
  • Births delivered by caesarean section (%)
  • Births in a private health facility delivered by caesarean section (%)
  • Births in a public health facility delivered by caesarean section (%)
  • Child Immunizations and Vitamin A Supplementation
  • Children age 12-23 months fully immunized (BCG, measles, and 3 doses each of polio and DPT) (%)
  • Children age 12-23 months who have received BCG (%)
  • Children age 12-23 months who have received 3 doses of polio vaccine (%)
  • Children age 12-23 months who have received 3 doses of DPT vaccine (%)
  • Children age 12-23 months who have received measles vaccine (%)
  • Children age 12-23 months who have received 3 doses of Hepatitis B vaccine (%)
  • Children age 9-59 months who received a vitamin A dose in last 6 months (%)
  • Children age 12-23 months who received most of the vaccinations in public health facility (%)
  • Children age 12-23 months who received most of the vaccinations in private health facility (%)
  • Treatment of Childhood Diseases (children under age 5 years)
  • Prevalence of diarrhoea (reported) in the last 2 weeks preceding the survey (%)
  • Children with diarrhoea in the last 2 weeks who received oral rehydration salts (ORS) (%)
  • Children with diarrhoea in the last 2 weeks who received zinc (%)
  • Children with diarrhoea in the last 2 weeks taken to a health facility (%)
  • Prevalence of symptoms of acute respiratory infection (ARI) in the last 2 weeks preceding the survey (%)
  • Children with fever or symptoms of ARI in the last 2 weeks preceding the survey taken to a health facility (%)
  • Child Feeding Practices and Nutritional Status of Children
  • Children under age 3 years breastfed within one hour of birth (%)
    Based on the last child born in the 5 years before the survey
  • Children under age 6 months exclusively breastfed (%)
    Based on the youngest child living with the mother
  • Children age 6-8 months receiving solid or semi-solid food and breastmilk (%)
    Based on the youngest child living with the mother
  • Breastfeeding children age 6-23 months receiving an adequate diet (%)
    Based on the youngest child living with the mother
  • Non-breastfeeding children age 6-23 months receiving an adequate diet (%)
    Based on the youngest child living with the mother. Breastfed children receiving 4 or more food groups and a minimum meal frequency, non-breastfed children fed with a minimum of 3 Infant and Young Child Feeding Practices (fed with other milk or milk products at least twice a day, a minimum meal frequency that is receiving solid or semi-solid food at least twice a day for breastfed infants 6-8 months and at least three times a day for breastfed children 9-23 months, and solid or semi-solid foods from at least four food groups not including the milk or milk products food group)
  • Total children age 6-23 months receiving an adequate diet (%)
    Based on the youngest child living with the mother
  • Non-breastfeeding children age 6-23 months receiving an adequate diet (%)
    Based on the youngest child living with the mother. Breastfed children receiving 4 or more food groups and a minimum meal frequency, non-breastfed children fed with a minimum of 3 Infant and Young Child Feeding Practices (fed with other milk or milk products at least twice a day, a minimum meal frequency that is receiving solid or semi-solid food at least twice a day for breastfed infants 6-8 months and at least three times a day for breastfed children 9-23 months, and solid or semi-solid foods from at least four food groups not including the milk or milk products food group)
  • Non-breastfeeding children age 6-23 months receiving an adequate diet (%)
    Based on the youngest child living with the mother. Breastfed children receiving 4 or more food groups and a minimum meal frequency, non-breastfed children fed with a minimum of 3 Infant and Young Child Feeding Practices (fed with other milk or milk products at least twice a day, a minimum meal frequency that is receiving solid or semi-solid food at least twice a day for breastfed infants 6-8 months and at least three times a day for breastfed children 9-23 months, and solid or semi-solid foods from at least four food groups not including the milk or milk products food group)
  • Children under 5 years who are stunted (height-for-age) (%)
    Below -2 standard deviations, based on the WHO standard.
  • Children under 5 years who are wasted (weight-for-height) (%)
    Below -2 standard deviations, based on the WHO standard.
  • Children under 5 years who are severely wasted (weight-for-height) (%)
    Below -3 standard deviations, based on the WHO standard.
  • Children under 5 years who are underweight (weight-for-age) (%)
    Below -2 standard deviations, based on the WHO standard.
  • Nutritional Status of Adults (age 15 - 49 years)
  • Women whose Body Mass Index (BMI) is below normal (BMI < 18.5 kg/m2) (%)
    Excludes pregnant women and women with a birth in the preceding 2 months.
  • Men whose Body Mass Index (BMI) is below normal (BMI < 18.5 kg/m2) (%)
  • Women who are overweight or obese (BMI ≥ 25.0 kg/m2)14 (%)
  • Men who are overweight or obese (BMI ≥ 25.0 kg/m2) (%)
  • Anaemia among Children and Adults
    Haemoglobin in grams per decilitre (g/dl). Among children, prevalence is adjusted for altitude. Among adults, prevalence is adjusted for altitude and for smoking status.
  • Children age 6-59 months who are anaemic (<11.0 g/dl) (%)
  • Non-pregnant women age 15-49 years who are anaemic (<12.0 g/dl) (%)
  • Pregnant women age 15-49 years who are anaemic (<11.0 g/dl) (%)
  • All women age 15-49 years who are anaemic (%)
  • Men age 15-49 years who are anaemic (<13.0 g/dl) (%)
  • Blood Sugar Level among Adults (age 15-49 years)
    Random blood sugar measurement (including those under medication).
  • Women - Blood sugar level - high (>140 mg/dl) (%)
  • Women - Blood sugar level - very high (>160 mg/dl) (%)
  • Men - Blood sugar level - high (>140 mg/dl) (%)
  • Men - Blood sugar level - very high (>160 mg/dl) (%)
  • Hypertension among Adults (age 15 - 49 years)
  • Women - Slightly above normal (Systolic 140-159 mm of Hg and/or Diastolic 90-99 mm of Hg) (%)
  • Women - Moderately high (Systolic 160-179 mm of Hg and/or Diastolic 100-109 mm of Hg) (%)
  • Women - Very high (Systolic ≥180 mm of Hg and/or Diastolic ≥110 mm of Hg) (%)
  • Men - Slightly above normal (Systolic 140-159 mm of Hg and/or Diastolic 90-99 mm of Hg) (%)
  • Men - Moderately high (Systolic 160-179 mm of Hg and/or Diastolic 100-109 mm of Hg) (%)
  • Men - Very high (Systolic ≥180 mm of Hg and/or Diastolic ≥110 mm of Hg) (%)
  • Women Age 15-49 years Who Have Ever Undergone Examinations of
  • Cervix (%)
  • Breast (%)
  • Oral cavity (%)
I love hearing feedback! If you don't like something, let me know in the comments and feel free to reach out to me. If you did, you can share it with your followers in one click or follow me on Twitter!