The Github repository with the data for CBSE is available at deedy/cbse_schools_data

and the CISCE at deedy/cisce_schools_data

India has two national boards of education - Central Board of Secondary Education (CBSE) and Council for Indian School Certificate Examination (CISCE). As of 2018, the CBSE has 20,367 schools and the CISCE 2,341 schools affiliated with them. Every year, over 1.1 million students in India take either the CBSE Class CII or the ISC examinaiton to graduate from high school. In this post, I write about parsing, scraping, cleaning and opening more than 3 million data points for the 22,000+ schools affiliated with these boards.

A map of the 10 CBSE regions of India.

CBSE

The CBSE is a government run national board of education in India, and is far larger than the CISCE. For each of the 20,367 schools affiliated with the CBSE, there exists 144 data points.

Regions

There are 10 CBSE regions and their region code. The region code is a 1 or 2 digit number that will be the first numbers of the roll numbers of students from these schools in the CBSE Class XII examination:

  • Ajmer, Rajasthan (1) Gujarat, Madhya Pradesh, Rajasthan, Dadar & Nagar Haveli.
  • Panchkula, Haryana (2) Haryana, Himachal Pradesh, Jammu & Kashmir, Punjab, Chandigarh.
  • Guwahati, Assam (3) Arunachal Pradesh, Assam, Manipur, Meghalaya, Mizoram, Nagaland, Sikkim, Tripura.
  • Chennai, Tamil Nadu (46) Andhra Pradesh, Goa, Karnataka, Maharashtra, Tamil Nadu, Telangana, Andaman & Nicobar, Daman & Diu, Puducherry
  • Trivandrum, Kerala (48) Kerala, Lakshadweep.
  • Allahabad, Uttar Pradesh (56) Uttar Pradesh (western districts).
  • Dehradun, Uttarakhand (58) Uttarakhand, Uttar Pradesh (eastern districts).
  • Bhubaneswar, Odisha (6) Chattisgarh, Odisha, West Bengal.
  • Patna, Bihar (7) Bihar, Jharkhand.
  • Delhi, NCT (9) Delhi and foreign schools.

Documentation

The detailed documentation and the data can be found in the Github repository.

  • required string name School name in upper case
  • required int32 aff_no Affiliation number, unique
  • required State state Indian State/Union Territory or “Foreign Schools”
  • optional District district Indian District (or Country if state == FOREIGN SCHOOlS)
  • required CbseRegion region One of the 10 CBSE regions this school is in the jurisdiction of.
  • required string address Postal Address
  • optional int32 pincode Indian pincode
  • optional string ph_no Phone number (with STD Code). ‘;’ Separated phone-numbers.
  • optional string off_ph_no Office phone number. ‘;’ Separated phone-numbers.
  • optional string res_ph_no Residential phone number. ‘;’ Separated phone-numbers.
  • optional string fax_no Fax number. ‘;’ Separated numbers.
  • optional string email Email address
  • optional string website Website
  • optional int32 year_found Year that the school was founded (between 1800 and 2018)
  • optional Date date_opened Date that the school was opened (in form “Sep 9, 2010”)
  • optional string princi_name Name of the principal, upper case
  • optional Sex sex Gender/sex of the school/principal (unclear?).
  • optional int32 princi_qual Qualifications of the principal
  • optional int32 princi_exp_adm Number of years of administrative experience of the principal
  • optional int32 princi_exp_teach Number of years of teaching experience of the principal
  • required Status status Status of the school - e.g. Middle Class, Secondary or Senior Secondary
  • optional AffiliationType aff_type Affiliation Type e.g. Provisional, Permanent
  • optional Date aff_start Affiliation start date (in form “Sep 9, 2010”)
  • optional Date aff_end Affiliation end date (in form “Sep 19, 2011”)
  • optional string soc_name Name of Trust, Society or Managing Committee, upper case
  • School Location Details (6 fields)
    • optional string l_nearest_railway Name of nearest railway station
    • optional float l_nearest_railway_dist Distance (in km) to nearest railway station
    • optional string l_nearest_police Name of nearest polic station
    • optional float l_nearest_police_dist Distance (in km) to nearest police station
    • optional string l_nearest_bank Name of nearest nationalized bank
    • optional float l_nearest_bank_dist Distance (in km) to nearest nationalized bank
  • School Nature Details (3 fields)
    • optional Category n_category The gender category of the school (e.g. Co-educational, Boys, Girls)
    • optional Medium n_medium The medium of instruction (e.g. English, Hindi)
    • optional Type n_school_type The type of the school (e.g. Independent, Govt, KVS, JNV, Govt Aided, etc)
  • School Enrollment Details (34 fields)
    • -sections and -students of each the different classes:
    • e_nursery_- (Nursery/KG/LKG), e_i_v_- (Class I-V Elementary), e_vi_viii_- (Class VI-VIII Middle), e_ix_x_- (Class IX-X Secondary), e_xi_xii_- (Class XI-XII Senior Secondary) and
    • e_i_-, e_ii_-, e_iii_-, e_iv_-, e_v_-, e_vi_-, e_vii_-, e_viii_-, e_ix_-, e_x_-, e_xi_-, e_xii_- (Class I through XII)
  • School Infrastructure Details (33 fields)
    • Each of -_no (number), _length (length in m), _breadth (breadth in m) of
    • i_classrooms-, i_composite_lab-, i_phy_lab-, i_chem_lab-, i_bio_lab-, i_biotech_lab-, i_math_lab-, i_cs_lab-, i_home_lab-, i_library-, i_other_lab-
  • School Teacher Details (21 fields)
    • Each of -_no (number), _trained (trained), _untrained (untrained) of
    • t_ntts- NTTs: Nursery Teacher Training
    • t_prts- PRTs: PRimary Teacher (1-5th, diploma in education)
    • t_tgts- TGTs: Trained Graduate Teacher (6-10th, grad in subject, B.Ed)
    • t_librarians- Librarians
    • t_ptis- PTIs: Physical Training Instructor
    • t_pgts- PGTs: Post Graduate Teacher (11-12th, post grad in subject, B.Ed)
    • t_execs- Executive (Vice Principal/Supervisor/Head Master/Head Mistress) Teacher
  • School Physical Infrastructure Details (10 fields)
    • optional float p_area_meter Area of school (in m^2)
    • optional float p_area_acre Area of school (in acre)
    • optional float p_area_builtup_meter Built-up area of school (in m^2)
    • optional Sites p_num_sites Number of sites this school is at (e.g. ONE, TWO)
    • optional float p_area_playground Area of playground (in m^2)
    • optional UrinalType p_urinal_type Type of urinals (e.g. flush, dry)
    • optional int p_boys_urinal Number of boys urinals
    • optional int p_girls_urinal Number of girls urinals
    • optional bool p_potable_water Whether the water is drinkable
    • optional bool p_health_cert Whether the school has an Oficial Health and Sanitary certificate
  • School Facilities Details (12 fields)
    • optional int f_total_books Number of total books
    • optional int f_periodicals Number of periodicals
    • optional int f_dailies Number of dailies
    • optional int f_reference_books Number of reference books
    • optional int f_magazine Number of magazines
    • optional bool f_swimming_pool Swimming Pool? (yes/no)
    • optional bool f_indoor_games Indoor games? (yes/no)
    • optional bool f_dance_rooms Dance rooms? (yes/no)
    • optional bool f_gym Gym? (yes/no)
    • optional bool f_music_rooms Music rooms? (yes/no)
    • optional bool f_hostel Hostel? (yes/no)
    • optional bool f_health_checkup Health checkup? (yes/no)

CISCE

The CISCE is the privately run national board of education in India. It’s largely based in West Bengal, and is the leading competitor to the CBSE. We’ve fetched 16 fields per CISCE school. The detailed documentation and the data can be found in the Github repository.

  • required string name Name of the school
  • required string code 5-digit unique code for school (e.g. AP001)
  • required string address Postal Address
  • required State state Indian State or Union Territory (or Country/City if international)
  • optional Phone off_ph_no Office Phone number. Integral phone number that may be prefixed with ‘+’.
  • optional Phone off_ph_no_2 Office Phone number 2.
  • optional Phone fax_no Fax number.
  • optional string email Email address
  • optional string website Website of the school
  • required Gender gender Gender of the school students (e.g. Co-ed, Boys, Girls)
  • optional ResidentialType res_type Residential Type (e.g. D/R, Day)
  • required AffiliationType aff_type Affiliation type of the school (e.g Permanent, Provisional)
  • required bool is_icse Does this school administer ICSE (Class X examinations)
  • required bool is_isc Does this school administer ISC (Class XII examinations)
  • required bool is_cve Does this school administer CVE (Certificate of Vocational Education)
  • required string princi_name Name of the principal.
I love hearing feedback! If you don't like something, let me know in the comments and feel free to reach out to me. If you did, you can share it with your followers in one click or follow me on Twitter!