1. Apples, oranges and enrolment rates
One of the biggest problems with education data in Pakistan is comparability. Government datasets like Pakistan Education Atlas and Pakistan Education Statistics (both based on NEMIS data) do not offer definitions for variables. So it is up to users to best interpret the data. TheAnnual Status of Education Report (ASER), does not specify, in detail, which enrolment rate it uses: net enrolment or gross enrolment. A comparison with Net Enrolment Rates (NER) and Gross Enrolment Rates (GER) given elsewhere becomes difficult.
Survival Rates (a measure of a child’s progress through her school career) are reported in NEMIS as well as some of the provincial Education Management Infiormation Systems. However the numbers reported by NEMIS vary from the numbers provided by provincial departments. For example, in the Pakistan Education Atlas (2013) the boys survival rate to 5th grade for Khyber Pakhtunkhwa is 71% but for the same grade Khyber Pakhtunkhwa EMIS 2013 reports 58%.
When compiling enrolment rates different age brackets are used by different data collection agencies. The Pakistan standards of living measurement survey (PSLM) uses three separate age brackets for primary school going age children: 4-9, 5-9 & 6-10.
2. The census, sampling & validity
The federal government is supposed to conduct a census every ten years. Censuses were conducted in 1951, 1961, 1972 and 1981 with regularity. The 1991 census was delayed by seven years and finally completed in 1998, while no census was conducted in 2001. The planned census for 2008 has yet not been conducted.
Unfortunately in the absence of a recently conducted census all sampling done in the country to collect data becomes intrinsically less valid. Think about the the national level migrations due to conflict (militancy), natural disasters (the 2005 earthquake and the 2010 & 2011 floods) & economic changes which have occured since 1998. Most sampling data in the country is either based on a census which was conducted 16 years ago or based on educated estimates or conjecture about the underlying nature of the population. This means each year survey data in Pakistan gets less and less valid. (Here is a brief primer on external validity).
3. Mind the gaps
While a lot of data is available on education in Pakistan there are still major gaps. perhaps one of the biggest challenges is clearly identifying education budget and spending data in a standardized and comparable way across provinces and at a sub-provincial level. While some information exists it is usually burried in government budgetary documents which are obtuse and not easy to decipher. Most of these documents are not available online.
A second major issue is the paucity of data on non-governmental educational institutes. This includes both private schools and madrassas. We rely on occasional surveys of private institutions (PEIP, 2000 and the National Education Census, 2005) and household estimates for participation in the madrassa system.
A third area which requires greater attention is making data available at levels of aggregation which are below for sub-district areas like tehsils, union councils and villages and cities.
Finally, the late release of data means that often when discussing educational trends and data we are talking about things which happened more than a year ago. The latest publicly availablePSLM and NEMIS data were collected in 2012-13, almost two years ago.
4. Jhelum by any other name …
Perhaps one of the most frustrating (and emblematic of the larger challenge) thing is that we cannot even agree what to call places. Is it Sindh or Sind? Balochistan or Baluchistan? For the record we use the spellings provided by the Constitution of Pakistan (Sindh and Balochistan).
District names are even more fraught with confusion. Sometimes the departments of the same government (federal or provincial) use different spellings:
- Jhelum or Jehlum (Punjab)
- Layyah or Leiah (Punjab)
- Killa Abdullah or Qilla Abdullah (Balochistan)
- Astore or Astor (Gilgit-Baltistan)
While this problem may appear minor on the surface it is important because ideally datasets from different agencies would be available in electronic form and data users would be able to use statistical (and other kinds of) software and have the data talk to each other. Standardized place and variable names would allow for ease of comparison.
*Originally published by Pakistan Data Portal.