CHAPTER 2 Methodological Issues

Susan Schadler, and Hugh Bredenkamp
Published Date:
June 1999
  • ShareShare
Show Summary Details
Hugh Bredenkamp

The extent of progress with respect to the objectives of the ESAF can be assessed from a variety of perspectives. The most commonly used in evaluations of adjustment programs are:

  • a comparison of outcomes before and after the start of a program or adjustment phase;
  • a comparison over time with outcomes in other countries or groups of countries;
  • a comparison between outcomes and targets set under a program; and
  • a comparison with what might have been achieved in the absence of a program.

Approaches to Evaluation

Before-and-after comparisons show progress (or lack thereof) in an absolute sense. They address straightforwardly whether countries’ policies changed course, and whether their economic situations improved or deteriorated over the periods of interest. Such comparisons are used extensively in this study, and so it is important to bear in mind what information they can and cannot convey.1 If the periods are chosen to precede and follow the start of a program or adjustment phase, before-and-after comparisons can demonstrate an association between the outcomes, policies pursued, and the (IMF-supported) program within which the policies were framed. As regards causality, however, no firm inferences can be drawn from such evidence. Although policies may have influenced outcomes, the policies were not necessarily “caused” by the program—they might have been implemented anyway. Discerning the impact of policies also strictly requires that the effects of differing initial conditions and exogenous factors (such as changes in commodity prices, world interest rates, or the weather) be taken into account.

Cross-country comparisons can be of interest from several perspectives. First, it may be considered important not only that a particular group of countries progress in absolute terms but also that they keep up, or catch up, with other groups. Second, if the comparator group is chosen appropriately, cross-country comparisons may abstract—albeit only very crudely—from some of the influences of certain exogenous factors, such as oil-price shocks or fluctuations in world trade, that might be expected to affect all countries simultaneously.2 Third, by virtue of the diversity and sheer volume of information they provide, large cross-country data sets are useful as a basis for more rigorous analyses of the linkages between outcomes, on the one hand, and policies, exogenous factors, and initial conditions on the other.3 This study includes examples of both simple cross-country (or cross-group) comparisons and, in the discussion of growth performance (see Chapter 5), “controlled” comparisons using cross-country regression analysis.

An assessment of outcomes relative to program targets is a natural component of a review of this sort, and it is used extensively. As with before-and-after comparisons, however, interpretations beyond the factual—targets were met or not met, and by how much—need to be made cautiously. Deviations from target can occur because policies were not implemented as planned, because planned policies had unanticipated effects (were too “weak” or too “strong”), or because other factors differed from program assumptions. Distinguishing between such causes is possible in principle, but far from straightforward in practice.

Some studies have attempted to compare outcomes under IMF-supported programs with what would have occurred in the absence of the program.4 This requires not only that the main determinants of macroeconomic outcomes be established, but also that a counterfactual set of policies be defined. One attraction of this approach is that, by focusing directly on the counterfactual, it attempts to make explicit what other methodologies leave implicit.5 Another feature is that it allows one to measure the “net value added” of IMF-supported or other clearly identifiable programs.6 It also permits the problem of sample selection bias to be addressed.

An evaluation of this sort has not been attempted in this review, however, for two reasons. First, the practical difficulties are formidable. The counterfactual set of policies must be simulated, the typical method being to estimate policy reaction functions for program countries using data from nonprogram episodes and nonprogram countries.7 There is a wide range of possible functions, and the problems of defining and measuring policies (especially in the structural area) typically constrain the choice to relatively simple formulations, which are then assumed to apply identically to all countries in the sample. These limitations add an additional layer of potential error to that already involved in linking policies to outcomes, and they can lead to weak and unstable results. When the approach was tried on the sample of countries and programs that were the subject of the 1993 ESAF review, the model was found to be so poorly determined—the explanatory power of the policy reaction functions in particular was close to zero—as to render the implied conclusions unreliable (Dicks-Mireaux and others, 1995). A second consideration weighing against this methodology was that estimates of the independent effect of IMF-supported programs would not help to answer the questions of primary interest for this review: what policies were actually pursued, what were the outcomes in terms of final objectives, and how can the design of programs be enhanced to improve their effectiveness.

Country Groupings

For the most part, the focus in this study is on developments in the sample of ESAF users as a whole, or in subgroups thereof, rather than on detailed case studies. This is because, for most of the issues addressed, the aim was to identify patterns in the data from which general conclusions might be drawn. The exceptions are the analyses of public enterprise and banking reforms in Chapter 8 and of revenue and expenditure reforms in the background paper on fiscal issues (see Abed and others, 1998). In both cases, it was considered that only a case-study approach would allow sufficiently close scrutiny of the institutional factors that are so important in these areas.

For purposes of exposition and analysis, various disaggregations of the sample are used, depending on the context: a breakdown by range of initial inflation is used in Chapter 6, for example, and according to the extent of progress toward external viability in Chapter 7. A common presentation, however—used throughout the study—differentiates between five regional groupings: eight countries in the CFA franc zone, 14 other African countries, four nontransition Asian countries, four Western Hemisphere countries, and six transition economies (Box 2.1). The transition economies have special characteristics, including a relatively short adjustment experience and limited data availability; thus, they have been separated out and grouped together for much of the analysis even though they are quite diverse, reflecting both regional differences and differences in the closeness of past ties with the former Soviet Union.

Box 2.1The Countries Under Review

Throughout this study, the 36 countries under review have been classified into five regional groupings as follows:

CFA AfricaNon-CFA AfricaAsiaWestern HemisphereTransition Economies
Burkina FasoGambia, TheNepalNepalCambodia
Côte d’IvoireGhanaPakistanHondurasKyrgyz Republic
Equatorial GuineaGuineaSri LankaNicaraguaLao, P.D.R.
Sierra Leone

It was found that, in many instances, the central tendencies for these groupings represented quite well the trends in their constituent countries, while there were often interesting or important differences between groups. However, where there was a large degree of dispersion—or some notable outliers—within groupings that might color the interpretation of group averages, these are highlighted.8 The study also pays heed to the risks of reading too much into differences between group averages when observations are highly dispersed. Where relevant, and where it affects the validity of important inferences, formal tests of the significance between means are reported.

A final note on data: the quality and coverage of the statistics for many of the countries under review are very poor. There are many gaps and weaknesses in data on program targets as well as outturns. This has necessitated some redefinition of samples in specific instances throughout the study, all of which are documented. More generally, it has hampered—and, in some respects, must weaken—the analysis in places. These problems were noted at the time of the last ESAF review, and it is not clear that the situation has improved markedly since then.


    Abed, George T., and others,1998, Fiscal Reforms in Low-Income Countries: Experience Under IMF-Supported Programs, Occasional Paper 160 (Washington: IMF).

    • Search Google Scholar
    • Export Citation

    Corbo, Vittorio, and StanleyFischer,1995, “Structural Adjustment, Stabilization, and Policy Reform: Domestic and International Finance,” in Handbook of Development Economics, Volume 3B, ed. byJereBehrman and T.N.Srinivasan (Amsterdam: North-Holland; New York: Elsevier).

    • Search Google Scholar
    • Export Citation

    Dicks-Mireaux, Louis, MauroMecagni, and SusanSchadler,1995, “The Macroeconomic Effects of ESAF-Supported Programs: Revisiting Some Methodological Issues,”Working Paper 95/92 (Washington: IMF).

    • Search Google Scholar
    • Export Citation

    Goldstein, Morris, and PeterMontiel,1986, “Evaluating Fund Stabilization Programs with Multicountry Data: Some Methodological Pitfalls,”Staff Papers, International Monetary Fund, Vol. 33 (June), pp. 304–44.

    • Search Google Scholar
    • Export Citation

    Khan, Mohsin S.,1990, “The Macroeconomic Effects of Fund-Supported Adjustment Programs,”Staff Papers, International Monetary Fund, Vol. 37 (June), pp. 195–231.

    • Search Google Scholar
    • Export Citation

    Killick, Tony,1995a, IMF Programmes in Developing Countries: Design and Impact (London: Routledge).

    Killick, Tony,1995b, “Can the IMF Help Low-Income Countries? Experiences With Its Structural Adjustment Facilities,”World Economy, Vol. 18, No. 4 (July), pp. 603–16.

    • Search Google Scholar
    • Export Citation

    Schadler, Susan,1995, “Can the IMF Help Low-Income Countries: A Reply,”World Economy, Vol. 18, No. 4 (July), pp. 617–25.

    Schadler, Susan, AdamBennett, MariaCarkovic, LouisDicks-Mireaux, MauroMecagni, JamesMorsink, and MiguelSavastano,1995, IMF Conditionality: Experience Under Stand-By and Extended Arrangements, Part 1: Key Issues and Findings, Occasional Paper 128 (Washington: IMF).

    • Search Google Scholar
    • Export Citation

The study employs three forms of before-and-after comparison, depending on the context and data availability: (1) a “preprogram” period versus “during” program; (2) the period preceding a first SAF/ESAF-supported program (a ccountryspecific proxy for the preadjustment period) versus the period since; and (3) the early 1980s (1981–85) as a general proxy for the preadjustment period, versus the late 1980s (1986–90) and early 1990s (1991–95). The last approach is clearly the crudest in terms of identifying the pre- and post-adjustment phases, but it greatly simplifies the analysis of general trends across countries.


Among the limitations of this form of analysis is the implicit assumption of similar initial conditions and economic structures across countries.


Interpreting simple comparisons of outturns in countries with and without ESAF-supported programs is hampered by the problem of sample selection bias—the likely tendency for ESAF users to have weaker initial conditions and perhaps less favorable external circumstances than the average developing country.


See Dicks-Mireaux and others (1995), Goldstein and Montiel (1986), and Khan (1990). The paper by Khan provides a thorough discussion of all the methodological issues raised here. See also Corbo and Fischer (1995), Killick (1995a and 1995b), and Schadler (1995).


The before-and-after comparison, for instance, implicitly assumes a counterfactual in which policies, the external environment, and outcomes would all have remained constant at their “before-program” values. In the control-group approach, the implicit underlying assumption is that, without the programs under review, the countries concerned would have had the same performance as that observed in the control group.


As Schadler and others (1995) have pointed out, however, the measure of value added does not discriminate between a case in which the same unit of value is being added to a very weak set of nonprogram policies and one in which the nonprogram baseline incorporates very strong policies.


A policy reaction function relates the setting of a policy instrument directly to a set of specified variables that represent the state of the economy and relevant economic objectives.


Some judgment was inevitably required with regard to the choice of the mean or the median to represent “average” values. Both measures are used, depending on which appears to be most representative of a particular sample, and in some cases (where the two measures convey a different message) both are reported.

    Other Resources Citing This Publication