Component 4: Determining the Structure of the Evaluation System

When determining the structure of the system, stakeholders must consider the designated levels of performance; the frequency of evaluations, as applicable; and a number of other factors related to implementation. In designating the number and description of levels, states must ensure that the level designations (e.g., developing, proficient, exemplary) work for teachers at different experience levels. Likewise, the instruments must be sensitive enough to identify the appropriate level of reliability.

In addition, it is important that the frequency of evaluation is considered separately for each measure used. Classroom observations, for example, are often conducted several times throughout the year, whereas analyses of teacher artifacts may be performed at a different frequency. The teacher's level of performance or experience also may be a factor in determining the appropriate frequency of evaluation. Beginning teachers, or teachers with identified areas of weakness, may be evaluated more frequently than teachers who have reached exemplary or master status.

States may elect to mandate specific format requirements or allow for local flexibility. When making these determinations, states should consider implementation fidelity and reliability, local bargaining restraints, and resource limitations.

For a more detailed discussion of these topics, see the full downloadable Acrobat version of A Practical Guide to Designing Comprehensive Teacher Evaluation Systems.


Multiple measures

Guiding Questions

Weight of measures

Guiding Questions

Levels of proficiency

Guiding Questions

Performance failure

Guiding Questions

  • Will the state promote or use multiple measures?
  • Will a single measure be sufficient in making defensible decisions regarding teacher effectiveness?
  • Will a single measure accurately capture teacher capacity in terms of ability to elicit improved student achievement and implement evidence-based instructional strategies?
  • Has the state determined the percentage (weight) of each measure in the overall teacher rating?
  • Will each measure be weighted differently depending on:
    • Its relation to student achievement?
    • Its reliability and validity?
    • Its face validity?
  • Will the weight of each measure fluctuate depending on the level of reliability and validity that is proven over time?
  • Will the weight of each measure vary depending on teaching discipline and context?
  • Have the levels of teaching proficiency been determined?
  • How many levels of proficiency can be explicitly defined?
  • Can rubrics be developed to ensure fidelity?
  • How often can data be generated?
  • What implementation limitations should be considered (e.g., how frequently assessments can be conducted)?
  • Will baseline data be analyzed prior to making decisions regarding teacher proficiency levels?
  • Have consequences been determined for failure to meet acceptable performance levels?
  • Are opportunities for teachers to improve going to be embedded in the evaluation cycle?
  • Are the measures technically defensible to make personnel and compensation decisions?
  • Will teacher supports be provided to assist teachers with unacceptable performance?
  • How much time and assistance will be provided for a teacher to demonstrate improvement before termination is considered?
  • Will teacher performance affect tenure?


When determining the structure of the system, stakeholders must consider the details of an evaluation system. The resources below are intended to support states and districts as they make decisions around their evaluation systems.

Details of an Evaluation System

Determining Processes That Build Sustainable Teacher Accountability Systems (Research & Policy Brief)

Ongoing issues of teacher accountability have impelled several responses in the form of changes to current teacher evaluation practices. This brief reports preliminary findings and recommendations from a study of such change processes that Public Impact conducted for the TQ Center in three school districts and three state departments of education.

A Practical Guide to Evaluating Teacher Effectiveness

This guide offers a definition of teacher effectiveness that states and districts may adapt to meet local requirements. In addition, the guide provides an overview of the many purposes for evaluating teacher effectiveness and indicates which measures are most suitable to use under different circumstances.

Workshop: Supporting State Efforts to Design and Implement Teacher Evaluation Systems

This workshop was designed to provide regional comprehensive center staff with increased knowledge of select topics related to the creation and implementation of comprehensive evaluations. The workshop included four working sessions on the following topics: using evaluation data to support and improve effectiveness, measuring student growth in tested subjects, measuring student growth in nontested subjects and for teachers of at-risk students, and understanding the basics of quality assessments for measuring growth.


The GTL Center is building an online repository of expert panel reviews of
real-life teacher evaluation models operated by districts throughout the country.

For each district included, you can view, per component, a description of how that district approached the many issues involved.

To view these real-life models, visit the Teacher Evaluation Models in Practice portion of the GTL Center website.

  • First, click on View the Models in the table of contents.
  • Click one or more districts.
  • Then, select Component 4.