UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE IDIOMAS EVALUATION AND ASSESSMENT TECHNIQUES LICDA. EVELYN R. QUIROA
Erver Moisés Azurdia Sandoval 50761114201 11-17-2012
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE HUMANIDADES ESCUELA DE IDIOMAS ESCUELA DE IDIOMAS LICDA. EVELYN R. QUIROA
Assessment and Evaluation Defined Assessment is the act of gathering information on a daily basis in order to understand individual students' learning and needs. Evaluation is the culminating act of interpreting the information gathered for the purpose of making decisions or judgments about students' learning and needs, often at reporting time. Assessment and evaluation are integral components of the teaching‐learning cycle. The main purposes are to guide and improve learning and instruction. Effectively planned assessment and evaluation can promote learning, build confidence, and develop students' understanding of themselves as learners. Assessment data assists the teacher in planning and adapting for further instruction. As well, teachers can enhance students' understanding of their own progress by involving them in gathering their own data, and by sharing teacher‐gathered data with them. Such participation makes it possible for students to identify personal learning goals. This curriculum advocates assessment and evaluation procedures which correspond with curriculum objectives and instructional practices, and which are sensitive to the developmental characteristics of early adolescents. Observation, conferencing, oral and written product assessment, and process (or performance) assessment may be used to gather information about student progress.
Guiding Principles The following principles are intended to assist teachers in planning for student assessment and evaluation: • •
• • •
Assessment and evaluation are essential components of the teaching‐learning process. They should be planned, continuous activities which are derived from curriculum objectives and consistent with the instructional and learning strategies. A variety of assessment and evaluation techniques should be used. Techniques should be selected for their appropriateness to students' learning styles and to the intended purposes. Students should be given opportunities to demonstrate the extent of their knowledge, abilities, and attitudes in a variety of ways. Teachers should communicate assessment and evaluation strategies and plans in advance, informing the students of the objectives and the assessment procedures relative to the objectives. Students should have opportunities for input into the evaluation process. Assessment and evaluation should be fair and equitable. They should be sensitive to family, classroom, school, and community situations and to cultural or gender requirements; they should be free of bias. Assessment and evaluation should help students. They should provide positive feedback and encourage students to participate actively in their own assessment in order to foster lifelong learning and enable them to transfer knowledge and abilities to their life experiences.
•
Assessment and evaluation data and results should be communicated to students and parents/guardians regularly, in meaningful ways.
Using a variety of techniques and tools, the teacher collects assessment information about students' language development and their growth in speaking, listening, writing, and reading knowledge and abilities. The data gathered during assessment becomes the basis for an evaluation. Comparing assessment information to curriculum objectives allows the teacher to make a decision or judgment regarding the progress of a student's learning.
Types of Assessment and Evaluation There are three types of assessment and evaluation that occur regularly throughout the school year: diagnostic, formative, and summative. Diagnostic assessment and evaluation usually occur at the beginning of the school year and before each unit of study. The purposes are to determine students' knowledge and skills, their learning needs, and their motivational and interest levels. By examining the results of diagnostic assessment, teachers can determine where to begin instruction and what concepts or skills to emphasize. Diagnostic assessment provides information essential to teachers in selecting relevant learning objectives and in designing appropriate learning experiences for all students, individually and as group members. Keeping diagnostic instruments for comparison and further reference enables teachers and students to determine progress and future direction. Diagnostic assessment tools such as the Writing Strategies Questionnaire and the Reading Interest/Attitude Inventory in this guide can provide support for instructional decisions. Formative assessment and evaluation focus on the processes and products of learning. Formative assessment is continuous and is meant to inform the student, the parent/guardian, and the teacher of the student's progress toward the curriculum objectives. This type of assessment and evaluation provides information upon which instructional decisions and adaptations can be made and provides students with directions for future learning. Involvement in constructing their own assessment instruments or in adapting ones the teacher has made allows students to focus on what they are trying to achieve, develops their thinking skills, and helps them to become reflective learners. As well, peer assessment is a useful formative evaluation technique. For peer assessment to be successful, students must be provided with assistance and the opportunity to observe a model peer assessment session. Through peer assessment students have the opportunity to become critical and creative thinkers who can clearly communicate ideas and thoughts to others. Instruments such as checklists or learning logs, and interviews or conferences provide useful data. Summative assessment and evaluation occur most often at the end of a unit of instruction and at term or year end when students are ready to demonstrate achievement of curriculum objectives. The main purposes are to determine knowledge, skills, abilities, and attitudes that have developed over a given period of time; to summarize student progress; and to report this progress to students, parents/guardians, and teachers.
Summative judgments are based upon criteria derived from curriculum objectives. By sharing these objectives with the students and involving them in designing the evaluation instruments, teachers enable students to understand and internalize the criteria by which their progress will be determined. Often assessment and evaluation results provide both formative and summative information. For example, summative evaluation can be used formatively to make decisions about changes to instructional strategies, curriculum topics, or learning environment. Similarly, formative evaluation assists teachers in making summative judgments about student progress and determining where further instruction is necessary for individuals or groups. The suggested assessment techniques included in various sections of this guide may be used for each type of evaluation.
The Evaluation Process Teachers as decision makers strive to make a close match between curriculum objectives, instructional methods, and assessment techniques. The evaluation process carried out parallel to instruction is a cyclical one that involves four phases: preparation, assessment, evaluation, and reflection. In the preparation phase, teachers decide what is to be evaluated, the type of evaluation to be used (diagnostic, formative, or summative), the criteria upon which student learning outcomes will be judged, and the most appropriate assessment techniques for gathering information on student progress. Teachers may make these decisions in collaboration with students. During the assessment phase, teachers select appropriate tools and techniques, then collect and collate information on student progress. Teachers must determine where, when, and how assessments will be conducted, and students must be consulted and informed. During the evaluation phase, teachers interpret the assessment information and make judgments about student progress. These judgments (or evaluation) provide information upon which teachers base decisions about student learning and report progress to students and parents/guardians. Students are encouraged to monitor their own learning by evaluating their achievements on a regular basis. Encouraging students to participate in evaluation nurtures gradual acceptance of responsibility for their own progress and helps them to understand and appreciate their growth as readers and writers. The reflection phase allows teachers to consider the extent to which the previous phases in the evaluation process have been successful. Specifically, teachers evaluate the utility, equity, and appropriateness of the assessment techniques used. Such reflection assists teachers in making decisions concerning improvements or adaptations to subsequent instruction and evaluation.
Student Assessment and Evaluation When implementing assessment and evaluation procedures, it is valuable to consider the characteristics of early adolescents. Developmentally, Middle Level students are at various cognitive, emotional, social, and physical levels. Assessment and evaluation must be sensitive to this range of transitions and address individual progress. It is unrealistic and damaging to expect students who are at various stages of development to perform at the same level. It is necessary to clarify, for Middle Level students, the individual nature of the curriculum and the assessment strategies used; students should recognize that they are not
being compared to their peers, but that they are setting their own learning goals in relation to curriculum objectives. Insensitive evaluation of the early adolescent can result in the student feeling low self‐worth and wanting to give up. Regular, positive feedback is a valuable part of the learning process and helps students identify how well they have achieved individual goals and curriculum objectives. As students begin to achieve success, their sense of self‐esteem increases and the need for extrinsic rewards gives way to the development of intrinsic motivation. Early adolescents are vulnerable to peer approval or rejection, and they harbor a strong sense of fairness and justice. Because Middle Level students find it more satisfying to strive for immediately achievable goals rather than long‐term goals, they will respond positively to a system of continuous assessment and evaluation. Effective evaluators of Middle Level students are astute observers who use a variety of monitoring techniques to collect information about students' knowledge, skills, attitudes, values, and language competencies. Well organized, concise, and accessible records accommodate the large quantities of data likely to be collected, and assist teachers' decision making and reporting. Some effective techniques for monitoring student progress in the areas of oracy and literacy include the following: • • • • • • • •
Make video and audio recordings of a variety of formal and informal oral language experiences, and then assess these according to pre‐determined criteria which are based upon student needs and curriculum objectives. Use checklists as concise methods of collecting information, and rating scales or rubrics to assess student achievement. Record anecdotal comments to provide useful data based upon observation of students' oral activities. Interview students to determine what they believe they do well or areas in which they need to improve. Have students keep portfolios of their dated writing samples, and language abilities checklists and records. Keep anecdotal records of students' reading and writing activities and experiences. Have students write in reader response journals. Confer with students during the writing and reading processes, and observe them during peer conferences.
Self‐assessment promotes students' abilities to assume more responsibility for their own learning by encouraging self‐reflection and encouraging them to identify where they believe they have been successful and where they believe they require assistance. Discussing students' self‐assessments with them allows the teacher to see how they value their own work and to ask questions that encourage students to reflect upon their experiences and set goals for new learning. Peer assessment allows students to collaborate and learn from others. Through discussions with peers, Middle Level students can verbalize their concerns and ideas in a way that helps them clarify their thoughts and decide in which direction to proceed.
The instruments for peer and self‐assessment should be collaboratively constructed by teachers and students. It is important for teachers to discuss learning objectives with the students. Together, they can develop assessment and evaluation criteria relevant to the objectives, as well as to students' individual and group needs.
Assessment and Evaluation Strategies Assessment data can be collected and recorded by both the teacher and the students in a variety of ways. Through observation of students, and in interviews or conferences with students, teachers can discover much about their students' knowledge, abilities, interests, and needs. As well, teachers can collect samples of students' work in portfolios and conduct performance assessments within the context of classroom activities. When a number of assessment tools are used in conjunction with one another, richer and more in‐ depth data collection results. Whatever method of data collection is used, teachers should: • •
meet with students regularly to discuss their progress adjust rating criteria as learners change and progress.
Observation Observation occurs during students' daily reading, writing, listening, and speaking experiences. It is an unobtrusive means by which teachers (and students) can determine their progress during learning. Observations can be recorded as anecdotal notes, and on checklists or rating scales. When teachers attach the data collection sheets to a hand‐held clipboard, data can be recorded immediately and with little interruption to the student. Alternatively, adhesive note papers can be used to record data quickly and unobtrusively. Anecdotal Records Anecdotal records are notes written by the teacher regarding student language, behavior, or learning. They document and describe significant daily events, and relevant aspects of student activity and progress. These notes can be taken during student activities or at the end of the day. Formats for collection should be flexible and easy to use. Guidelines for use include the following: • •
• • • •
Record the observation and the circumstance in which the learning experience occurs. There will be time to analyze notes at another time, perhaps at the end of the day, or after several observations about one student have been accumulated. Make the task of daily note taking manageable by focusing on clearly defined objectives or purposes, and by identifying only a few students to observe during a designated period of time. However, learning and progress cannot be scheduled, and it is valuable to note other observations of importance as they occur. Record data on loose‐leaf sheets and keep these in a three‐ring binder with a page designated for each student and organized alphabetically by students' last names or by class. This format allows the teacher to add pages as necessary. Write the notes on recipe cards and then file these alphabetically. Use adhesive note papers that can be attached to the student's pages or recipe card files. Design structured forms for collection of specific data.
•
Use a combination of the above suggestions.
Teachers may choose to keep running written observations for each student or they may use a more structured approach, constructing charts that focus each observation on the collection of specific data. A combination of open‐ended notes and structured forms may also be used. It is important to date all observations recorded. Checklists Observation checklists, usually completed while students are engaged in specific activities or processes, are lists of specific criteria that teachers focus on at a particular time or during a particular process. Checklists are used to record whether students have acquired specific knowledge, skills, processes, abilities, and attitudes. Checklists inform teachers about where their instruction has been successful and where students need assistance or further instruction. Formats for checklists should be varied and easy to use. Guidelines for using checklists include the following: • • • • • • • • • • • •
Determine the observation criteria from curriculum, unit, and lesson objectives. Review specific criteria with students before beginning the observation. Involve students in developing some or all of the criteria whenever it will be beneficial to do so. Choose criteria that are easily observed to prevent vagueness and increase objectivity. Use jargon‐free language to describe criteria so that data can be used in interviews with students and parents. Make the observation manageable by keeping the number of criteria to less than eight and by limiting the number of students observed to a few at one time. Have students construct and use checklists for peer and self‐assessments. Summarize checklist data regularly. Use or adapt existing checklists from other sources. Use yes‐no checklists to identify whether a specific action has been completed or if a particular quality is present. Use tally checklists to note the frequency of the action observed or recorded. Construct all checklists with space for recording anecdotal notes and comments.
Rating Scales and Rubrics Rating scales record the extent to which specific criteria have been achieved by the student or are present in the student's work. Rating scales also record the quality of the student's performance at a given time or within a given process. Rating scales are similar to checklists, and teachers can often convert checklists into rating scales by assigning number values to the various criteria listed. They can be designed as number lines or as holistic scales or rubrics. Rubrics include criteria that describe each level of the rating scale and are used to determine student progress in comparison to these expectations. All formats for rating student progress should be concise and clear. Guidelines for use include the following: • • •
Determine specific assessment criteria from curriculum objectives, components of a particular activity, or student needs. Discuss or develop the specific criteria with students before beginning the assessment. Choose criteria that are easily observed in order to prevent vagueness and increase objectivity.
• • • • • •
Select criteria that students have had the opportunity to practice. These criteria may differ from student to student, depending upon their strengths and needs. Use jargon‐free language to describe criteria so that data can be used effectively in interviews with students and parents. Make the assessment manageable by keeping the number of criteria to less than eight and by limiting the number of students observed to a few at one time. Use or adapt rating scales and rubrics from other sources. Use numbered continuums to measure the degree to which students are successful at accomplishing a skill or activity. Use rubrics when the observation calls for a holistic rating scale. Rubrics describe the attributes of student knowledge or achievements on a numbered continuum of possibilities.
Portfolios Portfolios are collections of relevant work that reflect students' individual efforts, development, and progress over a designated period of time. Portfolios provide students, teachers, parents, and administrators with a broad picture of each student's growth over time, including the student's abilities, knowledge, skills, and attitudes. Students should be involved in the selection of work to be included, goal setting for personal learning, and self‐assessment. The teacher can encourage critical thinking by having students decide which of their works to include in their portfolios and explain why they chose those particular items. Instruction and assessment are integrated as students and teachers collaborate to compile relevant and individual portfolios for each student. Guidelines for use include the following: • • • •
•
Brainstorm with students to discover what they already know about portfolios. Share samples of portfolios with students. (Teachers may need to create samples if student ones are not available; however, samples should be as authentic as possible.) Provide students with an overview of portfolio assessment prior to beginning their collections. Collaborate with students to set up guidelines for the content of portfolios and establish evaluation criteria for their portfolio collections. Consider the following: o What is the purpose of the portfolio? (Is it the primary focus of assessment or is it supplemental? Will it be used to determine a mark or will it simply be used to inform students, teachers, and parents about student progress?) o Who will be the audience(s) for the portfolio? o What will be included in the portfolio (e.g., writing samples only, samples of all language processes)? o What are the criteria for selecting a piece of work for inclusion? When should those selections be made? o Who will determine what items are included in the portfolio (e.g., the student, the teacher, the student and teacher in consultation)? o When should items be added or removed? o How should the contents be organized and documented? Where will the portfolios be stored? o What will be the criteria for evaluation of the portfolio? o What form will feedback to the students take (e.g., written summaries, oral interviews/ conferences)? o How will the portfolio be assessed/evaluated (e.g., list of criteria)? Assemble examples of work that represent a wide range of students' developing abilities, knowledge, and attitudes including samples of work from their speaking, listening, reading, writing, representing, and viewing experiences.
• • •
Date all items for effective organization and reference. Inform parents/guardians about the use and purposes of portfolios (e.g., send letters describing portfolios home, display sample portfolios on meet‐the‐teacher evening to introduce parents to the concept). Consider the following for inclusion: o criteria for content selection o table of contents or captioned labels that briefly outline or identify the contents o samples of student writing (e.g., pre‐writing, multiple drafts, final drafts, published pieces) o sample reading logs o samples of a variety of responses from reader response journals (originals or photocopies of originals) o evidence of student self‐reflection (e.g., summaries, structured reflection sheets) o audiotapes and videotapes of student work o photographs o collaborative projects o computer disks.
Formats for portfolio assembly should be easily organized, stored, and accessed. Some possibilities include the following: • • •
Keep file folders or accordion folders in classroom filing cabinet drawers, cupboards, or boxes. Use three‐ring binders for ease of adding and removing items as students progress. Store scrapbooks in boxes or crates.
Evaluating Student Portfolios At the end of the term/semester/year when the portfolio is submitted for summative evaluation, it is useful to review the contents as a whole and record data using the previously set criteria. One method of recording data is to prepare a grid with the criteria listed down one side and the checklist or rating scale across the top. If there is need to assign a numerical grade, designate numbers to each set of criteria on the checklist/rating scale and convert the evaluation into a number grade. Some examples of portfolio assessment and recording forms follow. The teacher can adapt these sample forms or create new ones.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE HUMANIDADES ESCUELA DE IDIOMAS ESCUELA DE IDIOMAS LICDA. EVELYN R. QUIROA
Assessment and Evaluation Defined Assessment is the act of gathering information on a daily basis in order to understand individual students' learning and needs. Evaluation is the culminating act of interpreting the information gathered for the purpose of making decisions or judgments about students' learning and needs, often at reporting time. Assessment and evaluation are integral components of the teaching‐learning cycle. The main purposes are to guide and improve learning and instruction. Effectively planned assessment and evaluation can promote learning, build confidence, and develop students' understanding of themselves as learners. Assessment data assists the teacher in planning and adapting for further instruction. As well, teachers can enhance students' understanding of their own progress by involving them in gathering their own data, and by sharing teacher‐gathered data with them. Such participation makes it possible for students to identify personal learning goals. This curriculum advocates assessment and evaluation procedures which correspond with curriculum objectives and instructional practices, and which are sensitive to the developmental characteristics of early adolescents. Observation, conferencing, oral and written product assessment, and process (or performance) assessment may be used to gather information about student progress.
Guiding Principles The following principles are intended to assist teachers in planning for student assessment and evaluation: • •
• • •
Assessment and evaluation are essential components of the teaching‐learning process. They should be planned, continuous activities which are derived from curriculum objectives and consistent with the instructional and learning strategies. A variety of assessment and evaluation techniques should be used. Techniques should be selected for their appropriateness to students' learning styles and to the intended purposes. Students should be given opportunities to demonstrate the extent of their knowledge, abilities, and attitudes in a variety of ways. Teachers should communicate assessment and evaluation strategies and plans in advance, informing the students of the objectives and the assessment procedures relative to the objectives. Students should have opportunities for input into the evaluation process. Assessment and evaluation should be fair and equitable. They should be sensitive to family, classroom, school, and community situations and to cultural or gender requirements; they should be free of bias. Assessment and evaluation should help students. They should provide positive feedback and encourage students to participate actively in their own assessment in order to foster lifelong learning and enable them to transfer knowledge and abilities to their life experiences.
•
Assessment and evaluation data and results should be communicated to students and parents/guardians regularly, in meaningful ways.
Using a variety of techniques and tools, the teacher collects assessment information about students' language development and their growth in speaking, listening, writing, and reading knowledge and abilities. The data gathered during assessment becomes the basis for an evaluation. Comparing assessment information to curriculum objectives allows the teacher to make a decision or judgment regarding the progress of a student's learning.
Types of Assessment and Evaluation There are three types of assessment and evaluation that occur regularly throughout the school year: diagnostic, formative, and summative. Diagnostic assessment and evaluation usually occur at the beginning of the school year and before each unit of study. The purposes are to determine students' knowledge and skills, their learning needs, and their motivational and interest levels. By examining the results of diagnostic assessment, teachers can determine where to begin instruction and what concepts or skills to emphasize. Diagnostic assessment provides information essential to teachers in selecting relevant learning objectives and in designing appropriate learning experiences for all students, individually and as group members. Keeping diagnostic instruments for comparison and further reference enables teachers and students to determine progress and future direction. Diagnostic assessment tools such as the Writing Strategies Questionnaire and the Reading Interest/Attitude Inventory in this guide can provide support for instructional decisions. Formative assessment and evaluation focus on the processes and products of learning. Formative assessment is continuous and is meant to inform the student, the parent/guardian, and the teacher of the student's progress toward the curriculum objectives. This type of assessment and evaluation provides information upon which instructional decisions and adaptations can be made and provides students with directions for future learning. Involvement in constructing their own assessment instruments or in adapting ones the teacher has made allows students to focus on what they are trying to achieve, develops their thinking skills, and helps them to become reflective learners. As well, peer assessment is a useful formative evaluation technique. For peer assessment to be successful, students must be provided with assistance and the opportunity to observe a model peer assessment session. Through peer assessment students have the opportunity to become critical and creative thinkers who can clearly communicate ideas and thoughts to others. Instruments such as checklists or learning logs, and interviews or conferences provide useful data. Summative assessment and evaluation occur most often at the end of a unit of instruction and at term or year end when students are ready to demonstrate achievement of curriculum objectives. The main purposes are to determine knowledge, skills, abilities, and attitudes that have developed over a given period of time; to summarize student progress; and to report this progress to students, parents/guardians, and teachers.
Summative judgments are based upon criteria derived from curriculum objectives. By sharing these objectives with the students and involving them in designing the evaluation instruments, teachers enable students to understand and internalize the criteria by which their progress will be determined. Often assessment and evaluation results provide both formative and summative information. For example, summative evaluation can be used formatively to make decisions about changes to instructional strategies, curriculum topics, or learning environment. Similarly, formative evaluation assists teachers in making summative judgments about student progress and determining where further instruction is necessary for individuals or groups. The suggested assessment techniques included in various sections of this guide may be used for each type of evaluation.
The Evaluation Process Teachers as decision makers strive to make a close match between curriculum objectives, instructional methods, and assessment techniques. The evaluation process carried out parallel to instruction is a cyclical one that involves four phases: preparation, assessment, evaluation, and reflection. In the preparation phase, teachers decide what is to be evaluated, the type of evaluation to be used (diagnostic, formative, or summative), the criteria upon which student learning outcomes will be judged, and the most appropriate assessment techniques for gathering information on student progress. Teachers may make these decisions in collaboration with students. During the assessment phase, teachers select appropriate tools and techniques, then collect and collate information on student progress. Teachers must determine where, when, and how assessments will be conducted, and students must be consulted and informed. During the evaluation phase, teachers interpret the assessment information and make judgments about student progress. These judgments (or evaluation) provide information upon which teachers base decisions about student learning and report progress to students and parents/guardians. Students are encouraged to monitor their own learning by evaluating their achievements on a regular basis. Encouraging students to participate in evaluation nurtures gradual acceptance of responsibility for their own progress and helps them to understand and appreciate their growth as readers and writers. The reflection phase allows teachers to consider the extent to which the previous phases in the evaluation process have been successful. Specifically, teachers evaluate the utility, equity, and appropriateness of the assessment techniques used. Such reflection assists teachers in making decisions concerning improvements or adaptations to subsequent instruction and evaluation.
Student Assessment and Evaluation When implementing assessment and evaluation procedures, it is valuable to consider the characteristics of early adolescents. Developmentally, Middle Level students are at various cognitive, emotional, social, and physical levels. Assessment and evaluation must be sensitive to this range of transitions and address individual progress. It is unrealistic and damaging to expect students who are at various stages of development to perform at the same level. It is necessary to clarify, for Middle Level students, the individual nature of the curriculum and the assessment strategies used; students should recognize that they are not
being compared to their peers, but that they are setting their own learning goals in relation to curriculum objectives. Insensitive evaluation of the early adolescent can result in the student feeling low self‐worth and wanting to give up. Regular, positive feedback is a valuable part of the learning process and helps students identify how well they have achieved individual goals and curriculum objectives. As students begin to achieve success, their sense of self‐esteem increases and the need for extrinsic rewards gives way to the development of intrinsic motivation. Early adolescents are vulnerable to peer approval or rejection, and they harbor a strong sense of fairness and justice. Because Middle Level students find it more satisfying to strive for immediately achievable goals rather than long‐term goals, they will respond positively to a system of continuous assessment and evaluation. Effective evaluators of Middle Level students are astute observers who use a variety of monitoring techniques to collect information about students' knowledge, skills, attitudes, values, and language competencies. Well organized, concise, and accessible records accommodate the large quantities of data likely to be collected, and assist teachers' decision making and reporting. Some effective techniques for monitoring student progress in the areas of oracy and literacy include the following: • • • • • • • •
Make video and audio recordings of a variety of formal and informal oral language experiences, and then assess these according to pre‐determined criteria which are based upon student needs and curriculum objectives. Use checklists as concise methods of collecting information, and rating scales or rubrics to assess student achievement. Record anecdotal comments to provide useful data based upon observation of students' oral activities. Interview students to determine what they believe they do well or areas in which they need to improve. Have students keep portfolios of their dated writing samples, and language abilities checklists and records. Keep anecdotal records of students' reading and writing activities and experiences. Have students write in reader response journals. Confer with students during the writing and reading processes, and observe them during peer conferences.
Self‐assessment promotes students' abilities to assume more responsibility for their own learning by encouraging self‐reflection and encouraging them to identify where they believe they have been successful and where they believe they require assistance. Discussing students' self‐assessments with them allows the teacher to see how they value their own work and to ask questions that encourage students to reflect upon their experiences and set goals for new learning. Peer assessment allows students to collaborate and learn from others. Through discussions with peers, Middle Level students can verbalize their concerns and ideas in a way that helps them clarify their thoughts and decide in which direction to proceed.
The instruments for peer and self‐assessment should be collaboratively constructed by teachers and students. It is important for teachers to discuss learning objectives with the students. Together, they can develop assessment and evaluation criteria relevant to the objectives, as well as to students' individual and group needs.
Assessment and Evaluation Strategies Assessment data can be collected and recorded by both the teacher and the students in a variety of ways. Through observation of students, and in interviews or conferences with students, teachers can discover much about their students' knowledge, abilities, interests, and needs. As well, teachers can collect samples of students' work in portfolios and conduct performance assessments within the context of classroom activities. When a number of assessment tools are used in conjunction with one another, richer and more in‐ depth data collection results. Whatever method of data collection is used, teachers should: • •
meet with students regularly to discuss their progress adjust rating criteria as learners change and progress.
Observation Observation occurs during students' daily reading, writing, listening, and speaking experiences. It is an unobtrusive means by which teachers (and students) can determine their progress during learning. Observations can be recorded as anecdotal notes, and on checklists or rating scales. When teachers attach the data collection sheets to a hand‐held clipboard, data can be recorded immediately and with little interruption to the student. Alternatively, adhesive note papers can be used to record data quickly and unobtrusively. Anecdotal Records Anecdotal records are notes written by the teacher regarding student language, behavior, or learning. They document and describe significant daily events, and relevant aspects of student activity and progress. These notes can be taken during student activities or at the end of the day. Formats for collection should be flexible and easy to use. Guidelines for use include the following: • •
• • • •
Record the observation and the circumstance in which the learning experience occurs. There will be time to analyze notes at another time, perhaps at the end of the day, or after several observations about one student have been accumulated. Make the task of daily note taking manageable by focusing on clearly defined objectives or purposes, and by identifying only a few students to observe during a designated period of time. However, learning and progress cannot be scheduled, and it is valuable to note other observations of importance as they occur. Record data on loose‐leaf sheets and keep these in a three‐ring binder with a page designated for each student and organized alphabetically by students' last names or by class. This format allows the teacher to add pages as necessary. Write the notes on recipe cards and then file these alphabetically. Use adhesive note papers that can be attached to the student's pages or recipe card files. Design structured forms for collection of specific data.
•
Use a combination of the above suggestions.
Teachers may choose to keep running written observations for each student or they may use a more structured approach, constructing charts that focus each observation on the collection of specific data. A combination of open‐ended notes and structured forms may also be used. It is important to date all observations recorded. Checklists Observation checklists, usually completed while students are engaged in specific activities or processes, are lists of specific criteria that teachers focus on at a particular time or during a particular process. Checklists are used to record whether students have acquired specific knowledge, skills, processes, abilities, and attitudes. Checklists inform teachers about where their instruction has been successful and where students need assistance or further instruction. Formats for checklists should be varied and easy to use. Guidelines for using checklists include the following: • • • • • • • • • • • •
Determine the observation criteria from curriculum, unit, and lesson objectives. Review specific criteria with students before beginning the observation. Involve students in developing some or all of the criteria whenever it will be beneficial to do so. Choose criteria that are easily observed to prevent vagueness and increase objectivity. Use jargon‐free language to describe criteria so that data can be used in interviews with students and parents. Make the observation manageable by keeping the number of criteria to less than eight and by limiting the number of students observed to a few at one time. Have students construct and use checklists for peer and self‐assessments. Summarize checklist data regularly. Use or adapt existing checklists from other sources. Use yes‐no checklists to identify whether a specific action has been completed or if a particular quality is present. Use tally checklists to note the frequency of the action observed or recorded. Construct all checklists with space for recording anecdotal notes and comments.
Rating Scales and Rubrics Rating scales record the extent to which specific criteria have been achieved by the student or are present in the student's work. Rating scales also record the quality of the student's performance at a given time or within a given process. Rating scales are similar to checklists, and teachers can often convert checklists into rating scales by assigning number values to the various criteria listed. They can be designed as number lines or as holistic scales or rubrics. Rubrics include criteria that describe each level of the rating scale and are used to determine student progress in comparison to these expectations. All formats for rating student progress should be concise and clear. Guidelines for use include the following: • • •
Determine specific assessment criteria from curriculum objectives, components of a particular activity, or student needs. Discuss or develop the specific criteria with students before beginning the assessment. Choose criteria that are easily observed in order to prevent vagueness and increase objectivity.
• • • • • •
Select criteria that students have had the opportunity to practice. These criteria may differ from student to student, depending upon their strengths and needs. Use jargon‐free language to describe criteria so that data can be used effectively in interviews with students and parents. Make the assessment manageable by keeping the number of criteria to less than eight and by limiting the number of students observed to a few at one time. Use or adapt rating scales and rubrics from other sources. Use numbered continuums to measure the degree to which students are successful at accomplishing a skill or activity. Use rubrics when the observation calls for a holistic rating scale. Rubrics describe the attributes of student knowledge or achievements on a numbered continuum of possibilities.
Portfolios Portfolios are collections of relevant work that reflect students' individual efforts, development, and progress over a designated period of time. Portfolios provide students, teachers, parents, and administrators with a broad picture of each student's growth over time, including the student's abilities, knowledge, skills, and attitudes. Students should be involved in the selection of work to be included, goal setting for personal learning, and self‐assessment. The teacher can encourage critical thinking by having students decide which of their works to include in their portfolios and explain why they chose those particular items. Instruction and assessment are integrated as students and teachers collaborate to compile relevant and individual portfolios for each student. Guidelines for use include the following: • • • •
•
Brainstorm with students to discover what they already know about portfolios. Share samples of portfolios with students. (Teachers may need to create samples if student ones are not available; however, samples should be as authentic as possible.) Provide students with an overview of portfolio assessment prior to beginning their collections. Collaborate with students to set up guidelines for the content of portfolios and establish evaluation criteria for their portfolio collections. Consider the following: o What is the purpose of the portfolio? (Is it the primary focus of assessment or is it supplemental? Will it be used to determine a mark or will it simply be used to inform students, teachers, and parents about student progress?) o Who will be the audience(s) for the portfolio? o What will be included in the portfolio (e.g., writing samples only, samples of all language processes)? o What are the criteria for selecting a piece of work for inclusion? When should those selections be made? o Who will determine what items are included in the portfolio (e.g., the student, the teacher, the student and teacher in consultation)? o When should items be added or removed? o How should the contents be organized and documented? Where will the portfolios be stored? o What will be the criteria for evaluation of the portfolio? o What form will feedback to the students take (e.g., written summaries, oral interviews/ conferences)? o How will the portfolio be assessed/evaluated (e.g., list of criteria)? Assemble examples of work that represent a wide range of students' developing abilities, knowledge, and attitudes including samples of work from their speaking, listening, reading, writing, representing, and viewing experiences.
• • •
Date all items for effective organization and reference. Inform parents/guardians about the use and purposes of portfolios (e.g., send letters describing portfolios home, display sample portfolios on meet‐the‐teacher evening to introduce parents to the concept). Consider the following for inclusion: o criteria for content selection o table of contents or captioned labels that briefly outline or identify the contents o samples of student writing (e.g., pre‐writing, multiple drafts, final drafts, published pieces) o sample reading logs o samples of a variety of responses from reader response journals (originals or photocopies of originals) o evidence of student self‐reflection (e.g., summaries, structured reflection sheets) o audiotapes and videotapes of student work o photographs o collaborative projects o computer disks.
Formats for portfolio assembly should be easily organized, stored, and accessed. Some possibilities include the following: • • •
Keep file folders or accordion folders in classroom filing cabinet drawers, cupboards, or boxes. Use three‐ring binders for ease of adding and removing items as students progress. Store scrapbooks in boxes or crates.
Evaluating Student Portfolios At the end of the term/semester/year when the portfolio is submitted for summative evaluation, it is useful to review the contents as a whole and record data using the previously set criteria. One method of recording data is to prepare a grid with the criteria listed down one side and the checklist or rating scale across the top. If there is need to assign a numerical grade, designate numbers to each set of criteria on the checklist/rating scale and convert the evaluation into a number grade. Some examples of portfolio assessment and recording forms follow. The teacher can adapt these sample forms or create new ones.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES TECNICAS DE EVALUACION Licda. Evelyn R. Quiroa de Arévalo, M.A.
WHAT IS ASSESSMENT? SCAVENGER HUNT WHAT IS A SCAVENGER HUNT? A scavenger hunt is an assessment idea in which students try to find all the items or complete all the activities on a list. It can be completed in teams or by individuals, and is often timed.
MAIN ON-LINE RESOURCE: http://www.cmu.edu/teaching/assessment/index.html ACTIVITY OBJECTIVES:
Analyze the difference between the types of assessment. Evaluate the importance of assessment in the teaching and learning process. Synthesize ways in which teacher can assess his or her own teaching.
INSTRUCTIONS:
Use the link above to complete the following list of questions and activities. The completed assignment must be sent via e-mail before Wednesday August 15th. DO NOT COPY PASTE
PART I: ASSESSMENT BASICS RESOURCE LINK: http://www.cmu.edu/teaching/assessment/basics/index.html INSTRUCTIONS: Complete and or answer the following questions or tasks after each instruction.
Why should assessments, learning objectives, and instructional strategies be aligned? They should be aligned to reinforce one another. What if the components of a course are misaligned? Student motivation and learning could be lost.
What is the difference between formative and summative assessment? Provide examples. Formative assessment is to monitor student learning and Summative assessment is to evaluate student learning. Formative assessment examples: •draw a concept map in class to represent their understanding of a topic •submit one or two sentences identifying the main point of a lecture •turn in a research proposal for early feedback Summative assessment examples: •a midterm exam •a final project •a paper •a senior recital
What is the difference between assessment and grading? Grading evaluates individual students’ learning and performance, while assessment is to improve student learning.
PART II: HOW TO ASSESS STUDENTS’ PRIOR KNOWLEDGE RESOURCE LINK: http://www.cmu.edu/teaching/assessment/priorknowledge/index.html What are Performance-Based Prior Knowledge Assessments? Provide examples. It is something that will help us gauge our student’s knowledge. For example the test you gave us at the beginning of the class. Give examples of your own of appropriately written questions for self assessment. Have you done assessment test in class? Have you designed your own assessment tests? What’s information have you acquired from assessment test? What are CATs? CATs are techniques that will provide us an overall view of the level of knowledge. List and provide a brief description of the CATs suggested on the sight. Minute paper: this will help you see if you share the same point of view of what was learned. Muddiest point: This will help you see what caused them the most difficulty. Problem Recognition Tasks: Will help you see the problem and find ways of fixing it. Documented Problem Solutions: It will help you find different point of view for solving a problem. Directed Paraphrasing: This is what I’m doing right now in this homework. Application Cards: students are applying on what they have learned. Student Generated Tests: Helps students understand instructions and, helps us see if they understood the content. Classroom Opinion Polls: Helps you find out how much they know about a subject theme. Give 3 examples of CATs you’ve used in class. I just have used two of those, but now that I know I’ll like to try the rest of them. One that I use very often is the “student generated test questions” I have been surprised of the results; they have showed me that there is a complete understanding of instructions and of course the content of the class. The other one is one that I don’t use very often which is the minute paper. I use this technique in more advanced students.
PART III: HOW TO ASSESS STUDENTS’ LEARNING AND PERFORMANCE RESOURCE LINK: http://www.cmu.edu/teaching/assessment/assesslearning/index.html List a few tips on how to create assignments. Make sure everything matches. Your objectives with the assessment. The title of the assessment with what you want them to do. Make it interesting and challenging for them. Make sure they have enough time to do the assignment. List a few tips on how to create exams. Make sure your tests are clear, aligned with your objectives, fair with pints given to each item. Make sure the students will have enough time to do it. Compare and contrast the two previous methods. They are very similar because both go together with the objectives of the class. Also both have to be thought through before giving them to the student, yet they will show you different weaknesses and strengths. What is the difference between concept maps and concept tests? Provide an example of each. Concept tests are to help instructors gauge whether students understand key concepts. Concept maps are a graphic representation of students’ knowledge. Concept map example:
Concept test example: What would happen if there were no trees?
Provide tips on how to assess student group work? Explain what has worked for you. Students work together to achieve a common academic goal. The goal has been set by the teacher who gave the means to reach it. These strategies also implement team rewards as a means of motivating students to work well with group members and to be responsible for their part of the work. And helps me get the group bound together.
What benefits are there in using rubrics? Provide an example of a rubric you have used. Rubrics help the teacher guide the student to what he wants and how he wants it.
Grading for Class Participation Weekly
Frequency
and
Quality
A (5 points)
B (4-3 points)
C (2-1 points)
D/R
Attends class regularly and always contributes to the discussion by raising thoughtful questions, analyzing relevant issues, building on others’ ideas, synthesizing across readings and discussions, expanding the class’ perspective, and appropriately challenging assumptions and perspectives.
Attends class regularly and sometimes contributes to the discussion in the aforementioned ways.
Attends class regularly but rarely contributes to the discussion in the aforementioned ways.
Attends class regularly but never contributes to the discussion in the aforementioned ways.
PART IV: HOW TO ASSESS YOUR TEACHING RESOURCE LINK: http://www.cmu.edu/teaching/assessment/assessteaching/index.html What are early course evaluations? What do they consist in? This kind of evaluation will help you find your strengths and weaknesses. Sometimes we think we are on the right path, but after one of this you get to know how your students see you and the course you teach. Remember always to think positive some of the things students may write won’t be nice things. Have an open mind. Explain the importance of classroom observations. What are a few suggestions on how to go about with a classroom observation? As teachers we don’t know everything, although we should, but having a colleague observing your class will really tell you where you are. Make sure you choose someone who knows and understands what he is doing. What is a student focus group? They will help you identify areas of agreement and disagreement across groups of students and to elicit students’ suggestions for improvement.
Chapter 9. Assessment Vocabulary The definitions in this list were derived from several sources, including: • • • • • •
Glossary of Useful Terms Related to Authentic and Performance Assessments. Grant Wiggins SCASS Arts Assessment Project Glossary of Assessment Terms The ERIC Review: Performance-Based Assessment. Vol. 3 Issue 1, Winter, 1994. Assessment: How Do We Know What They Know? ASCD. 1992. Dissolving the Boundaries: Assessment that Enhances Learning. Dee Dickinson http://www.newhorizons.org/strategies/assess/terminology.htm
Accountability – The demand by a community (public officials, employers, and taxpayers) for school officials to prove that money invested in education has led to measurable learning. "Accountability testing" is an attempt to sample what students have learned, or how well teachers have taught, and/or the effectiveness of a school's principal's performance as an instructional leader. School budgets and personnel promotions, compensation, and awards may be affected. Most school districts make this kind of assessment public; it can affect policy and public perception of the effectiveness of taxpayer-supported schools and be the basis for comparison among schools. It has been suggested that test scores analyzed in a disaggregated format can help identify instructional problems and point to potential solutions. Action Plans – The statement that indicates the specific changes that a given area plans to implement in the next cycle based on assessment results. "The biology faculty will introduce one special project in the introductory class that will expose the students to the scientific method." "Career Services is implementing a software program called ‘1st Place’. This software will allow better tracking of job openings." Action Research – Classroom-based research involving the systematic collection of data in order to address certain questions and issue so as to improve classroom instruction and educational effectiveness. Affective Outcomes – Outcomes of education that reflect feelings more than understanding; likes, pleasures, ideals, dislikes, annoyances, values. Annual Report: A report from each academic program based on its assessment plan that is submitted annually, which outlines how evidence was used to improve student learning outcomes through curricular and/or other changes or to document that no changes were needed. Assessment – The systematic collection, review, and use of information about educational programs undertaken for the purpose of improving student learning and development. In general
terms, assessment is the determination of a value, or measurement, based on a "standard." We often refer to this standard as a "target." Standard-based measurement, or assessment, is useful in education for both the placement of students in initial course work and ascertaining the extent of students' acquisition of skills/knowledge. Assessment Cycle – The assessment cycle in higher education is generally annual and fits within the academic year. Outcomes, targets and assessment tools are established early in the fall semester; data is collected by the end of spring semester; results are analyzed during the summer and early fall. Assessment Tool – An instrument that has been designed to collect objective data about students' knowledge and skill acquisition. An appropriate outcomes assessment test measures students' ability to integrate a set of individual skills into a meaningful, collective demonstration. Some examples of assessment tools include standardized tests, end-of-program skills tests, student inquiries, common final exams, and comprehensive embedded test items. Assessment Literacy – The possession of knowledge about the basic principals of sound assessment practice, including terminology, the development and use of assessment methodologies and techniques, familiarity with standards of quality in assessment. Increasingly, familiarity with alternatives to traditional measurements of learning. Authentic Assessment – A circumstance in which the behavior that the learning is intended to produce is evaluated and discussed in order to improve learning. The concept of model, practice, feedback in which students know what excellent performance is and are guided to practice an entire concept rather than bits and pieces in preparation for eventual understanding. A variety of techniques can be employed in authentic assessment. Benchmark – Student performance standards (the level(s) of student competence in a content area). Cohort – A group whose progress is followed by means of measurements at different points in time. Course-embedded assessment – A method in which evidence of student learning outcomes for the program is obtained from assignments in particular courses in the curriculum. Course-level assessment – Assessment to determine the extent to which a specific course is achieving its learning goals. Course mapping – A matrix showing the coverage of each program learning outcome in each course. It may also indicate the level of emphasis of each outcome in each course. Criterion Referenced Tests – A test in which the results can be used to determine a student's progress toward mastery of a content area. Performance is compared to an expected level of
mastery in a content area rather than to other students' scores. Such tests usually include questions based on what the student was taught and are designed to measure the student's mastery of designated objectives of an instructional program. The "criterion" is the standard of performance established as the passing score for the test. Scores have meaning in terms of what the student knows or can do, rather than how the test-taker compares to a reference or norm group. Curriculum Map – A matrix showing where each goal and/or learning outcome are covered in each program course. Direct Assessment – Assessment to gauge student achievement of learning outcomes directly from their work. Educational Goals – The knowledge, skills, abilities, capacities, attitudes or dispositions students are expected to acquire as a result of completing your academic program. Goals are sometimes treated as synonymous with outcomes, though outcomes are the behavioral results of the goals, and are stated in precise operational terms. Formative assessment – The assessment of student achievement at different stages of a course or at different stages of a student’s academic career. The focus of formative assessment is on the documentation of student development over time. It can also be used to engage students in a process of reflection on their education. General Education Assessment – Assessment that measures the campus-wide, general education competencies agreed upon by the faculty. General education assessment is more holistic in nature than program outcomes assessment because competencies are measured across disciplines, rather than just within a single discipline. Holistic Scoring – In assessment, assigning a single score based on an overall assessment of performance rather than by scoring or analyzing dimensions or traits individually. The product is considered to be more than the sum of its parts and so the quality of a final product or performance is evaluated rather than the process or dimension of performance. A holistic scoring rubric might combine a number of elements on a single scale. Focused holistic scoring may be used to evaluate a limited portion of a learner's performance. Indirect Assessment – Assessment that deduces student achievement of learning outcomes through the reported perception of learning by students and other agents. Institutional assessment – Assessment to determine the extent to which a college or university is achieving its mission. Learning outcomes – Operational statements describing specific student behaviors that evidence the acquisition of desired goals in knowledge, skills, abilities, capacities, attitudes or dispositions. Learning outcomes can be usefully thought of as behavioral criteria for determining whether students are achieving the educational goals of a program, and, ultimately, whether overall program goals are being successfully met. Outcomes are sometimes treated as synonymous with objectives, though objectives are usually more general statements of what
students are expected to achieve in an academic program. Measurable Criteria – An intended student outcome, or administrative objective, restated in a quantifiable, or measurable, statement. "60% of biology students will complete an experiment/project using scientific methods in fall 2003;" "75% of responding MU students will indicate on a survey in fall 2003 that they have read materials about career opportunities on campus." Metacognition – The knowledge of one's own thinking processes and strategies, and the ability to consciously reflect and act on the knowledge of cognition to modify those processes and strategies. Norm – A distribution of scores obtained from a norm group. The norm is the midpoint (or median) of scores or performance of the students in that group. Fifty percent will score above and fifty percent below the norm. Performance-Based Assessment – Direct, systematic observation and rating of student performance of an educational objective, often an ongoing observation over a period of time, and typically involving the creation of products. The assessment may be a continuing interaction between teacher and student and should ideally be part of the learning process. The assessment should be a real-world performance with relevance to the student and learning community. Assessment of the performance is done using a rubric, or analytic scoring guide to aid in objectivity. Performance-based assessment is a test of the ability to apply knowledge in a reallife setting or performance of exemplary tasks in the demonstration of intellectual ability. Portfolio – A systematic and organized collection of a student's work that exhibits to others the direct evidence of a student's efforts, achievements, and progress over a period of time. The collection should involve the student in selection of its contents, and should include information about the performance criteria, the rubric or criteria for judging merit, and evidence of student self-refection or evaluation. Portfolio Assessment – Portfolios may be assessed in a variety of ways. Each piece may be individually scored, or the portfolio might be assessed merely for the presence of required pieces, or a holistic scoring process might be used and an evaluation made on the basis of an overall impression of the student's collected work. It is common that assessors work together to establish consensus of standards or to ensure greater reliability in evaluation of student work. Established criteria are often used by reviewers and students involved in the process of evaluating progress and achievement of objectives. Primary Trait Method – A type of rubric scoring constructed to assess a specific trait, skill, behavior, or format, or the evaluation of the primary impact of a learning process on a designated audience. Process – A generalizable method of doing something, generally involving steps or operations which are usually ordered and/or interdependent. Process can be evaluated as part of an assessment, as in the example of evaluating a student's performance during prewriting exercises
leading up to the final production of an essay or paper. Program assessment – Assessment to determine the extent to which students in a departmental program can demonstrate the learning outcomes for the program. Reliability – An assessment tool’s consistency of results over time and with different samples of students. Rubric – A set of criteria specifying the characteristics of a learning outcome and the levels of achievement in each characteristic. Self-efficacy – Students’ judgment of their own capabilities for a specific learning outcome. Senior Project – Extensive projects planned and carried out during the senior year as the culmination of the undergraduate experience. Senior projects require higher-level thinking skills, problem-solving, and creative thinking. They are often interdisciplinary, and may require extensive research. Projects culminate in a presentation of the project to a panel of people, usually faculty and community mentors, sometimes students, who evaluate the student's work at the end of the year. Summative assessment – The assessment of student achievement at the end point of their education or at the end of a course. The focus of summative assessment is on the documentation of student achievement by the end of a course or program. It does not reveal the pathway of development to achieve that endpoint. Triangulation – The collection of data via multiple methods in order to determine if the results show a consistent outcome Validity – The degree to which an assessment measures (a) what is intended, as opposed to (b) what is not intended, or (c) what is unsystematic or unstable
Chapter 9. Assessment Vocabulary The definitions in this list were derived from several sources, including: • • • • • •
Glossary of Useful Terms Related to Authentic and Performance Assessments. Grant Wiggins SCASS Arts Assessment Project Glossary of Assessment Terms The ERIC Review: Performance-Based Assessment. Vol. 3 Issue 1, Winter, 1994. Assessment: How Do We Know What They Know? ASCD. 1992. Dissolving the Boundaries: Assessment that Enhances Learning. Dee Dickinson http://www.newhorizons.org/strategies/assess/terminology.htm
Accountability – The demand by a community (public officials, employers, and taxpayers) for school officials to prove that money invested in education has led to measurable learning. "Accountability testing" is an attempt to sample what students have learned, or how well teachers have taught, and/or the effectiveness of a school's principal's performance as an instructional leader. School budgets and personnel promotions, compensation, and awards may be affected. Most school districts make this kind of assessment public; it can affect policy and public perception of the effectiveness of taxpayer-supported schools and be the basis for comparison among schools. It has been suggested that test scores analyzed in a disaggregated format can help identify instructional problems and point to potential solutions. Action Plans – The statement that indicates the specific changes that a given area plans to implement in the next cycle based on assessment results. "The biology faculty will introduce one special project in the introductory class that will expose the students to the scientific method." "Career Services is implementing a software program called ‘1st Place’. This software will allow better tracking of job openings." Action Research – Classroom-based research involving the systematic collection of data in order to address certain questions and issue so as to improve classroom instruction and educational effectiveness. Affective Outcomes – Outcomes of education that reflect feelings more than understanding; likes, pleasures, ideals, dislikes, annoyances, values. Annual Report: A report from each academic program based on its assessment plan that is submitted annually, which outlines how evidence was used to improve student learning outcomes through curricular and/or other changes or to document that no changes were needed. Assessment – The systematic collection, review, and use of information about educational programs undertaken for the purpose of improving student learning and development. In general
terms, assessment is the determination of a value, or measurement, based on a "standard." We often refer to this standard as a "target." Standard-based measurement, or assessment, is useful in education for both the placement of students in initial course work and ascertaining the extent of students' acquisition of skills/knowledge. Assessment Cycle – The assessment cycle in higher education is generally annual and fits within the academic year. Outcomes, targets and assessment tools are established early in the fall semester; data is collected by the end of spring semester; results are analyzed during the summer and early fall. Assessment Tool – An instrument that has been designed to collect objective data about students' knowledge and skill acquisition. An appropriate outcomes assessment test measures students' ability to integrate a set of individual skills into a meaningful, collective demonstration. Some examples of assessment tools include standardized tests, end-of-program skills tests, student inquiries, common final exams, and comprehensive embedded test items. Assessment Literacy – The possession of knowledge about the basic principals of sound assessment practice, including terminology, the development and use of assessment methodologies and techniques, familiarity with standards of quality in assessment. Increasingly, familiarity with alternatives to traditional measurements of learning. Authentic Assessment – A circumstance in which the behavior that the learning is intended to produce is evaluated and discussed in order to improve learning. The concept of model, practice, feedback in which students know what excellent performance is and are guided to practice an entire concept rather than bits and pieces in preparation for eventual understanding. A variety of techniques can be employed in authentic assessment. Benchmark – Student performance standards (the level(s) of student competence in a content area). Cohort – A group whose progress is followed by means of measurements at different points in time. Course-embedded assessment – A method in which evidence of student learning outcomes for the program is obtained from assignments in particular courses in the curriculum. Course-level assessment – Assessment to determine the extent to which a specific course is achieving its learning goals. Course mapping – A matrix showing the coverage of each program learning outcome in each course. It may also indicate the level of emphasis of each outcome in each course. Criterion Referenced Tests – A test in which the results can be used to determine a student's progress toward mastery of a content area. Performance is compared to an expected level of
mastery in a content area rather than to other students' scores. Such tests usually include questions based on what the student was taught and are designed to measure the student's mastery of designated objectives of an instructional program. The "criterion" is the standard of performance established as the passing score for the test. Scores have meaning in terms of what the student knows or can do, rather than how the test-taker compares to a reference or norm group. Curriculum Map – A matrix showing where each goal and/or learning outcome are covered in each program course. Direct Assessment – Assessment to gauge student achievement of learning outcomes directly from their work. Educational Goals – The knowledge, skills, abilities, capacities, attitudes or dispositions students are expected to acquire as a result of completing your academic program. Goals are sometimes treated as synonymous with outcomes, though outcomes are the behavioral results of the goals, and are stated in precise operational terms. Formative assessment – The assessment of student achievement at different stages of a course or at different stages of a student’s academic career. The focus of formative assessment is on the documentation of student development over time. It can also be used to engage students in a process of reflection on their education. General Education Assessment – Assessment that measures the campus-wide, general education competencies agreed upon by the faculty. General education assessment is more holistic in nature than program outcomes assessment because competencies are measured across disciplines, rather than just within a single discipline. Holistic Scoring – In assessment, assigning a single score based on an overall assessment of performance rather than by scoring or analyzing dimensions or traits individually. The product is considered to be more than the sum of its parts and so the quality of a final product or performance is evaluated rather than the process or dimension of performance. A holistic scoring rubric might combine a number of elements on a single scale. Focused holistic scoring may be used to evaluate a limited portion of a learner's performance. Indirect Assessment – Assessment that deduces student achievement of learning outcomes through the reported perception of learning by students and other agents. Institutional assessment – Assessment to determine the extent to which a college or university is achieving its mission. Learning outcomes – Operational statements describing specific student behaviors that evidence the acquisition of desired goals in knowledge, skills, abilities, capacities, attitudes or dispositions. Learning outcomes can be usefully thought of as behavioral criteria for determining whether students are achieving the educational goals of a program, and, ultimately, whether overall program goals are being successfully met. Outcomes are sometimes treated as synonymous with objectives, though objectives are usually more general statements of what
students are expected to achieve in an academic program. Measurable Criteria – An intended student outcome, or administrative objective, restated in a quantifiable, or measurable, statement. "60% of biology students will complete an experiment/project using scientific methods in fall 2003;" "75% of responding MU students will indicate on a survey in fall 2003 that they have read materials about career opportunities on campus." Metacognition – The knowledge of one's own thinking processes and strategies, and the ability to consciously reflect and act on the knowledge of cognition to modify those processes and strategies. Norm – A distribution of scores obtained from a norm group. The norm is the midpoint (or median) of scores or performance of the students in that group. Fifty percent will score above and fifty percent below the norm. Performance-Based Assessment – Direct, systematic observation and rating of student performance of an educational objective, often an ongoing observation over a period of time, and typically involving the creation of products. The assessment may be a continuing interaction between teacher and student and should ideally be part of the learning process. The assessment should be a real-world performance with relevance to the student and learning community. Assessment of the performance is done using a rubric, or analytic scoring guide to aid in objectivity. Performance-based assessment is a test of the ability to apply knowledge in a reallife setting or performance of exemplary tasks in the demonstration of intellectual ability. Portfolio – A systematic and organized collection of a student's work that exhibits to others the direct evidence of a student's efforts, achievements, and progress over a period of time. The collection should involve the student in selection of its contents, and should include information about the performance criteria, the rubric or criteria for judging merit, and evidence of student self-refection or evaluation. Portfolio Assessment – Portfolios may be assessed in a variety of ways. Each piece may be individually scored, or the portfolio might be assessed merely for the presence of required pieces, or a holistic scoring process might be used and an evaluation made on the basis of an overall impression of the student's collected work. It is common that assessors work together to establish consensus of standards or to ensure greater reliability in evaluation of student work. Established criteria are often used by reviewers and students involved in the process of evaluating progress and achievement of objectives. Primary Trait Method – A type of rubric scoring constructed to assess a specific trait, skill, behavior, or format, or the evaluation of the primary impact of a learning process on a designated audience. Process – A generalizable method of doing something, generally involving steps or operations which are usually ordered and/or interdependent. Process can be evaluated as part of an assessment, as in the example of evaluating a student's performance during prewriting exercises
leading up to the final production of an essay or paper. Program assessment – Assessment to determine the extent to which students in a departmental program can demonstrate the learning outcomes for the program. Reliability – An assessment tool’s consistency of results over time and with different samples of students. Rubric – A set of criteria specifying the characteristics of a learning outcome and the levels of achievement in each characteristic. Self-efficacy – Students’ judgment of their own capabilities for a specific learning outcome. Senior Project – Extensive projects planned and carried out during the senior year as the culmination of the undergraduate experience. Senior projects require higher-level thinking skills, problem-solving, and creative thinking. They are often interdisciplinary, and may require extensive research. Projects culminate in a presentation of the project to a panel of people, usually faculty and community mentors, sometimes students, who evaluate the student's work at the end of the year. Summative assessment – The assessment of student achievement at the end point of their education or at the end of a course. The focus of summative assessment is on the documentation of student achievement by the end of a course or program. It does not reveal the pathway of development to achieve that endpoint. Triangulation – The collection of data via multiple methods in order to determine if the results show a consistent outcome Validity – The degree to which an assessment measures (a) what is intended, as opposed to (b) what is not intended, or (c) what is unsystematic or unstable
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES TECNICAS DE EVALUACION Licda. Evelyn R. Quiroa de Arévalo, M.A.
WHAT IS ASSESSMENT? SCAVENGER HUNT WHAT IS A SCAVENGER HUNT? A scavenger hunt is an assessment idea in which students try to find all the items or complete all the activities on a list. It can be completed in teams or by individuals, and is often timed.
MAIN ON-LINE RESOURCE: http://www.cmu.edu/teaching/assessment/index.html ACTIVITY OBJECTIVES: • • •
Analyze the difference between the types of assessment. Evaluate the importance of assessment in the teaching and learning process. Synthesize ways in which teacher can assess his or her own teaching.
INSTRUCTIONS: • • •
Use the link above to complete the following list of questions and activities. The completed assignment must be sent via e-mail before Wednesday August 15th. DO NOT COPY PASTE
PART I: ASSESSMENT BASICS RESOURCE LINK: http://www.cmu.edu/teaching/assessment/basics/index.html INSTRUCTIONS: Complete and or answer the following questions or tasks after each instruction.
Why should assessments, learning objectives, and instructional strategies be aligned? What if the components of a course are misaligned? What is the difference between formative and summative assessment? Provide examples. What is the difference between assessment and grading?
PART II: HOW TO ASSESS STUDENTS’ PRIOR KNOWLEDGE RESOURCE LINK: http://www.cmu.edu/teaching/assessment/priorknowledge/index.html What are Performance-Based Prior Knowledge Assessments? Provide examples. Give examples of your own of appropriately written questions for self assessment. What are CATs? List and provide a brief description of the CATs suggested on the sight. Give 3 examples of CATs you’ve used in class.
PART III: HOW TO ASSESS STUDENTS’ LEARNING AND PERFORMANCE RESOURCE LINK: http://www.cmu.edu/teaching/assessment/assesslearning/index.html List a few tips on how to create assignments. List a few tips on how to create exams. Compare and contrast the two previous methods. What is the difference between concept maps and concept tests? Provide an example of each. Provide tips on how to assess student group work? Explain what has worked for you. What benefits are there in using rubrics? Provide an example of a rubric you have used.
PART IV: HOW TO ASSESS YOUR TEACHING RESOURCE LINK: http://www.cmu.edu/teaching/assessment/assessteaching/index.html What are early course evaluations? What do they consist in? Explain the importance o classroom observations. What are a few suggestions on how to go about with a classroom observation? What is a student focus group?
Beginning
Preparedness
Developing
Accomplished
Exemplary
Lacks materials or completed assignments 9 or more times
Usually prepared; Lacks materials or lacks materials or completed completed assignments 4-8 assignments 1-3 times times
Always prepared with materials and completed assignments
Rarely demonstrates personal best on tasks.
Sometimes demonstrates personal best on tasks.
Usually demonstrates personal best on tasks.
Always demonstrates personal best on tasks.
Does not participate effectively and/or Participates uses discouraging minimally and/or language toward lacks cooperation. others
Demonstrates teamwork by encouraging others, working cooperatively.
Demonstrates teamwork by encouraging others, working cooperatively, and showing leadership.
Sometimes Rarely controls controls own own behavior behavior according to school according to rules; often requires school rules; adult redirection. needs some adult redirection.
Usually controls own behavior according to school rules.
Always controls own behavior according to school rules.
Score
Effort
Teamwork
Self Discipline
Student Name:
June 2003
________________
Created by Mrs. Frauhiger and Mrs. Johnson
Rubric .
criteria
Level 1
Level 2
Level 3
Level 4
(D, 50-59%)
(C, 60-69%)
(B, 70-79%)
(A, 80-100%)
.
Knowledge & Understanding
Thinking
Communication
Application
Notes:
• • • •
Not all four categories need be evaluated for each assessment task More than one subject area may be assessed at a time; note subject areas under “criteria” Wherever possible, students should be made aware of the criteria prior to beginning the task For more information read the Achievement Chart section of any revised Ontario curriculum document (for Language, it is found on pp17-21)
Assessment, Evaluation & Reporting Level/Grade Additional Info. Report Card Description
R (below 50%) •
•
insufficient achievement of curriculum expectations additional learning is required before the student will begin to achieve success with this grade's expectations
• •
D – Level 1
C - Level 2
B - Level 3
A - Level 4
(50-59%)
(60-69%)
(70-79%)
(80-100%)
limited level of achievement achievement falls below the provincial standard
.
• •
moderate level of achievement achievement is below, but is approaching the provincial standard
.
• •
high level of achievement achievement is at the provincial standard
• •
.
Very high level of achievement achievement [of the grade-level expectations] exceeds the provincial standard
.
.
Achievement Chart
•
limited effectiveness
.
•
•
some effectiveness
.
considerable effectiveness
.
What Might it Look Like in Your Classroom?
• •
major errors or omissions structured situations for simple purposes
• • •
.
• • •
minimal, weak independence in very structured situations many errors or omissions a few simple purposes very few contexts beginning
.
• • • • •
adequate independently in a few situations some errors or major omissions simple purposes and limited contexts progressing, emerging
• • • •
•
high degree of effectiveness
.
very good independently in a number of situations a few errors a variety of purposes and contexts
.
• •
• •
thorough, excellent independently and confidently in a wide variety of situations almost no errors a wide variety of purposes and contexts
.
.
Help for Report Card Comments…. Achievement in The Ontario Curriculum, Grades 1-8 Sample Verbs & Verb Phrases analyses applies automatically reads begins to can explain communicates demonstrates draws conclusions explains expresses expresses & organizes extends identifies
identifies and describes identifies and uses infers interprets makes makes connections predicts reads recognizes shows transfers uses
Expectations "All curriculum expectations must be accounted for in instruction, but evaluation focuses on students' achievement of the overall expectations." "A student's achievement of the overall expectations is evaluated on the basis of his or her achievement of related specific expectations." (p.16 of the Language Curriculum.)
Qualifiers Level 1 - limited Level 2 - some Level 3 - considerable Level 4 - thorough or high degree Other qualifiers may be used so long as they describe depth rather than frequency.
Descriptors (sample words to identify 'effectiveness') accuracy appropriateness breadth clarity depth effectiveness flexibility fluency logic precision relevance significance
Descriptors: The What and When of Their Use Many of the descriptors below more specifically define effectiveness. Teachers are encouraged to use the more specific clarifying words in task-specific rubrics.
Effectiveness
Having a definite or desired effect; having the intended outcome
useful explicit
Descriptor
Definition
Clarifying Words or Terms
Questions to Consider
Appropriateness
That which is suitable to the outcome; is to the point
apt applicable
Have you produced a result that is applicable to the situation? Is there a result that could be more suitable?
Clarity
That which is without ambiguity (unambiguous)
Accuracy
Conforming exactly with the truth or with a given standard; lacking errors
Precision
That which leaves no room for indecision. That which is clearly defined and corresponds to an identifiable notion. That which is performed or which operates in the safest possible manner, with the minimum likelihood of error Fits a purpose, conforms to reason and common sense, having a bearing on the matter in hand
relevant proper suitable clear elaborate detail illustrate accurate verify correct detail degree explicit
lucidity define concise explicit true valid exact specific exactness
Could you elaborate further? Could you express that in another way? Could you illustrate what you mean? Could you give me an example? How could we check that? How could we find out if that is true? How could we verify or test that? Could you be more specific? Could you give more details? Could you be more exact?
impact fit
How does this relate to the problem? How is that connected to the question? How does that bear on the issue? What factors make this a difficult problem? What are some of the complexities of this question? What are some of the difficulties we need to deal with? Is that dealing with the most significant factors?
Relevance Depth
That which explores the very foundations of a thing or idea; goes beyond appearances
Breadth
Freedom from limitations (opinion, interests); extent, range
Logic
Describes events or data that are heavily interdependent; conclusion depends on the premises. A coherent progression of ideas, an appropriate reasoning process, a sequence in a group of ideas
Significance
Of great importance or consequence
Fluency
Generate a quantity of ideas; offer many alternatives
Flexibility
Change direction of thought; vary ideas
pertinent relatedness connected complexity sophisticated layers levels (of understanding) exhaustive comprehensive elaborate (ideas, perspectives) make sense mutually supporting internal consistency so what? impact implications ease of use ease of generating ideas adaptable versatile
relevant pertinent
thorough intensity profound comprehensive insight range qualities insight liberality of views reasonable tied together order… sequence…flow organization consequences of importance effortless ready…grace unconstrained not rigid
Have you produced the desired or intended result?
Do we need to consider another point of view? Do we need to look at this from another perspective? Is there another way to look at this question? What would this look like from the point of view of …? Does all this really make sense together? Does that follow from what you said? How does that follow? But before you implied this and now you are saying that; how can both be true? Is this the most important problem to consider? Is this the central idea to focus on? Which of these facts are most important? Have many ideas been considered? Are there other alternatives? Do other factors need to be considered?
Teaching with the Revised Bloom’sTaxonomy Janet Giesen Faculty Development and Instructional Design Center
Taxonomy = Classification Classification of thinking Six cognitive levels of complexity
Why use Bloom’s taxonomy? • Write and revise learning objectives • Plan curriculum • Identifies simple to most difficult skills • Effectively align objectives to assessment techniques and standards
• Incorporate knowledge to be learned (knowledge dimension) and cognitive process to learn • Facilitate questioning (oral language = important role within framework)
Original
Revised
Evaluation
Creating
Synthesis
Evaluating
Analysis
Analyzing
Application
Applying
Comprehension
Understanding
Knowledge
Remembering
Noun
Verb
Original
Revised
Evaluation
Creating
Synthesis
Evaluating
Analysis
Analyzing
Application
Applying
Comprehension
Understanding
Knowledge
Remembering
Noun
Verb
Original
Revised
Evaluation
Creating
Synthesis
Evaluating
Analysis
Analyzing
Application
Applying
Comprehension
Understanding
Knowledge
Remembering
Noun
Verb
Creating Evaluating
Analyzing Applying
Understanding Remembering
Creating Evaluating Analyzing
Applying Understanding Remembering
Cognitive Domain Analyzing Applying Creating
Affective Domain
Psychomotor Domain
Characterizing by value or value concept
Articulating
Evaluating
Organizing & conceptualizing
Remembering
Receiving
Understanding
Responding
Valuing
Imitating Manipulating Performing Precisioning
Cognitive Domain Analyzing Applying Creating
Affective Domain
Psychomotor Domain
Characterizing by value or value concept
Articulating
Evaluating
Organizing & conceptualizing
Remembering
Receiving
Understanding
Responding
Valuing
Imitating Manipulating Performing Precisioning
Change in Terms • Categories noun to verb – Taxonomy reflects different forms of thinking (thinking is an active process) verbs describe actions, nouns do not
• Reorganized categories – Knowledge = product/outcome of thinking (inappropriate to describe a category of thinking) now remembering – Comprehension now understanding – Synthesis now creating to better reflect nature of thinking described by each category Handout #
Changes in Structure • Products of thinking part of taxonomy • Forms of knowledge = factual, conceptual, procedural, metacognitive (thinking about thinking) • Synthesis (creating) and evaluation (evaluating) interchanged – Creative thinking more complex form of thinking than critical thinking (evaluating) Handout #
Changes in Emphasis • USE: More authentic tool for curriculum planning, instructional delivery and assessment • Aimed at broader audience • Easily applied to all levels of education • Revision emphasizes explanation and description of subcategories
Handout #
Remembering The learner is able to recall, restate and remember learned information – Describing – Finding – Identifying – Listing
– Retrieving – Naming – Locating – Recognizing
Can students recall information?
Understanding Student grasps meaning of information by interpreting and translating what has been learned – Classifying – Comparing – Exemplifying – Explaining
– Inferring – Interpreting – Paraphrasing – Summarizing
Can students explain ideas or concepts?
Applying Student makes use of information in a context different from the one in which it was learned – Implementing – Carrying out
c
=
– Using – Executing
Can students use the information in another familiar situation?
Analyzing Student breaks learned information into its parts to best understand that information – Attributing – Comparing – Deconstructing – Finding
– Integrating – Organizing – Outlining – Structuring
Can students break information into parts to explore understandings and relationships?
Evaluating Student makes decisions based on in-depth reflection, criticism and assessment – Checking – Critiquing – Detecting – Experimenting
– Hypothesising – Judging – Monitoring – Testing
Can students justify a decision or a course of action?
Creating Student creates new ideas and information using what previously has been learned – Constructing – Designing – Devising – Inventing
– Making – Planning – Producing
Can students generate new products, ideas, or ways of viewing things?
Questioning . . . • Lower level questions—remembering, understanding & lower level applying levels • Lower level questions – Evaluate students’ preparation and comprehension – Diagnose students’ strengths and weaknesses – Review and/or summarizing content
Handout #
University of Illinois (2006)
Questioning . . . • Higher level questions require complex application, analysis, evaluation or creation skills • Higher level questions – Encourage students to think more deeply and critically – Facilitate problem solving – Encourage discussions – Stimulate students to seek information on their own Handout #
University of Illinois (2006)
“Remembering” stems What happened after...? How many...? What is...? Who was it that...? Name ... Find the definition of… Describe what happened after… Who spoke to...? Which is true or false...? (Pohl, 2000)
“Understanding” stems Explain why… Write in your own words… How would you explain…? Write a brief outline... What do you think could have happened next...? Who do you think...? What was the main idea...? Clarify… Illustrate… (Pohl, 2000)
“Applying” stems Explain another instance where… Group by characteristics such as… Which factors would you change if…? What questions would you ask of…? From the information given, develop a set of instructions about…
(Pohl, 2000)
“Analyzing� stems Which events could not have happened? If. ..happened, what might the ending have been? How is...similar to...? What do you see as other possible outcomes? Why did...changes occur? Explain what must have happened when... What are some or the problems of...? Distinguish between... What were some of the motives behind..? What was the turning point? What was the problem with...? (Pohl, 2000)
“Evaluating” stems Judge the value of... What do you think about...? Defend your position about... Do you think...is a good or bad thing? How would you have handled...? What changes to… would you recommend? Do you believe...? How would you feel if...? How effective are...? What are the consequences...? What influence will....have on our lives? What are the pros and cons of....? Why is....of value? What are the alternatives? Who will gain & who will loose?
(Pohl, 2000)
“Creating� stems Design a...to... Devise a possible solution to... If you had access to all resources, how would you deal with...? Devise your own way to... What would happen if ...? How many ways can you...? Create new and unusual uses for... Develop a proposal which would... (Pohl, 2000)
Summary Bloom’s revised taxonomy • Systematic process of thinking & learning • Assists assessment efforts with easy-to-use format • Visual representation of alignment between goals & objectives with standards, activities, & outcomes • Helps form challenging questions to help students gain knowledge & critical thinking skills • Assists in development of goals, objectives, & lesson plans
Let’s Practice! Worksheets
Thank You! Discussion and Questions
References and Resources Cruz, E. (2003). Bloom's revised taxonomy. In B. Hoffman (Ed.), Encyclopedia of Educational Technology. http://coe.sdsu.edu/eet/Articles/bloomrev/start.htm Dalton, J. & Smith, D. (1986) Extending children’s special abilities: Strategies for primary classrooms. http://www.teachers.ash.org.au/researchskills/dalton.htm Ferguson, C. (2002). Using the revised Bloom’s Taxonomy to plan and deliver team-taught, integrated, thematic units. Theory into Practice, 41(4), 239-244. Forehand, M. (2008). Bloom’s Taxonomy: From emerging perspectives on learning, teaching and technology. http://projects.coe.uga.edu/epltt/index.php?title=Bloom%27s_Taxonomy Mager, R. E. (1997). Making instruction work or skillbloomers: A step-by-step guide to designing and developing instruction that works, (2nd ed.). Atlanta, GA: The Center for Effective Performance, Inc. Mager, R. E. (1997). Preparing instructional objectives: A critical tool in the development of effective instruction, (3rd ed.). Atlanta, GA: The Center for Effective Performance, Inc. Pohl, Michael. (2000). Learning to think, thinking to learn: Models and strategies to develop a classroom culture of thinking. Cheltenham, Vic.: Hawker Brownlow. Tarlinton (2003). Bloom’s revised taxonomy. http://www.kurwongbss.qld.edu.au/thinking/Bloom/bloomspres.ppt. University of Illinois, Center for Teaching Excellence (2006). Bloom’s taxonomy. www.oir.uiuc.edu/Did/docs/QUESTION/quest1.htm
BLUE PRINT CLASSWORK CONTENT
# OF WEEK
# OF PERIODS
NOUNS
1
5
2
1
2
VERBS
2
10
2
2
ADJECTIVES
1
5
2
ADVERBS
3
15
PRONOUNS
1
1
TOTAL
REMEMBERING UNDERSTNADING
APPLYING
ANALYSING
EVALUATING
2
2
2
1
1
1
3
2
3
2
3
1
1
1
1
1
CREATING
2
8 WEEKS
# OF ITEMS ON TEST: 75
CONTENT # OF WEEK CELL THEORY 1 TYPES OF CELLS 1 CELL PARTS 3 MITOSIS 2 MEIOSIS 1 TOTAL 8 WEEKS # OF ITEMS ON TEST: 60
CONTENT
# OF WEEK GREGOR MENDEL 1 PEA EXPERIMENT 1 TRAITS 2 PUNNET SQUARE 3 PEDIGREE 1 TOTAL 8 WEEKS # OF ITEMS ON TEST: 90
# OF PERIODS 5 5 15 10 5 40 PERIODS
REMEMBERING UNDERSTNADING 3 2 2 2 5 3 2 2 1 1
APPLYING
ANALYSING
EVALUATING
CREATING
1 2 2 1
2 2 1
1 2 1
2
# OF PERIODS
REMEMBERING UNDERSTNADING
APPLYING
ANALYSING
EVALUATING
CREATING
3 6 3
1 2
1 2
1 1
5 5 10 15 5 40 PERIODS
5 3 2 2 1
2 2 2 1
Content
Total # Of Ătems in test:
# of Week
# of Periods
Remembering
Understanding
Applying
Analyzing
Evaluating
Creating
SCHOOL NAME
Name
GRADE: SECTION:
LEVEL: SUBJECT: TEACHER: LEVEL
CONTENTS
TOTALS
REMEMBERING Item Value Series
UNDERSTANDING Item Value Series
APPLYING Item Value Series
ANALYSING Item Value Series
EVALUATING Item Value Series
CREATING Item Value Series
LEVEL ITEMS
0
0 SERIES POINTS
0
0
0
TOTAL 0
0
TOTALS PER CONTENT
100 points 0.00
Minute Paper Pose one to two questions in which students identify the most significant things they have learned from a given lecture, discussion, or assignment. Give students one to two minutes to write a response on an index card or paper. Collect their responses and look them over quickly. Their answers can help you to determine if they are successfully identifying what you view as most important. Muddiest Point This is similar to the Minute Paper but focuses on areas of confusion. Ask your students, “What was the muddiest point in… (today’s lecture, the reading, the homework)?” Give them one to two minutes to write and collect their responses. Problem Recognition Tasks Identify a set of problems that can be solved most effectively by only one of a few methods that you are teaching in the class. Ask students to identify by name which methods best fit which problems without actually solving the problems. This task works best when only one method can be used for each problem. Documented Problem Solutions Choose one to three problems and ask students to write down all of the steps they would take in solving them with an explanation of each step. Consider using this method as an assessment of problem-solving skills at the beginning of the course or as a regular part of the assigned homework. Directed Paraphrasing Select an important theory, concept, or argument that students have studied in some depth and identify a real audience to whom your students should be able to explain this material in their own words (e.g., a grants review board, a city council member, a vice president making a related decision). Provide guidelines about the length and purpose of the paraphrased explanation. Applications Cards Identify a concept or principle your students are studying and ask students to come up with one to three applications of the principle from everyday experience, current news events, or their knowledge of particular organizations or systems discussed in the course. Student-Generated Test Questions A week or two prior to an exam, begin to write general guidelines about the kinds of questions you plan to ask on the exam. Share those guidelines with your students and ask them to write and answer one to two questions like those they expect to see on the exam. Classroom Opinion Polls When you believe that your students may have pre-existing opinions about course-related issues, construct a very short two- to four-item questionnaire to help uncover students’ opinions.
Competencies vs. Learning Objectives Laurita Santacaterina March 2007 One might be confused about the difference between competencies and learning objectives. From an educational standpoint, competencies can be regarded as the logical building blocks upon which assessments of professional development are based. When competencies are identified, a program can effectively determine the learning objectives that should guide the learners’ progress toward their professional goals. Tying these two together will also help identify what needs to be assessed for verification of the program’s quality in its effectiveness towards forming competent learners. Competencies define the applied skills and knowledge that enable people to successfully perform their work while learning objectives are specific to a course of instruction. Competencies are relevant to an individual’s job responsibilities, roles and capabilities. They are a way to verify that a learner has in fact learned what was intended in the learning objectives. Learning objectives describe what the learner should be able to achieve at the end of a learning period. Learning objectives should be specific, measurable statements and written in behavioral terms. In short, objectives say what we want the learners to know and competencies say how we can be certain they know it. So how does one write good learning objectives? First of all, it is important to focus on the learner’s performance rather than the instructor’s performance. For example, an objective written with a learner in mind might read, ‘Describe the differences between traditional and crisis leadership.’ While an objective written from an instructor’s performance focus might read, ‘Discuss traditional and crisis leadership.’ Second, each learning objective should begin with a behavioral verb and should not include more than one general learning outcome. A behavioral verb is a word that denotes an observable action or the creation of an observable product. An example of a behavioral learning objective is, “List examples of possible air contaminants resulting from a disaster.’ An example of a non-behavioral learning objective would be an objective beginning with the verb ‘understand’ because one cannot see ‘understanding.’ Third of all, it is essential the learning objectives not only tie back directly to the course content but also that they represent the most significant information for the learners to grasp. As mentioned above, one of the key elements to writing a good learning objective is to use behavioral verbs. Here is a short of list of behavioral verbs that should serve as a useful guide to preparing good learning objectives.
Classify Compose Construct Decode Define Demonstrate
Describe Diagram Distinguish Estimate Evaluate Identify
Interpret Label List Locate Measure Name
Order Reproduce Solve State (a rule) Translate
Additional Information on Creating Learning Objectives In 1956, Benjamin Bloom headed a group of educational psychologists who developed a classification of levels of intellectual behavior important in learning. This classification of levels is called Bloom’s Taxonomy. It is the most widely used system of its kind in education. Bloom’s Taxonomy divides educational objectives into three "domains:" affective, psychomotor, and cognitive. Within each domain there are different levels of learning. Traditional education tends to emphasize the skills in the cognitive domain. Bloom identified six levels within the cognitive domain, from the simple recall or recognition of facts (knowledge), as the lowest level, through increasingly more complex and abstract mental levels to the highest level that is classified as evaluation. Here is a table illustrating the different levels of the cognitive domain as identified by Bloom’s Taxonomy. The lower levels require less in the way of thinking skills. As one moves down the hierarchy, the activities require higher level thinking skills. Verb examples that represent intellectual activity on each level are also listed. This list can serve as a valuable tool to use when creating learning objectives, assessments and even competencies. Cognitive Verbs Used for Objectives Domain Levels Lowest Knowledge level
define, memorize, repeat, record, list, recall, name, relate, collect, label, specify, cite, enumerate, tell, recount
Comprehension
restate, summarize, discuss, describe, recognize, explain, express, identify, locate, report, retell, review, translate
Application
exhibit, solve, interview, simulate, apply, employ, use, demonstrate, dramatize, practice, illustrate, operate, calculate, show, experiment
Higher Analysis levels
interpret, classify, analyze, arrange, differentiate, group, compare, organize, contrast, examine, scrutinize, survey, categorize, dissect, probe, inventory, investigate, question, discover, text, inquire, distinguish, detect, diagram, inspect
Synthesis
compose, setup, plan, prepare, propose, imagine, produce, hypothesize, invent, incorporate, develop, generalize, design, originate, formulate, predict, arrange, contrive, assemble, concoct, construct, systematize, create
Evaluation
judge, assess, decide, measure, appraise, estimate, evaluate, infer, rate, deduce, compare, score, value, predict, revise, choose, conclude, recommend, select, determine, criticize
2
More on Competencies vs. Learning Objectives All objectives and competencies should be written in specific, measurable and behavioral terms. However, as mentioned earlier, competencies and objectives are different. Their difference lies in the level at which they are written. Competencies are more complex (higher level) than learning objectives. One competency generally requires a multitude of applied skills and knowledge while learning objectives are more specific and generally relate to one learning outcome. In short, one learning objective does not equal one competency. Here are some examples of competencies and learning objectives relating to the competency: Example 1 Competency: Utilizes appropriate methods for interacting sensitively, effectively, and professionally with persons from diverse cultural, socioeconomic, educational, racial, ethnic and professional backgrounds, and persons of all ages and lifestyle preferences (competency from: Council on Linkages Between Academia and Public Health Practice)
Learning objectives from a course that relate to the above competency: Describe the demographic trends and epidemiological trends related to diverse populations in the United States and abroad Compare and contrast diversity and cultural competency in the public health context Identify a framework to design culturally competent public health care services for diverse populations In example 1, one can see that the person would need to achieve several objectives in order to achieve the competency. It is also important to keep in mind that this list of objectives is not a complete list of objectives needed to reach competency. The purpose of this example is to simply demonstrate how a learning objective can be written to tie back to a competency. For example, in order to “utilize appropriate methods for interacting sensitively, effectively…” it would be important for the person to be able to “describe the demographic trends and epidemiological trends related to…” It would also be important for the person to be able to “identify a framework to design culturally competent public health care…” etc.
Example 2 Competency: Develops and adapts approaches to problems that take into account cultural differences (competency from: Council on Linkages Between Academia and Public Health Practice)
Learning objectives from a course that relate to the above competency: Develop an action plan at the level of the individual, group, and organization by strengthening diagnostic skills
3
Use data collected in the diagnostic phase to develop an action plan at the individual, group, and organization level Present individual, group, and organization action plans for feedback
Example 2 also illustrates how one competency would require several objectives to successfully achieve this competency. Again, it is also important to keep in mind that this list of objectives is not a complete list of objectives needed to reach competency. The purpose of this example is to once more demonstrate how a learning objective can be written to tie back to a competency. For example, in order to “develop and adapt approaches to problems…” it would be important for the person to be able to “develop an action plan at the level of the individual, group…” It would also be important for the person to be able to “use data collected in the diagnostic phase…” etc.
In summary, competencies define the applied skills and knowledge that enable people to successfully perform their work while learning objectives are specific to a course of instruction. Both should be written in specific, measurable and behavioral terms. Competencies are generally written at a higher level than a learning objective because they require more complex levels of performance. Since traditional education tends to emphasize skills in the cognitive domain, one can use the six different levels of Bloom’s Taxonomy as a valuable tool when creating both competencies and learning objectives.
Reference: Bloom, Benjamin S. and David R. Krathwohl. (1956). Taxonomy of educational objectives: The classification of educational goals, by a committee of college and university examiners. Handbook 1: Cognitive domain. New York, Longmans.
References Santacaterina, Laurita (2007). Competencies vs. Learning Objectives. Retrieved November 14, 2009 from http://www.sph.tulane.edu/Files/Competencies%20vs%20Learning%20Objectives_with%20exa mples.doc
4
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE IDIOMAS EVALUATION AND ASSESSMENT TECHNIQUES LICDA. EVELYN R. QUIROA
Erver Moisés Azurdia Sandoval 50761114201 11-17-2012
Doc. 1
Doc. 2
Doc. 3
Final
•Assessment is the act of gathering information on a daily basis in order to understand individual students' learning and needs.
•Classroom assessment is a method of inquiry into the effects of teaching on learning. •It involves the use of techniques and instruments designed to give instructors ongoing feedback about the effect their teaching is having on the level and quality of student learning; this feedback then informs their subsequent instructional decisions.
•Assessment is defined as data‐gathering strategies, analyses, and reporting processes that provide information that can be used to determine whether or not intended outcomes are being achieved.
Assessment is the gathering of information and strategies to stimulate growth through constant improvement of techniques, instruments and feedback applied to the teaching and learning process where instructors can reflect and determine if the goals established are being achieved.
Doc. 1
•Evaluation is the culminating act of interpreting the information gathered for the purpose of making decisions or judgments about students' learning and needs, often at reporting time.
Doc. 2
•Evaluation, in contrast, is used for summative purposes to give an overview of a particular instructor’s teaching in a particular course and setting.
Doc. 3
•Evaluation is the culminating act of interpreting the information gathered for the purpose of making decisions or judgments about students' learning and needs, often at reporting time.
Final
Evaluation is the culminating act for summative purposes to give an overview of the results gathered to make decisions or judgments on students’ learning, needs and outcomes as well as the analysis of the instructors teaching methods.
Differences
Similarities
Assessment
Evaluation Differences
YORK UNIVERSITY S E N AT E C O M M I T T E E O N T E A C H I N G A N D L E A R N I N G ’ S G U I D E T O
TEACHING ASSESSMENT & EVALUATION INTRODUCTION
NEED FOR THE GUIDE
The Teaching Assessment and Evaluation Guide provides instructors with starting-points for reflecting on their teaching, and with advice on how to gather feedback on their teaching practices and effectiveness as part of a systematic program of teaching development. As well, the Guide provides guidance on how teaching might be fairly and effectively evaluated, which characteristics of teaching might be considered, and which evaluation techniques are best suited for different purposes. The Teaching Assessment and Evaluation Guide is a companion to the Teaching Documentation Guide (1993), also prepared by the Senate Committee on Teaching and Learning (SCOTL). The Documentation Guide (available at the Centre for the Support of Teaching and on the SCOTL website) aims to provide instructors with advice and concrete suggestions on how to document the variety and complexity of their teaching contributions.
Teaching is a complex and personal activity that is best assessed and evaluated using multiple techniques and broadly-based criteria. Assessment for formative purposes is designed to stimulate growth, change and improvement in teaching through reflective practice. Evaluation, in contrast, is used for summative purposes to give an overview of a particular instructor’s teaching in a particular course and setting. Informed judgements on teaching effectiveness can best be made when both assessment and evaluation are conducted, using several techniques to elicit information from various perspectives on different characteristics of teaching. There is no one complete source for information on one’s teaching, and no single technique for gathering it. Moreover, the techniques need to be sensitive to the particular teaching assignment of the instructor being assessed or evaluated, as well as the context in which the teaching takes place. If multiple perspectives are represented and different techniques used, the process will be more valued, the conclusions reached will be more credible, and consequently more valuable to the individual being assessed or evaluated.
CONTENTS • Introduction ....................................... 1 • Need for the Guide ............................ 1 • What is Quality Teaching? ................. 2 • Formative Assessment ...................... 2 • Summative Evaluation ....................... 2 • Overview of Assessment and Evaluation Strategies: 1. 2. 3. 4. 5. 6.
Teaching dossiers ........................ 3 Student ratings ............................ 4 Peer observations ........................ 5 Letters & individual interviews ...... 6 Course portfolios ......................... 6 Classroom assessment ............... 7
• Classroom Assessment Techniques .. 8
Current practices at York University are varied. In most departments and units, teaching is systematically evaluated, primarily for summative purposes. Individual instructors are free, if they wish, to use the data so gathered for formative purposes, or they may contact the Centre for the Support of Teaching which provides feedback and teaching analysis aimed at growth, development and improvement. Without denying the value of summative teaching evaluation, the main purpose of this Guide is to encourage committees and individuals to engage in reflective practice through the ongoing assessment of teaching for formative purposes and for professional development. Research indicates that such practice leads to heightened enthusiasm for teaching, and improvement in teaching and learning, both of which are linked to faculty vitality.
The Teaching Assessment and Evaluation Guide© is published by the Senate Committee on Teaching and Learning (SCOTL),York University www.yorku.ca/secretariat/senate/committees/scotl/ (revised January 2002)
Teaching Assessment and Evaluation Guide
WHAT IS QUALITY TEACHING?
consideration the level of the course, the instructor’s objectives and style, and the teaching methodology employed. Nonetheless, the primary criterion must be improved student learning. Research indicates that students, faculty and administrators alike agree that quality teaching:
All assessment and evaluation techniques contain implicit assumptions about the characteristics that constitute quality teaching. These assumptions should be made explicit and indeed should become part of the evaluation process itself in a manner which recognizes instructors’ rights to be evaluated within the context of their own teaching philosophies and goals. First and foremost then, “teaching is not right or wrong, good or bad, effective or ineffective in any absolute, fixed or determined sense.”¹ Instructors emphasize different domains of learning (affective, cognitive, psychomotor, etc.) and employ different theories of education and teaching methodologies (anti-racist, constructivist, critical, feminist, humanistic, etc.)². They encourage learning in different sites (classrooms, field locations, laboratories, seminar rooms, studios, virtual classrooms, etc.). They use different instructional strategies and formats (using case studies, coaching, demonstrating, facilitating discussions, lecturing, problemQUALITY TEACHING based learning, Put succinctly, quality teaching is online delivery, etc.), that activity which brings about the and they do this most productive and beneficial while recognizing learning experience for students and that students have promotes their development as diverse backgrounds learners. This experience may and levels of include such aspects as: preparedness. In one situation, instructors • improved comprehension of may see their role as and ability to use the ideas transmitting factual introduced in the course; information, and in • change in outlook, attitude and another as facilitating enthusiasm towards the discussion and discipline and its place in the promoting critical academic endeavour; thinking. • intellectual growth; and • improvement in specific skills As variable and such as critical reading and diverse as quality writing, oral communication, teaching might be, analysis, synthesis, abstraction, generalizations may and generalization. nevertheless be made about its basic characteristics as described in the accompanying text box.
• • • • •
establishes a positive learning environment; motivates student engagement; provides appropriate challenges; is responsive to students’ learning needs; and is fair in evaluating their learning.
Concretely, indicators of quality teaching can include: • • • •
effective choice of materials; organization of subject matter and course; effective communication skills; knowledge of and enthusiasm for the subject matter and teaching; • availability to students; and • responsiveness to student concerns and opinions. Some characteristics are more easily measured than others. Furthermore, since instructors are individuals and teaching styles are personal, it is all the more important to recognize that not everyone will display the same patterns and strengths.
ASSESSMENT OF TEACHING FOR FORMATIVE PURPOSES Formative assessment of teaching can be carried out at many points during an instructional period, in the classroom or virtual environment, to compare the perceptions of the instructor with those of the students, and to identify gaps between what has been taught and what students have learned. The purpose of assessment is for instructors to find out what changes they might make in teaching methods or style, course organization or content, evaluation and grading procedures, etc., in order to improve student learning. Assessment is initiated by the instructor and information and feedback can be solicited from many sources (for example, self, students, colleagues, consultants) using a variety of instruments (surveys, on-line forms, etc. - see classroom assessment below). The data gathered are seen only by the instructor and, if desired, a consultant, and form the basis for ongoing improvement and development.
The criteria for evaluating teaching vary between disciplines and within disciplines, and should take into
SUMMATIVE EVALUATION Summative evaluation, by contrast, is usually conducted at the end of a particular course or at specific points in an instructor’s career. The purpose is to form a judgment about the effectiveness of a course and/or an instructor. The judgment may be used for tenure and promotion decisions, to reward success in the form of teaching awards or merit pay, or to enable departments to make
______ 1. Mary Ellen Weimer (1990). Improving College Teaching (CA: Jossey Bass Publishers), 202. 2. Adapted from George L. Geis (1977), “Evaluation: definitions, problems and strategies,” in Chris Knapper et al Eds., Teaching is Important (Toronto: Clarke Irwin in association with CAUT).
2
Teaching Assessment and Evaluation Guide
• evidence of exceptional achievements and contributions to teaching in the form of awards, and committee work.
informed decisions about changes to individual courses, the curriculum or teaching assignments. At most universities, summative evaluation includes the results of teaching evaluations regularly scheduled at the end of academic terms. However, to ensure that summative evaluation is both comprehensive and representative, it should include a variety of evaluation strategies, among them:
One’s teaching dossier (see below) is an ideal format for presenting these types of evaluation as a cumulative and longitudinal record of one’s teaching. Important note: It is crucial that the two processes – summative evaluation and formative assessment – be kept strictly apart if the formative assessment of teaching is to be effective and achieve its purpose. This means that the information gathered in a program of formative assessment should not be used in summative evaluation unless volunteered by instructors themselves. It also means that persons who are or have been involved in assisting instructors to improve their teaching should not be asked to provide information for summative evaluation purposes.
• letters from individual students commenting on the effectiveness of the instructor’s teaching, the quality of the learning experience, and the impact of both on their academic progress; • assessments by peers based on classroom visits; • samples and critical reviews of contributions to course and curriculum development, as well as of contributions to scholarship on teaching; and
OVERVIEW OF STRATEGIES FOR ASSESSING AND EVALUATING QUALITY TEACHING AND STUDENT LEARNING This section describes six strategies that teachers may use to assess and evaluate the quality of their teaching and its impact on student learning: 1) teaching dossiers; 2) student ratings; 3) peer observations; 4) letters and individual interviews; 5) course portfolios; and 6) classroom assessment. These descriptions draw on current research in the field (available at the Centre for the Suppport of Teaching, 111 Central Square, www.yorku.ca/cst) and practices and procedures at other universities in Canada and abroad. All evaluation and assessment efforts should use a combination of strategies to take advantage of their inherent strengths as well as their individual limitations.
1. TEACHING DOSSIERS
Benefits: Dossiers provide an opportunity for instructors to articulate their teaching philosophy, review their teaching goals and objectives, assess the effectiveness of their classroom practice and the strategies they use to animate their pedagogical values, and identify areas of strength and opportunities for improvement. They also highlight an instructor’s range of responsibilities, accomplishments, and contributions to teaching and learning more generally within the department, university and/or scholarly community.
A teaching dossier or To focus on: portfolio is a factual description of an § Appraisal of instructor’s instructor’s teaching teaching and learning context achievements and contains documentation § Soundness of instructor’s approach to teaching and that collectively learning suggests the scope and quality of his or her § Coherence of teaching teaching. Dossiers can objectives and strategies be used to present § Vigour of professional evidence about teaching development, contributions quality for evaluative and accomplishments in the purposes such as T&P area of teaching. submissions, teaching award nominations, etc., as they can provide a useful context for analyzing other forms of teaching evaluation. Alternatively, dossiers can provide the framework for a systematic program of reflective analysis and peer collaboration leading to improvement of teaching and student learning. For further information on how to prepare a teaching dossier, please consult SCOTL’s Teaching Documentation Guide (available at the Centre for the Support of Teaching and from the SCOLT website).
Limitations: It is important to note that dossiers are not meant to be an exhaustive compilation of all the documents and materials that bear on an instructor’s teaching performance; rather they should present a selection of information organized in a way that gives a comprehensive and accurate summary of teaching activities and effectiveness. _______ For further information on teaching dossiers see: Teaching Documentation Guide (1993, Senate Committee on Teaching and Learning). Peter Seldin “Self-Evaluation: What Works? What Doesn’t?” and John Zubizarreta “Evaluating Teaching through Portfolios” in Seldin and Associates (1999). Changing Practices in Evaluating Teaching: A Practical Guide to Improved Faculty Performance and Promotion/ Tenure Decisions (MA: Anker Press).
3
Teaching Assessment and Evaluation Guide
2. STUDENT RATINGS OF TEACHING Student ratings of To focus on: teaching or student evaluations are the most § Effectiveness of instructor commonly used source § Impact of instruction on of data for both student learning summative and § Perceived value of the course formative information. to the student In many academic units they are mandatory, and § Preparation and organization in several units, they are § Knowledge of subject matter also standardized. For and ability to stimulate purposes such as tenure interest in the course and promotion, data should be obtained over § Clarity and understandability time and across courses § Ability to establish rapport using a limited number and encourage discussion of global or summary within the classroom type questions. Such § Sensitivity to and concern data will provide a with students’ level of undercumulative record and standing and progress enable the detection of patterns of teaching development1. Information obtained by means of student ratings can also be used by individual instructors to improve the course in future years, and to identify areas of strength and weakness in their teaching by comparison with those teaching similar courses. Longer and more focussed questionnaires are also useful in a program of formative evaluation when designed and administered by an instructor during a course. Benefits: The use of a mandatory, standardized questionnaire puts all teaching evaluations on a common footing, and facilitates comparisons between teachers, courses and academic units. The data gathered also serve the purpose of assessing whether the educational goals of the unit are being met. Structured questionnaires are particularly appropriate where there are relatively large numbers of students involved, and where there are either several sections of a single course, or several courses with similar teaching objectives using similar teaching approaches.
Limitations: While students’ perceptions provide valuable feedback to instructors, recent research has identified specific areas of teaching quality on which students are not able to make informed judgments. These include the appropriateness of course goals, content, 3 design, materials, and evaluation of student work. Thus, the use of a variety of techniques as described elsewhere in this document can help to address the gaps and shortcomings in the student rating data. Further, recent research indicates that care should be taken to control for possible biases based on gender, race, discipline, and teaching approach, particularly for those using non-traditional teaching methods and curriculum. Likewise, ratings can be affected by factors for which it is difficult to control, such as student motivation, complexity of material, level of course, and class size. Care should be taken, therefore, to create an appropriate context for interpreting the data in light of other sources of data and in comparison with other courses. One way to ensure fairness and equity is to ask students to identify the strengths of the instructor’s approach as well as weaknesses, and to ask for specific suggestions for improvement. Teachers have such different perspectives, approaches, and objectives that a standardized questionnaire may not adequately or fairly compare their performance. For example, the implicit assumption behind the design of many evaluation forms is that the primary mode of instruction is the lecture method. Such a form will be inadequate in evaluating the performance of instructors who uses different teaching methods, for example collaborative learning. One way to overcome this limitation and to tailor the questionnaire to the objectives and approaches of a specific course or instructor is to design an evaluation form with a mandatory core set of questions and additional space for inserting questions chosen by the instructor. Note: The Centre for the Support of Teaching has sample teaching evaluation forms from numerous Faculties and departments, as well as books and articles which are helpful resources for individuals and committees interested in developing questionnaires. In addition, web resources are posted on the SCOTL website. _____
Questionnaires are relatively economical to administer, summarize and interpret. Provided that students are asked to comment only on items with which they have direct experience, student responses to questionnaires have been found to be valid. While questionnaire forms with open-ended questions are more expensive to administer, they often provide more reliable and useful sources of information in small classes and for the tenure and promotion process. Also, open-ended questions provide insight into the numerical ratings, and provide pertinent information for course revision.
For further information on student ratings of teaching see: 1. Cashin, William (1995), “Student ratings of teaching: The research revisited.” Idea Paper, Number 32 (Kansas State University, Centre for Faculty Development) 2. See, for example, The Teaching Professor, Vol. 8, No. 4, 3-4 3. See also Theall, Michael and Franklin, Jennifer, Eds.(1990). Student Ratings of Instruction: Issues for Improving Practice, New Directions in Teaching and Learning, No. 43 (CA: Jossey-Bass Inc.).
4
Teaching Assessment and Evaluation Guide
3. PEER OBSERVATIONS
Peer observation is especially useful for formative evaluation. In this case, it is important that the results of the observations remain confidential and not be used for summative evaluation. The process of observation in this case should take place over time, allowing the instructor to implement changes, practice improvements and obtain feedback on whether progress has been made. It may also include video-taping the instructor’s class. This process is particularly helpful to faculty who are experimenting with new teaching methods.
Peer observations offer To focus on: critical insights into an instructor’s § Quality of the learning performance, environment (labs, lecture complementing student halls, online discussion ratings and other forms groups, seminars, studios, of evaluation to etc.) contribute to a fuller § Level of student engagement and more accurate representation of § Clarity of presentation, and overall teaching quality. ability to convey course Research indicates that content in a variety of ways colleagues are in the § Range of instructional best position to judge methods and how they specific dimensions of support student teaching quality, understanding including the goals, § Student-instructor rapport content, design and organization of the § Overall effectiveness course, the methods and materials used in delivery, and evaluation of student work.
A particularly valuable form of observation for formative purposes is peer-pairing. With this technique, two instructors provide each other with feedback on their teaching on a rotating basis, each evaluating the other for a period of time (anywhere between 2 weeks and a full year). Each learns from the other and may learn as much in the observing role as when being observed. Full guidelines for using this technique, as well as advice and assistance in establishing a peer-pairing relationship, are available from the Centre for the Support of Teaching. Benefits: Peer observations can complete the picture of an instructor’s teaching obtained through other methods of evaluation. As well, observations are an important supplement to contextualize variations in student ratings in situations, for example, where an instructor’s teaching is controversial because experimental or non-traditional teaching methods are being used, or where other unique situations exist within the learning environment. Colleagues are better able than students to comment upon the level of difficulty of the material, knowledge of subject matter and integration of topics, and they can place the teaching within a wider context and suggest alternative teaching formats and ways of communicating the material.
Peer observation may be carried out for both summative and formative purposes. For summative evaluation, it is recommended that prior consensus be reached about what constitutes quality teaching within the discipline, what the observers will be looking for, and the process for carrying out and recording the observations. To ensure that a full picture of an instructor’s strengths and weaknesses is obtained, some observers find checklists useful and some departments may choose to designate the responsibility of making classroom observations to a committee. Given the range of activities in a class, some observers find it helpful to focus on specific aspects of the teaching and learning that takes place. It is also advisable that more than one colleague be involved, and that more than one observation take place by each colleague. This will counteract observer bias towards a particular teaching approach and the possibility that an observation takes place on an unusually bad day. These precautions also provide for greater objectivity and reliability of the results.
Limitations: There are several limitations to using peer observations for summative purposes. First, unless safeguards are put in place to control for sources of bias, conflicting definitions of teaching quality, and idiosyncrasies in practice, inequities can result in how classroom observations are done1. For example, instructors tend to find observations threatening and they and their students may behave differently when there is an observer present. Also, there is evidence to suggest that peers may be relatively generous evaluators in some instances. A second limitation is that it is costly in terms of faculty time since a number of observations are necessary to ensure the reliability and validity of findings. Since observers vary in their definitions of quality teaching and some tact is required in providing feedback on observations, it is desirable that observers receive training before becoming involved in providing formative evaluation. The approaches described above can help to minimize these inequities and improve the effectiveness of peer observation. Finally, to protect the integrity of this
Before an observation, it is important that the observer and instructor meet to discuss the instructor’s teaching philosophy, the specific objectives and the strategies that will be employed during the session to be observed, and the materials relevant to the course: syllabus, assignments, online course components, etc. Likewise, discussions of the criteria for evaluation and how the observations will take place can help to clarify expectations and procedures. A post-observation meeting allows an opportunity for constructive feedback and assistance in the development of a plan for improvement.
5
Teaching Assessment and Evaluation Guide
5. COURSE PORTFOLIOS
technique for both formative and summative purposes, it is critical that observations for personnel decisions be kept strictly separate from evaluations for teaching improvement. ______ For further information on colleague evaluation of teaching see: 1. DeZure, Deborah. “Evaluating teaching through peer classroom observation,” in Peter Seldin and Associates (1999). Changing Practices in Evaluating Teaching: A Practical Guide to Improved Faculty Performance and Promotion/Tenure Decisions (MA: Anker Press).
4. LETTERS AND INDIVIDUAL INTERVIEWS Letters and/or individual interviews may be used in teaching award nominations, tenure and promotion files, etc. to obtain greater depth of information for the purpose of improving teaching, or for providing details and examples of an instructor’s impact on students.
To focus on: § Effectiveness of instructor through detailed reflection § Impact of instruction on student learning and motivation over the longer term § Preparation and organization § Clarity and understandability § Ability to establish rapport and encourage discussion
§ Sensitivity to and concern with students’ level of Benefits: Interviews understanding and progress and letters elicit information not readily available through student ratings or other forms of evaluation. Insights, success stories, and thoughtful analyses are often the outcomes of an interview or request for a written impressions of an instructor’s teaching. Students who are reluctant to give information on a rating scale or in written form, often respond well to a skilled, probing interviewer.
A course portfolio is a To focus on: variant on the teaching dossier and is the § Appropriateness of course product of focussed goals and objectives inquiry into the learning § Quality of instructional by students in a materials and assignments particular course. It § Coherence of course represents the specific organization, teaching aims and work of the strategies and modes of instructor and is delivery structured to explain what, how and why § Comprehensiveness of students learn in a class. methods for appraising It generally comprises student achievement four main components: § Level of student learning and 1) a statement of the contribution of teaching to aims and pedagogical students’ progress strategies of the course and the relationship § Innovations in teaching and between the method and learning outcomes; 2) an analysis of student learning based on key assignments and learning activities to advance course goals; 3) an analysis of student feedback based on classroom assessment techniques; and 4) a summary of the strengths of the course in terms of students’ learning, and critical reflection on how the course goals were realised, changed or unmet. The final analysis leads to ideas about what to change in order to enhance student learning, thinking and development the next time the course is taught.1 Course portfolios have been described as being closely analogous to a scholarly project, in that: “a course, like a project, begins with significant goals and intentions, which are enacted in appropriate ways and lead to relevant results in the form of student learning. Teaching, like a research project, is expected to shed light on the question at hand and the issues that shape it; the methods used to complete the project should be congruent with the outcomes sought. The course portfolio has the distinct advantage of representing – by encompassing and connecting planning, implementation and results – the intellectual integrity of teaching as reflected in a single course.” 2
Limitations: The disadvantage of letters is that the response rate can be low. The major disadvantage of interviews is time. Interviews can take approximately one hour to conduct, about 30 minutes to arrange, and another block of time for coding and interpretation. A structured interview schedule should be used to eliminate the bias that may result when an untrained interviewer asks questions randomly of different students.
Benefits: The focus on a specific course allows the portfolio to demonstrate student understanding as an index of successful teaching. For instructors, course portfolios provide a framework for critical reflection and continuous improvement of teaching, and deep insight into how their teaching contributes to students’ knowledge and skills.
6
Teaching Assessment and Evaluation Guide
For departments, they can highlight cohesion and gaps within the curriculum and enable continuity within the course over time and as different instructional technologies are incorporated. As well, course portfolios can collectively promote course articulation and provide means of assessing the quality of a curriculum and pedagogical approaches in relation to the overall goals and outcomes of a program of study.
between what they teach and what students learn and enable them to adjust their teaching to make learning more efficient and effective. The information should always be shared with students to help them improve their own learning strategies and become more successful selfdirected learners. There are a variety of instruments for classroom assessment, either in class or electronically, such as oneminute papers, one-sentence summaries, critical incident questionnaires, focus groups, and mid-year mini surveys (see page 8). Generally, the instruments are created, administered, and results analysed by the instructor to focus on specific aspects of teaching and student learning. Although the instructor is not obligated to share the results of classroom assessment beyond the course, the results may usefully inform other strategies for evaluating teaching quality.
Limitations: Because course portfolios focus on one course, they do not reflect the full range of an instructor’s accomplishments, responsibilities, and contributions (such as curriculum development and work with graduate students) that would be documented in a teaching dossier. Also, course portfolios take time to prepare and evaluate, and instructors should not be expected to build a portfolio for every course taught; rather they should concentrate on those courses for which they have the strongest interest or in which they invest the majority of their energy, imagination and time.3 ______ For further information on course portfolios see:
Classroom assessment can be integrated into an instructor’s teaching in a graduated way, starting out with a simple assessment technique in one class involving five to ten minutes of class time, less than an hour for analysis of the results, and a few minutes during a subsequent class to let students know what was learned from the assessment and how the instructor and students can use that information to improve learning. After conducting one or two quick assessments, the instructor can decide whether this approach is worth further investment of time and energy.
1. Cerbin, William (1994), “The course portfolio as a tool for continuous improvement of teaching and learning.” Journal on Excellence in College Teaching, 5(1), 95-105. 2. Cambridge, Barbara. “The Teaching Initiative: The course portfolio and the teaching portfolio.” American Association for Higher Education. 3. Cutler, William (1997). The history course portfolio. Perspectives 35 (8): 17-20.
Benefits: Classroom assessment encourages instructors to become monitors of their own performance and promotes reflective practice. In addition, its use can prompt discussion among colleagues about their effectiveness, and lead to new and better techniques for eliciting constructive feedback from students on teaching and learning.
6. CLASSROOM ASSESSMENT* Classroom assessment To focus on: is method of inquiry into the effects of § Effectiveness of teaching on teaching on learning. It learning involves the use of § Constructive feedback on techniques and teaching strategies and instruments designed to classroom/online practices give instructors ongoing feedback about § Information on what students the effect their teaching are learning and level of is having on the level understanding of material and quality of student § Quality of student learning learning; this feedback and engagement then informs their § Feedback on course design subsequent instructional decisions. Unlike tests and quizzes, classroom assessment can be used in a timely way to help instructors identify gaps
Limitations: As with student ratings, the act of soliciting frank, in-the-moment feedback may elicit critical comments on the instructor and his/her approach to teaching. However, it is important to balance the positive and negative comments and try to link negative commentary to issues of student learning. New users of classroom assessment techniques might find it helpful to discuss the critical comments with an experienced colleague. ______ Adapted from Core: York’s newsletter on university teaching (2000) Vol 9, No. 3.
* “Classroom Assessment” is a term used widely by scholars in higher education; it is meant to include all learning environments. For examples, see references on page 8.
7
Teaching Assessment and Evaluation Guide
A SAMPLING OF CLASSROOM ASSESSMENT TECHNIQUES ONE-MINUTE PAPER
The One Sentence Summary technique involves asking students to consider the topic you are discussing in terms of Who Does/Did What to Whom, How, When, Where and Why, and then to synthesize those answers into a single informative, grammatical sentence. These sentences can then be analyzed to determine strengths and weaknesses in the students’ understanding of the topic, or to pinpoint specific elements of the topic that require further elaboration. Before using this strategy it is important to make sure the topic can be summarized coherently. It is best to impose the technique on oneself first to determine its appropriateness or feasibility for given material.
The One-Minute Paper, or a brief reflection, is a technique that is used to provide instructors with feedback on what students are learning in a particular class. It may be introduced in small seminars or in large lectures, in first year courses or upper year courses, or electronically using software that ensures student anonymity. The OneMinute Paper asks students to respond anonymously to the following questions: One-Minute Paper 1. What is the most important thing you learned today?
For further information on these and other classroom assessment strategies see:
2. What question remains uppermost in your mind?
Cross, K. P. and Angelo, T. A, Eds. (1988) Classroom Assessment Techniques: A Handbook for Faculty (MI: National Center for Research to Improve Post-Secondary Teaching and Learning).
Depending upon the structure and format of the learning environment, the One-Minute Paper may be used in a variety of ways: •
During a lecture, to break up the period into smaller segments enabling students to reflect on the material just covered.
•
At the end of a class, to inform your planning for the next session.
•
CRITICAL INCIDENT QUESTIONNAIRES The Critical Incident Questionnaire is a simple assessment technique that can be used to find out what and how students are learning, and to identify areas where adjustments are necessary (e.g., the pace of the course, confusion with respect to assignments or expectations). On a single sheet of paper, students are asked five questions which focus on critical moments for learning in a course. The questionnaire is handed out about ten minutes before the final session of the week.
In a course comprising lectures and tutorials, the information gleaned can be passed along to tutorial leaders giving them advance notice of issues that they may wish to explore with students.
Critical Incident Questionnaire
THE MUDDIEST POINT
1. At what moment this week were you most engaged as a learner?
An adaptation of the One-Minute Paper, the Muddiest Point is particularly useful in gauging how well students understand the course material. The Muddiest Point asks students:
2. At what moment this week were you most distanced as a learner? 3. What action or contribution taken this week by anyone in the course did you find most affirming or helpful?
What was the ‘muddiest point’ for you today? Like the One-Minute Paper, use of the Muddiest Point can helpfully inform your planning for the next session, and signal issues that it may be useful to explore.
4. What action or contribution taken this week by anyone in the course did you find most puzzling or confusing?
ONE SENTENCE SUMMARIES
5. What surprised you most about the course this week?
One Sentence Summaries can be used to find out how concisely, completely and creatively students can summarize a given topic within the grammatical constraints of a single sentence. It is also effective for helping students break down material into smaller units that are more easily recalled. This strategy is most effective for any material that can be represented in declarative form – historical events, story lines, chemical reactions and mechanical processes.
Critical Incident Questionnaires provide substantive feedback on student engagement and may also reveal power dynamics in the classroom that may not initially be evident to the instructor. For further information on Critical Incident Questionnaires see Brookfield, S. J. and Preskill, S. (1999) Discussion as a Way of Teaching: Tools and Techniques for a Democratic Classroom. (CA: Jossey Bass), page 49.
8
YORK UNIVERSITY S E N AT E C O M M I T T E E O N T E A C H I N G A N D L E A R N I N G ’ S G U I D E T O
TEACHING ASSESSMENT & EVALUATION INTRODUCTION
NEED FOR THE GUIDE
The Teaching Assessment and Evaluation Guide provides instructors with starting-points for reflecting on their teaching, and with advice on how to gather feedback on their teaching practices and effectiveness as part of a systematic program of teaching development. As well, the Guide provides guidance on how teaching might be fairly and effectively evaluated, which characteristics of teaching might be considered, and which evaluation techniques are best suited for different purposes. The Teaching Assessment and Evaluation Guide is a companion to the Teaching Documentation Guide (1993), also prepared by the Senate Committee on Teaching and Learning (SCOTL). The Documentation Guide (available at the Centre for the Support of Teaching and on the SCOTL website) aims to provide instructors with advice and concrete suggestions on how to document the variety and complexity of their teaching contributions.
Teaching is a complex and personal activity that is best assessed and evaluated using multiple techniques and broadly-based criteria. Assessment for formative purposes is designed to stimulate growth, change and improvement in teaching through reflective practice. Evaluation, in contrast, is used for summative purposes to give an overview of a particular instructor’s teaching in a particular course and setting. Informed judgements on teaching effectiveness can best be made when both assessment and evaluation are conducted, using several techniques to elicit information from various perspectives on different characteristics of teaching. There is no one complete source for information on one’s teaching, and no single technique for gathering it. Moreover, the techniques need to be sensitive to the particular teaching assignment of the instructor being assessed or evaluated, as well as the context in which the teaching takes place. If multiple perspectives are represented and different techniques used, the process will be more valued, the conclusions reached will be more credible, and consequently more valuable to the individual being assessed or evaluated.
CONTENTS • Introduction ....................................... 1 • Need for the Guide ............................ 1 • What is Quality Teaching? ................. 2 • Formative Assessment ...................... 2 • Summative Evaluation ....................... 2 • Overview of Assessment and Evaluation Strategies: 1. 2. 3. 4. 5. 6.
Teaching dossiers ........................ 3 Student ratings ............................ 4 Peer observations ........................ 5 Letters & individual interviews ...... 6 Course portfolios ......................... 6 Classroom assessment ............... 7
• Classroom Assessment Techniques .. 8
Current practices at York University are varied. In most departments and units, teaching is systematically evaluated, primarily for summative purposes. Individual instructors are free, if they wish, to use the data so gathered for formative purposes, or they may contact the Centre for the Support of Teaching which provides feedback and teaching analysis aimed at growth, development and improvement. Without denying the value of summative teaching evaluation, the main purpose of this Guide is to encourage committees and individuals to engage in reflective practice through the ongoing assessment of teaching for formative purposes and for professional development. Research indicates that such practice leads to heightened enthusiasm for teaching, and improvement in teaching and learning, both of which are linked to faculty vitality.
The Teaching Assessment and Evaluation Guide© is published by the Senate Committee on Teaching and Learning (SCOTL),York University www.yorku.ca/secretariat/senate/committees/scotl/ (revised January 2002)
Teaching Assessment and Evaluation Guide
WHAT IS QUALITY TEACHING?
consideration the level of the course, the instructor’s objectives and style, and the teaching methodology employed. Nonetheless, the primary criterion must be improved student learning. Research indicates that students, faculty and administrators alike agree that quality teaching:
All assessment and evaluation techniques contain implicit assumptions about the characteristics that constitute quality teaching. These assumptions should be made explicit and indeed should become part of the evaluation process itself in a manner which recognizes instructors’ rights to be evaluated within the context of their own teaching philosophies and goals. First and foremost then, “teaching is not right or wrong, good or bad, effective or ineffective in any absolute, fixed or determined sense.”¹ Instructors emphasize different domains of learning (affective, cognitive, psychomotor, etc.) and employ different theories of education and teaching methodologies (anti-racist, constructivist, critical, feminist, humanistic, etc.)². They encourage learning in different sites (classrooms, field locations, laboratories, seminar rooms, studios, virtual classrooms, etc.). They use different instructional strategies and formats (using case studies, coaching, demonstrating, facilitating discussions, lecturing, problemQUALITY TEACHING based learning, Put succinctly, quality teaching is online delivery, etc.), that activity which brings about the and they do this most productive and beneficial while recognizing learning experience for students and that students have promotes their development as diverse backgrounds learners. This experience may and levels of include such aspects as: preparedness. In one situation, instructors • improved comprehension of may see their role as and ability to use the ideas transmitting factual introduced in the course; information, and in • change in outlook, attitude and another as facilitating enthusiasm towards the discussion and discipline and its place in the promoting critical academic endeavour; thinking. • intellectual growth; and • improvement in specific skills As variable and such as critical reading and diverse as quality writing, oral communication, teaching might be, analysis, synthesis, abstraction, generalizations may and generalization. nevertheless be made about its basic characteristics as described in the accompanying text box.
• • • • •
establishes a positive learning environment; motivates student engagement; provides appropriate challenges; is responsive to students’ learning needs; and is fair in evaluating their learning.
Concretely, indicators of quality teaching can include: • • • •
effective choice of materials; organization of subject matter and course; effective communication skills; knowledge of and enthusiasm for the subject matter and teaching; • availability to students; and • responsiveness to student concerns and opinions. Some characteristics are more easily measured than others. Furthermore, since instructors are individuals and teaching styles are personal, it is all the more important to recognize that not everyone will display the same patterns and strengths.
ASSESSMENT OF TEACHING FOR FORMATIVE PURPOSES Formative assessment of teaching can be carried out at many points during an instructional period, in the classroom or virtual environment, to compare the perceptions of the instructor with those of the students, and to identify gaps between what has been taught and what students have learned. The purpose of assessment is for instructors to find out what changes they might make in teaching methods or style, course organization or content, evaluation and grading procedures, etc., in order to improve student learning. Assessment is initiated by the instructor and information and feedback can be solicited from many sources (for example, self, students, colleagues, consultants) using a variety of instruments (surveys, on-line forms, etc. - see classroom assessment below). The data gathered are seen only by the instructor and, if desired, a consultant, and form the basis for ongoing improvement and development.
The criteria for evaluating teaching vary between disciplines and within disciplines, and should take into
SUMMATIVE EVALUATION Summative evaluation, by contrast, is usually conducted at the end of a particular course or at specific points in an instructor’s career. The purpose is to form a judgment about the effectiveness of a course and/or an instructor. The judgment may be used for tenure and promotion decisions, to reward success in the form of teaching awards or merit pay, or to enable departments to make
______ 1. Mary Ellen Weimer (1990). Improving College Teaching (CA: Jossey Bass Publishers), 202. 2. Adapted from George L. Geis (1977), “Evaluation: definitions, problems and strategies,” in Chris Knapper et al Eds., Teaching is Important (Toronto: Clarke Irwin in association with CAUT).
2
Teaching Assessment and Evaluation Guide
• evidence of exceptional achievements and contributions to teaching in the form of awards, and committee work.
informed decisions about changes to individual courses, the curriculum or teaching assignments. At most universities, summative evaluation includes the results of teaching evaluations regularly scheduled at the end of academic terms. However, to ensure that summative evaluation is both comprehensive and representative, it should include a variety of evaluation strategies, among them:
One’s teaching dossier (see below) is an ideal format for presenting these types of evaluation as a cumulative and longitudinal record of one’s teaching. Important note: It is crucial that the two processes – summative evaluation and formative assessment – be kept strictly apart if the formative assessment of teaching is to be effective and achieve its purpose. This means that the information gathered in a program of formative assessment should not be used in summative evaluation unless volunteered by instructors themselves. It also means that persons who are or have been involved in assisting instructors to improve their teaching should not be asked to provide information for summative evaluation purposes.
• letters from individual students commenting on the effectiveness of the instructor’s teaching, the quality of the learning experience, and the impact of both on their academic progress; • assessments by peers based on classroom visits; • samples and critical reviews of contributions to course and curriculum development, as well as of contributions to scholarship on teaching; and
OVERVIEW OF STRATEGIES FOR ASSESSING AND EVALUATING QUALITY TEACHING AND STUDENT LEARNING This section describes six strategies that teachers may use to assess and evaluate the quality of their teaching and its impact on student learning: 1) teaching dossiers; 2) student ratings; 3) peer observations; 4) letters and individual interviews; 5) course portfolios; and 6) classroom assessment. These descriptions draw on current research in the field (available at the Centre for the Suppport of Teaching, 111 Central Square, www.yorku.ca/cst) and practices and procedures at other universities in Canada and abroad. All evaluation and assessment efforts should use a combination of strategies to take advantage of their inherent strengths as well as their individual limitations.
1. TEACHING DOSSIERS
Benefits: Dossiers provide an opportunity for instructors to articulate their teaching philosophy, review their teaching goals and objectives, assess the effectiveness of their classroom practice and the strategies they use to animate their pedagogical values, and identify areas of strength and opportunities for improvement. They also highlight an instructor’s range of responsibilities, accomplishments, and contributions to teaching and learning more generally within the department, university and/or scholarly community.
A teaching dossier or To focus on: portfolio is a factual description of an § Appraisal of instructor’s instructor’s teaching teaching and learning context achievements and contains documentation § Soundness of instructor’s approach to teaching and that collectively learning suggests the scope and quality of his or her § Coherence of teaching teaching. Dossiers can objectives and strategies be used to present § Vigour of professional evidence about teaching development, contributions quality for evaluative and accomplishments in the purposes such as T&P area of teaching. submissions, teaching award nominations, etc., as they can provide a useful context for analyzing other forms of teaching evaluation. Alternatively, dossiers can provide the framework for a systematic program of reflective analysis and peer collaboration leading to improvement of teaching and student learning. For further information on how to prepare a teaching dossier, please consult SCOTL’s Teaching Documentation Guide (available at the Centre for the Support of Teaching and from the SCOLT website).
Limitations: It is important to note that dossiers are not meant to be an exhaustive compilation of all the documents and materials that bear on an instructor’s teaching performance; rather they should present a selection of information organized in a way that gives a comprehensive and accurate summary of teaching activities and effectiveness. _______ For further information on teaching dossiers see: Teaching Documentation Guide (1993, Senate Committee on Teaching and Learning). Peter Seldin “Self-Evaluation: What Works? What Doesn’t?” and John Zubizarreta “Evaluating Teaching through Portfolios” in Seldin and Associates (1999). Changing Practices in Evaluating Teaching: A Practical Guide to Improved Faculty Performance and Promotion/ Tenure Decisions (MA: Anker Press).
3
Teaching Assessment and Evaluation Guide
2. STUDENT RATINGS OF TEACHING Student ratings of To focus on: teaching or student evaluations are the most § Effectiveness of instructor commonly used source § Impact of instruction on of data for both student learning summative and § Perceived value of the course formative information. to the student In many academic units they are mandatory, and § Preparation and organization in several units, they are § Knowledge of subject matter also standardized. For and ability to stimulate purposes such as tenure interest in the course and promotion, data should be obtained over § Clarity and understandability time and across courses § Ability to establish rapport using a limited number and encourage discussion of global or summary within the classroom type questions. Such § Sensitivity to and concern data will provide a with students’ level of undercumulative record and standing and progress enable the detection of patterns of teaching development1. Information obtained by means of student ratings can also be used by individual instructors to improve the course in future years, and to identify areas of strength and weakness in their teaching by comparison with those teaching similar courses. Longer and more focussed questionnaires are also useful in a program of formative evaluation when designed and administered by an instructor during a course. Benefits: The use of a mandatory, standardized questionnaire puts all teaching evaluations on a common footing, and facilitates comparisons between teachers, courses and academic units. The data gathered also serve the purpose of assessing whether the educational goals of the unit are being met. Structured questionnaires are particularly appropriate where there are relatively large numbers of students involved, and where there are either several sections of a single course, or several courses with similar teaching objectives using similar teaching approaches.
Limitations: While students’ perceptions provide valuable feedback to instructors, recent research has identified specific areas of teaching quality on which students are not able to make informed judgments. These include the appropriateness of course goals, content, 3 design, materials, and evaluation of student work. Thus, the use of a variety of techniques as described elsewhere in this document can help to address the gaps and shortcomings in the student rating data. Further, recent research indicates that care should be taken to control for possible biases based on gender, race, discipline, and teaching approach, particularly for those using non-traditional teaching methods and curriculum. Likewise, ratings can be affected by factors for which it is difficult to control, such as student motivation, complexity of material, level of course, and class size. Care should be taken, therefore, to create an appropriate context for interpreting the data in light of other sources of data and in comparison with other courses. One way to ensure fairness and equity is to ask students to identify the strengths of the instructor’s approach as well as weaknesses, and to ask for specific suggestions for improvement. Teachers have such different perspectives, approaches, and objectives that a standardized questionnaire may not adequately or fairly compare their performance. For example, the implicit assumption behind the design of many evaluation forms is that the primary mode of instruction is the lecture method. Such a form will be inadequate in evaluating the performance of instructors who uses different teaching methods, for example collaborative learning. One way to overcome this limitation and to tailor the questionnaire to the objectives and approaches of a specific course or instructor is to design an evaluation form with a mandatory core set of questions and additional space for inserting questions chosen by the instructor. Note: The Centre for the Support of Teaching has sample teaching evaluation forms from numerous Faculties and departments, as well as books and articles which are helpful resources for individuals and committees interested in developing questionnaires. In addition, web resources are posted on the SCOTL website. _____
Questionnaires are relatively economical to administer, summarize and interpret. Provided that students are asked to comment only on items with which they have direct experience, student responses to questionnaires have been found to be valid. While questionnaire forms with open-ended questions are more expensive to administer, they often provide more reliable and useful sources of information in small classes and for the tenure and promotion process. Also, open-ended questions provide insight into the numerical ratings, and provide pertinent information for course revision.
For further information on student ratings of teaching see: 1. Cashin, William (1995), “Student ratings of teaching: The research revisited.” Idea Paper, Number 32 (Kansas State University, Centre for Faculty Development) 2. See, for example, The Teaching Professor, Vol. 8, No. 4, 3-4 3. See also Theall, Michael and Franklin, Jennifer, Eds.(1990). Student Ratings of Instruction: Issues for Improving Practice, New Directions in Teaching and Learning, No. 43 (CA: Jossey-Bass Inc.).
4
Teaching Assessment and Evaluation Guide
3. PEER OBSERVATIONS
Peer observation is especially useful for formative evaluation. In this case, it is important that the results of the observations remain confidential and not be used for summative evaluation. The process of observation in this case should take place over time, allowing the instructor to implement changes, practice improvements and obtain feedback on whether progress has been made. It may also include video-taping the instructor’s class. This process is particularly helpful to faculty who are experimenting with new teaching methods.
Peer observations offer To focus on: critical insights into an instructor’s § Quality of the learning performance, environment (labs, lecture complementing student halls, online discussion ratings and other forms groups, seminars, studios, of evaluation to etc.) contribute to a fuller § Level of student engagement and more accurate representation of § Clarity of presentation, and overall teaching quality. ability to convey course Research indicates that content in a variety of ways colleagues are in the § Range of instructional best position to judge methods and how they specific dimensions of support student teaching quality, understanding including the goals, § Student-instructor rapport content, design and organization of the § Overall effectiveness course, the methods and materials used in delivery, and evaluation of student work.
A particularly valuable form of observation for formative purposes is peer-pairing. With this technique, two instructors provide each other with feedback on their teaching on a rotating basis, each evaluating the other for a period of time (anywhere between 2 weeks and a full year). Each learns from the other and may learn as much in the observing role as when being observed. Full guidelines for using this technique, as well as advice and assistance in establishing a peer-pairing relationship, are available from the Centre for the Support of Teaching. Benefits: Peer observations can complete the picture of an instructor’s teaching obtained through other methods of evaluation. As well, observations are an important supplement to contextualize variations in student ratings in situations, for example, where an instructor’s teaching is controversial because experimental or non-traditional teaching methods are being used, or where other unique situations exist within the learning environment. Colleagues are better able than students to comment upon the level of difficulty of the material, knowledge of subject matter and integration of topics, and they can place the teaching within a wider context and suggest alternative teaching formats and ways of communicating the material.
Peer observation may be carried out for both summative and formative purposes. For summative evaluation, it is recommended that prior consensus be reached about what constitutes quality teaching within the discipline, what the observers will be looking for, and the process for carrying out and recording the observations. To ensure that a full picture of an instructor’s strengths and weaknesses is obtained, some observers find checklists useful and some departments may choose to designate the responsibility of making classroom observations to a committee. Given the range of activities in a class, some observers find it helpful to focus on specific aspects of the teaching and learning that takes place. It is also advisable that more than one colleague be involved, and that more than one observation take place by each colleague. This will counteract observer bias towards a particular teaching approach and the possibility that an observation takes place on an unusually bad day. These precautions also provide for greater objectivity and reliability of the results.
Limitations: There are several limitations to using peer observations for summative purposes. First, unless safeguards are put in place to control for sources of bias, conflicting definitions of teaching quality, and idiosyncrasies in practice, inequities can result in how classroom observations are done1. For example, instructors tend to find observations threatening and they and their students may behave differently when there is an observer present. Also, there is evidence to suggest that peers may be relatively generous evaluators in some instances. A second limitation is that it is costly in terms of faculty time since a number of observations are necessary to ensure the reliability and validity of findings. Since observers vary in their definitions of quality teaching and some tact is required in providing feedback on observations, it is desirable that observers receive training before becoming involved in providing formative evaluation. The approaches described above can help to minimize these inequities and improve the effectiveness of peer observation. Finally, to protect the integrity of this
Before an observation, it is important that the observer and instructor meet to discuss the instructor’s teaching philosophy, the specific objectives and the strategies that will be employed during the session to be observed, and the materials relevant to the course: syllabus, assignments, online course components, etc. Likewise, discussions of the criteria for evaluation and how the observations will take place can help to clarify expectations and procedures. A post-observation meeting allows an opportunity for constructive feedback and assistance in the development of a plan for improvement.
5
Teaching Assessment and Evaluation Guide
5. COURSE PORTFOLIOS
technique for both formative and summative purposes, it is critical that observations for personnel decisions be kept strictly separate from evaluations for teaching improvement. ______ For further information on colleague evaluation of teaching see: 1. DeZure, Deborah. “Evaluating teaching through peer classroom observation,” in Peter Seldin and Associates (1999). Changing Practices in Evaluating Teaching: A Practical Guide to Improved Faculty Performance and Promotion/Tenure Decisions (MA: Anker Press).
4. LETTERS AND INDIVIDUAL INTERVIEWS Letters and/or individual interviews may be used in teaching award nominations, tenure and promotion files, etc. to obtain greater depth of information for the purpose of improving teaching, or for providing details and examples of an instructor’s impact on students.
To focus on: § Effectiveness of instructor through detailed reflection § Impact of instruction on student learning and motivation over the longer term § Preparation and organization § Clarity and understandability § Ability to establish rapport and encourage discussion
§ Sensitivity to and concern with students’ level of Benefits: Interviews understanding and progress and letters elicit information not readily available through student ratings or other forms of evaluation. Insights, success stories, and thoughtful analyses are often the outcomes of an interview or request for a written impressions of an instructor’s teaching. Students who are reluctant to give information on a rating scale or in written form, often respond well to a skilled, probing interviewer.
A course portfolio is a To focus on: variant on the teaching dossier and is the § Appropriateness of course product of focussed goals and objectives inquiry into the learning § Quality of instructional by students in a materials and assignments particular course. It § Coherence of course represents the specific organization, teaching aims and work of the strategies and modes of instructor and is delivery structured to explain what, how and why § Comprehensiveness of students learn in a class. methods for appraising It generally comprises student achievement four main components: § Level of student learning and 1) a statement of the contribution of teaching to aims and pedagogical students’ progress strategies of the course and the relationship § Innovations in teaching and between the method and learning outcomes; 2) an analysis of student learning based on key assignments and learning activities to advance course goals; 3) an analysis of student feedback based on classroom assessment techniques; and 4) a summary of the strengths of the course in terms of students’ learning, and critical reflection on how the course goals were realised, changed or unmet. The final analysis leads to ideas about what to change in order to enhance student learning, thinking and development the next time the course is taught.1 Course portfolios have been described as being closely analogous to a scholarly project, in that: “a course, like a project, begins with significant goals and intentions, which are enacted in appropriate ways and lead to relevant results in the form of student learning. Teaching, like a research project, is expected to shed light on the question at hand and the issues that shape it; the methods used to complete the project should be congruent with the outcomes sought. The course portfolio has the distinct advantage of representing – by encompassing and connecting planning, implementation and results – the intellectual integrity of teaching as reflected in a single course.” 2
Limitations: The disadvantage of letters is that the response rate can be low. The major disadvantage of interviews is time. Interviews can take approximately one hour to conduct, about 30 minutes to arrange, and another block of time for coding and interpretation. A structured interview schedule should be used to eliminate the bias that may result when an untrained interviewer asks questions randomly of different students.
Benefits: The focus on a specific course allows the portfolio to demonstrate student understanding as an index of successful teaching. For instructors, course portfolios provide a framework for critical reflection and continuous improvement of teaching, and deep insight into how their teaching contributes to students’ knowledge and skills.
6
Teaching Assessment and Evaluation Guide
For departments, they can highlight cohesion and gaps within the curriculum and enable continuity within the course over time and as different instructional technologies are incorporated. As well, course portfolios can collectively promote course articulation and provide means of assessing the quality of a curriculum and pedagogical approaches in relation to the overall goals and outcomes of a program of study.
between what they teach and what students learn and enable them to adjust their teaching to make learning more efficient and effective. The information should always be shared with students to help them improve their own learning strategies and become more successful selfdirected learners. There are a variety of instruments for classroom assessment, either in class or electronically, such as oneminute papers, one-sentence summaries, critical incident questionnaires, focus groups, and mid-year mini surveys (see page 8). Generally, the instruments are created, administered, and results analysed by the instructor to focus on specific aspects of teaching and student learning. Although the instructor is not obligated to share the results of classroom assessment beyond the course, the results may usefully inform other strategies for evaluating teaching quality.
Limitations: Because course portfolios focus on one course, they do not reflect the full range of an instructor’s accomplishments, responsibilities, and contributions (such as curriculum development and work with graduate students) that would be documented in a teaching dossier. Also, course portfolios take time to prepare and evaluate, and instructors should not be expected to build a portfolio for every course taught; rather they should concentrate on those courses for which they have the strongest interest or in which they invest the majority of their energy, imagination and time.3 ______ For further information on course portfolios see:
Classroom assessment can be integrated into an instructor’s teaching in a graduated way, starting out with a simple assessment technique in one class involving five to ten minutes of class time, less than an hour for analysis of the results, and a few minutes during a subsequent class to let students know what was learned from the assessment and how the instructor and students can use that information to improve learning. After conducting one or two quick assessments, the instructor can decide whether this approach is worth further investment of time and energy.
1. Cerbin, William (1994), “The course portfolio as a tool for continuous improvement of teaching and learning.” Journal on Excellence in College Teaching, 5(1), 95-105. 2. Cambridge, Barbara. “The Teaching Initiative: The course portfolio and the teaching portfolio.” American Association for Higher Education. 3. Cutler, William (1997). The history course portfolio. Perspectives 35 (8): 17-20.
Benefits: Classroom assessment encourages instructors to become monitors of their own performance and promotes reflective practice. In addition, its use can prompt discussion among colleagues about their effectiveness, and lead to new and better techniques for eliciting constructive feedback from students on teaching and learning.
6. CLASSROOM ASSESSMENT* Classroom assessment To focus on: is method of inquiry into the effects of § Effectiveness of teaching on teaching on learning. It learning involves the use of § Constructive feedback on techniques and teaching strategies and instruments designed to classroom/online practices give instructors ongoing feedback about § Information on what students the effect their teaching are learning and level of is having on the level understanding of material and quality of student § Quality of student learning learning; this feedback and engagement then informs their § Feedback on course design subsequent instructional decisions. Unlike tests and quizzes, classroom assessment can be used in a timely way to help instructors identify gaps
Limitations: As with student ratings, the act of soliciting frank, in-the-moment feedback may elicit critical comments on the instructor and his/her approach to teaching. However, it is important to balance the positive and negative comments and try to link negative commentary to issues of student learning. New users of classroom assessment techniques might find it helpful to discuss the critical comments with an experienced colleague. ______ Adapted from Core: York’s newsletter on university teaching (2000) Vol 9, No. 3.
* “Classroom Assessment” is a term used widely by scholars in higher education; it is meant to include all learning environments. For examples, see references on page 8.
7
Teaching Assessment and Evaluation Guide
A SAMPLING OF CLASSROOM ASSESSMENT TECHNIQUES ONE-MINUTE PAPER
The One Sentence Summary technique involves asking students to consider the topic you are discussing in terms of Who Does/Did What to Whom, How, When, Where and Why, and then to synthesize those answers into a single informative, grammatical sentence. These sentences can then be analyzed to determine strengths and weaknesses in the students’ understanding of the topic, or to pinpoint specific elements of the topic that require further elaboration. Before using this strategy it is important to make sure the topic can be summarized coherently. It is best to impose the technique on oneself first to determine its appropriateness or feasibility for given material.
The One-Minute Paper, or a brief reflection, is a technique that is used to provide instructors with feedback on what students are learning in a particular class. It may be introduced in small seminars or in large lectures, in first year courses or upper year courses, or electronically using software that ensures student anonymity. The OneMinute Paper asks students to respond anonymously to the following questions: One-Minute Paper 1. What is the most important thing you learned today?
For further information on these and other classroom assessment strategies see:
2. What question remains uppermost in your mind?
Cross, K. P. and Angelo, T. A, Eds. (1988) Classroom Assessment Techniques: A Handbook for Faculty (MI: National Center for Research to Improve Post-Secondary Teaching and Learning).
Depending upon the structure and format of the learning environment, the One-Minute Paper may be used in a variety of ways: •
During a lecture, to break up the period into smaller segments enabling students to reflect on the material just covered.
•
At the end of a class, to inform your planning for the next session.
•
CRITICAL INCIDENT QUESTIONNAIRES The Critical Incident Questionnaire is a simple assessment technique that can be used to find out what and how students are learning, and to identify areas where adjustments are necessary (e.g., the pace of the course, confusion with respect to assignments or expectations). On a single sheet of paper, students are asked five questions which focus on critical moments for learning in a course. The questionnaire is handed out about ten minutes before the final session of the week.
In a course comprising lectures and tutorials, the information gleaned can be passed along to tutorial leaders giving them advance notice of issues that they may wish to explore with students.
Critical Incident Questionnaire
THE MUDDIEST POINT
1. At what moment this week were you most engaged as a learner?
An adaptation of the One-Minute Paper, the Muddiest Point is particularly useful in gauging how well students understand the course material. The Muddiest Point asks students:
2. At what moment this week were you most distanced as a learner? 3. What action or contribution taken this week by anyone in the course did you find most affirming or helpful?
What was the ‘muddiest point’ for you today? Like the One-Minute Paper, use of the Muddiest Point can helpfully inform your planning for the next session, and signal issues that it may be useful to explore.
4. What action or contribution taken this week by anyone in the course did you find most puzzling or confusing?
ONE SENTENCE SUMMARIES
5. What surprised you most about the course this week?
One Sentence Summaries can be used to find out how concisely, completely and creatively students can summarize a given topic within the grammatical constraints of a single sentence. It is also effective for helping students break down material into smaller units that are more easily recalled. This strategy is most effective for any material that can be represented in declarative form – historical events, story lines, chemical reactions and mechanical processes.
Critical Incident Questionnaires provide substantive feedback on student engagement and may also reveal power dynamics in the classroom that may not initially be evident to the instructor. For further information on Critical Incident Questionnaires see Brookfield, S. J. and Preskill, S. (1999) Discussion as a Way of Teaching: Tools and Techniques for a Democratic Classroom. (CA: Jossey Bass), page 49.
8
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE HUMANIDADES ESCUELA DE IDIOMAS ESCUELA DE IDIOMAS LICDA. EVELYN R. QUIROA
PROFESORADO EN EL IDIOMA INGLES CURSO: EVALUATION AND ASSESSMENT TECHNIQUES COURSE DESCRIPTION This course is dedicated in the study of the principle theories that inbound evaluation and assessment in the classroom. A critical analysis will be held in order to critique and put into practice the different perspectives, techniques and styles related to performance‐based assessment, summative and formative feedback methods to assess and evaluate student learning in the classroom. COURSE GOAL By the end of the course, students will be able to plan and create assessments and evaluations that provide their students with activities closely related to learning objectives and/or competences. LEARNING OUTCOMES Upon completion of the course, students will be able to: 1. Demonstrate development and use of academic standards across the curriculum and application of standards and objectives in classroom assessment and evaluation. 2. Match assessment to learning outcomes, develop rubric criteria and select appropriate assessment and evaluation choices using the tools proportioned by the course. 3. Apply current research tools to create authentic assessment, discourse analysis, self and peer evaluation, rubrics, surveys, tests and mini‐quizzes for self‐paced tutorials. 4. Evaluate and utilize appropriate tools such as grade books, calendars, spreadsheets and portfolios. GENERAL AND SPECIFIC EXPECTATIONS OF THE COURSE Student Assessment and Evaluation General Expectation 1: to communicate an overview of evaluation frameworks and processes. Specific Expectations: 1. Identify the following: a) the purposes of evaluation, b) key terms relative to evaluation, c) types of evaluation, d) links between planning and evaluation 2. Develop student assessment and practice within a philosophical framework 3. Understand equity issues in evaluation and assessment. General Expectation 2: to understand the purposes of various types of evaluation strategies. Specific Expectations: 1. Differentiate between diagnostic, formative, and summative evaluation 2. Compare the purpose and function of different information sources for evaluation 3. Identify a variety of evaluation and assessment procedures, their purposes, strengths, and weaknesses 4. Discriminate between traditional and authentic assessment and appropriate application in teaching/learning 5. Incorporate appropriate assessment and evaluation strategies into your teaching practice.
General Expectation 3: to place evaluation strategies in the context of a unit of study. Specific Expectations: 1. Design student assessment instruments (including rubrics) for a unit of study 2. Accommodate the needs of exceptional students within the unit and its evaluation component. 3. Enhance research in teaching to improve their own practice. 4. Be capable of doing self assessment. 5. Share the knowledge acquired to benefit the school community to which they belong. EXPECTATIONS: • Students are expected to attend all classes. Class attendance will be a part of the final evaluation. • Students are expected to arrive for class on time. Any student who arrives late will not be given additional time to complete quizzes, exams, or in‐class assignments. • Students are expected to submit all assignments on time. Late submissions will be penalized or not be accepted depending on the particular case. • Students are expected to come to class having read and completed all assignments. • Students are expected to participate in class discussions. • Students are expected to complete all quizzes and examinations in class on the date specified by the teacher. • Students are expected to word process assignments as required, handwritten work will not be excepted unless it is a test blueprint. CONTENTS: EXAM DATE CONTENT • The difference between evaluation and assessment • Types of evaluation (Diagnostic, Formative & Summative) • Establishing High‐Quality (Validity, Reliability etc. ) 08‐27‐11 • Becoming aware of content, context and learners • Curriculum and Evaluation • Visualizing your actions: planning and testing • Objectives vs. Competences • Blooms Taxonomy • Designing a blueprint • Test type items • Test item type instructions • Organizing test type items according to competencies and domain 10‐08‐11 levels • Analyzing test • Creating different core content tests • Assessment strategies • Self Improvement through self assessment • Self assessment tools: rubrics, checklists, portfolios etc. 11‐19‐11 • Differentiated learning • Declarative and procedural knowledge based assessment • Reflective Teaching and Learning • Administering and interpreting standardized tests NOTE: Additional content may be added to list.
MEANS TO ACHIEVE OUR GOALS: 1. Summary on subject matter must be turned in weekly. (Except when having test) 2. Teacher and student exchange of knowledge and experiences. 3. Group discussions. Students must read the material in advance. 4. Individual research and enrichment. 5. Multimedia presentations. 6. Teaching Project 7. Portfolio 8. Exams EVALUATION: Attendance 80% to apply for final term TOTAL ZONE…………………….……………………………………………10 PTS • QUIZZES • CLASS ACTIVITIES • PRESENTATIONS TWO MIDTERMS…….…………………………………………………….40PTS PORTFOLIO …….……………………………………………………………. 20 PTS FINAL EXAM ….…………..…………………………………………………. 30 PTS TOTAL …………………………………………………………………………..100PTS REFERENCES: 1. LANGUAGE PROGRAM EVALUATION, Brian K. Lynch Cambridge University applied linguistics 2. REFLECTIVE PLANNING, TEACHING AND EVALUATION. Judy W. Eby, Adrienne L. Herrell & Jim Hicks 3rd. Edition Merill‐Prentice Hall. London 2002 3. PLANNING LESSONS AND COURSES. Tessa Woodward. Cambridge University Press. Cambridge 2001 4. CLASSROOM ASSESSMENT, PRINCIPLES AND PRACTICES FOR EFFECTIVE INSTRUCTION, James H. McMillan. McMillan Press. Virginia 2001
CLASS REQUIREMENTS AND GUIDELINES Submitting Assignments: All assignments either have or will have an identified “due date”. Extensions beyond the designated due date are not granted except in the most extenuating of circumstances. With the exception of an immediate and pressing “emergency”, all requests for an extension will be written, signed, dated, and delivered in person to me, as your Professor, before the specified due date and in time for me to respond to your request in writing. All assignments are to include a title page that clearly identifies the assignment topic/title, course name and number, the date submitted, the teacher’s name, and the student’s name and I.D. number. All assignments are to be given, in person, directly to the teacher. I will take no responsibility for assignments that are given to other students or given to the personnel in the “Escuela de Idiomas” office. While I have not yet lost any student assignment; there is always the first time! Therefore, you would be well advised to back up your assignment electronically and if feasible, in hard copy. An assignment will be considered late if it is not directly handed to me, as your Professor, by the end of class on the specified “due date”. Late assignments will be penalized 5% for each day or part thereof following the specified “due date” [including Saturday(s) and Sunday(s)]. Attendance and Participation: Attendance will be taken at the beginning of each class period. Attendance in each class is mandatory; however, there is a proviso in the University regulations that students are permitted to miss the equivalent of 3 classroom contact hours. Beyond this limit, the student will be issued a warning that any more absences may result in being excluded from writing the final examination. Regular attendance, being prepared, and constructively participating in classroom activities, are all seen as integral components in the growth and development of becoming a professional teacher and in the establishment of a meaningful community of learnership in our class. Tardiness This can be extremely disruptive and disrespectful to members who strive to be on time. Naturally, we all encounter circumstances that occasionally cause us to be late – but habituated tardiness is not acceptable. If you are late for class, no material will be repeated. Therefore, you need to contact your classmates to be filled in on the material covered. If you arrive after attendance has been taken and you have no excuse, you will be marked as absent. Class Policy on Cell Phones Cell phones must be turned off at all times. If you are expecting an emergency call make sure to talk to me before class. Class Policy on Laptop Computers You may bring your laptop to class, but all work done on laptop computers must be related to the class work of that day. Academic Dishonesty Academic honesty is fundamental to the activities and principles of the University, and more broadly to society at large. All members of the academic community must be confident that each person’s work has been responsibly and honorably acquired, developed, and presented. References Use the A.P.A format 5th Edition.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE HUMANIDADES ESCUELA DE IDIOMAS ESCUELA DE IDIOMAS LICDA. EVELYN R. QUIROA
PROFESORADO EN EL IDIOMA INGLES CURSO: EVALUATION AND ASSESSMENT TECHNIQUES COURSE DESCRIPTION This course is dedicated in the study of the principle theories that inbound evaluation and assessment in the classroom. A critical analysis will be held in order to critique and put into practice the different perspectives, techniques and styles related to performance‐based assessment, summative and formative feedback methods to assess and evaluate student learning in the classroom. COURSE GOAL By the end of the course, students will be able to plan and create assessments and evaluations that provide their students with activities closely related to learning objectives and/or competences. LEARNING OUTCOMES Upon completion of the course, students will be able to: 1. Demonstrate development and use of academic standards across the curriculum and application of standards and objectives in classroom assessment and evaluation. 2. Match assessment to learning outcomes, develop rubric criteria and select appropriate assessment and evaluation choices using the tools proportioned by the course. 3. Apply current research tools to create authentic assessment, discourse analysis, self and peer evaluation, rubrics, surveys, tests and mini‐quizzes for self‐paced tutorials. 4. Evaluate and utilize appropriate tools such as grade books, calendars, spreadsheets and portfolios. GENERAL AND SPECIFIC EXPECTATIONS OF THE COURSE Student Assessment and Evaluation General Expectation 1: to communicate an overview of evaluation frameworks and processes. Specific Expectations: 1. Identify the following: a) the purposes of evaluation, b) key terms relative to evaluation, c) types of evaluation, d) links between planning and evaluation 2. Develop student assessment and practice within a philosophical framework 3. Understand equity issues in evaluation and assessment. General Expectation 2: to understand the purposes of various types of evaluation strategies. Specific Expectations: 1. Differentiate between diagnostic, formative, and summative evaluation 2. Compare the purpose and function of different information sources for evaluation 3. Identify a variety of evaluation and assessment procedures, their purposes, strengths, and weaknesses 4. Discriminate between traditional and authentic assessment and appropriate application in teaching/learning 5. Incorporate appropriate assessment and evaluation strategies into your teaching practice.
General Expectation 3: to place evaluation strategies in the context of a unit of study. Specific Expectations: 1. Design student assessment instruments (including rubrics) for a unit of study 2. Accommodate the needs of exceptional students within the unit and its evaluation component. 3. Enhance research in teaching to improve their own practice. 4. Be capable of doing self assessment. 5. Share the knowledge acquired to benefit the school community to which they belong. EXPECTATIONS: • Students are expected to attend all classes. Class attendance will be a part of the final evaluation. • Students are expected to arrive for class on time. Any student who arrives late will not be given additional time to complete quizzes, exams, or in‐class assignments. • Students are expected to submit all assignments on time. Late submissions will be penalized or not be accepted depending on the particular case. • Students are expected to come to class having read and completed all assignments. • Students are expected to participate in class discussions. • Students are expected to complete all quizzes and examinations in class on the date specified by the teacher. • Students are expected to word process assignments as required, handwritten work will not be excepted unless it is a test blueprint. CONTENTS: EXAM DATE CONTENT • The difference between evaluation and assessment • Types of evaluation (Diagnostic, Formative & Summative) • Establishing High‐Quality (Validity, Reliability etc. ) 08‐27‐11 • Becoming aware of content, context and learners • Curriculum and Evaluation • Visualizing your actions: planning and testing • Objectives vs. Competences • Blooms Taxonomy • Designing a blueprint • Test type items • Test item type instructions • Organizing test type items according to competencies and domain 10‐08‐11 levels • Analyzing test • Creating different core content tests • Assessment strategies • Self Improvement through self assessment • Self assessment tools: rubrics, checklists, portfolios etc. 11‐19‐11 • Differentiated learning • Declarative and procedural knowledge based assessment • Reflective Teaching and Learning • Administering and interpreting standardized tests NOTE: Additional content may be added to list.
MEANS TO ACHIEVE OUR GOALS: 1. Summary on subject matter must be turned in weekly. (Except when having test) 2. Teacher and student exchange of knowledge and experiences. 3. Group discussions. Students must read the material in advance. 4. Individual research and enrichment. 5. Multimedia presentations. 6. Teaching Project 7. Portfolio 8. Exams EVALUATION: Attendance 80% to apply for final term TOTAL ZONE…………………….……………………………………………10 PTS • QUIZZES • CLASS ACTIVITIES • PRESENTATIONS TWO MIDTERMS…….…………………………………………………….40PTS PORTFOLIO …….……………………………………………………………. 20 PTS FINAL EXAM ….…………..…………………………………………………. 30 PTS TOTAL …………………………………………………………………………..100PTS REFERENCES: 1. LANGUAGE PROGRAM EVALUATION, Brian K. Lynch Cambridge University applied linguistics 2. REFLECTIVE PLANNING, TEACHING AND EVALUATION. Judy W. Eby, Adrienne L. Herrell & Jim Hicks 3rd. Edition Merill‐Prentice Hall. London 2002 3. PLANNING LESSONS AND COURSES. Tessa Woodward. Cambridge University Press. Cambridge 2001 4. CLASSROOM ASSESSMENT, PRINCIPLES AND PRACTICES FOR EFFECTIVE INSTRUCTION, James H. McMillan. McMillan Press. Virginia 2001
CLASS REQUIREMENTS AND GUIDELINES Submitting Assignments: All assignments either have or will have an identified “due date”. Extensions beyond the designated due date are not granted except in the most extenuating of circumstances. With the exception of an immediate and pressing “emergency”, all requests for an extension will be written, signed, dated, and delivered in person to me, as your Professor, before the specified due date and in time for me to respond to your request in writing. All assignments are to include a title page that clearly identifies the assignment topic/title, course name and number, the date submitted, the teacher’s name, and the student’s name and I.D. number. All assignments are to be given, in person, directly to the teacher. I will take no responsibility for assignments that are given to other students or given to the personnel in the “Escuela de Idiomas” office. While I have not yet lost any student assignment; there is always the first time! Therefore, you would be well advised to back up your assignment electronically and if feasible, in hard copy. An assignment will be considered late if it is not directly handed to me, as your Professor, by the end of class on the specified “due date”. Late assignments will be penalized 5% for each day or part thereof following the specified “due date” [including Saturday(s) and Sunday(s)]. Attendance and Participation: Attendance will be taken at the beginning of each class period. Attendance in each class is mandatory; however, there is a proviso in the University regulations that students are permitted to miss the equivalent of 3 classroom contact hours. Beyond this limit, the student will be issued a warning that any more absences may result in being excluded from writing the final examination. Regular attendance, being prepared, and constructively participating in classroom activities, are all seen as integral components in the growth and development of becoming a professional teacher and in the establishment of a meaningful community of learnership in our class. Tardiness This can be extremely disruptive and disrespectful to members who strive to be on time. Naturally, we all encounter circumstances that occasionally cause us to be late – but habituated tardiness is not acceptable. If you are late for class, no material will be repeated. Therefore, you need to contact your classmates to be filled in on the material covered. If you arrive after attendance has been taken and you have no excuse, you will be marked as absent. Class Policy on Cell Phones Cell phones must be turned off at all times. If you are expecting an emergency call make sure to talk to me before class. Class Policy on Laptop Computers You may bring your laptop to class, but all work done on laptop computers must be related to the class work of that day. Academic Dishonesty Academic honesty is fundamental to the activities and principles of the University, and more broadly to society at large. All members of the academic community must be confident that each person’s work has been responsibly and honorably acquired, developed, and presented. References Use the A.P.A format 5th Edition.
Map Rubric Name ________________________ Date _______ Class ____________________________________
Content
Visual Appeal Map Elements
Expert
Practitioner
Apprentice
Novice
All labels are included and are carefully and accurately placed; detail along coastlines is careful and accurate Very colorful and clean looking; labels are very easy to read Includes clearly labeled title, date (if appropriate), directional arrow (compass rose), scale, key, source line, and latitude and longitude lines
All labels are included and most are accurately placed
All but one or two labels are included, several are not accurately placed
Several labels are not included and many are not accurately or carefully placed
Some color; a few labels are not easy to read
Limited use of color; labels are somewhat difficult to read Missing several standard map elements
Limited or no use of color; labels are very difficult to read Missing most standard map elements
Includes most standard map elements; most are accurate and easy to read
Map Evaluation Form Name ________________________ Date _______ Class ____________________________________ Expert Content
Visual Appeal
Map Elements
COMMENTS:
Practitioner
Apprentice
Novice
Writing Good Multiple-Choice Exams
Dawn M. Zimmaro, Ph.D.
Measurement and Evaluation Center Telephone: (512) 232-2662 Web: www.utexas.edu/academic/mec Location: Bridgeway Building, 2616 Wichita Street Address: P.O. Box 7246, Austin, TX 78713-7246
Last revised August 19, 2004
1
Table of Contents SECTION Goals of the workshop The KEY to Effective Testing Summary of How Evaluation, Assessment, Measurement and Testing Terms Are Related Course Learning Objectives Abilities and Behaviors Related to Bloom’s Taxonomy of Educational Objectives Illustrative Action Verbs for Defining Objectives using Bloom’s Taxonomy Examples of Instructional Objectives for the Cognitive Domain Resources on Bloom’s Taxonomy of the Cognitive Domain and Writing Educational Objectives Test Blueprint Description of Multiple-Choice Items Multiple-Choice Item Writing Guidelines Guidelines to Writing Test Items Preparing Your Students for Taking Multiple-Choice Tests Sample Multiple-Choice Items Related to Bloom’s Taxonomy More Sample Multiple-Choice Items Good versus Poor Multiple-Choice Items Activity: Identifying Flawed Multiple-Choice Items Scenario-Based Problem Solving Item Set An Alternative Multiple-Choice Method Guidelines for Administering Examinations Analyzing Multiple-Choice Item Responses Activity: Item Analysis
Last revised August 19, 2004
PAGE 2 3 4 5 6 7 8 9 10 11-14 15-17 18 19 20-22 23-24 25-26 27-29 30-32 33-34 35 36-38 39
2
Goals of the Workshop Multiple-choice exams are commonly used to assess student learning. However, instructors often find it challenging to write good items that ask students to do more than memorize facts and details. In this workshop we will explore how to create effective classroom multiple-choice exams that are based on sound learning objectives and how you can use information from your exams to improve your teaching. After completing the workshop you will be able to: • • • • •
Describe various levels of learning objectives Explain the strengths and weaknesses of multiple-choice exams Identify common errors when writing multiple-choice items Create multiple-choice items that assess various levels of learning Use exam results for feedback and to evaluate instructional effectiveness
Last revised August 19, 2004
3
The KEY to Effective Testing •
To maximize your testing, you should aim to integrate all the major components of a course.
INSTRUCTION
OBJECTIVES
ASSESSMENT
EVALUATION
OBJECTIVES: Specific statements of the goals of the instruction; the objectives express what the students should be able to do or know as a result of taking the course; also, the objectives should indicate the cognitive level of performance expected (e.g., basic knowledge level, deeper comprehension level, or application level). INSTRUCTION: This consists of all the usual elements of the curriculum designed to teach a course, including lesson plans, study guides, and reading and homework assignments; the instruction should correspond directly to the course objectives. ASSESSMENT: The process of gathering, describing, or quantifying information about performance; the testing component of the course; the amount of weight given to the different subject matter areas on the test should match the relative importance of each of the course objectives as well as the emphasis given too each subject area during instruction. EVALUATION: Examining student performance and comparing and judging its quality. Determining whether or not the learner has met the course objectives and how well.
Last revised August 19, 2004
4
Summary of How Evaluation, Assessment, Measurement and Testing Terms Are Related Commonly used assessment and measurement terms are related and understanding how they connect with one another can help you better integrate your testing and teaching. Evaluation Examining information about many components of the thing being evaluated (e.g., student work, schools, or a specific educational program) and comparing or judging its quality, worth or effectiveness in order to make decisions based on Assessment The process of gathering, describing, or quantifying information about performance.
includes Measurement Process of assigning numbers to qualities or characteristics of an object or person according to some rule or scale and analyzing that data based on psychometric and statistical theory specific way to measure performance is Testing A method used to measure the level of achievement or performance
Last revised August 19, 2004
5
Course Learning Objectives Course objectives should contain clear statements about what the instructor wants to know by the end of the semester. If objectives are clearly and specifically defined, the instructor will have an effective means of evaluating what the students learned. Course objectives should not be so specific that the creativity of the instructor and student are stifled, nor should they be so vague that the students are left without direction. An example of a well constructed objective might be: “Students in Psychology 100 will be able to demonstrate their knowledge of Erikson’s Psychosocial Stages of Development by naming the 8 stages in order and describing the psychosocial crises at each stage.” Note that the objective is written in terms of what the student will be able to do, not what the instructor will teach. Learning objectives should focus on what the students should be able to do or know at the end of the semester. Do not use words that can be open to interpretation or are unclear. “Students should have an understanding of Erikson’s theory of development.” How would you measure “an understanding” or “an awareness” or “an appreciation”? In beginning to write course learning objectives you may find it helpful to write some general statements about some concepts, topics, and principles of course content. From those general statements you can then write specific objectives for class sessions. Bloom specified different abilities and behaviors that are related to thinking processes in his Taxonomy of Educational Objectives. This taxonomy can be helpful in outlining your course learning objectives. Reference: Hellyer, S. (n.d.). A teaching handbook for university faculty. Chapter 1: Course objectives. Retrieved October 1, 1998 from Indiana University Purdue University Indianapolis Web site: http://www.iupui.edu/~profdev/handbook/chap1.html
Last revised August 19, 2004
6
Abilities and Behaviors Related to Bloom’s Taxonomy of Educational Objectives Knowledge – Recognizes students’ ability to use rote memorization and recall certain facts. • Test questions focus on identification and recall of information Comprehension – Involves students’ ability to read course content, extrapolate and interpret important information and put other’s ideas into their own words. • Test questions focus on use of facts, rules and principles Application – Students take new concepts and apply them to another situation. • Test questions focus on applying facts or principles Analysis – Students have the ability to take new information and break it down into parts to differentiate between them. • Test questions focus on separation of a whole into component parts Synthesis – Students are able to take various pieces of information and form a whole creating a pattern where one did not previously exist. • Test questions focus on combining ideas to form a new whole Evaluation – Involves students’ ability to look at someone else’s ideas or principles and see the worth of the work and the value of the conclusions. • Test questions focus on developing opinions, judgments or decisions
Reference: Hellyer, S. (n.d.). A teaching handbook for university faculty. Chapter 1: Course objectives. Retrieved October 1, 1998 from Indiana University Purdue University Indianapolis Web site: http://www.iupui.edu/~profdev/handbook/chap1.html
Last revised August 19, 2004
7
Illustrative Action Verbs for Defining Objectives using Bloom’s Taxonomy Taxonomy Categories
Sample Verbs for Stating Specific Learning Outcomes
Knowledge
Cite, define, identify, label, list, match, name, recognize, reproduce, select, state
Comprehension
Classify, convert, describe, distinguish between, explain, extend, give examples, illustrate, interpret, paraphrase, summarize, translate
Application
Apply, arrange, compute, construct, demonstrate, discover, modify, operate, predict, prepare, produce, relate, show, solve, use
Analysis
Analyze, associate, determine, diagram, differentiate, discriminate, distinguish, estimate, infer, order, outline, point out, separate, subdivide
Synthesis
Combine, compile, compose, construct, create, design, develop, devise, formulate, integrate, modify, organize, plan, propose, rearrange, reorganize, revise, rewrite, tell, write
Evaluation
Appraise, assess, compare, conclude, contrast, criticize, discriminate, evaluate, judge, justify, support, weigh
Reference: Gronlund, N. E. (1998). Assessment of student achievement. Boston: Allyn and Bacon.
Last revised August 19, 2004
8
Examples of Instructional Objectives for the Cognitive Domain 1. The student will recall the four major food groups without error. (Knowledge) 2. By the end of the semester, the student will summarize the main events of a story in grammatically correct English. (Comprehension) 3. Given a presidential speech, the student will be able to point out the positions that attack a political opponent personally rather than the opponent’s political programs. (Analysis) 4. Given a short story, the student will write a different but plausible ending. (Synthesis) 5. Given fractions not covered in class, the student will multiply them on paper with 85 percent accuracy. (Application) 6. Given a description of a country’s economic system, the student will defend it by basing arguments on principles of socialism. (Evaluation) 7. From memory, with 80 percent accuracy the student will match each United States General with his most famous battle. (Knowledge) 8. The student will describe the interrelationships among acts in a play. (Analysis) Reference: Kubiszyn, K., & Borich, G. (1984). Educational testing and measurement: Classroom application and practice. Glenview, IL: Scott, Foresman, pp. 53-55.
Last revised August 19, 2004
9
Resources on Bloom’s Taxonomy of the Cognitive Domain and Writing Educational Objectives Bloom, B.S. (1956). Taxonomy of educational objectives, Vol. 1. New York: McKay. Jacobs, L.C. and Chase, C.I. (1992). Developing and using tests effectively: A guide for faculty. San Francisco: Jossey-Bass. Web resources: Allen, T. (1998). The taxonomy of educational objectives. Retrieved November 3, 2003 from the Humbolt State University Web site: http://www.humboldt.edu/~tha1/bloomtax.html Bixler, B. (2002). Writing educational goals and objectives. Retrieved November 3, 2003 from the Pennsylvania State University Web site: http://www.personal.psu.edu/staff/b/x/bxb11/Objectives/ Bloom’s taxonomy. (2003). Retrieved November 3, 2003 from the University of Victoria Counseling Services Web site: http://www.coun.uvic.ca/learn/program/hndouts/bloom.html Clark, D. (2002). Learning domains or Bloom’s taxonomy. Retrieved November 3, 2003 from http://www.nwlink.com/~donclark/hrd/bloom.html Huitt, W. (2000). Bloom et al.’s taxonomy of the cognitive domain. Retrieved November 3, 2003 from Valdosta State University Educational Psychology Web site: http://chiron.valdosta.edu/whuitt/col/cogsys/bloom.html Krumme, G. (2001). Major categories on the taxonomy of educational objectives. Retrieved November 3, 2003 from the University of Washington Web site: http://faculty.washington.edu/krumme/guides/bloom.html Writing educational goals and objectives. (2001). Retrieved November 3, 2003 from the University of Mississippi School of Pharmacy Bureau of Pharmaceutical Services Web site: http://www.pharmd.org/thebureau/N.htm
Last revised August 19, 2004
10
Test Blueprint Once you know the learning objectives and item types you want to include in your test you should create a test blueprint. A test blueprint, also known as test specifications, consists of a matrix, or chart, representing the number of questions you want in your test within each topic and level of objective. The blueprint identifies the objectives and skills that are to be tested and the relative weight on the test given to each. The blueprint can help you ensure that you are obtaining the desired coverage of topics and level of objective. Once you create your test blueprint you can begin writing your items! Example: 40 item exam Topic A
Topic B
Topic C
Topic D
TOTAL
Knowledge
1
2
1
1
Comprehension
2
1
2
2
Application
4
4
3
4
Analysis
3
2
3
2
5 (12.5%) 7 (17.5%) 15 (37.5%) 10 (25%) 2 (5%) 1 (2.5%) 40
Synthesis
1
Evaluation TOTAL
1 1
10 (25%)
10 (25%)
10 (25%)
10 (25%)
Once you create your blueprint you should write your items to match the level of objective within each topic area.
Last revised August 19, 2004
11
Description of Multiple-Choice Items Multiple-Choice Items: Multiple-choice items can be used to measure knowledge outcomes and various types of learning outcomes. They are most widely used for measuring knowledge, comprehension, and application outcomes. The multiple-choice item provides the most useful format for measuring achievement at various levels of learning. When selection-type items are to be used (multiple-choice, true-false, matching, check all that apply) an effective procedure is to start each item as a multiple-choice item and switch to another item type only when the learning outcome and content make it desirable to do so. For example, (1) when there are only two possible alternatives, a shift can be made to a true-false item; and (2) when there are a number of similar factors to be related, a shift can be made to a matching item. Strengths: 1. Learning outcomes from simple to complex can be measured. 2. Highly structured and clear tasks are provided. 3. A broad sample of achievement can be measured. 4. Incorrect alternatives provide diagnostic information. 5. Scores are less influenced by guessing than true-false items. 6. Scores are more reliable than subjectively scored items (e.g., essays). 7. Scoring is easy, objective, and reliable. 8. Item analysis can reveal how difficult each item was and how well it discriminated between the strong and weaker students in the class 9. Performance can be compared from class to class and year to year 10. Can cover a lot of material very efficiently (about one item per minute of testing time). 11. Items can be written so that students must discriminate among options that vary in degree of correctness. 12. Avoids the absolute judgments found in True-False tests.
Limitations: 1. Constructing good items is time consuming. 2. It is frequently difficult to find plausible distractors. 3. This item is ineffective for measuring some types of problem solving and the ability to organize and express ideas. 4. Real-world problem solving differs – a different process is involved in proposing a solution versus selecting a solution from a set of alternatives. 5. Scores can be influenced by reading ability. 6. There is a lack of feedback on individual thought processes – it is difficult to determine why individual students selected incorrect responses. 7. Students can sometimes read more into the question than was intended. 8. Often focus on testing factual information and fails to test higher levels of cognitive thinking. 9. Sometimes there is more than one defensible “correct” answer.
Last revised August 19, 2004
12 10. They place a high degree of dependence on the student’s reading ability and the instructor’s writing ability. 11. Does not provide a measure of writing ability. 12. May encourage guessing. Helpful Hints: • Base each item on an educational or instructional objective of the course, not trivial information. • Try to write items in which there is one and only one correct or clearly best answer. • The phrase that introduces the item (stem) should clearly state the problem. • Test only a single idea in each item. • Be sure wrong answer choices (distractors) are at least plausible. • Incorporate common errors of students in distractors. • The position of the correct answer should vary randomly from item to item. • Include from three to five options for each item. • Avoid overlapping alternatives (see Example 3 following). • The length of the response options should be about the same within each item (preferably short). • There should be no grammatical clues to the correct answer. • Format the items vertically, not horizontally (i.e., list the choices vertically) • The response options should be indented and in column form. • Word the stem positively; avoid negative phrasing such as “not” or “except.” If this cannot be avoided, the negative words should always be highlighted by underlining or capitalization: Which of the following is NOT an example …… • Avoid excessive use of negatives and/or double negatives. • Avoid the excessive use of “All of the above” and “None of the above” in the response alternatives. In the case of “All of the above”, students only need to have partial information in order to answer the question. Students need to know that only two of the options are correct (in a four or more option question) to determine that “All of the above” is the correct answer choice. Conversely, students only need to eliminate one answer choice as implausible in order to eliminate “All of the above” as an answer choice. Similarly, with “None of the above”, when used as the correct answer choice, information is gained about students’ ability to detect incorrect answers. However, the item does not reveal if students know the correct answer to the question.
Last revised August 19, 2004
13 Example 1 The stem of the original item below fails to present the problem adequately or to set a frame of reference for responding. Original
Revised
1. World War II was:
1. In which of these time period was World War II fought?
A. The result of the failure of the League of Nations. B. Horrible. C. Fought in Europe, Asia, and Africa. D. Fought during the period of 19391945.
A. B. C. D. E.
1914-1917 1929-1934 1939-1945 1951-1955 1961-1969
Example 2 There should be no grammatical clues to the correct answer. Original
Revised
1. Albert Eisenstein was a:
1. Who was Albert Einstein?
A. B. C. D.
Anthropologist. Astronomer. Chemist. Mathematician
A. B. C. D.
An anthropologist. An Astronomer. A chemist. A mathematician.
Example 3 Alternatives should not overlap (e.g., in the original form of this item, if either of the first two alternatives is correct, “C� is also correct.) Original
Revised
1. During what age period is thumb-sucking likely to produce the greatest psychological trauma?
1. During what age period is thumb-sucking likely to produce the greatest psychological trauma?
A. B. C. D. E.
Infancy Preschool period Before adolescence During adolescence After adolescence
Last revised August 19, 2004
A. B. C. D. E.
From birth to 2 years old From 2 years to 5 years old From 5 years to 12 years old From 12 years to 20 years old 20 years of age or older
14 Example 4 Example of how the greater similarity among alternatives increases the difficulty of the item. Easy
More Difficult
1. Who was the President of the U.S. during the War of 1812?
1. Who was President of the U.S. during the War of 1812?
A. B. C. D. E.
Grover Cleveland Abraham Lincoln James Madison Harry Truman George Washington
A. B. C. D. E.
John Q. Adams Andrew Jackson Thomas Jefferson James Madison George Washington
Reference: Marshall, J. C., & Hales, L. W. (1971). Classroom test construction. Reading MA: Addison-Wesley.
Last revised August 19, 2004
15
Multiple-Choice Item Writing Guidelines Multiple-choice questions typically have 3 parts: a stem, the correct answer – called the key, and several wrong answers, called distractors. Procedural Rules: •
• • • • • • • • •
Use either the best answer or the correct answer format. • Best answer format refers to a list of options that can all be correct in the sense that each has an advantage, but one of them is the best. • Correct answer format refers to one and only one right answer. Format the items vertically, not horizontally (i.e., list the choices vertically) Allow time for editing and other types of item revisions. Use good grammar, punctuation, and spelling consistently. Minimize the time required to read each item. Avoid trick items. Use the active voice. The ideal question will be answered by 60-65% of the tested population. Have your questions peer-reviewed. Avoid giving unintended cues – such as making the correct answer longer in length than the distractors.
Content-related Rules: • • • • • • • • • • • •
Base each item on an educational or instructional objective of the course, not trivial information. Test for important or significant information. Focus on a single problem or idea for each test item. Keep the vocabulary consistent with the examinees’ level of understanding. Avoid cueing one item with another; keep items independent of one another. Use the author’s examples as a basis for developing your items. Avoid overly specific knowledge when developing items. Avoid textbook, verbatim phrasing when developing the items. Avoid items based on opinions. Use multiple-choice to measure higher level thinking. Be sensitive to cultural and gender issues. Use case-based questions that use a common text to which a set of questions refers.
Stem Construction Rules: • •
State the stem in either question form or completion form. When using a completion form, don’t leave a blank for completion in the beginning or middle of the stem.
Last revised August 19, 2004
16 • • • • •
Ensure that the directions in the stem are clear, and that wording lets the examinee know exactly what is being asked. Avoid window dressing (excessive verbiage) in the stem. Word the stem positively; avoid negative phrasing such as “not” or “except.” If this cannot be avoided, the negative words should always be highlighted by underlining or capitalization: Which of the following is NOT an example …… Include the central idea and most of the phrasing in the stem. Avoid giving clues such as linking the stem to the answer (…. Is an example of an: test-wise students will know the correct answer should start with a vowel)
General Option Development Rules: • • • • • • • • • • • • • •
Place options in logical or numerical order. Use letters in front of options rather than numbers; numerical answers in numbered items may be confusing to students. Keep options independent; options should not be overlapping. Keep all options homogeneous in content. Keep the length of options fairly consistent. Avoid, or use sparingly, the phrase all of the above. Avoid, or use sparingly, the phrase none of the above. Avoid the use of the phrase I don’t know. Phrase options positively, not negatively. Avoid distractors that can clue test-wise examinees; for example, absurd options, formal prompts, or semantic (overly specific or overly general) clues. Avoid giving clues through the use of faulty grammatical construction. Avoid specific determinates, such as never and always. Position the correct option so that it appears about the same number of times in each possible position for a set of items. Make sure that there is one and only one correct option.
Distractor (incorrect options) Development Rules: • • • • • • •
Use plausible distractors. Incorporate common errors of students in distractors. Avoid technically phrased distractors. Use familiar yet incorrect phrases as distractors. Use true statements that do not correctly answer the item. Avoid the use of humor when developing options. Distractors that are not chosen by any examinees should be replaced.
Suggestions for Writing Good Multiple Choice Items: • •
Present practical or real-world situations to the students. Present the student with a diagram of equipment and ask for application, analysis or evaluation.
Last revised August 19, 2004
17 • • •
Present actual quotations taken from newspapers or other published sources and ask for the interpretation or evaluation of these quotations. Use pictorial materials that require students to apply principles and concepts. Use charts, tables or figures that require interpretation.
References: Carneson, J., Delpierre, G., & Masters, K. (n.d.). Designing and managing multiple choice questions: Appendix B, designing MCQs – do’s and don’ts. Retrieved November 3, 2003 from the University of Cape Town Web site: http://www.uct.ac.za/projects/cbe/mcqman/mcqappb.html Haladyna, T. M. (1999). Developing and validating multiple-choice test items, 2nd ed. Mahwah, NJ: Lawrence Erlbaum Associates. Haladyna, T. M. (1989). Taxonomy of multiple-choice item-writing rules. Applied Measurement in Education, 2 (1), 37-50. Jacobs, L. C. (2002). How to write better tests: A handbook for improving test construction skills. Retrieved November 3, 2003 from Indiana University Bloomington Evaluation Services & Testing Web site: http://www.indiana.edu/~best/write_better_tests.shtml Sevenair. J. P., & Burkett, A. R. (1997). Item writing guidelines. Retrieved November 3, 2003 from the Xaviar University of Louisiana Web site: http://webusers.xula.edu/jsevenai/objective/guidelines.html Writing multiple choice questions that demand critical thinking. (2002). Retrieved November 3, 2003 from the University of Oregon Teaching Effectiveness Program Web site: http://tep.uoregon.edu/resources/assessment/multiplechoicequestions/mc4critthink.html
Last revised August 19, 2004
18
Guidelines to Writing Test Items •
Begin writing items well ahead of the time when they will be used; allow time for revision.
•
Match items to intended outcomes at the proper difficulty level to provide a valid measure of the instructional objectives.
•
Be sure each item deals with an important aspect of the content area and not with trivia.
•
Be sure that the problem posed is clear and unambiguous.
•
Be sure that each item is independent of all other items (i.e., a hint to an answer should not be unintentionally embedded in another item).
•
Be sure the item has one correct or best answer on which experts would agree.
•
Prevent unintended clues to the answer in the statement or question (e.g., grammatical inconsistencies such as ‘a’ or ‘an’ give clues).
•
Avoid duplication of the textbook in writing test items; don’t lift quotes directly from any textual materials.
•
Avoid trick or catch questions in an achievement test. (Don’t waste time testing how well the student can interpret your intentions).
•
On a test with different question formats (e.g., multiple choice and True-False), one should group all items of similar format together.
•
Questions should follow an easy to difficult progression.
•
Space the items to eliminate overcrowding.
•
Have diagrams and tables above the item using the information, not below.
Last revised August 19, 2004
19
Preparing Your Students for Taking Multiple-Choice Tests 1. Specify objectives or give study questions • Students should not be forced to guess what will be on a test • Give students specific study questions or topics, then draw the test items from those questions • There should be many study questions that are comprehensive and cover all the important ideas in the course 2. Try to reduce frustration for the creative student • Avoid “all of these,” “none of these,” and “both A and B” answer choices. 3. Defeat the “test-wise” strategies of students who don’t study • Be aware of the general “rules of thumb” that students use to guess on multiple choice exams and try to avoid them a. Pick the longest answer o make sure the longest answer is only correct a part of the time o try to make options equal length b. When in doubt pick “c” o make sure the correct answer choice letter varies c. Never pick an answer which uses the word ‘always’ or ‘never’ in it o make sure this option is correct part of the time or avoid using always and never in the option choices d. If there are two answers which express opposites, pick one or the other and ignore other alternatives o sometimes offer opposites when neither is correct or offer two pairs of opposites e. If in doubt, guess o use five alternatives instead of three or four to reduce guessing f. Pick the scientific-sounding answer o use scientific sounding jargon in wrong answers g. Don’t pick an answer which is too simple or obvious o sometimes make the simple, obvious answer the correct one h. Pick a word which you remember was related to the topic o when creating the distractors use terminology from the same area of the text as the right answer, but in distractors use those words incorrectly so the wrong answers are definitely wrong Reference: Dewey, R. A. (1998, January 20). Writing multiple choice items which require comprehension. Retrieved November 3, 2003 from http://www.psywww.com/selfquiz/aboutq.htm
Last revised August 19, 2004
20
Sample Multiple-Choice Items Related to Bloom’s Taxonomy Knowledge Items: Outcome: Identifies the meaning of a term. Reliability is the same as: A. consistency. B. relevancy. C. representativeness. D. usefulness. Outcome: Identifies the order of events. What is the first step in constructing an achievement test? A. Decide on test length. B. Identify the intended learning outcomes. C. Prepare a table of specifications. D. Select the item types to use. Comprehension Items: Outcome: Identifies an example of a term. Which one of the following statements contains a specific determiner? A. America is a continent. B. America was discovered in 1492. C. America has some big industries. D. America’s population is increasing. Outcome: Interprets the meaning of an idea. The statement that “test reliability is a necessary but not sufficient condition of test validity” means that: A. a reliable test will have a certain degree of validity. B. a valid test will have a certain degree of reliability. C. a reliable test may be completely invalid and a valid test completely unreliable. Outcome: Identifies an example of a concept or principle. Which of the following is an example of a criterion-referenced interpretation? A. Derik earned the highest score in science. B. Erik completed his experiment faster than his classmates. C. Edna’s test score was higher than 50 percent of the class. D. Tricia set up her laboratory equipment in five minutes.
Last revised August 19, 2004
21 Outcome: Predicts the most probable effect of an action. What is most likely to happen to the reliability of the scores for a multiple-choice test, where the number of alternatives for each item is changed from three to four? A. It will decrease. B. It will increase. C. It will stay the same. D. There is no basis for making a prediction.
Application Items Outcome: Distinguishes between properly and improperly stated outcomes. Which one of the following learning outcomes is properly stated in terms of student performance? A. Develops an appreciation of the importance of testing. B. Explains the purpose of test specifications. C. Learns how to write good test items. D. Realizes the importance of validity. Outcome: Improves defective test items. Directions: read the following test item and then indicate the best change to make to improve the item. Which one of the following types of learning outcomes is most difficult to evaluate objectively? 1. A concept. 2. An application. 3. An appreciation. 4. None of the above. The best change to make in the previous item would be to: A. change the stem to incomplete-statement form. B. use letters instead of numbers for each alternative. C. remove the indefinite articles “a” and “an” from the alternatives. D. replace “none of the above” with “an interpretation.”
Analysis Items Directions: Read the following comments a teacher made about testing. Then answer the questions that follow by circling the letter of the best answer. “Students go to school to learn, not to take tests. In addition, tests cannot be used to indicate a student’s absolute level of learning. All tests can do is rank students in order of achievement, and this relative ranking is influenced by guessing, bluffing, and the subjective opinions of the teacher doing the scoring. The teacher-learning process would benefit if we did away with tests and depended on student self-evaluation.”
Last revised August 19, 2004
22 Outcome: Recognizes unstated assumptions. 1. Which one of the following unstated assumptions is this teacher making? A. Students go to school to learn. B. Teachers use essay tests primarily. C. Tests make no contribution to learning. D. Tests do not indicate a student’s absolute level of learning. Outcome: 2. A. B. C. D.
Identifies the meaning of a term. Which one of the following types of test is this teacher primarily talking about? Diagnostic test. Formative test. Pretest. Summative test.
Synthesis Item (See paragraph for analysis items) Outcome: 3. A. B. C. D.
Identifies relationships. Which one of the following propositions is most essential to the final conclusion? Effective self-evaluation does not require the use of tests. Tests place students in rank order only. Test scores are influenced by factors other than achievement. Students do not go to school to take tests.
Reference: Gronlund, N. E. (1998). Assessment of student achievement. Boston: Allyn and Bacon.
Last revised August 19, 2004
23
More Sample Multiple-Choice Items Knowledge: 1. In the area of physical science, which one of the following definitions describes the term “polarization”? A. B. C. D. E.
The separation of electric charges by friction. The ionization of atoms by high temperatures. The interference of sound waves in a closed chamber. The excitation of electrons by high frequency light. The vibration of transverse waves in a single plane.
Simple recall of the correct definition of polarization is required.
Comprehension: 2. Which one of the following describes what takes place in the so-called PREPARATION stage of the creative process, as applied to the solution of a particular problem? A. The problem is identified and defined. B. All available information about the problem is collected. C. An attempt is made to see if the proposed solution to the problem is acceptable. D. The person goes through some experience leading to a general idea of how the problem can be solved. E. The person sets the problem aside, and gets involved with some other unrelated activity. The knowledge of the five stages of the creative process must be recalled (knowledge) and the student is tested for an understanding (comprehension) of the meaning of each term, in this case, “preparation.”
Application: 3. Which one of the following memory systems does a piano-tuner mainly use in his occupation? A. B. C. D. E.
Echoic memory Short-term memory Long-term memory Mono-auditory memory None of the above
This question tests for the application of previously acquired knowledge (the various memory systems).
Last revised August 19, 2004
24 Analysis: 4. Read carefully through the paragraph below, and decide which of the options A-D is correct. “The basic premise of pragmatism is that questions posed by speculative metaphysical propositions can often be answered by determining what the practical consequences of the acceptance of a particular metaphysical proposition are in this life. Practical consequences are taken as the criterion for assessing the relevance of all statements or ideas about truth, norm and hope.” A. B. C. D.
The word “acceptance” should be replaced by “rejection.” The word “often” should be replaced by “only.” The word “speculative” should be replaced by hypothetical.” The word “criterion” should be replaced by “measure.”
This question requires prior knowledge of and understanding about the concept of pragmatism. The student is tested on his/her ability to analyze whether a word fits with the accepted definition of pragmatism.
Evaluation: 5. Judge the sentence in italics according to the criteria given below: “The United States took part in the Gulf War against Iraq BECAUSE of the lack of civil liberties imposed on the Kurds by Saddam Hussein’s regime.” A. B. C. D. E.
The assertion and the reason are both correct, and the reason is valid. The assertion and the reason are both correct, but the reason is invalid. The assertion is correct but the reason is incorrect. The assertion is incorrect but the reason is correct. Both the assertion and the reason are incorrect.
A knowledge and understanding of Middle East politics is assumed. The student is tested in the ability to evaluate between cause and effect in the sentence in terms of predefined criteria.
Reference: Carneson, J., Delpierre, G., & Masters, K. (n.d.). Designing and managing multiple choice questions: Appendix C, multiple choice questions and Bloom’s taxonomy. Retrieved November 3, 2003 from the University of Cape Town Web site: http://www.uct.ac.za/projects/cbe/mcqman/mcqappc.html
Last revised August 19, 2004
25
Good versus Poor Multiple-Choice Items Presented here are some possible multiple-choice questions in areas of statistics, biology and communication. Even though each question within the pair assesses the same content, those in bold encourage the learner to think about the problem in more depth, to use and apply their knowledge and not simply recall memorized information. Because the questions in bold encourage more meaningful processing of information they are considered more effective. *The correct answers have been starred. Statistics 1a. The mean of a distribution of test scores is the: a) Most frequently occurring score b) 50th percentile
c) Arithmetic average* d) Measure of score range
1b. A university developed an aptitude test to use for admission to its Honors Program. The test was administered to a group of seven applicants who obtained the following scores: 70,72,72,80,89,94,98. The mean score on the aptitude test is: a) 72 c) 82* b) 80 d) 90
Biology 1a. Suppose you thoroughly and adequately examined a particular type of cell, using the transmission electronic microscope, and discovered that it completely lacked ribosomes. You would then conclude that this cell also lacked: a) A nucleus b) DNA c) Cellulose d) Protein synthesis*
1b. Ribosomes are important for: a) b) c) d)
the nucleus DNA Cellulose Protein synthesis*
Last revised August 19, 2004
26 Communication
1a. What is an example of pseudolistening (a pitfall to effective listening)? a) b) c) d)
daydreaming while nodding your head* sidetracking a conversation premature replying paying attention to the context
1b. While Amy is presenting her proposal to the group, Josh is thinking about his weekend fishing trip. Even though he is not listening to a word Amy is saying, he manages to occasionally nod his head in agreement. Josh’s behavior is an example of: a) pseudolistening* b) premature replying c) attentiveness to the context d) conversation sidetracking
Last revised August 19, 2004
27
Activity: Identifying Flawed Multiple-Choice Items For each pair of items, decide which item is better and why. 1A. The promiscuous use of sprays, oils, and antiseptics in the nose during acute colds is a pernicious practice because it may have a deleterious effect on: a. the sinuses b. red blood cells c. white blood cells d. the olfactory nerve 1B. Frequent use of sprays, oils, and antiseptics in the nose during a bad cold may result in: a. the spreading of the infection to the sinuses b. damage to the olfactory nerve c. destruction of white blood cells d. congestion of the mucous membrane in the nose 2A. In 1965, the death rate from accidents of all types per 100,000 population in the 15-24 age group was: a. 59.0 b. 59.1 c. 59.2 d. 59.3 2B. In 1965, the leading cause of death per 100,000 population in the 15-24 age group was from: a. respiratory disease b. cancer c. accidents d. rheumatic heart disease 3A. About how many calories are recommended daily for a 14-year old who is 62 in. tall, weighs 103 lbs., and is moderately active? a. 1,500 b. 2,000 c. 2,500 d. 3,000 3B. About how many calories are recommended daily for a 14-year old who is 62 in. tall, weighs 103 lbs., and is moderately active? a. 0 b. 2,000 c. 2,500 d. 3,000
Last revised August 19, 2004
28 4A. Which of the following is a category in the taxonomy of the cognitive domain? A. Reasoning ability B. Critical thinking C. Rote learning D. All of the above E. None of the above 4B. What is the most complex level in the taxonomy of the cognitive domain? A. Knowledge B. Synthesis C. Evaluation D. Analysis E. Comprehension
Last revised August 19, 2004
29
Answers: Identifying Flawed Multiple-Choice Items Pair #1: 1B is the better item. 1A is wordy and uses vocabulary that may be unfamiliar to many students. 1B not only asks what part of the body is affected (sinuses) but also what is the result (spreading of infection). Pair #2: 2B is the better item. The difference between the options in 2A is trivial. Also, 2A asks for memorization of factual information. Pair #3:
3A is the better item. 3B contains a distractor that is not plausible (0 calories).
Pair #4: 4B is the better item. 4A asks for simple identification of a category, whereas 4B asks for differentiation between the levels. 4A also contains “all of the above” and “none of the above” as option choices, which should be avoided, if possible. Pair #5: 5A is the better item. 5A sites the source of the information, whereas 5B can be construed as an opinion question.
Last revised August 19, 2004
30
Scenario-Based Problem Solving Item Set Presented here are some scenario–based problem solving item sets in statistics and biology. This method provides a basis for testing complex thinking, application of knowledge as well as integration of material. It is a structured and well-organized method of assessment, ensuring ease of scoring. Note that it may be time consuming to write appropriate scenarios. Statistics Two researchers were studying the relationship between amount of sleep each night and calories burned on an exercise bike for 42 men and women. They were interested if people who slept more had more energy to use during their exercise session. They obtained a correlation of .28, which has a two-tailed probability of .08. Alpha was .10. 1. Which is an example of a properly written research question? a) b) c) d)
Is there a relationship between amount of sleep and energy expanded?* Does amount of sleep correlate with energy used? What is the cause of energy expanded? What is the value of rho?
2. What is the correct term for the variable amount of sleep? a) b) c) d)
Dependent* Independent Predictor y
3. What is the correct statistical null hypothesis? a) b) c) d)
There is no correlation between sleep and energy expanded Rho equals zero* R equals zero Rho equals r
4. What conclusions should you draw regarding the null hypothesis? a) Reject* b) Accept c) Cannot determine without more information
Last revised August 19, 2004
31 5. What conclusions should you draw regarding this study? a) b) c) d)
The correlation was significant The correlation was not significant A small relationship exists* No relationship exists
Reference: Haladyna, T. M. (1994). Developing and validating-multiple choice test items, 1st ed. Hillsdale, NJ: Lawrence Erlbaum Associates.
Biology One day you meet a student watching a wasp drag a paralyzed grasshopper down a small hole in the ground. When asked what he is doing he replies, “I’m watching that wasp store paralyzed grasshoppers in her nest to feed her offspring.” 1. Which of the following is the best description of his reply? a) b) c) d)
He is not a careful observer. He is stating a conclusion only partly derived from his observation.* He is stating a conclusion entirely drawn from his observation. He is making no assumptions.
2. Which of the following additional observations would add the most strength to the student’s reply in Question 1? a) b) c) d)
Observing the wasp digging a similar hole. Observing the wasp dragging more grasshoppers into the hole. Digging into the hole and observing wasp eggs on the paralyzed grasshopper* Observing adult wasps emerging from the hole a month later.
3. Both of you wait until the wasp leaves the area, then you dig into the hole and observe three paralyzed grasshoppers, each with a white egg on its side. The student states that this evidence supports his reply in Question 1. Which of the following assumptions is he making? a) b) c) d)
The eggs are grasshopper eggs. The wasp laid the eggs.* The wasp dug the hole. The wasp will return with another grasshopper.
Last revised August 19, 2004
32 4. You take the white eggs to the Biology laboratory. Ten days later immature wasps hatched from the eggs. The student states that this evidence supports his reply in Question 1. Which of the following assumptions is he making? a) The wasp dug the hole. b) The wasp stung the grasshoppers. c) The grasshoppers were dead. d) A paralyzed grasshopper cannot lay an egg.*
Reference: Donovan, M. P., & Allen, R.D. (n.d.). Analytical problems in biology. Morgantown, West Virginia: Alpha Editions. Additional reading: Terry, T.M. (1980). The narrative exam – an approach to creative organization of multiple-choice tests. Journal of College Science Teaching, 9(3), 156-158.
Last revised August 19, 2004
33
An Alternative Multiple-Choice Method There are a number of different ways multiple choice exams can be used in the classroom. One such example comes from Stanford University. Presented here is a summary of the Human Biology Project implemented at this institution.
The Human Biology Project The Stanford Learning Lab implemented a new approach to assess student learning by using weekly on-line problem sets for a Stanford Human Biology course, The Human Organism. The web-based problem sets created by the Stanford Learning Lab allowed a large lecture class to focus on the individual student, permitting personal and rapid feedback. The Human Biology class at Stanford University is a large undergraduate course with 2 professors, 5 assistants and 208 students. It covers the topic of physiology, or how animals (including humans) work. The course consists of four one-hour lectures and one-hour discussion section each week. During Spring Quarter 1998, the faculty team provided a problem set to the students via the Web at the end of each week’s lecture. Graded by computer, the correct answers to the sets were posted on the Web. In addition to selecting a multiple-choice answer to each question, students were required to submit a short “rationale” explaining their answers. The faculty team sorted responses to make it easier to explore frequently-missed questions. The course assistant then used this information to tailor class instruction. Sample Question and Rationale: 1. Which of the following has/have intrinsic pacemaker characteristics? a) Medulla c) Sinoatrial node b) Pons d) Atrioventricular node Ideal rationale: SA node is the normal pacemaker for the entire heart. AV node also has pacemaker potential, but is overshadowed by SA node. Medulla has pacemaker potential for breathing rhythm as well. Pons helps refine rhythm, but does not have pacemaker potential.
Last revised August 19, 2004
34 Less–than-ideal rationales: Offering an incomplete answer: Normally the SA node is responsible for generating heart rate, and it is able to do this because of its intrinsic rhythm. The AV node also has an intrinsic rhythm, but it is “overshadowed” by that of the SA node. Providing a quotation from the book: The sinoatrial node is the pacemaker of the mammalian heart. Providing irrelevant information: Stretch receptors are located in the aortic arch and the carotid sinus. They have the ability to respond to changes in pressure. Restating the answer: The SA node, AV node, and medulla all possess intrinsic pacemaker characteristics as they all serve as intrinsic pacemakers. Blind appeal to authority: This answer is right because Professor Heller said that it was, and Professor Heller is cool. Worst rationale: No rationale submitted
References: Schaeffer, E., Michalchik, V., Martin, M. Birks, H., & Nash, J. (1999). Web-based problem sets in the human biology program: Fall 1998. Retrieved November 3, 2003 from Stanford University, Stanford Learning Lab Web site: http://sll.stanford.edu/projects/humbioa/HumBio_2a2b_98.pdf Nash, J. & Schaeffer, E. (1999, January 11). Web-based coursework proves useful. Speaking of Computers, 49. Retrieved November 3, 2003 from http://acomp.stanford.edu/acpubs/SOC/Back_Issues/SOC49/humbio.html
Last revised August 19, 2004
35
Guidelines for Administering Examinations One of the basic problems in education is determining how well the students have learned the material covered in the course. It is quite possible that a particular student may know the material being tested extremely well but still perform poorly on examinations. If one conceives of an examination as a measurement device analogous to, for example, a ruler, then the accuracy of the assessment of how “well” someone knows the course material is a function of the quality of the examination. However, the actual administration of the examination may also affect a student’s performance. Presented below is a list of general principles to consider when designing and administering examinations. 1.
Give complete instructions as to how the examination is to be taken. It is helpful to students for you to indicate the number of points each section of the examination counts or the amount of time to spend on each question. This indication of the relative importance of each question helps them to allocate their time and efforts wisely.
2.
If the student is allowed to take any aids into the examination room (such as a calculator, notes, textbook), be sure to state specifically what is allowed.
3.
The examination should test the lesson or course objectives. The lesson assignments themselves should provide preparation for taking the final examination in content as well as in the practice of answering certain kinds of questions. For example, if the lesson assignments ask all essay questions, it would be inappropriate for the examination to consist of 200 multiple-choice questions. Practice taking the completed test yourself. You should count on the students to take about four times the amount of time it takes you to complete the test.
4.
For final examinations, structure the test to cover the scope of the entire course. The examination should be comprehensive enough to test adequately the student’s learning of the course material. It is usually a good idea to use a variety of different types of questions on the examination (e.g., multiple-choice, essay, etc.) because certain subject matter areas can be covered most effectively with certain types of items. However, items of the same type should be kept together when possible.
5.
Prior to the examination, tell the students what types of questions will be on the test (essay, multiple-choice, etc.). If possible, it is a good practice to allow students access to past (retied) examinations so that they have some idea what to expect. Also, if you plan to administer essay exams, it is a good idea to share the general grading scheme ahead of time so that the students know the criteria by which they will be evaluated.
6.
Present to the students a list of review questions or a list of topics to be covered on the examination along with an indication of the relative emphasis on each.
7.
Give detailed study suggestions.
8.
Indicate how much the examination will count toward determining the final grade.
Last revised August 19, 2004
36
Analyzing Multiple-Choice Item Responses Understanding how to interpret and use information based on student test scores is as important as knowing how to construct a well-designed test. Using feedback from your test to guide and improve instruction is an essential part of the process. Using statistical information to review your multiple-choice test can provide useful information. Three of these statistics are: Item difficulty: the percentage of students that correctly answered the item. • Also referred to as the p-value. • The range is from 0% to 100%, or more typically written as a proportion as 0.0 to 1.00. • The higher the value, the easier the item. • Calculation: Divide the number of students who got an item correct by the total number of students who answered it. • P-values above 0.90 are very easy items and should not be reused again for subsequent tests. If almost all of the students can get the item correct, it is a concept probably not worth testing. • P-values below 0.20 are very difficult items and should be reviewed for possible confusing language, removed from subsequent tests, and/or highlighted for an area for reinstruction. If almost all of the students get the item wrong there is either a problem with the item or students did not get the concept. • Ideal value: Slightly higher than midway between chance (1.00 divided by the number of choices) and a perfect score (1.00) for the item. For a 5-option multiple-choice question the ideal value is .60 (60%) Item discrimination: the point-biserial relationship between how well students did on the item and their total test score. • Also referred to as the Point-Biserial correlation (PBS) • The range is from –1.00 to 1.00. • The higher the value, the more discriminating the item. A highly discriminating item indicates that the students who had high tests scores got the item correct whereas students who had low test scores got the item incorrect. • Items with discrimination values near or less than zero should be removed from the test. This indicates that students who overall did poorly on the test did better on that item than students who overall did well. The item may be confusing for your better scoring students in some way. • Acceptable range: 0.20 or higher • Ideal value: The closer to 1.00 the better (X C −X T ) p • Calculation: where S .D.Total q
X C = the mean total score for persons who have responded correctly to the item X T = the mean total score for all persons Last revised August 19, 2004
37
p = the difficulty value for the item q = (1 – p) S. D. Total = the standard deviation of total test scores Reliability coefficient: a measure of the amount of measurement error associated with a test score. • The range is from 0.0 to 1.0. • The higher the value, the more reliable the overall test score. • Typically, the internal consistency reliability is measured. This indicates how well the items are correlated with one another. • High reliability indicates that the items are all measuring the same thing, or general construct (e.g. knowledge of how to calculate integrals for a Calculus course). • With multiple-choice items that are scored correct/incorrect, the Kuder-Richardson formula 20 (KR-20) is often used to calculate the internal consistency reliability. K ∑ pq o (1 − 2 ) where K −1 σ x
• • •
K = number of items p = proportion of persons who responded correctly to an item (i.e., difficulty value) q = proportion of persons who responded incorrectly to an item (i.e., 1 – p) σ2x = total score variance Two ways to improve the reliability of the test are to 1) increase the number of questions in the test or 2) use items that have high discrimination values in the test Acceptable range: 0.60 or higher Ideal value: 1.00
Another useful item review technique to use is distractor evaluation. The distractor should be considered an important part of the item. Nearly 50 years of research shows that there is a relationship between the distractors students choose and total test score. The quality of the distractors influence student performance on a test item. Although the correct answer must be truly correct, it is just as important that the distractors be incorrect. Distractors should appeal to low scorers who have not mastered the material whereas high scorers should infrequently select the distractors. Reviewing the options can reveal potential errors of judgment and inadequate performance of distractors. These poor distractors can be revised, replaced, or removed. One way to study responses to distractors is with a frequency table. This table tells you the number and/or percent of students that selected a given distractor. Distractors that are selected by a few or no students should be removed or replaced. These kinds of distractors are likely to be so implausible to students that hardly anyone selects them.
Last revised August 19, 2004
38 • • • • •
•
Definition: The incorrect alternatives in a multiple-choice item. Reported as: The frequency (count), or number of students, that selected each incorrect alternative Acceptable Range: Each distractor should be selected by at least a few students Ideal Value: Distractors should be equally popular Interpretation: o Distractors that are selected by a few or no students should be removed or replaced o One distractor that is selected by as many or more students than the correct answer may indicate a confusing item and/or options The number of people choosing a distractor can be lower or higher than the expected because: o Partial knowledge o Poorly constructed item o Distractor is outside of the area being tested
References:
DeVellis, R. F. (1991). Scale development: Theory and applications. Newbury Park: Sage Publications. Haladyna. T. M. (1999). Developing and validating multiple-choice test items, 2nd ed. Mahwah, NJ: Lawrence Erlbaum Associates. Suen, H. K. (1990). Principles of test theories. Hillsdale, NJ: Lawrence Erlbaum Associates.
Last revised August 19, 2004
39
Activity: Item Analysis Below is a sample item analysis performed by MEC that shows the summary table of item statistics for all items for a multiple-choice classroom exam. Review the item difficulty (P), discrimination (R(IT)), and distractors (options B-E). Item Analysis (sample of 10 items) – correct answer is “A” Summary Table of Test Item Statistics <test name> N TOTAL = 932
ITEM 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. . . 40.
MEAN TOTAL = 69.4
P
R(IT)
0.72 0.90 0.60 0.99 0.94 0.77 0.47 0.12 0.08 0.35
0.34 0.21 0.39 -0.06 0.14 -0.01 0.31 0.08 0.04 0.42
NC 667 840 561 923 876 716 432 114 75 330
MC
MI
71.56 70.11 72.66 69.34 69.76 69.34 72.76 71.61 70.78 75.24
67.66 69.02 65.47 69.90 68.23 69.57 66.16 68.39 69.03 63.54
S.D. TOTAL = 10.2
OMIT 1 1 0 0 0 0 3 8 0 0
A
B
C
D
E
667 840 561 923 876 716 432 114 75 330
187 1 233 3 0 16 107 218 64 98
37 76 46 3 12 25 68 264 120 74
30 9 88 3 24 35 165 153 67 183
10 5 4 0 20 140 157 175 606 247
Which item(s) would you remove altogether from the test? Why?
Which distractor(s) would you revise? Why?
Which items are working well?
Last revised August 19, 2004
ALPHA = .84
TEST TYPES True/False Good for:
Knowledge level content Evaluating student understanding of popular misconceptions Concepts with two logical responses
Advantages:
Can test large amounts of content Students can answer 3-4 questions per minute
Disadvantages:
They are easy It is difficult to discriminate between students that know the material and students who don't Students have a 50-50 chance of getting the right answer by guessing Need a large number of items for high reliability
Tips for Writing Good True/False items:
Avoid double negatives. Avoid long/complex sentences. Use specific determinants with caution: never, only, all, none, always, could, might, can, may, sometimes, generally, some, few. Use only one central idea in each item. Don't emphasize the trivial. Use exact quantitative language Don't lift items straight from the book. Make more false than true (60/40). (Students are more likely to answer true.)
Matching Good for:
Knowledge level Some comprehension level, if appropriately constructed
Types:
Terms with definitions Phrases with other phrases Causes with effects Parts with larger units Problems with solutions
Advantages:
Maximum coverage at knowledge level in a minimum amount of space/preptime Valuable in content areas that have a lot of facts
Disadvantages:
Time consuming for students Not good for higher levels of learning
Tips for Writing Good Matching items:
Need 15 items or less. Give good directions on basis for matching. Use items in response column more than once (reduces the effects of guessing). Use homogenous material in each exercise. Make all responses plausible. Put all items on a single page. Put response in some logical order (chronological, alphabetical, etc.). Responses should be short.
Multiple Choice Good for:
Application, synthesis, analysis, and evaluation levels
Types:
Question/Right answer Incomplete statement Best answer
Advantages:
Very effective Versatile at all levels Minimum of writing for student Guessing reduced Can cover broad range of content
Disadvantages:
Difficult to construct good test items. Difficult to come up with plausible distractors/alternative responses.
Tips for Writing Good Multiple Choice items:
Stem should present single, clearly formulated problem. Stem should be in simple, understood language; delete extraneous words. Avoid "all of the above"--can answer based on partial knowledge (if one is incorrect or two are correct, but unsure of the third...). Avoid "none of the above." Make all distractors plausible/homoegenous. Don't overlap response alternatives (decreases discrimination between students who know the material and those who don't). Don't use double negatives. Present alternatives in logical or numerical order. Place correct answer at random (A answer is most often). Make each item independent of others on test. Way to judge a good stem: student's who know the content should be able to answer before reading the alternatives List alternatives on separate lines, indent, separate by blank line, use letters vs. numbers for alternative answers. Need more than 3 alternatives, 4 is best.
Short Answer Good for:
Application, synthesis, analysis, and evaluation levels
Advantages:
Easy to construct Good for "who," what," where," "when" content Minimizes guessing Encourages more intensive study-student must know the answer vs. recognizing the answer.
Disadvantages:
May overemphasize memorization of facts Take care - questions may have more than one correct answer Scoring is laborious
Tips for Writing Good Short Answer Items:
When using with definitions: supply term, not the definition-for a better judge of student knowledge. For numbers, indicate the degree of precision/units expected. Use direct questions, not an incomplete statement. If you do use incomplete statements, don't use more than 2 blanks within an item. Arrange blanks to make scoring easy. Try to phrase question so there is only one answer possible.
Essay Good for:
Application, synthesis and evaluation levels
Types:
Extended response: synthesis and evaluation levels; a lot of freedom in answers Restricted response: more consistent scoring, outlines parameters of responses
Advantages:
Students less likely to guess Easy to construct Stimulates more study Allows students to demonstrate ability to organize knowledge, express opinions, show originality.
Disadvantages:
Can limit amount of material tested, therefore has decreased validity. Subjective, potentially unreliable scoring. Time consuming to score.
Tips for Writing Good Essay Items:
Provide reasonable time limits for thinking and writing. Avoid letting them to answer a choice of questions (You won't get a good idea of the broadness of student achievement when they only answer a set of questions.) Give definitive task to student-compare, analyze, evaluate, etc. Use checklist point system to score with a model answer: write outline, determine how many points to assign to each part Score one question at a time-all at the same time.
Oral Exams Good for:
Knowledge, synthesis, evaluation levels
Advantages:
Useful as an instructional tool-allows students to learn at the same time as testing. Allows teacher to give clues to facilitate learning. Useful to test speech and foreign language competencies.
Disadvantages:
Time consuming to give and take. Could have poor student performance because they haven't had much practice with it. Provides no written record without checklists.
Student Portfolios Good for:
Knowledge, application, synthesis, evaluation levels
Advantages:
Can assess compatible skills: writing, documentation, critical thinking, problem solving Can allow student to present totality of learning. Students become active participants in the evaluation process.
Disadvantages:
Can be difficult and time consuming to grade.
Performance Good for:
Application of knowledge, skills, abilities
Advantages:
Measures some skills and abilities not possible to measure in other ways
Disadvantages:
Can not be used in some fields of study Difficult to construct Difficult to grade Time-consuming to give and take
My Definitions for Assessment and Evaluation Assessment is the gathering of information in order to help us see our studentsâ&#x20AC;&#x2122; strengths and weaknesses. Also to see if the goals established are being reached.
Evaluation is a way of ending the processes to give an overview of the results gathered to make decisions or judgments on studentsâ&#x20AC;&#x2122; learning, needs and outcomes as well as the analysis of the instructors teaching methods.
Diagnostic test: 1. Action Research: Some kind of investigation with a specific purpose. 2. Affective Outcomes: Responses, out of what was said and how was said. 3. Annual Report: A report done every year. 4. Assessment: Ways of gathering information in order to see if the goals established are being achieved. 6. Assessment Tool: Instrument used to gather information. 7. Assessment Literacy: Knowledge of how to make an assessment tool. 8. Authentic Assessment: A well done assessment. 9. Benchmark: Unit of measurement 13. Course mapping: Syllabus 18. Educational Goals: What you expect your students to perform. 19. Formative assessment: Students learn form it. 25. Norm: Rule 26. Portfolio: A recollection of everything done in class. 28. Process: A method of doing something. 29. Program assessment 30. Reliability: Correct, trustworthy. 31. Rubric: A guideline. 35. Validity: Well done.
Erver Azurdia.
Oral Presentation Rubric TRAIT NONVERBAL SKILLS EYE CONTACT
BODY LANGUAGE POISE
4
3
2
1 No eye contact with audience, as entire report is read from notes.
Made movements or gestures that enhances articulation. Makes minor mistakes, but quickly recovers from them; displays little or no tension.
Displayed minimal eye contact with audience, while reading mostly from the notes. Very little movement or descriptive gestures. Displays mild tension; has trouble recovering from mistakes.
No movement or descriptive gestures. Tension and nervousness is obvious; has trouble recovering from mistakes.
Demonstrates a strong, positive feeling about topic during entire presentation. Student uses a clear voice and correct, precise pronunciation of terms so that all audience members can hear presentation.
Occasionally shows positive feelings about topic.
Shows some negativity toward topic presented.
Shows absolutely no interest in topic presented.
Student’s voice is clear. Student pronounces most words correctly. Most audience members can hear presentation.
Student’s voice is low. Student incorrectly pronounces terms. Audience members have difficulty hearing presentation.
Student mumbles, incorrectly pronounces terms, and speaks too quietly for a majority of students to hear.
Student demonstrates full knowledge by answering all class questions with explanations and elaboration. Student presents information in logical, interesting sequence which audience can follow. Presentation has no misspellings or grammatical errors.
Student is at ease with expected answers to all questions, without elaboration. Student presents information in logical sequence which audience can follow.
Student is uncomfortable with information and is able to answer only rudimentary questions. Audience has difficulty following presentation because student jumps around. Presentation has three misspellings and/or grammatical errors.
Student does not have grasp of information; student cannot answer questions about subject. Audience cannot understand presentation because there is no sequence of information.
Holds attention of entire audience with the use of direct eye contact, seldom looking at notes. Movements seem fluid and help the audience visualize. Student displays relaxed, self-confident nature about self, with no mistakes.
Consistent use of direct eye contact with audience, but still returns to notes.
COMMENTS:
VERBAL SKILLS ENTHUSIASM
ELOCUTION
COMMENTS:
CONTENT SUBJECT KNOWLEDGE
ORGANIZATION
MECHANICS
COMMENTS:
Presentation has no more than two misspellings and/or grammatical errors.
Student’s presentation has four or more spelling and/or grammatical errors.
Benchmarks/Content Standards: Specific Product/Performance: General Rubric for Painting Project Criteria Scale
COMPOSITION
COLOR THEORY
PARTICIPATION
CRAFTSMANSHIP
Student's composition displays balance of the overall Elements of Design (line, space, shape, color, value, texture)
Student has clearly applied a color scheme: (ie., monochromatic, analogous, primary, secondary, tertiary, neutral, complementary, split complementary, warm, cool)
Student asks/answers questions; is clearly involved with project from start to finish (on task); responsible for clean-up.
Student displays proper application of paint according to style instructed by teacher to mimic. The overall painting is free from smudges, "dog ears" and tears.
Student's work is highly effective when balancing the Elements of Design Applicable
Student's work is highly effective in defining a clear color scheme; Colors within the scheme mix with one another to create values or in the case of flat application, the colors are used throughout Student's work is effective in displaying color scheme, there is a dominance of some colors but may have neglected some areas. (ie. using black for shadows instead of overlapping color)
Student's participation is highly effective; everyday working dilegently
Student's craftsmanship is highly effective
Student's participation is effective; may be asked once to "get back to work"
Student's craftsmanship is effective (some minute areas of carelessness)
Student's work is moderately effective in displaying a color scheme. The student can name the scheme he/she has chosen but is uncertain how to mix the colors so there is a dominance
Student's participation is moderately effective; is sporadic but participates more than not
Student's craftsmanship is moderately effective, some areas could have adjusted for a better appearance.
Student's work is ineffective in displaying a color scheme
Student's participation is non-existant
Student's craftsmanship is ineffective
4
3
2
1
Student's work is effective when balancing the Elements of Design Applicable; 1 element may need to be worked into (ie., variety of value not clearly defined but other elements have been balanced) Student's work is moderately effective when balancing the Elements of Design Applicable (2-3 elements need to be adjusted) Student's work is ineffective when balancing the Elements of Design Applicable (student is not aware of the elements)
Adapted from materials provided by Jay McTighe.
Portfolio Table of Contents: Sample Recording Form Student Name: Item
Term/Semester/Year: Date
Entered by
Reason for Inclusion
67
Portfolio Assessment: Sample Rating Scale Student: Date: Criteria to be Assessed/Evaluated
Term/Semester/Year:
Excellent
5
Very Good
4
Good
3
table of contents representative of achievements or progress this reporting period includes a variety of processes (e.g., reading, writing, listening, speaking, viewing, representing) includes evidence of student reflection includes evidence of goal setting and readjustment of goals Anecdotal Summary Notes This student can:
This student needs:
68
Adequate
2
Needs Much Improvement 1
Sample Portfolio Assessment: Selection of Portfolio Items Student
Date
Portfolio Entry for
The following criteria have been used to determine whether make relevant selections for his/her portfolio.
(student's name)
is developing the ability to
Teacher Comments ____ Does the item selected meet the defined focus for the portfolio? ____ Does the item selected demonstrate the student's best effort/work? ____ Is the item dated and inserted in chronological order? ____ Other?
You do a good job of
and you can improve
Student Comments I plan to improve by
69
Student Reflection: Sample Self-assessment Student Name: ______________________________ The attached portfolio item is (e.g., first draft, poetry, concept map).
This piece of work demonstrates that I can: ____ take risks
____ support ideas with evidence or reasons
____ persevere
____ organize related ideas
____ collaborate
____ write using a variety of sentence structures
____ use a writing process
____ use effective spelling strategies
____ participate in discussion
____ self-edit
____ other: __________________________________________________
Please notice:
Now I am planning to:
Student Signature: ____________________________________________
70
Date: ____________
Portfolio Item: Sample Collaborative Form Student Name:
Date:
Project:
Student Comments Two reasons that I chose this item are
I want you to notice
Next time I might
Other comments
Teacher Comments Two positive things that I noticed are
One specific area to work on is
Other comments
71
Sample Product Assessment: Descriptive Paragraph
Criteria The paragraph: is about one topic.
has an interesting, informative topic sentence.
includes adequate detail in the body and each sentence is about the topic.
includes vivid adjectives and strong verbs.
uses linking words that show clear relationships between the sentences.
has an interesting concluding sentence that relates to the main idea in the topic sentence.
has complete sentences.
is punctuated correctly.
is capitalized correctly.
has all words spelled correctly.
72
Yes
No
Comments
Sample Anecdotal Record Form for Small Group Learning Identify two or three criteria with which to assess each small group. Observe and comment about the extent to which the group or individual members achieve the criteria. Also note support needed or provided by the teacher or peers. Sample Criteria: 1. 2. 3.
moves into groups quickly and quietly encourages all members to participate in discussion asks questions to clarify meaning
Group Members
Date
Criteria #
Comments
73
Nene Poster Contest Rubric • Posters must be original artwork created by the student • Entries must be hand drawn, and may be colored with felt-tip marker, bright paint, or bold crayon colors (no clip art or other pre-drawn items) • All posters must be on paper at least 8-1/2 X 11 inches and no larger than 11 X 17 inches
CATEGORY Attractiveness
4 The poster is exceptionally attractive in terms of design, layout, and neatness.
3 The poster is attractive in terms of design, layout and neatness.
2 The poster is acceptably attractive though it may be a bit messy.
1 The poster is distractingly messy or very poorly designed. It is not attractive.
Drawing-Originality
Drawing shows exceptional degree of student creativity in their creation and/or display.
Drawings on the poster reflect student creativity in their creation and/or display.
The drawings are made by the student, but are based on the designs or ideas of others.
Drawings show no creativity.
Interpretation of Book Creative Scene from the book Characters in the Scene interpretation of a is clearly shown in the book are shown on book scene using an poster. the poster but it is unique point of view. unclear which scene The emotion of the is depicted. scene can be felt by the viewer.
Scene from book is not shown in the poster.
Rubric for PowerPoint Presentation- Time Travels CATEGORY Background
4
3
2
Background does not detract from text or other graphics. Choice of background could have been better suited for the project.
Background does not detract from text or other graphics. Choice of background foes not fit project.
Background makes it difficult to see text or competes with other graphics on the page.
Text - Font Choice & Formatting
Font formats have been Font formats (e.g., color, bold, italic) have carefully planned to been carefully planned enhance readability. to enhance readability and content.
Font formatting has been carefully planned to complement the content. It may be a little hard to read.
Font formatting makes it very difficult to read the material.
Content Accuracy
All content throughout the presentation is accurate. There are no factual errors.
Most of the content is accurate but there is one piece of information that might be inaccurate.
The content is generally accurate, but one piece of information is clearly flawed or inaccurate.
Content is typically confusing or contains more than one factual error. It is difficult to understand the time period that was chosen.
Spelling and Grammar
Presentation has no misspellings or grammatical errors.
Presentation has 1-2 misspellings, but no grammatical errors.
Presentation has 1-2 grammatical errors but no misspellings.
Presentation has more than 2 grammatical and/or spelling errors.
Use of Graphics
All graphics are attractive (size and colors) and support the theme/content of the presentation.
A few graphics are not attractive but all support the theme/content of the presentation.
All graphics are attractive but a few do not seem to support the theme/content of the presentation.
Several graphics are unattractive AND detract from the content of the presentation.
Background does not detract from text or other graphics. Choice of background is appropriate for this project.
Effectiveness Project includes all material needed to gain a comfortable understanding of the time period chosen.
Presentation Student presented the material with confidence.
1
Project is missing more Project is lacking Project includes most material needed to gain than two key elements. several key elements and has inaccuracies a comfortable understanding of the time period chosen.
Student had many Student presented material but could have difficulties presenting materials. been more confident.
http://www.scholastic.com
Student was unable to complete presentation before the class.
RUBRIC 41 WRITING TO EXPRESS DIRECTIONS: This form is designed to help you evaluate expressive writing assignments. Read the statements below. Then indicate the number from the following scale that reflects your assessment of the studentâ&#x20AC;&#x2122;s work. To assess general writing skills, see the more generic Rubric 37: Writing Assignments. 1 = Weak 2 = Moderately Weak 3 = Average 4 = Moderately Strong 5 = Strong 1.
The student introduces the topic or experience in a way that draws in the audience. 1 2 3 4 5
2.
The student clearly states the topic or experience. 1 2 3 4 5
3.
The student focuses on his/her personal thoughts and feelings about the topic or experience. 1 2 3 4 5
4.
The student uses memorable sensory description in relaying specific details. 1 2 3 4 5
5.
The student presents events in chronological order or in another order that the audience can follow. 1 2 3 4 5
6.
If appropriate, the student compares his/her reactions to the topic or event to his/her reactions to another topic or event. 1 2 3 4 5
7.
The student expresses the personal meaning or value of the topic or event. 1 2 3 4 5
8.
The student concludes in a way that reiterates his/her attitude toward the topic or experience. 1 2 3 4 5
9.
The spelling, punctuation, and grammar on the writing assignment are accurate. 1 2 3 4 5
10.
The writing assignment is neatly typed or handwritten. 1 2 3 4 5
Additional Comments: ___________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________
Total Points/Grade: ___________ Copyright  by Holt, Rinehart and Winston. All rights reserved.
Defining A Rubric What are they?
Defining A Rubric A rubric is a tool that has the potential for helping a teacher formatively assess a student performance during the teaching/learning process by clearly establishing the standards and quality expectations. It assists in customizing the student feedback: what a student has done well; what weaknesses exist; and how or what might be done to correct or improve the performance. It assists students in the fair and honest opportunity for self assessment of their work and allows them the opportunity to set, monitor, and achieve their personal learning goals. It assists parents in understanding the tasks and the standards by which their child's growth and progress will be measured. Rubrics also provide the teacher and district leaders with the option to later summatively evaluate their studentsâ&#x20AC;&#x2122; performances with a higher degree of consistency. Information obtained from the summative use of rubrics can be utilized to report student progress toward the agreed upon learning goals or outcomes.
Making Sense of the Definition Two concepts are imperative if one is to make any sense of the above definition. One must look closely at two words that are often used, but seldom clearly, universally understood.
Assessment
From the French word Evaluation assire, meaning aside, to set beside and guide, it is the process of identifying what's right, what's wrong, and how to fix it. For the purpose of depicting and â&#x20AC;&#x153;coachingâ&#x20AC;? the growth of an individual...where they were when they began, to where they are able to develop or advance.
From the French word evaluer, to value, it is the process of sorting, selecting, and labeling such as grading, ranking, etc. For the purpose of depicting and reporting progress of the individual against external standards, norms and/or the performance of age mates.
How to Design a Rubic
When Do I Use a Rubric? When creating a rubric, the answer to the question, "How do I design one?" may be found by utilizing a decision making grid entitled "Rubric Design Principles: Guide." This guide allows educators to create rubrics to match of the purpose/users of the rubric to five (5) design principles. The principles are: Word Choice Visual Appeal Student’s Role "Fix" Correctives "Why" Statements They appear vertically on the left side of the grid. Explanations of each are included. The purpose/users are listed across the top of the grid. Designing a rubric begins by selecting the purpose and users of the rubric from the top of the grid. Once determined, the design principles may be addressed by paying attention to the indicators and suggestions listed vertically in the column beneath the selected heading. For example, if a teacher wants to create a rubric that will be shared with students for a performance assessment task (PAT), conscious attention should be paid to the suggestions for the five design principles in the column under "Great Potential for Learner and Teacher." But, if the rubric for the PAT is being designed to insure inter-rater reliability between two teachers who will be scoring student work and it is not intended to be shared with the learners, then adhering to the suggestions listed under "Great Potential for Teachers To Be Consistent" may be adequate.
Two Teacher Tips for Designing More User-Friendly Rubrics
The "Rubric Design Principles Guide" models two tips invented to make using rubrics effective and efficient. One is the use of "skinny" columns. They are thin columns drawn between the vertical columns. When used in a rubric, the skinny columns allow a teacher to honor a student’s improvement from an initial review to subsequent reviews when the improvement was not adequate to advance the student to the next level of competency.
A teacher could place pluses (+) in the thin columns, hopefully maintaining the student’s motivation toward continued improvement, rather than creating a picture that no improvement had occurred.
Two Teacher Tips for Designing More User-Friendly Rubrics Con’t
The second tip modeled on the grid that could be incorporated on a rubric, is the wide column entitled "Legend." It was invented to encourage students ranking at the highest proficiency to continue to look for ways to make their work outstanding.
It is often called the "Legend in Your Own Time" column. It has no indicators listed. It is simply an open invitation for students to extend themselves beyond the stated expectations/requirements.
Two Teacher Tips for Designing More User-Friendly Rubrics Conâ&#x20AC;&#x2122;t
Two Teacher Tips for Designing More User-Friendly Rubrics Conâ&#x20AC;&#x2122;t ď Ž
Both skinny columns and "Legend in Your Own Time" tips were developed to take rubrics beyond merely a grading tool and make it a coaching tool. The two tips were invented to encourage the learner to use a continuous improvement mindset rather than merely asking, "What can I do to get a grade?" Rubrics are like training wheels to a learner. They should be used to help assist a learner in becoming self-reliant, self-directed and selfassessing. If used effectively, rubrics develop a strong sense of student ownership in their achievement.
Rubrics in the Classroom: When to Use
When Do I Use a Rubric? Rubrics are expensive in terms of the time and energy they require to design and implement. The decision to use a rubric must be weighed carefully. Rubrics are best suited for situations where a wide range of variation exists between whatâ&#x20AC;&#x2122;s considered very proficient and whatâ&#x20AC;&#x2122;s considered not yet proficient. Teachers have found rubrics to be every useful in providing guidance and feedback to students where skills and processes are the targets being monitored. Examples of skills or processes that adapt well to being rubriced include: the writing process, the application of the method of scientific inquiry, thinking skills (i.e. constructing support, compare, problem solving, etc.), and life-long learner skills (i.e. collaborative worker, quality producer, etc.).
Rubrics in the Classroom: When to Use Conâ&#x20AC;&#x2122;t
Methods other than rubrics are more conducive to monitoring quantities or amounts of factual information known by a learner. These methods may include tests, quizzes, checklists, etc. Helpful Hint: Donâ&#x20AC;&#x2122;t rubric everything. Some teachers reserve rubrics for processes and skills in which students are having difficulty demonstrating a high degree of proficiency. Others use rubrics to scaffold new performance tasks or introduce new skills and processes. However, or whenever, the decision is made to use a rubric, best results usually occur when students are involved in the work of designing a rubric, as well as in the feedback loop and in the reporting-out to stakeholders process, (i.e., parents, school board members, community, etc.).
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2009
RUBRICS From an Assessment Workshop presented at Honolulu Community College on August 31, 2004 by Dr. Mary Allen, The California State University System
In general a rubric is a scoring guide used in subjective assessments. A rubric implies that a rule defining the criteria of an assessment system is followed in evaluation. A rubric can be an explicit description of performance characteristics corresponding to a point on a rating scale. A scoring rubric makes explicit expected qualities of performance on a rating scale or the definition of a single scoring point on a scale Rubrics are explicit schemes for classifying products or behaviors into categories that vary along a continuum. They can be used to classify virtually any product or behavior, such as essays, research reports, portfolios, works of art, recitals, oral presentations, performances, and group activities. Judgments can be self-assessments by students; or judgments can be made by others, such as faculty, other students, or field-work supervisors. Rubrics can be used to provide formative feedback to students, to grade students, and/or to assess programs. Rubrics have many strengths:
Complex products or behaviors can be examined efficiently. Developing a rubric helps to precisely define faculty expectations. Well-trained reviewers apply the same criteria and standards, so rubrics are useful for assessments involving multiple reviewers. Summaries of results can reveal patterns of student strengths and areas of concern. Rubrics are criterion-referenced, rather than norm-referenced. Raters ask, "Did the student meet the criteria for level 5 of the rubric?" rather than "How well did this student do compared to other students?" This is more compatible with cooperative and collaborative learning environments than competitive grading schemes and is essential when using rubrics for program assessment because you want to learn how well students have met your standards. Ratings can be done by students to assess their own work, or they can be done by others, such as peers, fieldwork supervisions, or faculty.
Developing a Rubric It is often easier to adapt a rubric that someone else has created, but if you are starting from scratch, here are some steps that might make the task easier:
Identify what you are assessing (e.g., critical thinking).
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2009
Identify the characteristics of what you are assessing (e.g., appropriate use of evidence, recognition of logical fallacies). Describe the best work you could expect using these characteristics. This describes the top category. Describe the worst acceptable product using these characteristics. This describes the lowest acceptable category. Describe an unacceptable product. This describes the lowest category. Develop descriptions of intermediate-level products and assign them to intermediate categories. You might develop a scale that runs from 1 to 5 (unacceptable, marginal, acceptable, good, outstanding), 1 to 3 (novice, competent, exemplary), or any other set that is meaningful. Ask colleagues who were not involved in the rubric's development to apply it to some products or behaviors and revise as needed to eliminate ambiguities.
Suggestions for Using Scoring Rubrics for Grading and Program Assessment
1. Hand out the grading rubric with an assignment so students will know your expectations and how they'll be graded. This should help students master your learning objectives by guiding their work in appropriate directions. 2. Use a rubric for grading student work, including essay questions on exams, and return the rubric with the grading on it. Faculty save time writing extensive comments; they just circle or highlight relevant segments of the rubric. Each row in the rubric could have a different array of possible points, reflecting its relative importance for determining the overall grade. Points (or point ranges) possible for each cell in the rubric could be printed on the rubric, and a column for points for each row and comments section(s) could be added. 3. Develop a rubric with your students for an assignment or group project. Students can then monitor themselves and their peers using agreed-upon criteria that they helped develop. (Many faculty find that students will create higher standards for themselves than faculty would impose on them.) 4. Have students apply your rubric to some sample products (e.g., lab reports) before they create their own. Faculty report that students are quite accurate when doing this, and this process should help them evaluate their own products as they develop them. 5. Have students exchange paper drafts and give peer feedback using the rubric, then give students a few days before the final drafts are turned in to you. (You might also require that they turn in the draft and scored rubric with their final paper.) 6. Have students self-assess their products using the grading rubric and hand in the self-assessment with the product; then faculty and students can compare
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2009
self- and faculty-generated evaluations. 7. Use the rubric for program assessment. Faculty can use it in classes and aggregate the data across sections, faculty can independently assess student products (e.g., portfolios) and then aggregate the data, or faculty can participate in group readings in which they review student products together and discuss what they found. Field-work supervisors or community professionals also may be invited to assess student work using rubrics. A well-designed rubric should allow evaluators to efficiently focus on specific learning objectives while reviewing complex student products, such as theses, without getting bogged down in the details. Rubrics should be pilot tested, and evaluators should be "normed" or "calibrated" before they apply the rubrics (i.e., they should agree on appropriate classifications for a set of student products that vary in quality). If two evaluators apply the rubric to each product, inter-rater reliability can be examined. Once the data are collected, faculty discuss results to identify program strengths and areas of concern, "closing the loop" by using the assessment data to make changes to improve student learning. 8. Faculty can get "double duty" out of their grading by using a common rubric that is used for grading and program assessment. Individual faculty may elect to use the common rubric in different ways, combining it with other grading components as they see fit.
GRADING
Creating and Using Rubrics GRADING Grading is not simply a matter of assigning number or letter grades. It is a process that may involve some or all of these activities:
Creating effective assignments Establishing standards and criteria Setting curves Making decisions about effort and improvement Deciding which comments would be the most useful in guiding each student's learning Designing assignments and exams that promote the course objectives Assessing student learning and teaching effectiveness
Effective grading requires an understanding of how grading may function as a tool for learning, an acceptance that some grades will be based on subjective criteria, and a willingness to listen to and communicate with students. It is important to help students to focus on the learning process rather than on "getting the grade," while at the same time acknowledging the importance that grades hold for students. And, since GSIs are students themselves, to balance the requirements of effective grading with other workload and professional commitments. This section contains general tips on how to make your grading both more effective and more efficient. You will also find specific suggestions here about designing assignments, setting standards and policies, using grading rubrics, and writing comments on student work. You might also find relevant information in other sections of this online guide, for example, Working with Student Writing, Academic Misconduct, and Improving Your Teaching. Before You Grade Designing Assignments As a GSI, you may or may not have input into the course assignments you will grade. Some faculty prefer to design the course assignments themselves; others ask for substantial input from GSIs. Course assignments can be very particular: they depend on the content and objectives of the course, the teaching methods and style of the instructor, the level and background of the students, and the given discipline. However, there are questions to take into account if you are designing assignments: What do you want the students to learn? What are the goals and objectives of the course? How does the assignment contribute to those goals and objectives? What skills to you want students to employ: to solve, to argue, to create, to analyze, to explain, to demonstrate, to apply, etc.? How well focused is the assignment? Are the instructions clear and concise? Does the assignment give the students a clearly defined, unambiguous task? How long is the assignment going to be? Do you want students to engage in research that goes beyond the course content, or do you want them to stick to the course materials? What should the assignment format be? When will the assignment be due, and how much time will you need to grade it? When will the assignment be returned to students? Will you allow students to rewrite the assignment if necessary? Can this assignment be realistically completed given the knowledge, ability, and time constraints of the students? Is it clearly related to the course content? Are the research materials needed to complete the assignment available in sufficient quantity?
GRADING
Creating and Using Rubrics Is it possible for you to grade this assignment effectively, given your workload and other commitments? How is this assignment going to contribute to the student’s final course grade? The Grading Process Steps in the Process The process of assigning grades can be broken down into stages. Establish a set of criteria (or understand the criteria given to you by your department or the professor of the course) by thinking about what students must do to complete the assignment successfully, and weighting each criterion accordingly with respect to the final grade. If there is more than one GSI for the course, try to establish the criteria jointly to ensure consistency. To make criteria easier to keep in mind when grading, divide them into areas such as clarity of expression, understanding of material, and quality of argumentation (or whatever areas are relevant to your field and the assignment). Read through all the papers quickly. Try to pick out model papers, for example a model 'A' paper, a model 'B' paper, and so forth. This will give you a good overall sense of the quality of the assignments. Read through the papers more carefully, writing comments and assigning preliminary grades. Write the grades in pencil in case you want to change them later. Sort the papers by grade range and compare them to make sure that you have assigned grades consistently; that is, that all of the B papers are of the same quality, that all of the A papers are of the same quality, etc. Write the final grades in pen, or record it in whatever way you are instructed. If there is more than one GSI or grader for your course, you might want to exchange a couple of papers from each grade range to ensure that you are assigning grades in the same way. Communicating with Students Writing Comments on Student Work Your written comments on students’ work should be used to help them understand the strengths and weaknesses of their work, and to make clear how their work has or has not achieved the goals and standards set in the class. Here are some suggestions on how to make your comments meaningful to students. For more detailed advice about writing comments on papers, see Comments on Student Writing. Think about the sorts of comments that you find helpful and unhelpful. For example, avoid one word comments such as “good” or “unclear.” If you think that something is good or unclear, explain in concrete terms why you think so. Think about the extent to which you want to comment on each aspect of the assignment. For example, how important are punctuation and spelling? Is it enough to have one or two comments on grammar or syntax, or would more extensive comments be appropriate? Don’t overwhelm the student with a lot of different comments. Approximately one or two comments per page is enough. Focus on a couple of major points rather than comment on everything. Write specific comments in the margin and more general comments at the end of the assignment. General comments give the students an overall sense of what went right or wrong and how they might improve their work in the future. Specific comments identify particular parts of the assignment that are right or wrong and explain why.
GRADING
Creating and Using Rubrics What has been omitted from the paper or exam response is as important as what has been included. Ask questions to point out something that’s missing or to suggest improvements. Try to give the students a good overall sense of how they might improve their work. Don’t comment exclusively on weaknesses. Identify strengths and explain them. This helps students know their progress, and helps them build their skills. Write as many comments on good work as on bad work. In addition to commenting on things the student does well, think about how the student might work to improve his or her work even further.
Write legibly or type your comments. Don’t be sarcastic or make jokes. What seems funny to you may be hurtful to students and not provide the guidance they need for improvement. Discuss difficult cases with other GSIs or the instructor in charge. Keep a record of common problems and interesting ideas, and discuss them in class. Make sure you have adequately explained the reason for the grade.
Questions to Ask Yourself When Writing Comments What were the strengths in this piece of work? What were the weaknesses? What stands out as memorable or interesting? Does the work have a clear thesis or main point, either explicit or implicit? Is it clear what point the author is trying to make and why? Are the main points and ideas clear? Are they specific enough? Are they clearly related to the assignment? Does the author provide sufficient evidence or argumentative support? Is the writing clear, concise, coherent, and easy and interesting to read? Are the grammar and syntax acceptable? Is the writing style appropriate? Does the author understand all of the words and phrases that he or she is using? Does the work have a clear, logical structure? Are the transitions clear? Is there one main point per paragraph? Are the factual claims correct? Does the author provide the appropriate citations and bibliographical references? Steps to Creating a Rubric Creating a rubric takes time and requires thought and experimentation. Here, you can see the steps we used to create a rubric for an essay assignment in a large-enrollment, intro-level sociology course. See also Tips on Using Rubrics Effectively and a Rubric Worksheet you can use to make rubrics of your own. Define the traits or learning outcomes you want to assess (usually in nouns or noun phrases). Choose what kind of scale you want to use: analytic or holistic? Five-point scale, three-point scale, letter grades, or a scale of your own devising? Draw a table in which you describe the characteristics of student work at each point on your scale. Test it out!
GRADING
Creating and Using Rubrics Example: Sociology 3AC Essay Assignment Write a 7-8 page essay in which you make an argument about the relationship between social factors and educational opportunity. To complete the assignment, you will use electronic databases to gather data on three different high schools (including your own). You will use this data to locate each school within the larger social structure and to support your argument about the relationship between social status and public school quality. In your paper you should also reflect on how your own personal educational opportunities have been influenced by the social factors you identify. Course readings and materials should be used as background, to define sociological concepts and to place your argument within a broader discussion of the relationship between social status and individual opportunity. Your paper should be clearly organized, proofread for grammar and spelling, and all scholarly ideas must be cited using the ASA style manual. Using the four-step process for this assignment:
Define the traits or learning outcomes you want to assess (usually in nouns or noun phrases). Argument Use and interpretation of data Reflection on personal experiences Application of course readings and materials Organization, writing, and mechanics Choose the kind of scale you want to use: analytic or holistic, five-point scale, three-point scale, letter grades, or a scale of your own devising.
For this assignment, we decided to grade each trait individually because there seemed to be too many independent variables to grade holistically. We decided to use a five-point scale for each trait, but we could have used a three-point scale, or a descriptive scale, as follows. This choice, again, depends on the complexity of the assignment and the kind of information you want to convey to students. Draw a table in which you describe the characteristics of student work at each point on your scale. RUBRIC SCALES Five-Point Scale Grade/ Point
Characteristics
5
Argument pertains to relationship between social factors and educational opportunity and is clearly stated and defensible.
4
Argument pertains to relationship between social factors and educational opportunity and is defensible, but it is not clearly stated.
3
Argument pertains to relationship between social factors and educational opportunity but is not defensible using the evidence available.
2
Argument is presented, but it does not pertain to relationship between social factors and educational opportunity.
GRADING
Creating and Using Rubrics 1
Social factors and educational opportunity are discussed, but no argument is presented.
Three-Point Scale Grade/ Point
Characteristics
3
Argument pertains to relationship between social factors and educational opportunity and is clearly stated and defensible.
2
Argument pertains to relationship between social factors and educational opportunity but may not be clear or sufficiently narrow in scope.
1
Social factors and educational opportunity are discussed, but no argument is presented.
Simplified Three-Point Scale Ideal Outcome Argument pertains to relationship between social factors and educational opportunity and is clearly stated and defensible
3
2
1
Simplified Three-Point Scale, numbers replaced with descriptive terms Ideal Outcome Argument pertains to relationship between social factors and educational opportunity and is clearly stated and defensible
Proficient
Fair
Inadequate
Final Analytic Rubric Argument 5
Argument pertains to relationship between social factors and educational opportunity and is clearly stated and defensible.
4
Argument pertains to relationship between social factors and educational opportunity and is
GRADING
Creating and Using Rubrics defensible, but it is not clearly stated. 3
Argument pertains to relationship between social factors and educational opportunity but is not defensible using the evidence available.
2
Argument is presented, but it does not pertain to relationship between social factors and educational opportunity.
1
Social factors and educational opportunity are discussed, but no argument is presented.
Interpretation and Use of Data 5
The data is accurately interpreted to identify each school’s position within a larger social structure, and sufficient data is used to defend the main argument.
4
The data is accurately interpreted to identify each school’s position within a larger social structure, and data is used to defend the main argument, but it might not be sufficient.
3
Data is used to defend the main argument, but it is not accurately interpreted to identify each school’s position within a larger social structure, and it might not be sufficient.
2
Data is used to defend the main argument, but it is insufficient, and no effort is made to identify the school’s position within a larger social structure.
1
Data is provided, but it is not used to defend the main argument.
Reflection on Personal Experiences 5
Personal educational experiences are examined thoughtfully and critically to identify significance of external social factors and support the main argument.
4
Personal educational experiences are examined thoughtfully and critically to identify significance of external social factors, but relation to the main argument may not be clear.
3
Personal educational experiences are examined, but not in a way that reflects understanding of the external factors shaping individual opportunity. Relation to the main argument also may not be clear.
2
Personal educational experiences are discussed, but not in a way that reflects understanding of the external factors shaping individual opportunity. No effort is made to relate experiences back to the main argument.
1
Personal educational experiences are mentioned, but in a perfunctory way.
Application of Course Readings and Materials
5
Demonstrates solid understanding of the major themes of the course, using course readings to accurately define sociological concepts and to place the argument within a broader discussion of the relationship between social status and individual opportunity.
GRADING
Creating and Using Rubrics 4
Uses course readings to define sociological concepts and place the argument within a broader framework, but does not always demonstrate solid understanding of the major themes.
3
Uses course readings to place the argument within a broader framework, but sociological concepts are poorly defined or not defined at all. The data is not all accurately interpreted to identify each school's position within a larger social structure, and it might not be sufficient.
2
Course readings are used, but paper does not place the argument within a broader framework or define sociological concepts.
1
Course readings are only mentioned, with no clear understanding of the relationship between the paper and course themes.
Organization, Writing, and Mechanics
5
Clear organization and natural “flow” (with an introduction, transition sentences to connect major ideas, and conclusion) with few or no grammar or spelling errors. Scholarly ideas are cited correctly using the ASA style guide.
4
Clear organization (introduction, transition sentences to connect major ideas, and conclusion), but writing might not always be fluid, and might contain some grammar or spelling errors. Scholarly ideas are cited correctly using the ASA style guide.
3
Organization unclear or the paper is marred by significant grammar or spelling errors (but not both). Scholarly ideas are cited correctly using the ASA style guide.
2
Organization unclear and the paper is marred by significant grammar and spelling errors. Scholarly ideas are cited correctly using the ASA style guide.
1
Effort to cite is made, but the scholarly ideas are not cited correctly. (Automatic “F” if ideas are not cited at all.)
Holistic Rubric For some assignments, you may choose to use a holistic rubric, or one scale for the whole assignment. This type of rubric is particularly useful when the variables you want to assess just cannot be usefully separated. We chose not to use a holistic rubric for this assignment because we wanted to be able to grade each trait separately, but we’ve completed a holistic version here for comparative purposes. Grade/ Point
Characteristics
A
The paper is driven by a clearly stated, defensible argument about the relationship between social factors and educational opportunity. Sufficient data is used to defend the argument, and the data is accurately interpreted to identify each school’s position within a larger social structure. Personal educational experiences are examined thoughtfully and critically to identify significance of external social factors and support the main argument. Paper reflects
GRADING
Creating and Using Rubrics solid understanding of the major themes of the course, using course readings to accurately define sociological concepts and to place the argument within a broader discussion of the relationship between social status and individual opportunity. Paper is clearly organized (with an introduction, transition sentences to connect major ideas, and conclusion) and has few or no grammar or spelling errors. Scholarly ideas are cited correctly using the ASA style guide.
B
The paper is driven by a defensible argument about the relationship between social factors and public school quality, but it may not be stated as clearly and consistently throughout the essay as in an “A” paper. The argument is defended using sufficient data, reflection on personal experiences, and course readings, but the use of this evidence does not always demonstrate a clear understanding of how to locate the school or community within a larger class structure, how social factors influence personal experience, or the broader significance of course concepts. Essay is clearly organized, but might benefit from more careful attention to transitional sentences. Scholarly ideas are cited accurately, using the ASA style sheet, and the writing is polished, with few grammar or spelling errors.
C
The paper contains an argument about the relationship between social factors and public school quality, but the argument may not be defensible using the evidence available. Data, course readings, and personal experiences are used to defend the argument, but in a perfunctory way, without demonstrating an understanding of how social factors are identified or how they shape personal experience. Scholarly ideas are cited accurately, using the ASA style sheet. Essay may have either significant organizational or proofreading errors, but not both.
D
The paper does not have an argument, or is missing a major component of the evidence requested (data, course readings, or personal experiences). Alternatively, or in addition, the paper suffers from significant organizational and proofreading errors. Scholarly ideas are cited, but without following ASA guidelines.
F
The paper does not provide an argument and contains only one component of the evidence requested, if any. The paper suffers from significant organizational and proofreading errors. If scholarly ideas are not cited, paper receives an automatic “F.”
Tips on Using Rubrics Think through your learning objectives. Put some thought into the various traits, or learning outcomes, you want the assignment to assess. The process of creating a rubric can often help clarify the assignment itself. If the assignment has been well articulated, with clear and specific learning goals in mind, the language for your rubric can come straight from the assignment as written. Otherwise, try to unpack the assignment, identifying areas that are not articulated clearly. If the learning objectives are too vague, your rubric will be less useful (and your students will have a difficult time understanding your expectations). If, on the other hand, your stated objectives are too mechanistic or specific, your rubric will not accurately reflect your grading expectations. For help in articulating learning objectives, see Bloom’s Taxonomy. Decide what kind of scale you will use. Decide whether the traits you have identified should be assessed separately or holistically. If the assignment is complex, with many variables in play, you might need a scale for each trait (“Analytic Rubric”). If the assignment is not as complex, or the variables seem too interdependent to be separated, you might
GRADING
Creating and Using Rubrics choose to create one scale for the entire assignment (“Holistic Rubric”). Do you want to use a letter-grade scale, a point scale (which can be translated into a grade at the end), or some other scale of your own devising (e.g., “Proficient,” “Fair,” “Inadequate,” etc.)? This decision will depend, again, on how complex the assignment is, how it will be weighed in the students’ final grade, and what information you want to convey to students about their grade. Describe the characteristics of student work at each point on your scale. Once you have defined the learning outcomes being assessed and the scale you want to employ, create a table to think through the characteristics of student work at every point or grade on your scale. You might find it helpful to use the rubric worksheet that follows. Instructors are used to articulating the ideal outcome of a given assignment. It can be more challenging (but often far more helpful to the students) to articulate the differences, for example, between “C” and “B” work. If you have samples of student work from past years, look them over to identify the various levels of accomplishment. Start by describing the “ideal” outcome, then the “acceptable” outcome, then the “unacceptable” outcome, and fill in the blanks in between. If you don’t have student work, try to imagine the steps students will take to complete the assignment, the difficulties they might encounter, and the lower-level achievements we might take for granted. Test your rubric on student work. It is essential to try your rubric out and make sure it accurately reflects your grading expectations (as well as those of the instructor and other GSIs). If available, use sample work from previous semesters. Otherwise, test your rubric on a sampling of student papers and then revise the rubric before you grade the rest. Make sure, however, that you are not altering substantially the grading criteria you laid out for your students. Use your rubric to give constructive feedback to students. Consider handing the rubric out with students’ returned work. You can use the rubric to facilitate the process of justifying grades and to provide students with clear instructions about how they can do better next time. Some instructors prefer not to hand out the rubric, at least in the form that they use it in grading. An abbreviated form of the rubric can be developed for student communication both before the paper is handed in and when it's handed back after grading. Use your rubric to clarify your assignments and to improve your teaching. The process of creating a rubric can help you create assignments tailored to clear and specific learning objectives. Next time you teach the assignment, use your rubric to fine-tune the assignment description, and consider handing out the rubric with the assignment itself. Rubrics can also provide you, as the teacher, with important feedback on how well your students are meeting the learning outcomes you’ve laid out for them. If most of your students are scoring a “2” on “Clarity and Strength of Argument,” then you know that next time you teach the course you need to devote more classroom time to this learning goal. WORKSHEET List the traits you want the assignment to measure (usually in nouns or noun phrases): __________________________________________________________________________________________________ __________________________________________________________________________________________________ _________________________________________________________________________________________________ _________________________________________________________________________________________________ ________________________________________________________________________________________________
GRADING
Creating and Using Rubrics Use the following chart to create a rubric. Either fill out one chart for the entire assignment (holistic rubric) or fill out one chart for each trait or learning objective (analytic rubric). Trait / Assignment being Assessed:______________________________________________________________________ Grade / Points:
Characteristics:
http://gsi.berkeley.edu/teachingguide2009/grading/rubricsSteps.html
DOCUMENT 5 UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN EL IDIOMA INGLES EVALUATION AND ASSESSMENT LICDA EVELYN R. QUIROA
Summary of How Evaluation, Assessment, Measurement and Testing Terms Are Related Commonly used assessment and measurement terms are related and understanding how they connect with one another can help you better integrate your testing and teaching. Evaluation Examining information about many components of the thing being evaluated (e.g., student work, schools, or a specific educational program) and comparing or judging its quality, worth or effectiveness in order to make decisions based on Assessment The process of gathering, describing, or quantifying information about performance. includes Measurement Process of assigning numbers to qualities or characteristics of an object or person according to some rule or scale and analyzing that data based on psychometric and statistical theory specific way to measure performance is Testing A method used to measure the level of achievement or performance
DOCUMENT 5 Abilities and Behaviors Related to Bloom’s Taxonomy of Educational Objectives
Knowledge – Recognizes students’ ability to use rote memorization and recall certain facts. • Test questions focus on identification and recall of information Comprehension – Involves students’ ability to read course content, extrapolate and interpret important information and put other’s ideas into their own words. • Test questions focus on use of facts, rules and principles Application – Students take new concepts and apply them to another situation. • Test questions focus on applying facts or principles Analysis – Students have the ability to take new information and break it down into parts to differentiate between them. • Test questions focus on separation of a whole into component parts Synthesis – Students are able to take various pieces of information and form a whole creating a pattern where one did not previously exist. • Test questions focus on combining ideas to form a new whole Evaluation – Involves students’ ability to look at someone else’s ideas or principles and see the worth of the work and the value of the conclusions.
DOCUMENT 5
• Test questions focus on developing opinions, judgments or decisions Examples of Instructional Objectives for the Cognitive Domain 1. The student will recall the four major food groups without error. (Knowledge) 2. By the end of the semester, the student will summarize the main events of a story in grammatically correct English. (Comprehension) 3. Given a presidential speech, the student will be able to point out the positions that attack a political opponent personally rather than the opponent’s political programs. (Analysis) 4. Given a short story, the student will write a different but plausible ending. (Synthesis) 5. Given fractions not covered in class, the student will multiply them on paper with 85 percent accuracy. (Application) 6. Given a description of a country’s economic system, the student will defend it by basing arguments on principles of socialism. (Evaluation) 7. From memory, with 80 percent accuracy the student will match each United States General with his most famous battle. (Knowledge) 8. The student will describe the interrelationships among acts in a play. (Analysis)
Test Blueprint Once you know the learning objectives and item types you want to include in your test you should create a test blueprint. A test blueprint, also known as test specifications, consists of a matrix, or chart, representing the number of questions you want in your test within each topic and level of objective. The
DOCUMENT 5
blueprint identifies the objectives and skills that are to be tested and the relative weight on the test given to each. The blueprint can help you ensure that you are obtaining the desired coverage of topics and level of objective. Once you create your test blueprint you can begin writing your items! Once you create your blueprint you should write your items to match the level of objective within each topic area.
Description of Multiple‐Choice Items Multiple‐Choice Items: Multiple‐choice items can be used to measure knowledge outcomes and various types of learning outcomes. They are most widely used for measuring knowledge, comprehension, and application outcomes. The multiple‐choice item provides the most useful format for measuring achievement at various levels of learning. When selection‐type items are to be used (multiple‐choice, true‐false, matching, check all that apply) an effective procedure is to start each item as a multiple‐choice item and switch to another item type only when the learning outcome and content make it desirable to do so. For example, (1) when there are only two possible alternatives, a shift can be made to a true‐false item; and (2) when there are a number of similar factors to be related, a shift can be made to a matching item. Strengths:
DOCUMENT 5
1. Learning outcomes from simple to complex can be measured. 2. Highly structured and clear tasks are provided. 3. A broad sample of achievement can be measured. 4. Incorrect alternatives provide diagnostic information. 5. Scores are less influenced by guessing than true‐false items. 6. Scores are more reliable than subjectively scored items (e.g., essays). 7. Scoring is easy, objective, and reliable. 8. Item analysis can reveal how difficult each item was and how well it discriminated between the strong and weaker students in the class 9. Performance can be compared from class to class and year to year 10. Can cover a lot of material very efficiently (about one item per minute of testing time). 11. Items can be written so that students must discriminate among options that vary in degree of correctness. 12. Avoids the absolute judgments found in True‐False tests. Limitations: 1. Constructing good items is time consuming. 2. It is frequently difficult to find plausible distracters. 3. This item is ineffective for measuring some types of problem solving and the ability to organize and express ideas. 4. Real‐world problem solving differs – a different process is involved in proposing a solution versus selecting a solution from a set of alternatives. 5. Scores can be influenced by reading ability.
DOCUMENT 5
6. There is a lack of feedback on individual thought processes – it is difficult to determine why individual students selected incorrect responses. 7. Students can sometimes read more into the question than was intended. 8. Often focus on testing factual information and fails to test higher levels of cognitive thinking. 9. Sometimes there is more than one defensible “correct” answer. 10. They place a high degree of dependence on the student’s reading ability and the instructor’s writing ability. 11. Does not provide a measure of writing ability. 12. May encourage guessing. Helpful Hints: • Base each item on an educational or instructional objective of the course, not trivial information. • Try to write items in which there is one and only one correct or clearly best answer. • The phrase that introduces the item (stem) should clearly state the problem. • Test only a single idea in each item. • Be sure wrong answer choices (distracters) are at least plausible. • Incorporate common errors of students in distracters. • The position of the correct answer should vary randomly from item to item. • Include from three to five options for each item. • Avoid overlapping alternatives (see Example 3 following). • The length of the response options should be about the same within each item (preferably short).
DOCUMENT 5
• There should be no grammatical clues to the correct answer. • Format the items vertically, not horizontally (i.e., list the choices vertically) • The response options should be indented and in column form. • Word the stem positively; avoid negative phrasing such as “not” or “except.” If this cannot be avoided, the negative words should always be highlighted by underlining or capitalization: Which of the following is NOT an example …… • Avoid excessive use of negatives and/or double negatives. • Avoid the excessive use of “All of the above” and “None of the above” in the response alternatives. In the case of “All of the above”, students only need to have partial information in order to answer the question. Students need to know that only two of the options are correct (in a four or more option question) to determine that “All of the above” is the correct answer choice. Conversely, students only need to eliminate one answer choice as implausible in order to eliminate “All of the above” as an answer choice. Similarly, with “None of the above”, when used as the correct answer choice, information is gained about students’ ability to detect incorrect answers. However, the item does not reveal if students know the correct answer to the question. Example 1 The stem of the original item below fails to present the problem adequately or to set a frame of reference for responding. Original 1. World War II was: A. The result of the failure of the League of Nations. B. Horrible. C. Fought in Europe, Asia, and Africa. D. Fought during the period of 1939‐1945.
DOCUMENT 5
Revised 1. In which of these time period was World War II fought? A. 1914‐1917 B. 1929‐1934 C. 1939‐1945 D. 1951‐1955 E. 1961‐1969 Example 2 There should be no grammatical clues to the correct answer. Original 1. Albert Eisenstein was a: A. Anthropologist. B. Astronomer. C. Chemist. D. Mathematician Revised 1. Who was Albert Einstein? A. An anthropologist. B. An Astronomer. C. A chemist. D. A mathematician. Example 3 Alternatives should not overlap (e.g., in the original form of this item, if either of the first two alternatives is correct, “C” is also correct.)
DOCUMENT 5
Original 1. During what age period is thumb‐sucking likely to produce the greatest psychological trauma? A. Infancy B. Preschool period C. Before adolescence D. During adolescence E. After adolescence Revised 1. During what age period is thumb‐sucking likely to produce the greatest psychological trauma? A. From birth to 2 years old B. From 2 years to 5 years old C. From 5 years to 12 years old D. From 12 years to 20 years old E. 20 years of age or older Example 4 Example of how the greater similarity among alternatives increases the difficulty of the item. Easy 1. Who was the President of the U.S. during the War of 1812? A. Grover Cleveland B. Abraham Lincoln C. James Madison D. Harry Truman
DOCUMENT 5
E. George Washington More Difficult 1. Who was President of the U.S. during the War of 1812? A. John Q. Adams B. Andrew Jackson C. Thomas Jefferson D. James Madison E. George Washington
Multiple‐Choice Item Writing Guidelines Multiple‐choice questions typically have 3 parts: a stem, the correct answer – called the key, and several wrong answers, called distracters. Procedural Rules: • Use either the best answer or the correct answer format. • Best answer format refers to a list of options that can all be correct in the sense that each has an advantage, but one of them is the best. • Correct answer format refers to one and only one right answer. • Format the items vertically, not horizontally (i.e., list the choices vertically) • Allow time for editing and other types of item revisions. • Use good grammar, punctuation, and spelling consistently. • Minimize the time required to read each item. • Avoid trick items. • Use the active voice. • The ideal question will be answered by 60‐65% of the tested population.
DOCUMENT 5
• Have your questions peer‐reviewed. • Avoid giving unintended cues – such as making the correct answer longer in length than the distracters. Content‐related Rules: • Base each item on an educational or instructional objective of the course, not trivial information. • Test for important or significant information. • Focus on a single problem or idea for each test item. • Keep the vocabulary consistent with the examinees’ level of understanding. • Avoid cueing one item with another; keep items independent of one another. • Use the author’s examples as a basis for developing your items. • Avoid overly specific knowledge when developing items. • Avoid textbook, verbatim phrasing when developing the items. • Avoid items based on opinions. • Use multiple‐choice to measure higher level thinking. • Be sensitive to cultural and gender issues. • Use case‐based questions that use a common text to which a set of questions refers. Stem Construction Rules: • State the stem in either question form or completion form. • When using a completion form, don’t leave a blank for completion in the beginning or middle of the stem.
DOCUMENT 5
Ensure that the directions in the stem are clear, and that wording lets the examinee know exactly what is being asked. • Avoid window dressing (excessive verbiage) in the stem. • Word the stem positively; avoid negative phrasing such as “not” or “except.” If this cannot be avoided, the negative words should always be highlighted by underlining or capitalization: Which of the following is NOT an example …… • Include the central idea and most of the phrasing in the stem. • Avoid giving clues such as linking the stem to the answer (…. Is an example of an: test‐wise students will know the correct answer should start with a vowel) General Option Development Rules: • Place options in logical or numerical order. • Use letters in front of options rather than numbers; numerical answers in numbered items may be confusing to students. • Keep options independent; options should not be overlapping. • Keep all options homogeneous in content. • Keep the length of options fairly consistent. • Avoid, or use sparingly, the phrase all of the above. • Avoid, or use sparingly, the phrase none of the above. • Avoid the use of the phrase I don’t know. • Phrase options positively, not negatively. • Avoid distracters that can clue test‐wise examinees; for example, absurd options, formal prompts, or semantic (overly specific or overly general) clues. • Avoid giving clues through the use of faulty grammatical construction. • Avoid specific determinates, such as never and always.
DOCUMENT 5
• Position the correct option so that it appears about the same number of times in each possible position for a set of items. • Make sure that there is one and only one correct option. Distracter (incorrect options) Development Rules: • Use plausible distracters. • Incorporate common errors of students in distracters. • Avoid technically phrased distracters. • Use familiar yet incorrect phrases as distracters. • Use true statements that do not correctly answer the item. • Avoid the use of humor when developing options. • Distracters that are not chosen by any examinees should be replaced. Suggestions for Writing Good Multiple Choice Items: • Present practical or real‐world situations to the students. • Present the student with a diagram of equipment and ask for application, analysis or evaluation. Present actual quotations taken from newspapers or other published sources and ask for the interpretation or evaluation of these quotations. • Use pictorial materials that require students to apply principles and concepts. • Use charts, tables or figures that require interpretation. Guidelines to Writing Test Items • Begin writing items well ahead of the time when they will be used; allow time for revision.
DOCUMENT 5
• Match items to intended outcomes at the proper difficulty level to provide a valid measure of the instructional objectives. • Be sure each item deals with an important aspect of the content area and not with trivia. • Be sure that the problem posed is clear and unambiguous. • Be sure that each item is independent of all other items (i.e., a hint to an answer should not be unintentionally embedded in another item). • Be sure the item has one correct or best answer on which experts would agree. • Prevent unintended clues to the answer in the statement or question (e.g., grammatical inconsistencies such as ‘a’ or ‘an’ give clues). • Avoid duplication of the textbook in writing test items; don’t lift quotes directly from any textual materials. • Avoid trick or catch questions in an achievement test. (Don’t waste time testing how well the student can interpret your intentions). • On a test with different question formats (e.g., multiple choice and True‐False), one should group all items of similar format together. • Questions should follow an easy to difficult progression. • Space the items to eliminate overcrowding. • Have diagrams and tables above the item using the information, not below.
VERBS
Events, people, newspapers, magazine articles, definitions, videos, dramas, textbooks, films, television programs, recordings, media presentations
POTENTIAL ACTIVITIES & PRODUCTS
Tell, List, Describe, Relate, Locate, Write, Find, State, Name, Identify, Label, Recall, Define, Recognise, Match, Reproduce, Memorise, Draw, Select, Write, Recite
MATERAILS SITUATIONS
Revised Blooms Taxonomy – Verbs, Materials/situations that require this level of thinking, Potential activities and products REMEMBERING UNDERSTANDING APPLYING ANALYZING EVALUATING CREATING
Make a list of the main events . Make a timeline of events. Make a facts chart. Write a list of any pieces of information you can remember. List all the …in the story. Make a chart showing.. Make an acrostic. Recite a poem
Explain, Interpret, Outline, Discuss, Distinguish, Predict, Restate, Translate, Compare, Describe, Relate, Generalise, Summarise, Put into your own words, Paraphrase, Convert, Demonstrate, Visualise, Find out more information about
Speech, stories, drama, cartoons, diagrams, graphs, summaries, outlines, analogies, posters, bulletin boards.
Cut out or draw pictures to show a particular event. Illustrate what you think the main idea was. Make a cartoon strip showing the sequence of events. Retell the story in your own words. Paint a picture of some aspect you like. Write a summary report of an event. Prepare a flow chart to illustrate the sequence of events. Make a colouring book.
Solve, Show, Use, Illustrate, Construct Complete, Examine Classify, Choose Interpret, Make Put together, Change, Apply, Produce, Translate, Calculate, Manipulate, Modify, put into practice Diagrams, sculptures, illustrations, dramatisations, forecasts, problems, puzzles, organisations, classifications, rules, systems, routines. Construct a model to demonstrate how it will work. Make a diorama to illustrate an important event. Make a scrapbook about the areas of study. Make a papier-mache map to include relevant information about an event. Take a collection of photographs to demonstrate a particular point. Make up a puzzle game showing the ideas from an area of study. Make a clay model of an item in the area. Design a market strategy for your product. Dress a doll in costume. Paint a mural. Write a textbook outline.
Analyse, Distinguish, Examine, Compare Contrast, Investigate Categorise, Identify Explain, Separate Advertise, Take apart Differentiate, Subdivide, deduce,
Judge, Select, Choose, Decide, Justify, Debate, Verify, Argue, Recommend, Assess, Discuss, Rate, Prioritise, Determine, Critique, Evaluate, Criticise, Weigh, Value, estimate, defend
Create, Invent, Compose, Predict Plan, Construct Design, Imagine Propose, Devise Formulate, Combine, Hypothesize, Originate, Add to, Forecast,
Surveys, questionnaires, arguments, models, displays, demonstrations, diagrams, systems, conclusions, reports, graphed information
Recommendations, selfevaluations, group discussions, debates, court trials, standards, editorials, values.
Experiments, games, songs, reports, poems, speculations, creations, art, inventions, drama, rules.
Prepare a list of criteria to judge a ……..show? Remember to indicate priorities and ratings. Conduct a debate about a special issue. Make a booklet about 5 rules you see as important to convince others. Form a panel to discuss views. Write a letter to .... advising on changes needed at … Write a half yearly report. present your point of view.
Invent a machine to do a specific task. Design a building to house your study. Create a new product, give it a name and then devise a marketing strategy. Write about your feeling sin relation to … Design a record, book or magazine cover. Sell an idea. Devise a way to … Compose a rhythm or put new words to an old song.
Design a questionnaire to gather information. Write a commercial to sell a new product. Conduct an investigation to produce information to support a point of view. Construct a graph to illustrate selected information. Make a jigsaw puzzle. Make a family tree showing relationships. Put on a play about he study area. Write a biography of the study person. Prepare a report. Arrange a party and record as a procedure. Review apiece of art including form, colour and texture
IV. Student Assessment Chapter 12 – Testing and Assessment Issues Assessment of student achievement is an important part of the teaching and learning process. Given at the beginning of a course, assessments help you know where to begin and/or identify areas of remediation that must be addressed. Frequent assessments during the course help you and your students see the progress of learning and help identify problematic areas where students need more help or time. Given at the completion of instruction, assessments tell you how much has been learned by the end of a unit, by mid-semester, or by the end of the term. They provide the basis for making judgments on the grades to assign each student. This chapter provides an overview of assessments, with a focus on tests or examinations typically given in paper/pencil format or on the computer. • • • • • • • • • •
Types of Learning Assessments Paper/Pencil or Computer Examinations (Tests) Step-by-Step Guidelines for Creating Tests Constructing Performance Tests General Tips about Testing Helping Students Learn from Tests Using Item Analysis to Test Your Test Cheating on Tests Alternative Methods of Assessment Resources on Testing
Types of Learning Assessments
•
•
•
Section: Student Assessment
Examinations o
Open-ended such as essay and short-answer
o
Limited-choice such as multiple choice, sentence completion, fill-in-blank, matching, true-false
o
Usually provided in pencil/paper format, sometimes involving scan response sheets or administered on a computer.
Written or Constructed Student Creations o
Reports, papers, projects, products
o
Usually done outside of class and involving research or reviews of a variety of information sources. Final products are assembled for submission.
Performances o
Demonstrations, events, presentations
o
Students demonstrate skills and knowledge in simulated or authentic conditions. May focus on
1
Chapter 12: Testing Issues
psychomotor skills but can also heavily involve cognitive skills and judgments, such as in counseling performances. Purposes of Effective Assessments •
Intended learning outcomes: Measure what students should know and/or be able to do to show they have mastered the learning outcomes.
•
Knowledge and skills included in the instruction: Measure the amount and quality of student ability to use information, examples, practices, and other related activities provided during instruction.
•
Enable generalization: Allow inference from skills and knowledge tested that students have mastered the full range of skills and knowledge and the essential or key content points.
Alignment of Course Components Assessments should align with the course objectives or specified learning outcomes and the content and activities included in the class. The appropriate range of content should be assessed, as well as the key points covered during instructional activities. The assessment should allow the student to demonstrate his or her knowledge and skill in the subject area.
Paper/Pencil or Computer Examinations (Tests)
Constructing Tests
Limited-Choice vs. Open-Ended Questions
The term “limited-choice” is used here to describe test questions that require students to choose one or more given alternatives (multiple choice, true/false, matching), and “open-ended” is used to refer to questions that require students to formulate their own answers (sentence completion, short answer, essay). Deciding Which Type of Test Questions to Use
Whether it is better to use open-ended or limited-choice test items depends on the circumstances and on the goals of the test. Each type of test has its own sets of strengths and weaknesses. The advantages and disadvantages of the two main categories of test items are discussed below in terms of the various issues that are often considered when a test is being developed.
Section: Student Assessment
2
Chapter 12: Testing Issues
Table 1 -- Comparison of Limited-Choice and Open-Ended Tests Issue
Limited Choice
Open-Ended
Level of learning objective (rule of thumb)
Recall, comprehension
Problem solving, synthesizing
Content coverage
Wider sample
Greater depth
Practice and reward of writing and reading No skills
Yes
Reward of creativity and divergent thinking
No
Yes
Feedback to instructor and student
Limited but fast
Thorough but slow
Length of exam (time to complete)
Short
Long
Size of class
Larger
Smaller
Reliability in grading
Very reliable
Requires work to become reliable
Exam construction and grading time
Long/short
Short/long
Test reusability
High
Low
Prevention of cheating
Low
High
Level of Learning Objective
In principle, both limited-choice and open-ended items can be used to test a wide range of learning objectives. In practice, most people find it easier to construct limited-choice items to test recall and comprehension, while open-ended items are used to test higherlevel learning objectives, but other possibilities exist. Limitedchoice items that require students to classify statements as fact or opinion go beyond rote learning, and focused essay questions can easily stay at the recall level.
Related Chapter -- For discussions of the different levels of learning outcomes, see Chapter 2 -- Determining Learning Objectives. Content Coverage
Since more limited-choice than open-ended items can be used in exams of the same length, it is possible to sample more broadly over a body of subject matter with limited-choice items. Scoring and Grading
Limited-choice exams allow faster and more consistent scoring than open-ended exams. Open-ended exams require individual
Section: Student Assessment
3
Chapter 12: Testing Issues
review and judgment of student responses and, therefore, take longer to score and may be scored more subjectively, even by the same reviewer. Unless graders are available, it is very difficult to give long open-ended exams and provide timely feedback in a high-enrollment course. Exams that consist mainly of limitedchoice items are usually more practical under these circumstances. Test Construction
Constructing good limited-choice exams takes much longer than open-ended exams because items must contain the pertinent information students need to answer the question, a set of appropriate distractors, and the correct answer. In addition, none of the information should include clues that point to the correct answer or be written in a way that unnecessarily confuses student reading and interpretation of the item. Open-ended exams, on the other hand, may be constructed quickly and easily because they usually consist of one or two direct statements or questions asking students to respond in writing. The chances of providing unwanted clues are greatly reduced, and there is no opportunity to confuse students with distractors that may be too close to the right answer. Length of Exam and Student Response Time
Whether using limited-choice or open-ended exams, instructors should consider how much time students might need to respond thoughtfully to the test items. One frequent complaint from students is that they knew the material but there were so many items that they could not answer every question or could not take time to provide thoughtful answers. Asking a colleague or a graduate student who has taken the course to complete the test may give you some estimate of how many items can be completed during the time provided. Reusability of Exam Items
In general, exams consisting of a large number of limited-choice items are easier to reuse than those consisting of only a few essay questions. More items are more difficult for students to remember and transmit to those who will take the exam later (if the printed exam does not get into circulation). If a large item bank is built and different exams can be randomly generated from the same pool of questions, limited-choice items are highly reusable. Prevention of Cheating
Limited-choice exams provide more opportunities for cheating than do open-ended exams since single letters or numbers are far easier to see or hear than extensive text. Cheating on limited-
Section: Student Assessment
4
Chapter 12: Testing Issues
choice items can be minimized in several ways, such as using alternative test forms and controlling students’ seating arrangements. Writing Limited-Choice Test Questions In the discussion of limited-choice items below, the term “stem” is used to refer to the part of the item that asks the question. The terms “responses,” “choices,” “options,” and “alternatives” are used to refer to the parts of the item that will be used to answer the question.
Example
Stem: Who is the author of Jane Eyre? Responses: A) Emily Bronte B) Charlotte Bronte C) Thomas Hardy D) George Elliot Multiple-Choice Items Advantages -- Multiple-choice items are considered to be among the most versatile of all item types. They can be used to test students’ ability to recall facts as well as their understanding and ability to apply learning. Multiple-choice items can also provide an excellent basis for post-test discussion, especially if the discussion addresses why the incorrect responses were wrong as well as why the correct responses were right. Disadvantages -- Unfortunately, good multiple-choice items are difficult and time-consuming to construct. They may also appear too discriminating (picky) to students, especially when the alternatives are well constructed, and open to misinterpretation by students who read more into questions than is there.
y
Suggestions for Constructing Multiple-Choice Items
Concerns about the general construction of questions
Section: Student Assessment
•
Use negatively stated items sparingly. When they are used, it helps to underline or otherwise visually emphasize the negative word. Never use the word “not” in a multiple-choice question.
•
Be certain there is only one best or correct response to the stem.
•
Keep the number of alternatives at five or fewer. Beyond five alternatives, poor alternatives are likely.
•
Randomly distribute correct responses among the alternative positions so that there are no discernible patterns
5
Chapter 12: Testing Issues
to the answer sequence (e.g., ABBABBABB). Try to have a nearly equal proportion of As, Bs, Cs, etc., as the correct answers. Concerns about the construction of the stem portion of the question •
Use the stem to present the problem or question as clearly as possible.
•
Use direct questions rather than incomplete statements for the stem.
•
Include as much of the item as possible in the stem so that alternatives can be kept brief. However, when applying definitions, it is recommended you place the terms in the stem and use the definitions as options, although this makes the questions rather long.
Concerns about the construction of the responses or options of the question •
List options on separate lines rather than including them as part of the stem, so that all options can be clearly distinguished.
•
Keep all alternatives in a similar format (i.e., all phrases, all sentences, etc.).
•
Be certain that all options are plausible responses to the stem. Poor alternatives should not be included just for the sake of having more options.
•
Check all choices for grammatical consistency with the stem.
•
Try to make alternatives for an item approximately the same length. Making the correct response consistently longer is a common error.
•
Use misconceptions students have displayed in class, or errors commonly made by students in the class, as the basis for incorrect alternatives.
•
Use “all of the above” and “none of the above” sparingly since students, on the basis of incomplete knowledge, choose these alternatives often.
•
Use capital letters (A, B, C, D, E) as response signs rather than lower-case letters (“a” gets confused with “d” and “c” with “e” if the quality of the typeface or duplication is poor).
True/False Items Advantages -- True/false items are relatively easy to prepare since each item comes rather directly from the content. They offer the instructor the opportunity to write questions that cover more
Section: Student Assessment
6
Chapter 12: Testing Issues
content than most other item types since students can respond to many questions in the time allowed. They are easy to score accurately and quickly. Disadvantages -- True/false items, however, may not give a true estimate of the students’ knowledge since students have a 50/50 chance of guessing the correct answer. They are very poor for diagnosing students’ strengths and weaknesses and are generally considered to be “tricky” by students. Since true/false questions tend to be either extremely easy or extremely difficult, they do not discriminate between students of varying ability as well as other types of questions.
y
Section: Student Assessment
Suggestions for Constructing True/False Items
•
Keep language as simple and clear as possible.
•
Use a relatively large number of items (75 or more when the entire test is T/F).
•
Avoid taking statements verbatim from the text.
•
Be aware that extremely long or complicated statements will test reading skill rather than content knowledge.
•
Require students to circle or underline a typed “T” or “F” rather than to fill in a “T” or “F” next to the statement. This allows scorers to avoid having to interpret confusing handwriting.
•
Avoid the use of negatives, especially double negatives. Never use “not.”
•
Avoid ambiguous or tricky items.
•
Be certain that the statements used are entirely true or entirely false. Statements that are either partially true or partially false cause unnecessary ambiguity.
•
Use certain key words sparingly since they tip students off to the correct answers. The words “all,” “always,” “never,” “every,” “none,” and “only” usually indicate a false statement, whereas the words “generally,” “sometimes,” “usually,” “maybe,” and “often” are frequently used in true statements.
•
Use precise terms, such as “50% of the time,” rather than less precise terms, such as “several,” “seldom,” and “frequently.”
•
Use more false than true items, but do not exceed their use more than 15%. False items tend to discriminate more than true items.
•
Avoid patterns in answers such as “all true,” “all false,” or “alternation.”
7
Chapter 12: Testing Issues
Matching Items Advantages -- Matching items are generally quite brief and are especially suitable for who, what, when, and where questions. They can, however, be used to have students discriminate among, and to apply concepts. They permit efficient use of space when there are a number of similar types of information to be tested. They are easy to score accurately and quickly. Disadvantages -- Among the drawbacks of matching items are that they are difficult to use to measure learning beyond recognition of basic factual knowledge, and they are usually poor for diagnosing student strengths and weaknesses. Matching items are appropriate in only a limited number of situations, and they are difficult to construct since parallel information is required.
Example -- Notice the relative width of the columns in the â&#x20AC;&#x153;Cities of the World Quizâ&#x20AC;? on the following page. Also notice that the directions tell the learner what to do and answer possible questions about the format of the quiz.
Section: Student Assessment
8
Chapter 12: Testing Issues
Cities of the World Quiz Directions: A description of, or fact about, a major city in the world appears as part of the numbered question. The city names are listed on the right. Write the capital letter corresponding to the correct city in the list on the line corresponding to each question. You may use cities from the list more than once. Some cities may not be described at all. A. Kyoto ___1. The Seine river divides this city into two famous banks. B. Madison ___2. This obscure Roman fortress city suffered four major C. London fires on its way to becoming capital of an empire larger than D. Paris Rome. E. Tallahassee ___3. The capital city of the Island of Taiwan F. Chicago ___4. Once a capital of the Roman empire, this city became G. Rome the capital of the Eastern Orthodox faith. H. Lisbon ___5. The tallest building in the world is located in this city. I. Moscow ___6. Called the “City of Big Shoulders,” this city was once J. Taipei home to the world’s largest stockyards. K. ___7. Home city to the Statue of Liberty Constantinople ___8. Located on a continental divide, this city’s builders L. Beijing reversed the direction of flow of water in the City’s river. M. New York ___9. This city was once the winter capital of Japan.
N. Koala Lumpur
___10. The Kremlin is located in this city.
O. Capetown
y
Section: Student Assessment
Suggestions for Constructing Matching Items
•
Use only homogeneous material in a set of matching items, i.e., dates and places should not be in the same set.
•
Use the more involved expressions in the stem, and keep the responses short and simple.
•
Supply directions that clearly state the basis for the matching, indicating whether or not a response can be used more than once and stating where the answer should be placed.
•
Be certain there are never multiple correct choices for one premise, although a choice may be used as the correct answer for more than one premise.
•
Avoid giving inadvertent grammatical clues to the correct choice by checking that choices match each other in terms of tense, number, and part of speech.
•
Arrange items in the response column in some logical order -- alphabetical, numerical, chronological -- so students can find them easily.
9
Chapter 12: Testing Issues
•
Avoid breaking a set of items (premises and choices) over two pages.
•
Use no more than 15 items in one set.
•
Provide more choices than premises to make “process-ofelimination” guessing less effective.
•
Number each premise for ease in later discussions.
•
Use capital letters for the choice signs rather than lower-case letters. Insist that a capital letter be written in the area where the answer is placed.
Writing Open-Ended Test Questions Completion Items Advantages -- Completion items are especially useful in assessing mastery of factual information when a specific word or phrase is important to know. They preclude the kind of guessing that is possible on limited-choice items since they require a definite response rather than simple recognition of the correct answer. Because only a short answer is required, their use on a test can enable a wide sampling of content. Disadvantages -- Completion items, however, tend to test only rote and repetitive responses, and they may encourage a fragmented study style since memorization of bits and pieces of information can result in higher test scores. They are more difficult to score than forced-choice items, and scoring often must be done by the test writer since more than one answer may have to be considered correct.
y
Section: Student Assessment
Suggestions for Constructing Completion Items
•
Use original questions rather than taking questions from the text.
•
Provide clear and concise cues about the expected response.
•
Use vocabulary and phrasing that comes from the text or class presentation.
•
Provide explicit directions, when possible, as to what amount of variation will be accepted in the answers.
•
Avoid using a long quote with multiple blanks to complete.
•
Require only one word or phrase in each blank.
•
Facilitate scoring by having the students write their responses on lines arranged in a column to the left of the items.
•
Ask students to fill in only important terms or expressions.
10
Chapter 12: Testing Issues
•
Avoid providing grammatical clues to the correct answer by using “a,” “an,” etc., instead of specific modifiers.
•
Assign much more credit for completion items than for T/F or matching items.
Essay/Short-Answer Items Advantages -- Short-answer items, those limited to fewer than five full sentences, are interchangeable with completion items. Essay items, on the other hand, allow expression of both breadth and depth of learning, and encourage originality, creativity, and divergent thinking. Written items offer students the opportunity to use their own judgment, writing styles, and vocabularies. They are less time-consuming to prepare than any other item type. Disadvantages -- Unfortunately, tests consisting only of written items permit only a limited sampling of content learning due to the time required for students to respond. Essay items are not efficient for assessing knowledge of basic facts, and they provide students more opportunity for bluffing and rambling than do limited-choice items. They favor students who possess good writing skills and for their neatness, and they are pitfalls for students who tend to go off on tangents or misunderstand the main point of the question. The main disadvantage, however, is that essay items are difficult and time-consuming to score and are potentially subject to biased and unreliable scoring.
y
Section: Student Assessment
Suggestions for Constructing Essay/Short-Answer Items
•
Use novel problems or material whenever possible, but only if they relate to class learning.
•
Make essay questions comprehensive rather than focused on small units of content.
•
Require students to demonstrate command of background information by asking them to provide supporting evidence for claims and assertions.
•
Provide clear directions as to the expectations.
•
Allow students an appropriate amount of time. It is helpful to give students some guidelines on how much time to use on each question, as well as the desired length and format of the response, e.g., full sentences, phrases only, outline, and so on.
•
Inform students, in advance, about the proportional value of each item in comparison to the total grade.
•
Keep grading in mind while creating the questions. Jot down notes of what you expect to see in student answers that help identify mastery of the subject matter.
11
Chapter 12: Testing Issues
Step-by-Step Guidelines for Creating Tests
Constructing Performance Tests
Section: Student Assessment
•
Determine which types of items are best for the testing situation, and then write them.
•
Write explicit directions for the test sections indicating credit on each section.
•
Organize the layout (group like items together; start with easy items; number the items).
•
Make the answer key.
•
Review patterns of responses (avoid sequences such as ABABCABABC).
•
Use alphabetic, chronological, or numerical sequences to determine how response choices are organized, all of which will help you avoid getting into a pattern of responses.
•
Consider scoring.
•
Weight test points according to types of item, learning assessed, and student effort involved.
•
Score test papers anonymously.
•
Observe student confidentiality.
•
Review the final product. Are the items concise?
•
Have inadvertent clues been avoided?
•
Do the number of items written for each objective, or topic area, represent the emphasis placed on them during instruction?
•
Do the difficulty levels of the items seem appropriate?
•
Is the length of the test appropriate?
•
Are the test items readable (understandable)?
•
Have spelling errors and typos been corrected?
•
Ask an “outside reviewer” available to critique the test for content, difficulty level, and timing.
•
Make final changes and then duplicate the test.
Advantages -- The truest measure of whether a learner is able to do something is to watch the learner do it. Performance tests provide an opportunity to make this kind of measurement. As long as you are able to observe the student’s performance of some prescribed task, your confidence in the student’s ability is affirmed.
12
Chapter 12: Testing Issues
Disadvantages -- While we may see a performance successfully completed, we do not have assurance of the normalcy of the performance situations. We also do not know that failure to perform is an indication of what the student has done or will do in the future. General Guidelines Different kinds of tests will be appropriate, depending upon some of the following general guidelines. •
It is important to base the test on the specific skills or competencies that the course is promoting. A course in family therapy, for example, might include performance tests on various aspects that are covered in the course, such as recording client data, conducting an opening interview, and leading a therapy session. Developing a performance test involves isolating particular demonstrative skills that have been taught and establishing ways in which the level of skill can be assessed for each student. You might, for example, decide that the best way a student can demonstrate counseling skills, such as active listening, would be to have the student play the role of therapist in a simulated session.
•
Good performance tests specify criteria on which successful performance will be judged. For curriculum areas in which it is possible to define mastery clearly, it is desirable to do so (e.g., “the student will be able to tread water for five minutes”). In most areas, however, effective performance is a complex blend of art and skill, and particular components are very subtle and hard to isolate. In these cases, it is often useful to try to highlight some observable characteristics and to define what would constitute adequate performance.
Example -- In a test of teaching, students might be expected to demonstrate clarity, organization, discussion skills, reinforcement of student responses, and the like. Operational definitions for specific components to be evaluated may be phrased like the following excerpt from a teaching observation checklist: “Praises student contributions -- The instructor acknowledges that s/he values student contributions by making some agreeable verbal response to the contributions. The instructor may say ‘That’s a good point,’ ‘Right, well done.’ or the like.” Such information is helpful to the student as well as the instructor who will be rating the performance. •
Section: Student Assessment
Define the task as clearly as possible rather than simply alerting students to the fact that their performance will be observed or rated. It is helpful to give students precise instructions on how the test will be structured, including how long they will have to complete the task, the conditions under which they will perform the task, and other factors that will allow them to anticipate and prepare for the test. If possible, set
13
Chapter 12: Testing Issues
up a new testing situation by asking a student or colleague to go through a trial run before using the test with students so that unanticipated problems can be detected and eliminated. •
It is important to give the same test or same kind of test to each student. When possible, it is best to arrange uniform conditions surrounding a performance-testing situation. Students can be given the same materials to work with, or the same task. Often, however, particularly in professional practice situations, it is difficult to control the context of a performancetesting situation. One nursing student may be evaluated while dealing with an especially troublesome patient, while another will be working with a helpful patient. In these situations, documenting and allowing for the contextual influences on the performance is an extremely important part of the evaluation.
In summary, the effectiveness of performance testing is directly related to how appropriate the test is, given the course objectives; how clearly the tasks are defined; how well the criteria for successful performance have been identified and conveyed; and how uniform the testing is for all students involved.
Related Chapter -- For a discussion of grading students in a performance situation, see Chapter 13 -- Grading.
General Tips about Testing
Section: Student Assessment
•
Use a variety of item types. It is often advantageous to include a mix of item types (multiple choice, true/false, essay) on a written exam or to mix types of exams (a performance component with a written component). Weaknesses connected with one kind of item or component, or in students’ test taking skills, will be minimized. If a mix of item types is used on one exam, items of the same type should be grouped together.
•
Be cautious about test banks. You should be cautious about using tests written by others. Items developed by a previous instructor or by a textbook publisher can save a lot of time, but they should be checked for accuracy and appropriateness for the given course, and whether they are written according to the standards of test construction.
•
Test early. You will find it helpful to test early in the semester and, if results are poor, consider discounting the first test. Students often need a practice test to understand the format each instructor uses and to anticipate the best way to prepare for and take particular tests.
•
Test frequently. Frequent testing helps students avoid getting behind, provides you with multiple sources of information to use in computing
14
Chapter 12: Testing Issues
the final course grade (thus minimizing the effect of “bad days”), and gives students regular feedback.
y Suggestion from an instructor in Information Management Systems, College of Business -- “I give quizzes every week. (I don’t count two of the quizzes, giving the students a chance for a cut and to drop their lowest grade. Some of the quizzes do not require study or effort but they do tell me if the students are understanding very general concepts…) The students studied for the quizzes, and I believe they did better on the mid-term and final exam than students in other sections of the course largely because of the quizzes.”
Section: Student Assessment
•
Test in proportion to the emphasis a topic was given in class. It is important to test various topics in proportion to the emphasis they have been given in class. Students will expect this practice and will study with this expectation.
•
Show items to colleagues before printing the test. Written exams should be proofread with care and, when possible, a second person should be asked to proofread them. Tiny mistakes, such as mis-numbering the responses, can cause problems later. Also, check carefully for missing pages after collating.
•
Reuse effective test items. If enough test items are developed and kept out of circulation between tests, it is possible to develop a test item bank from which items that are known to be effective can be reused on multiple versions or offerings of a test. (See Using Item Analysis to Test the Test for information on how to determine the effectiveness of test items.)
•
Do not use a series of questions in which answering successfully depends on knowing the correct answer to a previous item. Generally, on either a written or performance test, it is wise to avoid having separate items or tasks depend upon answers or skills required in previous items or tasks. A student’s initial mistake will be perpetuated over the course of succeeding items or tasks, penalizing the student repeatedly for one error.
•
Pilot-test the exam. A good way to detect test errors in advance is by pilot-testing the exam. You can take the test yourself or ask colleagues and/or former students to critique it.
•
Be aware of the needs of special students. It is important to anticipate special considerations that learning disabled students or non-native speakers may need. You must decide whether or not these students will be allowed the use of
15
Chapter 12: Testing Issues
dictionaries, extra time, separate testing sites, or other special conditions.
Helping Students Learn from Tests
•
Bring extra copies of the test to class. Having too few copies of a written exam can be a disaster. You can avoid problems by bringing more copies of the exam than you think will be needed. Also, when duplicating the test, be certain that no pages are missing. Missing pages can pose a serious problem unless a contingency has been planned.
•
Do not interrupt students while they are taking the exam. Before the exam, students can be informed that they should check the board periodically for instructions or corrections. You can minimize interruptions during the exam by writing on the board any instructions or corrections that need to be made after the exam has begun and then calling students’ attention to them.
Testing’s most important function is to serve as an educational tool, not simply as a basis for grading. Not only do tests direct students’ studying, but also they can provide important corrective feedback for the student. Returning Test Papers Returning test papers promptly is appreciated by students and conforms to traditional learning principles. However, if you do not plan to discuss the papers, do not hand them back at the beginning of the hour or you risk losing students’ attention for the rest of the hour. Although students appreciate your returning examinations to them, there may be some question as to whether you should return multiple-choice examinations. Multiple-choice items are difficult to construct, and you may not want the items to “get out.” However, you can return separate answer sheets so that your marking and arithmetic can be checked. Allow students to have copies of the examination while you go through the test. If you follow this method, however, certain questions arise. Does such a procedure destroy the validity of the items in future tests? Do the students benefit from an exam review? These are experimental questions to which we have only partial answers, but evidence suggests that validity is not lost and that students do learn from their corrected papers, even when they do not get to keep them. Although you may not wish to spend class time quibbling over some individual items, you should make known your willingness to discuss the test individually with students. Providing Feedback for Essays and Short-Answer Tests The comments written on essays and short-answer tests are far more important than the grade. What kinds of comments are
Section: Student Assessment
16
Chapter 12: Testing Issues
helpful? Look for problems that arise from a lack of ability to see relationships, implications, or applications of material. Help students find alternative ways of looking at the problem rather than simply noting that something is wrong. Comments that provide correction and guidance may not achieve their purpose if students become so discouraged that they give up. The motivational as well as the cognitive aspects of comments need to be considered. Misconceptions must be identified, but not in overwhelming number. Encouragement and guidance for improvement should set the overall tone.
y
Suggestion -- When you review an essay or short-answer test in class, describe what you had expected in a “good” or “satisfactory” answer and then discuss common inadequacies. Read an example of a good answer (without identifying the student) and construct a synthetic “poor” answer as contrast.
Reviewing Limited-Choice Tests A small-group exercise is a technique for helping students learn from mistakes while reducing their tendency to complain about the appropriateness or fairness of test items. Instructors using this technique break the class into small groups of five to eight students. Each group discusses the test for part of the class period. When they have finished, unresolved questions are referred to the instructor as the expert. This method seems to permit dissipation of the aggressions aroused and to limit arguments to points where there are several aggrieved students. Dealing with Special Problems What about the student who comes to your office in great anger or with a desperate appeal for sympathy but with no educationally valid reason for changing the test grade? First, listen. Engaging in a debate will simply prolong the unpleasantness. If you decide not to change the grade once you have heard the student out, try to convert the discussion from one of stonewall resistance to problem solving. Try to help the student find alternative modes of study that will produce better results. (“What can we do to help you do better next time?”) Encourage the student to shift from blaming you or the test toward motivation to work more effectively.
y
Suggestion -- A technique that will reduce the number of students coming to your office in a state of high emotion is to ask students who have complaints about grades to write a paragraph describing their complaint or point of view. State your willingness to go over the test with anyone who brings in such a paragraph. This technique has a calming effect, resulting in fewer unfounded
Section: Student Assessment
17
Chapter 12: Testing Issues
complaints and more rational discussion with those who do come to your office. While these suggestions may save you some bitter moments, they cannot substitute for the time (and it takes lots) devoted to the construction of good tests. [Adapted with permission from: A Guidebook for University of Michigan Teaching Assistants. Center for Research on Learning and Teaching, University of Michigan and from: Teaching Tips: A Guidebook for the Beginning College Teacher (9th ed.) by W. J. McKeachie (1994).]
Using Item Analysis to Test Your Test
After a test has been administered, a good way to judge its quality, particularly in the case of a limited-choice test, is to perform an item analysis. It is especially important to do this when test items will be reused or when there is sufficient doubt about students’ test results to consider dropping some items as invalid when computing the final grade. Machine scannable test forms can be used or software purchased to provide item analysis. It is possible to perform an item analysis without a computer, especially if the test is short and the class size is small. Procedures for Computing Difficulty and Discrimination Indices
Section: Student Assessment
•
Score each test by marking correct answers and putting the total number of correct answers on the test.
•
Sort the papers in numerical order (highest to lowest) according to the total score.
•
Determine the upper, middle, and lower groups. One way to do this is to call the top 27% (some people use the top third) of the papers the “upper group,” the bottom 27% (some people use the bottom third), the “lower group,” and the remaining papers, the “middle group.”
•
Summarize the number correct and number wrong for each group.
•
Calculate the difficulty index for each item by adding the number of students from all groups who chose the correct response and dividing that sum by the total number of students who took the test. The difficulty index will range from 0 to 1, with a difficult item being indicated by an index of less than .50 and an easy item being indicated by an index of over .80.
•
Calculate the discrimination index by first calculating for both the upper and lower group students the percentage of students who answered each item correctly. Subtract the percentage of lower group students from the percentage of upper group students to get the index. The index will range from -1 to +1, with discrimination over .3 being desirable and a negative index indicating a possibly flawed item.
18
Chapter 12: Testing Issues
Table 2 illustrates item analysis for a simple set of scores for 37 students on a 10-item test. The names of the 10 students (approximately 27% of the total students) with the highest scores are listed as the upper group; the 10 students with the lowest scores (again, approximately 27%) are listed as the lower group; and the remaining 17 are listed as the middle group. On item 1, for example, the difficulty index was calculated by totaling the correct responses and dividing by the number of students (19/37 = .51). The item appears to be on the difficult end of the range. The discrimination index for the same item was obtained by first calculating the percent correct for both the upper and lower groups -- 20% and 90% respectively -- then subtracting the percentage for the lower group from that of the upper group (.20 - .90 = -.70). This negative discrimination index indicates that the item is probably flawed. Note that the students who scored poorly on the exam as a whole did well on this item and the students who got the top total scores on the exam did poorly -- the reverse of what one would expect. A mistake in the answer key or some error in the question that only the more discriminating students would catch might be the cause. If the answer key is correct, this item should be dropped from the test. Such items should be revised before being used on a test again. Table 2
Section: Student Assessment
1 Upper Group Ellen C John C Albert W Joanne W Maria W Anne W Doris W Joshua W Barbara W Michael W # Correct 2 # Wrong 8
Sample Test Grid for 10 Items Item Numbers 2 3 4 5 6 7 8
9
10
C C C W C C C C C C 9 1
C C C C C C C C C C 10 0
C C C C C C C C C C 10 0
C C C C C C C C C W 9 1
C C C C C C C C C C 10 0
C C C C C C C C C C 10 0
C C C C C C C C C C 10 0
C C C C C C C C C C 10 0
C C C C C C C C C C 10 0
Middle Group # Correct 8 # Wrong 9
12 5
12 5
13 4
12 5
13 4
11 6
11 6
12 5
12 5
Lower Group Lucille C Joseph C Charles W
C C W
C C C
C C C
W W C
C C C
W W W
C C C
W C C
C C W
19
Chapter 12: Testing Issues
Leslie Jerome Nancy Judith Ralph Beth Donald # Correct # Wrong
C C C C C C C 9 1
Difficulty Index .51
C C C C W C W 7 3
C C C W W W C 7 3
C C C C W W C 8 2
C C C C C W W 6 4
C C C C C W C 9 1
W W W W C W W 1 9
C C W W W W W 5 5
C C C W W W W 5 5
W C W W W C C 5 5
.76
.78
.84
.73
.86
.59
.70
.73
.73
.3
.2
.3
.1
.9
.5
.5
.5
Discrimination Index -.7 .2 C = W=
Correct Wrong
[Adapted with permission from: Teaching at the Ohio State University: A Handbook. Center for Teaching Excellence (1990).]
Cheating on Tests
The University has an Academic Honor Code that calls for the coordinated efforts of faculty members and students to uphold academic integrity and combat academic dishonesty, including cheating and plagiarism. The Academic Honor Code includes descriptions of violations of the code, statements of student and faculty responsibilities for upholding the code, and explanations of academic penalties for violating the code. A description and information of the Academic Honor Code can be found in the current Student Handbook. Preventing Cheating
Section: Student Assessment
â&#x20AC;˘
Reduce the pressure. The first action you can take is to reduce the pressure on students. While you cannot influence the general academic atmosphere that places heavy emphasis on grades, you can influence the pressure in your own course. One method to accomplish this is to provide students with several opportunities to demonstrate their achievement of course objectives rather than relying upon a single examination.
â&#x20AC;˘
Make reasonable demands. A second way to reduce cheating is to make sensible demands. Write fair tests and design reasonable assignments. Some cheating is simply the result of frustration and desperation arising from assignments that are too long to be completed adequately or tests that require the memorization of trivial information. Remember, some students view cheating as a way of getting back at an unreasonable instructor.
20
Chapter 12: Testing Issues
•
Treat students as individuals. Work to develop and maintain each student’s sense that she is an individual with a personal relationship both with you and her classmates. Students are not as likely to cheat in situations where they are known as individuals, whereas they may be tempted to cheat in situations where they feel they are anonymous members of a crowd. If a large course has regular meetings in small discussion or laboratory sections, there is likely to be less cheating if the test is administered in these groups than if the test is administered en masse. Moreover, if it is in their regular classroom, they will perform better.
•
Show an interest in your students. Cheating is more likely to occur when students think the instructor is disinterested and unconcerned. Instructors often feel that any show of active proctoring will indicate that they do not trust the students. However, it is possible to convey a sense of alert helpfulness while walking between desks and watching for questions.
•
Use alternate seating. The most common form of cheating is copying from another student’s paper. To minimize opportunities for copying, try to recruit proctors and administer exams in a room that is large enough to enable students to sit in alternate seats. Before students arrive, write on the chalkboard: “Please sit in alternate seats.”
•
Use alternate test forms. Another way to reduce cheating is to use two or more alternative forms of the test. This method can be achieved by simply scrambling the order of the test items. Instructors who want the test items to follow the order that the material was covered in the course can scramble the items within topic areas.
•
Be careful about extra copies. Do not leave copies of tests lying around your office, the typist’s office, or photocopy room.
[Adapted from: Teaching Tips: A Guidebook for the Beginning College Teacher (9th ed.), by W. J. McKeachie, Lexington, MA: D. C. Heath. (1994).]
Handling Cheating Despite preventive measures, almost every instructor must at some time or other face the problem of what to do about a student who is cheating. Policies for handling cheating are set by the University as well as by departments. FSU’s Faculty Handbook, Chapter 8 provides specific information about university policy.
Section: Student Assessment
21
Chapter 12: Testing Issues
Alternative Methods of Assessment
There are assessment devices, other than tests, that can be used to provide measures of student performance, including: •
Essays
•
Term papers
•
Research reviews
•
Reports
•
Case studies
•
Portfolios
•
Projects
•
Performances
•
Peer evaluation
•
Mastery
•
Simulations
Just as with tests, the underlying principles to keep in mind as you introduce alternative assessment tools are validity and reliability. A tool you use for measurement will be valid as long as it measures student learning of goals and objectives set for the course. The measurement will be reliable if you expect to get similar results administering the chosen assessment tool to the same group of people again. It will have the additional benefit of reusability, if it can be used in multiple instances, including future classes. Available media and equipment influence the choice of assessment tools. But the principle factor involved in the choice of an assessment strategy is the overall design of the course. Good design principles demand that the assessment strategies be chosen as part of the overall instructional plan before the course actually starts. •
Section: Student Assessment
Essays are written assignments in which the student is the source of information. Essays report on things the student knows or thinks. Reference is not a major part of an essay. But an essay requires the student to use high-level thinking skills. There may be preparatory activities involved with writing an essay. For instance, an assignment may ask a student to read several articles from several viewpoints and then derive his own viewpoint from the articles. The expectation in giving the assignment is that the student will apply reasoning skills and reach a conclusion that is well reasoned. The actual position reached is not the main value of the essay, and should not be evaluated unless your objective is to have students’ opinions agree with your own.
22
Chapter 12: Testing Issues
Essays expose student-reasoning processes. An assignment of a practice essay (that is not figured into the course grade) near the beginning of a course gives the instructor an idea of whether the student’s reasoning skills are adequate for pursuing the course. If a change in reasoning skills is a desired course outcome, an essay assigned near the end of the course is a good way to tell whether the desired skills have been attained. •
Reports (and term papers) usually have a specific topic that may be assigned by the instructor or selected by the student. When a student reports on a set of facts or events, accuracy of the student’s description is the main concern. The report often includes a provision for commentary by the student. The student’s commentary is presumed to reflect the student’s point of view accurately about the facts, events, or issues of the report. Research on which a report is based may be of any variety, including experimentation and documentation. The amount of research for a report varies. In a report based on documentation, credit for quotations and concepts should be included.
•
Research reviews ask a student to find out what research about a topic area has been done. Unless the student is asked to synthesize the results of the research, they offer little room for a student’s free expression or creativity. If the same research review is assigned to a group of students or a class, duplication of the research found should anticipated. The assignment measures the ability of the student to use available research tools and the ability to judge whether articles found qualify as appropriate references for the subject at hand.
•
Case studies are often associated with problem-based learning. They are used to assess a learner’s ability to analyze, make decisions, and solve problems. Related Chapter – Learning through case studies is addressed in Chapter 8 – Using Active Learning in the Classroom. Case studies measure depth of learning to a greater extent than most limited choice tests, which focus on memorization skills. Used as an assessment tool, the instructor usually creates the case that can be contained within the allocated assessment time. Like the case studies used for learning, these contain a number of circumstance descriptions, which provide guidance through the project. While there are selfconsistent solutions, there are no “perfectly right” answers. Students look at the circumstances, bring them into their own
Section: Student Assessment
23
Chapter 12: Testing Issues
personalized conceptual frameworks, and then try to provide solutions. Therefore, answers may be phrased in various ways, but must include all salient points and exclude inaccurate points. Advantages: Case studies assess the readiness of learners to use their skills in real world contexts. The student may expose both the process of working through a case and the results obtained. This allows the assessment to focus itself on either the process or the product. The student should be informed ahead of time as to whether process or product will receive the greater weight. Cases are equally easy to present in computer based, hard copy, or audio based form. They lend themselves to either collaborative efforts or individual efforts. Student answers to case type problems may, with permission of the students involved, become models for answers to cases delivered in subsequent classes. If prior student work is used to model poor performance, care must be taken to keep the studentâ&#x20AC;&#x2122;s name and identity anonymous. It is probably better to construct poor examples if they are needed, rather than use student work. Limitations: The cases may require considerable effort for the teacher to create or select. The tendency to collaboration makes it difficult for teachers to assess individual contributions to team efforts. Case studies take a long time to grade, and grading of them tends to be highly subjective. This subjectivity is greatly reduced, however, by the use of guidelines and rubrics that specify what features will be assessed in a case study answer. Contextual considerations: Case studies can be used with almost any available media choices. The cases can be presented in print, audio, video, computer based, or Internetbased forms depending on what the learners have available. Proposed solutions can be designed for presentation via any of these media choices. Where the Internet is available, the use of collaborative learning tools such as those included in Blackboard allows tracking of individual participation in collaborative activities, provided all parties have agreed to use these tools. This means that where cases are assigned to groups, individual contributions to group effort can be assessed in a limited sense. A tracking system may note the number of contributions an individual makes to a discussion, but rarely evaluates the quality of those contributions. Checking samples from an archive of the discussion often gives a truer picture of individual student efforts. â&#x20AC;˘
Section: Student Assessment
Portfolios are of all the work that students collect as a demonstration of how their work has progressed and developed over time. The student, over the length of the course, will have been involved with and probably completed
24
Chapter 12: Testing Issues
several projects. In most cases, it is wise for teachers to set standards for portfolio contents in terms of number, quality, and size of projects to be included. Fine and performing arts classes are likely to include portfolios among the assessment tools used. Other instances where tangible products result from student efforts are also good places to use portfolios. For instance, a business class may collect a portfolio that includes a business plan, some presentations, and an implementation timeline as evidence of completion of course objectives. Advantages: Assessment of portfolios encourages extended thinking and reasoning. Where privacy is an issue, the sharing of project contents may be strictly controlled. The student is completely responsible for decisions about what to include and is therefore led to considerations of quality in what will be submitted. Except for setting standards, teachers do not have much to do with the construction of the portfolio and are free to guide learners toward completion of their portfolio projects. A wide variety of media choices are available for the types of projects that may be included. Portfolios are useful in a wide range of subject areas. Besides the arts, where it makes sense to let students demonstrate their capability with produced work, such areas as writing and even mathematics have been assessed with portfolio type assessment tools. In writing, student style and organizational skills are demonstrated with portfolios. In mathematics, data analysis and problem solving may usually be addressed with portfolio work. Limitations: Grading is highly subjective. The teacher sees only the successes of the student and leaves all failed projects out of grading considerations. Grading of portfolios can take a long time. Portfolios can show development on a broad spectrum of work or allow students to concentrate extensively in very narrow areas of a field of study. This strength of the portfolio approach allows students to mask weaknesses in other areas. Extensive use of portfolios may encourage students to repeat earlier successes rather than to undertake new challenges. Contextual considerations: Projects must be constructed in a media form compatible with the available media. Mediation of the projects in the portfolio to match the delivery system capabilities may distort the value of the project. For example, a portfolio of artwork loses scale information and impact when displayed on a computer monitor. Where such distortions are likely to happen, the student should be made aware of them and requested to take measures to offset the distortions (through labeling, etc). Also, the act of handing in a portfolio puts at risk a large and potentially valuable body of the studentâ&#x20AC;&#x2122;s work.
Section: Student Assessment
25
Chapter 12: Testing Issues
•
Term papers are valuable to students because they provide them with an opportunity to be experts in small but relevant areas of the field. They should be limited to one per term. Term papers are long for most students and are a significant part of the work for the term. A term paper should be introduced early in a course and collected near the course’s end. It should contain some requirement for research, and a strong indication that the student has mastered the course material as it was presented over the term of the course. A term paper is a type of project, and the characteristics and recommendations for projects apply (see next bullet). y Suggestions for making term papers more effective measures of learning: •
Ask students to write to readers other than you, such as peers, experts in the field, or to specific journals.
•
Clarify what the final term paper should do: classify, explain, summarize, demonstrate, generate, or design.
•
Let students know your expectations concerning:
•
•
Section: Student Assessment
o
Academic discourse conventions
o
Level of formality
o
Structure: introductions, bodies, conclusions, and internal organization options
o
Formatting instructions: length, margins, typing, cover page, page numbering, and documentation style. Give samples, if possible.
o
Charts, graphics
Assist in the writing process: o
Students bring in drafts and respond to each other’s work.
o
One-on-one conferences
o
Photocopy a past student’s draft and critique it as a class.
o
Encourage students to take their drafts to the Writing Center.
o
Schedule the workload.
When grading term papers: o
Avoid being overly directive, commenting on every grammatical error or global problem, and avoid making vague or generic comments.
o
Respond to strengths and weaknesses.
26
Chapter 12: Testing Issues
•
•
When responding, save yourself time by: o
Marking patterns in grammatical errors, or have students find the errors.
o
Focusing on three or four major issues.
o
Having students peer review the term papers before turning them in.
o
Having students visit the Writing Center.
o
Carefully designing your assignment.
o
Using a grading rubric.
Projects -- A teacher may request one large project or several smaller projects during a student’s course of study. The assessments of student performance on the project(s) collected may make up the whole or a part of a student’s course performance. Monitoring the student’s ongoing progress toward completion of a project moves the emphasis of instruction and assessment away from outcome and toward process. Large projects create an opportunity for the instructor and the student to work with each other. Typically, an instructor will assign a large project to the student, then check the student’s progress at various stages, offering advice for changes to be made along the way. Advantages: Students doing projects work outside the boundaries of the classroom. Classroom-time constraints play a limited role in how the projects turn out. Student effort is given to organizational activities along with problem solving. This allows the instructor to see the student’s work at its pinnacle, much like a portfolio. Presentation of some or all of the projects by their creators can be a classroom enhancement activity. Students should know in advance whether all projects or a sample of projects will be presented. The student-toteacher interactivity involved in most projects provides feedback to learners at the most appropriate time in a student’s learning process -- while a project is being done. Interaction between teacher and student helps keep the student on schedule toward project completion. Limitations: Since so much of the project is done outside the classroom setting, it is very difficult to monitor what the student is doing while completing the project(s). Different levels of acceptable outside help may need to be defined. The pacing of projects comes from both the student and the instructor, and hence does not faithfully represent the student’s own ability to set a pace. Getting the feedback to the student in a timely manner requires large amounts of teacher time, and frequent teacher check-ins.
Section: Student Assessment
27
Chapter 12: Testing Issues
Contextual considerations: Interactive projects are very well suited to Internet-based teaching tools. Tools for project interactivity over the Internet, such as “Common Space” (for writing) are becoming readily available at low cost. The use of these complex tools greatly facilitates working with this strategy, although there is a need to learn how to use the tools that may take away from available contact time. Feedback on projects must be timely, since pacing can be easily upset by long teacher delays. A low class size limit will permit you to provide timely feedback to all students. •
Performances require a student to perform in a classroom or at a special facility where performances of the type in question are done. Typically, there will be some measurement of the quality of the performance requested. The instructor alone, the instructor and the rest of the class, or a larger invited group, may view performances. Advantages: The criteria established help the student by clarifying what the standards of judgment of performance are, by letting students see strengths and weaknesses they have relative to these standards, and by establishing that the grading system used is consistent. Limitations: The number of available criteria that can be applied to measurement of student performance is incredibly high. No single checklist is likely to encompass every aspect of what students do when they are asked to perform. The richness of available criteria, however, does not make assessment of a live performance impossible. Instead, it makes precision in matching the choice of included criteria to learning objectives more critical than ever. For example, if you were making a performance checklist for a dance performance, you would almost certainly include some criteria related to the execution of the performance. But would you include criteria related to the choreography? You would probably only include this if the student was responsible for the choreography. Would you include a measure for the warmth of a student’s smile while dancing? It depends on whether your instruction included any encouragement of the act of smiling as something that enhances stage presence. You want to be fair to the students as much as possible in the checklist criteria you include. If your learning objectives are at particular levels, the performances on your checklist should be at the same levels. Making up scales and rubrics provides guidance as to how to measure what are often spontaneous and subjective performances. In general, students should be aware of the objective criteria on which performances are measured.
Section: Student Assessment
28
Chapter 12: Testing Issues
•
Peer Evaluations: The idea behind the use of peer evaluation as an assessment tool is that a student’s peers, who have had to work with and contend with a student, have a good idea of that student’s contribution level. The higher the grade levels of the students, the more likely this is to be the case. By asking students to review other students’ products, results, or performance, you can take account of experiences in which you were not directly involved. Peer evaluation often involves the use of a measurement instrument distributed multiple times. This presumes that all of a student’s peers participate in the evaluation. Sometimes, the number of instruments to be completed and counted may be reduced through varying both the assignment of partners and the assignment of which peers get to review. Some formula for compiling the instrument results yields an indication of each student’s peer evaluation score. Peer evaluation, when used at all, should be only a part rather than the whole of a student’s final grade. Advantages: Peer evaluation fills a gap in the usual assessment process that exists because so much of a student’s performance is unseen. Peer evaluation instruments, while subjective on an individual basis, provide data through which a quantitative measure of subjective judgments is accumulated. The feedback of measured information to the student, in carefully chosen circumstances, may motivate improvements in student performance. Group dynamics are being measured with these instruments, which is important in environments that value active participation in collaborative activities. Limitations: Peer evaluation may measure student popularity or some phenomenon other than the one the instructor wants to assess. Although the instruments are used to gather data, the data is an accumulation of subjective judgments. Summary of the measurement-instrument results is time consuming. A set of standards should be provided and explained to students, or students may agree among themselves on the standards they will use in determining peer performance. Contextual considerations: In computer-mediated, distanceeducation environments, collaboration is often a significant part of the learning experience. Peer evaluation is the only tool that measures this collaboration from the group’s point of view. The negative influences of poor in-group performances by some students may be brought to light. Where computers are used to mediate the communications processes of the educational environment, they can be used to aid in the gathering and summing of peer evaluation data, thereby making peer evaluation strategies easier to use.
Section: Student Assessment
29
Chapter 12: Testing Issues
•
Mastery Models -- When it is important that a skill be mastered, an “all or nothing” approach, similar to pass/fail, may be the best indicator. This is particularly true when a skill that will see later use in the learning process is first being learned. To assess using a mastery model, it is typical to assign a project that involves the use of the new skill, so that handing in a successfully completed project serves as an indication of the skill having been mastered. In many cases, the learner may take an unlimited number of tries without penalty, but will pass once mastery has been demonstrated. In essence, this model is like pass/fail for the steps along the way to achieving course objectives. Advantages: Mastery models are not competitive -- everyone can and will master the skill. Such models have high validity and reliability, and they provide a clear and direct measure of success in reaching learning objectives. Students also have the ability to avoid the need to relearn redundant material. Limitations: Mastery models are only applicable in skilllearning situations. While they measure the mastery of a skill, there are different levels of mastery that are not measured beyond the minimum competency level. Because everyone passes eventually, mastery leaves open the question of how to give grades. When everyone succeeds, there needs to be a difference between A and D, but this method incorporates no distribution of grades that may be used for determining the difference. While applying the principles of mastery learning helps students get through courses, the non-competitive nature of the learning makes it difficult to assess inside a competitive framework.
•
Simulations -- In an assessment that uses simulation, students are placed into an environment that, in many significant ways, looks and behaves like the environment where learning will actually be applied. They are given opportunities to perform in the simulated environment. Some record of their performance is used as the basis for assessment. Advantages: The use of simulations reduces the student’s exposure to situations that could have strong negative consequences if performance was done improperly in the real world environment. The classic case of simulated assessment is the simulated airplane cockpit that monitors student performance in handling the controls of an airplane, but will not crash. Simulations do not have to reach this level of complexity, however. Students have learned social skills such as job interviewing by playing their roles as if they were really going through an interview process. The simulation provides a more accurate measure of performance than just asking students to describe how they
Section: Student Assessment
30
Chapter 12: Testing Issues
would do something. Simulations can be repeated to achieve continuous improvement until a standard level of performance is reached. Limitations: The simulations have to be designed for each situation in which performance is being assessed. Designing and building the simulations is costly and time consuming. Once a student is made aware that the situation is a simulation, stresses associated with real world performance are significantly reduced, resulting in an inaccurate measure of the student’s actual capacity to perform. Contextual considerations: The simulator must be tested and calibrated at the student’s location. Many simulations are done with computers, and this makes their assessment results easy to pass on to other computers.
Resources on Testing
Section: Student Assessment
Books/Articles •
Anderson, P. S. (1987). The MDT innovation: Machine scoring of fillin-the-blank tests. (ERIC Document Reproduction Service No. ED 307 287)
•
Astin, A. W. (1991). Assessment for excellence: The philosophy and practice of assessment and evaluation in higher education. New York: American Council on Education/Oryx Press.
•
Ben-Chiam, D., & Zoller, U. (1997). Examination-type preferences of secondary school students and their teachers in the science disciplines. Instructional Science, 25, (5), 347-67.
•
Bloom, B. S., & Madaus, G. (1981). Evaluation to improve learning. New York: McGraw-Hill.
•
Boaler, J. (1998). Alternative approaches to teaching, learning and assessing mathematics. Evaluation and Program Planning, 21 (2), 129-141
•
Cashin, W. E. (1987). Improving essay tests. (Idea Paper No. 17). Manhattan, KS: Kansas State University, Center for Faculty Evaluation & Development.
•
Clegg, V. L., & Cashin, W. E. (1986). Improving multiple-choice test. (Idea Paper No. 16). Manhattan, KS: Kansas State University, Center for Faculty Evaluation & Development.
•
Cooke, J. C., Drennan, J. D., & Drennan, P. (1997). Peer evaluation as a real life-learning tool. The Technology Teacher, 23-27
•
Cross, K. P., & Angelo, T.A. (1993). Classroom assessment techniques: A handbook for college teachers (2nd ed.). San Francisco: Jossey-Bass.
31
Chapter 12: Testing Issues
Section: Student Assessment
•
Duffy, T. M., & Cunningham, D. J. (1996). Constructivism: Implications for the design and delivery of instruction. In D. H. Jonassen (Ed.), Handbook of research for educational communications and technology (pp.170-195). New York: Lawrence Erlbaum Associates.
•
Erwin, T. D. (1991). Assessing student learning and development: A guide to the principles, goals and methods of determining college outcomes. San Francisco: Jossey-Bass.
•
GLE: Grade Level Examination. Ensuring Academic Success (1991). San Diego, CA: Tudor Publishing. (ERIC Document Reproduction Service No. ED 363620)
•
Hansen, J. D., & Dexter, L. (1997). Quality multiple-choice test questions: Item-writing guidelines and an analysis of auditing test banks. Journal of Education for Business, 73 (2), 94-97.
•
Jacobs, L., & Chase, C. (1992). Developing and using tests effectively: A guide for faculty. San Francisco: Jossey-Bass.
•
McKeachie, W. J. (1994). Tests and examinations. In W. J. McKeachie (Ed.), Teaching tips: Strategies, research, and theory for college and university teachers (9th ed., pp.71-93). Lexington, MA: D.C. Heath.
•
LaPierre, S. D. (1992). Mastery-level measurement: An alternative to norm-referenced intelligence testing. Reston, VA: National Art Education Association. (ERIC Document Reproduction Service No. ED 346 024)
•
Mager, R. F. (1997). Measuring instructional results (3rd ed.). Atlanta, GA: Center for Effective Performance.
•
Mehrens, W. A., & Lehmann, I. J. (1991). Measurement and evaluation in education and psychology (4th ed.). New York: Holt, Rinehart & Winston.
•
Metzger, R. L., Boschee, P. F., Haugen, T., & Schnobrich, B. L. (1979). The classroom as learning context: Changing rooms affects performance. Journal of Educational Psychology, 71, 440-442.
•
Miller, H. G., Williams, R. G., & Haladyna, T. M. (1978). Beyond facts: Objective ways to measure thinking. Englewood Cliffs, NJ: Educational Technology Publications.
•
Milton, O. (1978). On college teaching: A guide to contemporary practices. San Francisco: Jossey-Bass.
•
Myerberg, N. J. (1996). Performance on different test types by racial/ethnic group and gender. (Eric Document Reproduction Service No. ED 400 290)
•
Natal, D. (1998). Onl ine Assessment: What, Why, How? (ERIC Document Reproduction Service No. ED 419 552)
•
Newmann, F. M., & Archbald, D. A. (1992). The nature of authentic academic achievement. In H. Berlak, T. Burgess, J. Raven, & T. Romberg (Eds.), Toward a new science of educational testing and assessment (pp. 71-83). Albany, NY: State University of New York Press.
32
Chapter 12: Testing Issues
•
Nitko, A. J. (1983). Item analysis: Using information from pupils to improve the quality of items. In A.J. Nitko (Ed.), Educational tests and measurement: An introduction (pp. 284-301). New York: Harcourt Brace Jovanovich.
•
Ory, J. C. (1979). Improving your test questions. Urbana-Champaign: University of Illinois, Office of Instructional Resources.
•
Ory, J., & Ryan, K. (1993). Tips for improving testing and grading. Newbury Park, CA: Sage.
•
Recess, M. D. (1997, March). Constructs assessed by portfolios: How do they differ from those assessed by other educational tests. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL.
•
Resnick, L. B., & Resnick, D. P. (1992). Assessing the thinking curriculum: New tools for educational reform. In B. R. Gifford & M. C. O’Connor (Eds.), Changing assessments - Alternative views of aptitude, achievement and instruction (pp. 37-75). Boston: Kluwer Academic Publishers.
•
Roos, L. L., Wise, S. L., Yoes, M. E., & Rocklin, T. R. Conducting self-adapted testing using Microcat. Educational and Psychological Measurement, 56 (5), 821-827.
•
Smith, C. R., & McBeath, R. J. (1992). Constructing Matching Test Items. In R. J. McBeath (Ed.), Instructing and evaluating in higher education: A guidebook for planning learning outcomes (pp. 199223). Englewood Cliffs, NJ: Educational Technology Publications
•
Straetmans, G. J. J. M., & Eggen, T. J. H. M. (1998, JanuaryFebruary). Computerized adaptive testing: What it is and how it works. Educational Technology, 45- 51.
•
Svinicki, M. D. (1976). The test: Uses, construction and evaluation. Engineering Education, 66 (5), 408-411.
•
White, E. M. (1985). Teaching and assessing writing. San Francisco: Jossey-Bass.
•
Zaremba, S. B., & Schultz, M. T. (1993). An analysis of traditional classroom assessment techniques and a discussion of alternative methods of assessment. (ERIC Document Reproduction Service No. ED 365 404)
Website •
Principles and Indicators for Student Assessment Systems. FairTest, The National Center for Fair & Open Testing (accessed November 3, 2005).
Back to Table of Contents On to Chapter 13: Grading
Section: Student Assessment
33
Chapter 12: Testing Issues
Universidad Mariano Gálvez de Guatemala Evaluation and Assessment Techniques Evelyn R. Quiroa M.Ed 2012 NAME:__________Erver Moisés Azurdia Sandoval_______________DATE: _____________________ ID #: _____50761114201_____
TEST ITEM TYPES INSTRUCTIONS: Complete the following chart based on test item types. TEST ITEM
CHARACTERISTICS
PROS & CONS
SKILL LEVELS ASSESSED
RULES AND TIPS
MULTIPLE CHOICE
Valid and Reliable
Easy to grade/easy to copy
Up to analyzing
Don’t abuse of none of these.
TRUE OR FALSE
Valid and Reliable
Easy to grade/easy to copy
Up to analyzing
4 to 6 ratio.
MATCHING
Structure
Remembering
Well structured
ESSAY QUESTIONS
Must know what he/she is doing. Valid and Reliable
Easy to grade/too elemental Difficult to grade/takes too long. Easy to check and copy
Creating
Instructions must be clear.
FILL IN THE GAPS OPEN ENDED QUESTIONS
Remembering Analyzing
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
The Advantages of Rubrics Part one in a five-part series What is a rubric? A rubric is a scoring guide that seeks to evaluate a student's performance based on the
sum of a full range of criteria rather than a single numerical score. A rubric is an authentic assessment tool used to measure students' work. o Authentic assessment is used to evaluate students' work by measuring the
product according to real-life criteria. The same criteria used to judge a published author would be used to evaluate students' writing. o Although the same criteria are considered, expectations vary according to one's level of expertise. The performance level of a novice is expected be lower than that of an expert and would be reflected in different standards. For example, in evaluating a story, a first-grade author may not be expected to write a coherent paragraph to earn a high evaluation. A tenth grader would need to write coherent paragraphs in order to earn high marks. A rubric is a working guide for students and teachers, usually handed out before the assignment begins in order to get students to think about the criteria on which their work will be judged. A rubric enhances the quality of direct instruction. Rubrics can be created for any content area including math, science, history, writing, foreign languages, drama, art, music, and even cooking! Once developed, they can be modified easily for various grade levels. The following rubric was created by a group of postgraduate education students at the University of San Francisco, but could be developed easily by a group of elementary students. Chocolate chip cookie rubric The cookie elements the students chose to judge were:
Number of chocolate chips Texture Color Taste Richness (flavor)
4 - Delicious: Chocolate chip in every bite Chewy Golden brown Home-baked taste Rich, creamy, high-fat flavor
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008 3 - Good: Chocolate chips in about 75 percent of the bites taken Chewy in the middle, but crispy on the edges Either brown from overcooking, or light from being 25 percent raw Quality store-bought taste Medium fat content 2 - Needs Improvement: Chocolate chips in 50 percent of the bites taken Texture is either crispy/crunchy from overcooking or doesn't hold together because it is at least 50 percent uncooked Either dark brown from overcooking or light from undercooking Tasteless Low-fat content 1 - Poor: Too few or too many chocolate chips Texture resembles a dog biscuit Burned Store-bought flavor with a preservative aftertaste â&#x20AC;&#x201C; stale, hard, chalky Non-fat contents Here's how the table looks: Delicious
Good
Needs Improvement
Poor
Chips in about 75% of bites
Chocolate in 50% of bites
Too few or too many chips
Chewy
Chewy in middle, crisp on edges
Texture either crispy/crunchy or 50% uncooked
Texture resembles a dog biscuit
Golden brown
Either light from overcooking or light from being 25% raw
Either dark brown from overcooking or light from undercooking
Burned
Number of Chocolate chip Chips in every bite Texture
Color
Taste
Home-baked taste
Quality storebought taste
Tasteless
Store-bought flavor, preservative aftertaste â&#x20AC;&#x201C; stale, hard, chalky
Richness
Rich, creamy, high-fat flavor
Medium fat contents
Low-fat contents
Nonfat contents
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008 Why use rubrics? Many experts believe that rubrics improve students' end products and therefore increase learning. When teachers evaluate papers or projects, they know implicitly what makes a good final product and why. When students receive rubrics beforehand, they understand how they will be evaluated and can prepare accordingly. Developing a grid and making it available as a tool for students' use will provide the scaffolding necessary to improve the quality of their work and increase their knowledge. In brief: Prepare rubrics as guides students can use to build on current knowledge. Consider rubrics as part of your planning time, not as an additional time commitment to
your preparation. Once a rubric is created, it can be used for a variety of activities. Reviewing, reconceptualizing, and revisiting the same concepts from different angles improves understanding of the lesson for students. An established rubric can be used or slightly modified and applied to many activities. For example, the standards for excellence in a writing rubric remain constant throughout the school year; what does change is students' competence and your teaching strategy. Because the essentials remain constant, it is not necessary to create a completely new rubric for every activity. There are many advantages to using rubrics: Teachers can increase the quality of their direct instruction by providing focus, emphasis,
and attention to particular details as a model for students. Students have explicit guidelines regarding teacher expectations. Students can use rubrics as a tool to develop their abilities. Teachers can reuse rubrics for various activities.
Create an Original Rubric Part two in a five-part series Learning to create rubrics is like learning anything valuable. It takes an initial time investment. Once the task becomes second nature, it actually saves time while creating a higher quality student product. The following template will help you get started:
Determine the concepts to be taught. What are the essential learning objectives? Choose the criteria to be evaluated. Name the evidence to be produced. Develop a grid. Plug in the concepts and criteria. Share the rubric with students before they begin writing. Evaluate the end product. Compare individual students' work with the rubric to determine whether they have mastered the content.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008 Fiction-writing content rubric Criteria
4
3
2
1
PLOT: "What" and "Why"
One of the plot parts is Both plot parts are fully developed and the fully developed. less developed part is at least addressed.
Both plot parts are addressed but not fully developed.
Neither plot parts are fully developed.
SETTING: "When" and "Where"
Both setting parts are fully developed.
One of the setting parts is fully developed and the less developed part is at least addressed.
Both setting parts of the story are addressed but not fully developed.
Neither setting parts are developed.
CHARACTERS "Who" Described by behavior, appearance, personality, and character traits.
The main characters are fully developed with much descriptive detail. The reader has a vivid image of the characters.
The main characters are developed with some descriptive detail. The reader has a vague idea of the characters.
The main characters are identified by name only.
None of the characters are developed or named.
In the above example, the concepts include the plot, setting, and characters. The criteria are the who, what, where, when, and why parts of the story. The grid is the physical layout of the rubric. Sharing the rubric and going over it step-by-step is necessary so that students will understand the standards by which their work will be judged. The evaluation is the objective grade determined by the teacher. The teacher determines the passing grade. For instance, if all three concepts were emphasized, a passing grade of 3 in all three concepts might be required. If any part of the story fell below a score of 3, then that particular concept would need to be re-taught and rewritten with specific teacher feedback. In another example, suppose a teacher emphasized only one concept, such as character development. A passing grade of "3" in character development may constitute a passing grade for the whole project. The purpose in writing all three parts of the story would be to gain writing experience and get feedback for future work. Share the rubric with students prior to starting the project. It should be visible at all times on a bulletin board or distributed in a handout. Rubrics help focus teaching and learning time by directing attention to the key concepts and standards that students must meet.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
Analytic vs. Holistic Rubrics Part three in a five-part series What's the difference between analytic and holistic rubrics? Analytic rubrics identify and assess components of a finished product. Holistic rubrics assess student work as a whole.
Which one is better? Neither rubric is better than the other. Both have a place in authentic assessment, depending on the following: Who is being taught? Because there is less detail to analyze in the holistic rubric, younger
students may be able to integrate it into their schema better than the analytic rubric. How many teachers are scoring the product? How many teachers are scoring the product?
Different teachers have different ideas about what constitutes acceptable criteria. The extra detail in the analytic rubric will help multiple grades emphasize the same criteria. Recall the analytic rubric from part two and compare it with the holistic rubric below: Fiction Writing Content Rubric – HOLISTIC 5 – The plot, setting, and characters are developed fully and organized well. The who,
what, where, when, and why are explained using interesting language and sufficient detail. 4 – Most parts of the story mentioned in a score of 5 above are developed and organized well. A couple of aspects may need to be more fully or more interestingly developed. 3 – Some aspects of the story are developed and organized well, but not as much detail or organization is expressed as in a score of 4. 2 – A few parts of the story are developed somewhat. Organization and language usage need improvement. 1 – Parts of the story are addressed without attention to detail or organization.
Rubric Reminders: 1. Neither the analytic nor the holistic rubric is better than the other one. 2. Consider your students and grader(s) when deciding which type to use. 3. For modeling, present to your students anchor products or exemplars of products at various levels of development.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
How to Weight Rubrics Part four in a five-part series What is a weighted rubric? ď&#x201A;ˇ
ď&#x201A;ˇ
A weighted rubric is an analytic rubric in which certain concepts are judged more heavily than others. If, in a creative writing assignment, a teacher stresses character development, he or she might consider weighing the characters part of the rubric more heavily than the plot or setting. Remember that the purpose of creative writing is to evoke emotion from the reader. The writing needs to be interesting, sad, exciting, mysterious, or whatever the author decides. One way to develop the intended emotion is to focus on each concept separately within the context of creative writing.
Advantages A weighted rubric clearly communicates to the students and their parents which parts of the project are more important to learn for a particular activity. Weights can be changed to stress different aspects of a project. One week a teacher may focus on character development. In the next week or two, plot may take precedence. A weighted rubric focuses attention on specific aspects of a project. When learning something new, it is difficult to assimilate all of the necessary details into a coherent final product. Likewise, it is difficult to learn new things in isolation or out of context. A weighted rubric devised from quality projects will allow new learners to focus on what is being taught, while providing meaningful context to support the entire experience. Different ways to weight rubrics 1. Refer to the analytic rubric in part two of this series. If you have just focused on character development, simply require students to achieve a passing score of 3.00 in characters, realizing that the other parts are also necessary for quality fiction writing. 2. Assign numeric weights to different concepts. Characters might be worth 50 percent, and the setting and plot might be worth 25 percent each. When grading a story, the teacher would put twice as much weight on characters as either setting or plot. A passing score of at least 2.00 points with 1.50 coming from characters would be required. After a lesson on how to develop the plot, that concept might be worth 50 percent while the setting and characters would be worth 25 percent each. 3. To achieve a cumulative effect after the second lesson, the plot and characters might be worth 40 percent each, and the setting might be worth 20 percent. Summary Weighted rubrics are useful for explicitly describing to students and parents what concepts take priority over others for certain activities. In designing weighted rubrics, it is important not to lose sight of the purpose of an activity by getting bogged down in meaningless details, such as the number of adjectives and verbs used or the number of pages written.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008 The purpose of creative writing is to evoke a response from the reader. Using written words to elicit emotion effectively requires skill and understanding of the language. The concepts are the form by which good writing is judged. The important criteria become how the author uses language to achieve his or her goals. Weighted fiction-writing content rubric Criteria
4
Both plot parts PLOT: "What" and are fully "Why" developed. 25%
.25 x 4 = 1.00 point
3 One of the plot parts is fully developed and the less developed part is at least addressed.
2
1
Both plot parts Neither plot parts are addressed but are fully not fully developed. developed. .25 x 2 = .50 point
.25 x 1 = .25 point
.25 x 3 = .75 point One of the setting parts is fully Both setting parts developed and SETTING: "When" are fully the less and "Where" developed. developed part is at least 25% .25 x 4 = 1.00 addressed. point
Both setting parts Neither setting of the story are parts are addressed but not developed. fully developed. .25 x 2 = .50 point
.25 x 1 = .25 point
.25 x 3 = .75 point
CHARACTERS: "Who" described by appearance, personality, character traits, and behavior. 50%
The main characters are fully developed with much descriptive detail. The reader has a vivid image of the characters. .50 x 4 =2.00 points
The main characters are developed with some descriptive detail. The reader has a vague idea of the characters. .50 x 3 = 1.50 points
The main characters are identified by name only. .50 x 2 = 1.00 point
None of the characters are developed or named. .50 x 1 = .50 point
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
Student-Generated Rubrics Part five in a five-part series Why should students create their own rubrics? Reading or listening to a teacher's expectations is very different for a student than creating and accomplishing his or her own goals. The purpose of inviting students to develop their own evaluation structure is to improve their motivation, interest, and performance in the project. As students' overall participation in school increases, they are likely to excel in it. How can students create their own rubrics? Students are motivated intrinsically to design their own assessment tool after experiencing project-based learning. Once students have invested a significant amount of time, effort, and energy into a project, they naturally want to participate in deciding how it will be evaluated. The knowledge gained through experience in a particular field of study provides the foundation for creating a useful rubric. Background I decided to try out the possibility of student-created rubrics with my class when we did a project on bridges. The purpose of the project was for students to:
learn basic physics concepts. apply fundamental mathematics principles. develop technical reading and writing skills.
My third-grade class began the Bridge Project by poring through books, handouts, magazine articles, Internet sites, and pictures of bridges. The class was divided into four work groups of five students each. Each group decided on their own "Company Name" as well as who would fill the following department head positions: project director, architect, carpenter, transportation chief, and accountant. All students were required to help out in every department. Each group received $1.5 million (hypothetically) to purchase land and supplies. Rubric development I created the preliminary outline by listing the learning outcomes that were to be emphasized in the project. The outcomes were then divided into suitable categories, and sample products were displayed and discussed. I proceeded to introduce the idea of the rubric to the students, who then generated many ideas for the rubric criteria. Students were asked to think about what parts of the design, construction, budget, and building journal were the most significant to the overall bridge quality. Together, the class came up with four different rubrics.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
The budget rubric is provided as an example: Budget Criteria
4 Excellent
3 Good
Legibility
The budget shows two or Completely three marks legible. or stains, but is legible.
2 Fair
1 Unacceptable
The budget is barely The budget is legible, with messy and numerous illegible. marks or stains.
Five-sixths of Supplies & Completely the materials Materials accounted and labor are Accountability for. accounted for.
Two-thirds of the materials and labor are accounted for.
Materials and labor are not accounted for.
Five-sixths of the daily balance of funds is indicated.
Two-thirds of the daily balance of funds is indicated.
The daily balance of funds is nonexistent.
Ledger Activity
All daily activities are recorded.
Ledger Balance
The daily The daily Balance is fund record The daily fund balance has completely has more balance is two or three accurate. than three inaccurate. inaccuracies. inaccuracies.
Summary The experience students gain through an authentic project enables them to understand the various aspects necessary for creating a valuable piece of work. Knowledge that has deep meaning provides the basis for students to judge objectively their own work as well as that of others. Developing a rubric is a reflective process that extends the experience and the knowledge gained beyond simply turning in a project for a teacher-initiated grade.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
RUBRIC EXAMPLES TO ANALYZE Appendix A: Sample Holistic Rubric • Always prepared and attends class • Participates constructively in class • Exhibits preparedness and punctuality in class/class work
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
RUBRIC FOR ORAL PRESENTATION AWARENESS OF AUDIENCE Distinguished (4)
Proficient (3)
Apprentice (2)
Novice (1)
SUITABILITY / match of material and presentation style to audience
SUITABILITY / match of material and presentation style to audience
SUITABILITY / match of material and presentation style to audience
SUITABILITY / match of material and presentation style to audience
Information (Including Explanation and Instruction):
Information (Including Explanation and Instruction):
Information (Including Explanation and Instruction):
Information (Including Explanation and Instruction):
significantly increases audience understanding and knowledge of topic
raises audience understanding and awareness of most points
raises audience understanding and knowledge of some points
fails to increase audience understanding or knowledge of topic
Persuasion:
Persuasion:
effectively convinces an audience to recognize the validity of a point of view
point of view is clear, but development or support is inconclusive and incomplete
Entertainment:
Entertainment:
uses humor appropriately to make significant points about the topic consistent with the interest of audience
achieves moderate success in using humor
Persuasion: Persuasion: point of view may be clear, but lacks development or support
fails to effectively convince the audience
Entertainment: Entertainment: humor attempted but inconsistent or weak
no use of humor or humor used inappropriately
STRENGTH OF MATERIAL, ORGANIZATION Distinguished (4)
Proficient (3)
Apprentice (2)
Novice (1)
CONTENT
CONTENT
CONTENT
CONTENT
Focus:
Focus:
Focus:
Focus:
Purpose and subject are
has some success
attemps to define
subject and purpose are
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008 defined clearly; information and logic are self-consistent; (persuasive speech anticipates opposition and provides counter example[s])
defining purpose and subject; information and logic generally selfconsistent
Quality of material:
Quality of material:
purpose and subject; has contradictory information and/or logic
not clearly defined; muddled
Quality of material:
Quality of material: very weak or no support of subject through use of examples, facts, and/or statistics
pertinent examples, facts, and/or statistics
some examples, facts, and/or statistics that supports the subject
weak examples, facts, and/or statistics, which do not adequately support the subject
Sufficiency:
Sufficiency:
Sufficiency:
Sufficiency:
conclusions or ideas are supported by data or evidence
includes some data or evidence which supports conclusions or ideas
includes very thin data or evidence in support of ideas or conclusions
totally insufficient support for ideas or conclusions
ORGANIZATION
ORGANIZATION ORGANIZATION
Introduction:
ORGANIZATION Introduction:
Introduction: Introduction has strong purpose statement which captivates audience and narrows topic
Introduction:
Introductory statement informs audience of general purpose of presentation
Core:
Introduction of subject fails to make audience aware of the purpose of presentation Core:
Core: topic is narrowed, researched, and organized
no introductory statement or introductory statement which confuses audience Core:
topic needs to be narrowed, researched extended and/or tightened
topics too broad, insufficiently researched, and/or haphazardly delivered
topic is general, vague, and/or disorganized
Closing: Closing: audience informed, major ideas summarized, audience left with a full understanding of presenter=s position DELIVERY
Closing: may need to refine summary or final idea
Closing: major ideas may need to be summarized or audience is left with vague idea to remember
major ideas left unclear, audience left with no new ideas
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008 Distinguished (4)
Proficient (3)
Apprentice (2)
Novice (1)
POISE/APPEARANCE:
POISE/APPEARANCE:
POISE/APPEARANCE:
POISE/APPEARANCE:
relaxed, self-confident and appropriately dressed for purpose or audience
quick recovery from minor mistakes; appropriately dressed
some tension or indifference apparent and possible inappropriate dress for purpose or audience
nervous, tension obvious and/or inappropriately dressed for purpose or audience
BODY LANGUAGE:
BODY LANGUAGE:
LANGUAGE:
movements and gestures generally enhance delivery
insufficient movement and/or awkward gestures
no movement or descriptive gestures
EYE CONTACT:
EYE CONTACT:
occasional but unsustained eye contact with audience
no effort to make eye contact with audience
BODY BODY LANGUAGE: natural movement and descriptive gestures which display energy, create mood, and help audience visualize EYE CONTACT: builds trust and holds attention by direct eye contact with all parts of audience
EYE CONTACT: fairly consistent use of direct eye contact with audience
VOICE: VOICE: VOICE:
VOICE: fluctuation in volume and inflection help to maintain audience interest and emphasize key points PACING: good use of pause, giving sentence drama, length matches allocated time
satisfactory variation of volume and inflection
uneven volume with little or no inflection
low volume and/or monotonous tone causes audience to disengage
PACING: PACING: PACING: pattern of delivery generally successful; slight mismatch between length and allotted time
uneven or inappropriate patterns of delivery and/or length does not match allotted time
delivery is either too rushed or too slow and/or length does not match allotted time PRESENTATION AIDS:
PRESENTATION AIDS: PRESENTATION AIDS:
PRESENTATION AIDS: are clear, appropriate, not over-used and beneficial to the speech
are used and add some clarity and dimension to speech
none used or attempted attempted, but unclear; inappropriate or overused
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
Rubric for Paper based on an Interview Rating
Criteria _ engaging, creative, and thoughtful _ precise, vivid, and sophisticated vocabulary; varied patterns and lengths of sentences _ coherent and organized structure
A _ chosen form effectively and innovatively conveys content _ relevant and intriguing use of details to convey personality and experience of person interviewed _ few surface feature errors; only noticeable if looking for them _ clear and thoughtful _ complex, precise vocabulary and varied sentences _ logical organization B _ chosen form effectively conveys content _ relevant and careful use of details to convey personality and experience of person interviewed _ few surface feature errors; occasional spelling or punctuation errors _ quite well developed and detailed _ generally precise vocabulary and complex sentence structures containing minimal errors _ obvious organization C _ chosen form appropriate for content _ relevant use of details to convey personality and experience of person interviewed _ generally few surface feature errors; some punctuation, spelling, or pronoun reference errors _ direct and usually clear D
_ straightforward vocabulary and effective sentences that are rarely complex or varied _ organization evident
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008 _ chosen form generally appropriate for content _ competent use of details to convey personality and experience of person interviewed _ surface feature errors such as comma splice, spelling, or pronoun reference errors _ limited clarity and thought _ unsophisticated and, at times, inappropriate vocabulary with simple sentences _ evidence of some organization REWRITE
_ chosen form rarely conveys content effectively _ inconsistent use of details to convey personality and experience of person interviewed _ surface feature errors may at times distract reader _ message is clear, understandable, and thought-provoking
Collaboration Rubric
Beginning 1
Developing 2
Accomplished 3
Exemplary 4
Contribute Research & Gather Information
Does not collect any information that relates to the topic.
Collects very little information--some relates to the topic.
Collects some basic information--most relates to the topic.
Collects a great deal of information--all relates to the topic.
Share Information Does not relay any information to teammates.
Relays very little information--some relates to the topic.
Relays some basic information--most relates to the topic.
Relays a great deal of information--all relates to the topic.
Hands in most assignments on time.
Hands in all assignments on time.
Performs nearly all duties.
Performs all duties of assigned team role.
Be Punctual
Does not hand in Hands in most any assignments. assignments late.
Take Responsibility Fulfill Team Role's Does not Duties perform any duties of assigned team role.
Performs very little duties.
Participate in
Either gives too little Offers some
Does not speak
Offers a fair amount of
Score
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008 Science Conference
during the science conference.
information or information which is irrelevant to topic.
information--most is relevant.
important information-all is relevant.
Share Equally
Always relys on others to do the work.
Rarely does the Usually does the assigned work--often assigned work--rarely needs reminding. needs reminding.
Always does the assigned work without having to be reminded.
Listen to Other Teammates
Is always talking-never allows anyone else to speak.
Usually doing most Listens, but of the talking--rarely sometimes talks too allows others to much. speak.
Listens and speaks a fair amount.
Cooperate with Teammates
Usually argues Sometimes argues. with teammates.
Rarely argues.
Never argues with teammates.
Make Fair Decisions
Usually wants to Often sides with have things their friends instead of way. considering all views.
Usually considers all views.
Always helps team to reach a fair decision.
Value Others' Viewpoints
Total
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
THE USE OF PORTFOLIO ASSESSMENT IN EVALUATION Meg Sewell, Mary Marczak, & Melanie Horn WHAT IS PORTFOLIO ASSESSMENT? In program evaluation as in other areas, a picture can be worth a thousand words. As an evaluation tool for community-based programs, we can think of a portfolio as a kind of scrapbook or photo album that records the progress and activities of the program and its participants, and showcases them to interested parties both within and outside of the program. While portfolio assessment has been predominantly used in educational settings to document the progress and achievements of individual children and adolescents, it has the potential to be a valuable tool for program assessment as well. Many programs do keep such albums, or scrapbooks, and use them informally as a means of conveying their pride in the program, but most do not consider using them in a systematic way as part of their formal program evaluation. However, the concepts and philosophy behind portfolios can apply to community evaluation, where portfolios can provide windows into community practices, procedures, and outcomes, perhaps better than more traditional measures. ortfolio assessment has become widely used in educational settings as a way to examine and measure progress, by documenting the process of learning or change as it occurs. Portfolios extend beyond test scores to include substantive descriptions or examples of what the student is doing and experiencing. Fundamental to "authentic assessment" or "performance assessment" in educational theory is the principle that children and adolescents should demonstrate, rather than tell about, what they know and can do (Cole, Ryan, & Kick, 1995). Documenting progress toward higher order goals such as application of skills and synthesis of experience requires obtaining information beyond what can be provided by standardized or norm-based tests. In "authentic assessment", information or data is collected from various sources, through multiple methods, and over multiple points in time (Shaklee, Barbour, Ambrose, & Hansford, 1997). Contents of portfolios (sometimes called "artifacts" or "evidence") can include drawings, photos, video or audio tapes, writing or other work samples, computer disks, and copies of standardized or program-specific tests. Data sources can include parents, staff, and other community members who know the participants or program, as well as the self-reflections of participants themselves. Portfolio assessment provides a practical strategy for systematically collecting and organizing such data.
PORTFOLIO ASSESSMENT IS MOST USEFUL FOR: *Evaluating programs that have flexible or individualized goals or outcomes. For example, within a program with the general purpose of enhancing children's social skills, some individual children may need to become less aggressive while other shy children may need to become more assertive. Each child's portfolio asseessment would be geared to his or her individual needs and goals. *Allowing individuals and programs in the community (those being evaluated) to be involved in their own change and decisions to change.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
*Providing information that gives meaningful insight into behavior and related change. Because portfolio assessment emphasizes the process of change or growth, at multiple points in time, it may be easier to see patterns. *Providing a tool that can ensure communication and accountability to a range of audiences. Participants, their families, funders, and members of the community at large who may not have much sophistication in interpreting statistical data can often appreciate more visual or experiential "evidence" of success. *Allowing for the possibility of assessing some of the more complex and important aspects of many constructs (rather than just the ones that are easiest to measure).
PORTFOLIO ASSESSMENT IS NOT AS USEFUL FOR: *Evaluating programs that have very concrete, uniform goals or purposes. For example, it would be unnecessary to compile a portfolio of individualized "evidence" in a program whose sole purpose is full immunization of all children in a community by the age of five years. The required immunizations are the same, and the evidence is generally clear and straightforward. *Allowing you to rank participants or programs in a quantitative or standardized way (although evaluators or program staff may be able to make subjective judgments of relative merit). *Comparing participants or programs to standardized norms. While portfolios can (and often do) include some standardized test scores along with other kinds of "evidence", this is not the main purpose of the portfolio.
USING PORTFOLIO ASSESSMENT WITH THE STATE STRENGTHENING EVALUATION GUIDE Tier 1 - Program Definition Using portfolios can help you to document the needs and assets of the community of interest. Portfolios can also help you to clarify the identity of your program and allow you to document the "thinking" behind the development of and throughout the program. Ideally, the process of deciding on criteria for the portfolio will flow directly from the program objectives that have been established in designing the program. However, in a new or existing program where the original objectives are not as clearly defined as they need to be, program developers and staff may be able to clarify their own thinking by visualizing what successful outcomes would look like, and what they would accept as "evidence". Thus, thinking about portfolio criteria may contribute to clearer thinking and better definition of program objectives. Tier 2 - Accountability Critical to any form of assessment is accountability. In the educational arena for example, teachers are accountable to themselves, their students, and the families, the schools and society. The portfolio is an assessment practice that can inform all of these constituents.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
The process of selecting "evidence" for inclusion in portfolios involves ongoing dialogue and feedback between participants and service providers. Tier 3 - Understanding and Refining Portfolio assessment of the program or participants provides a means of conducting assessments throughout the life of the program, as the program addresses the evolving needs and assets of participants and of the community involved. This helps to maintain focus on the outcomes of the program and the steps necessary to meet them, while ensuring that the implementation is in line with the vision established in Tier 1.
Tier 4 - Progress Toward Outcomes Items are selected for inclusion in the portfolio because they provide "evidence" of progress toward selected outcomes. Whether the outcomes selected are specific to individual participants or apply to entire communities, the portfolio documents steps toward achievement. Usually it is most helpful for this selection to take place at regular intervals, in the context of conferences or discussions among participants and staff. Tier 5 - Program Impact One of the greatest strengths of portfolio assessment in program evaluation may be its power as a tool to communicate program impact to those outside of the program. While this kind of data may not take the place of statistics about numbers served, costs, or test scores, many policy makers, funders, and community members find visual or descriptive evidence of successes of individuals or programs to be very persuasive.
ADVANTAGES OF USING PORTFOLIO ASSESSMENT *Allows the evaluators to see the student, group, or community as individual, each unique with its own characteristics, needs, and strengths. *Serves as a cross-section lens, providing a basis for future analysis and planning. By viewing the total pattern of the community or of individual participants, one can identify areas of strengths and weaknesses, and barriers to success. *Serves as a concrete vehicle for communication, providing ongoing communication or exchanges of information among those involved. *Promotes a shift in ownership; communities and participants can take an active role in examining where they have been and where they want to go. *Portfolio assessment offers the possibility of addressing shortcomings of traditional assessment. It offers the possibility of assessing the more complex and important aspects of an area or topic. *Covers a broad scope of knowledge and information, from many different people who know the program or person in different contexts ( eg., participants, parents, teachers or staff, peers, or community leaders).
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
DISADVANTAGES OF USING PORTFOLIO ASSESSMENT *May be seen as less reliable or fair than more quantitative evaluations such as test scores. *Can be very time consuming for teachers or program staff to organize and evaluate the contents, especially if portfolios have to be done in addition to traditional testing and grading. *Having to develop your own individualized criteria can be difficult or unfamiliar at first. *If goals and criteria are not clear, the portfolio can be just a miscellaneous collection of artifacts that don't show patterns of growth or achievement. *Like any other form of qualitative data, data from portfolio assessments can be difficult to analyze or aggregate to show change.
HOW TO USE PORTFOLIO ASSESSMENT Design and Development Three main factors guide the design and development of a portfolio: 1) purpose, 2) assessment criteria, and 3) evidence (Barton & Collins, 1997). 1) Purpose The primary concern in getting started is knowing the purpose that the portfolio will serve. This decision defines the operational guidelines for collecting materials. For example, is the goal to use the portfolio as data to inform program development? To report progress? To identify special needs? For program accountability? For all of these? 2) Assessment Criteria Once the purpose or goal of the portfolio is clear, decisions are made about what will be considered sucess (criteria or standards), and what strategies are necessary to meet the goals. Items are then selected to include in the portfolio because they provide evidence of meeting criteria, or making progress toward goals.
3) Evidence In collecting data, many things need to be considered. What sources of evidence should be used? How much evidence do we need to make good decisions and determinations? How often should we collect evidence? How congruent should the sources of evidence be? How can we make sense of the evidence that is collected? How should evidence be used to modify program and evaluation? According to Barton and Collins (1997), evidence can include artifacts (items produced in the normal course of classroom or program activities), reproductions (documentation of interviews or projects done outside of the
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
classroom or program), attestations (statements and observations by staff or others about the participant), and productions (items prepared especially for the portfolio, such as participant reflections on their learning or choices) . Each item is selected because it adds some new information related to attainment of the goals.
Steps of Portfolio Assessment Although many variations of portfolio assessment are in use, most fall into two basic types: process portfolios and product portfolios (Cole, Ryan, & Kick, 1995). These are not the only kinds of portfolios in use, nor are they pure types clearly distinct from each other. It may be more helpful to think of these as two steps in the portfolio assessment process, as the participant(s) and staff reflectively select items from their process portfolios for inclusion in the product portfolio. Step 1: The first step is to develop a process portfolio, which documents growth over time toward a goal. Documentation includes statements of the end goals, criteria, and plans for the future. This should include baseline information, or items describing the participant's performance or mastery level at the beginning of the program. Other items are "works in progress", selected at many interim points to demonstrate steps toward mastery. At this stage, the portfolio is a formative evaluation tool, probably most useful for the internal information of the participant(s) and staff as they plan for the future. Step 2: The next step is to develop a product portfolio (also known as a "best pieces portfolio"), which includes examples of the best efforts of a participant, community, or program. These also include "final evidence", or items which demonstrate attainment of the end goals. Product or "best pieces" portfolios encourage reflection about change or learning. The program participants, either individually or in groups, are involved in selecting the content, the criteria for selection, and the criteria for judging merits, and "evidence" that the criteria have been met (Winograd & Jones, 1992). For individuals and communities alike, this provides opportunities for a sense of ownership and strength. It helps to show-case or communicate the accomplishments of the person or program. At this stage, the portfolio is an example of summative evaluation, and may be particularly useful as a public relations tool. Distinguishing Characteristics Certain characteristics are essential to the development of any type of portfolio used for assessment. According to Barton and Collins (1997), portfolios should be: 1) Multisourced (allowing for the opportunity to evaluate a variety of specific evidence) Multiple data sources include both people (statements and observations of participants, teachers or program staff, parents, and community members), and artifacts (anything from test scores to photos, drawings, journals, & audio or videotapes of performances). 2) Authentic (context and evidence are directly linked) The items selected or produced for evidence should be related to program activities, as well as the goals and criteria. If the portfolio is assessing the effect of a program on participants or communities, then the "evidence" should reflect the activities of the program rather than skills that were gained elsewhere. For example, if a child's musical
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
performance skills were gained through private piano lessons, not through 4-H activities, an audio tape would be irrelevant in his 4-H portfolio. If a 4-H activity involved the same child in teaching other children to play, a tape might be relevant. 3) Dynamic (capturing growth and change) An important feature of portfolio assessment is that data or evidence is added at many points in time, not just as "before and after" measures. Rather than including only the best work, the portfolio should include examples of different stages of mastery. At least some of the items are self-selected. This allows a much richer understanding of the process of change.
4) Explicit (purpose and goals are clearly defined) The students or program participants should know in advance what is expected of them, so that they can take responsibility for developing their evidence. 5) Integrated (evidence should establish a correspondence between program activities and life experiences) Participants should be asked to demonstrate how they can apply their skills or knowledge to real-life situations. 6) Based on ownership (the participant helps determine evidence to include and goals to be met) The portfolio assessment process should require that the participants engage in some reflection and self-evaluation as they select the evidence to include and set or modify their goals. They are not simply being evaluated or graded by others. 7) Multipurposed (allowing assessment of the effectiveness of the program while assessing performance of the participant). A well-designed portfolio assessment process evaluates the effectiveness of your intervention at the same time that it evaluates the growth of individuals or communities. It also serves as a communication tool when shared with family, other staff, or community members. In school settings, it can be passed on to other teachers or staff as a child moves from one grade level to another. Analyzing and Reporting Data As with any qualitative assessment method, analysis of portfolio data can pose challenges. Methods of analysis will vary depending on the purpose of the portfolio, and the types of data collected (Patton, 1990). However, if goals and criteria have been clearly defined, the "evidence" in the portfolio makes it relatively easy to demonstrate that the individual or population has moved from a baseline level of performance to achievement of particular goals. It should also be possible to report some aggregated or comparative results, even if participants have individualized goals within a program. For example, in a teen peer tutoring program, you might report that "X% of participants met or exceeded two or more
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
of their personal goals within this time frame", even if one teen's primary goal was to gain public speaking skills and another's main goal was to raise his grade point average by mastering study skills. Comparing across programs, you might be able to say that the participants in Town X on average mastered 4 new skills in the course of six months, while those in Town Y only mastered 2, and speculate that lower attendance rates in Town Y could account for the difference. Subjectivity of judgements is often cited as a concern in this type of assessment (Bateson, 1994). However, in educational settings, teachers or staff using portfolio assessment often choose to periodically compare notes by independently rating the same portfolio to see if they are in agreement on scoring (Barton & Collins, 1997). This provides a simple check on reliability, and can be very simply reported. For example, a local programmer could say "To ensure some consistency in assessment standards, every 5th portfolio (or 20%) was assessed by more than one staff member. Agreement between raters, or inter-rater reliability, was 88%".
There are many books and articles that address the problems of analyzing and reporting on qualitative data in more depth than can be covered here. The basic issues of reliability, validity and generalizability are relevant even when using qualitative methods, and various strategies have been developed to address them. Those who are considering using portfolio assessment in evaluation are encouraged to refer to some of the sources listed below for more in-depth information.
ANNOTATED BIBLIOGRAPHY Barton, J., & Collins, A. (Eds.) (1997). Portfolio assessment: A handbook for educators. Menlo Park, CA: Addison-Wesley Publishing Co. A book about portfolio assessment written by and for teachers. The main goal is to give practical suggestions for creating portfolios so as to meet the unique needs and purposes of any classroom. The book includes information about designing portfolios, essential steps to make portfolios work, actual cases of portfolios in action, a compendium of portfolio implementation tips that save time and trouble, how to use portfolios to assess both teacher and student performance, and a summary of practical issues of portfolio development and implementation. This book is very clear, easy to follow, and can easily serve as a bridge between the use of portfolios in the classroom and the application of portfolios in community evaluations. Bateson, D. (1994). Psychometric and philosophic problems in "authentic" assessment: Performance tasks and portfolios. Alberta Journal of Educational Research, 40 (2), p. 233245. Considers issues of reliability and validity in assessment which are as important in "authentic assessment" methods as in more traditional methods. Care needs to be exercised so that these increasingly popular new methods are not perceived as unfair or invalid. Cole, D. J., Ryan, C. W., & Kick, F. (1995). Portfolios across the curriculum and beyond. Thousand Oaks, CA: Corwin Press.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
Authors discuss the development of authentic assessment and how it has led to portfolio usage. Guidelines are given for planning portfolios, how to use them, selection of portfolio contents, reporting strategies, and use of portfolios in the classroom. In addition, a chapter focuses on the development of a professional portfolio. Courts, P. L., & McInerny, K. H. (1993). Assessment in higher education: Politics, pedagogy, and portfolios. London: Praeger. The authors describe a project using portfolios to train teachers to assess exceptional potential in underserved populations. The portfolio includes observations of the children's behavior in the school, home, and community. The underlying assumption of the project is that teachers learn to recognize exceptional potential if they are provided with authentic examples of such behavior. Results indicated that participating teachers experienced a sense of empowerment as a consequence of the project and became both involved in and committed to the project. Glasgow, N. A. (1997). New curriculum for new times: A guide to student-centered, problem-based learning. Thousand Oaks, CA: Corwin Press. This book is an attempt to identify and define current practices and present alternatives that can better meet the needs of a wider range of students in facilitating literacy and readiness for life outside the classroom. Discussion centers on current curriculum and the need for instruction that meets the changing educational context. Included is information about portfolio assessment, design and implementation, as as examples of a new curricular style that promotes flexible and individualistic instruction. Maurer, R. E. (1996). Designing alternative assessments for interdisciplinary curriculum in middle and secondary schools. Boston: Allyn and Bacon. This book explains how to design an assessment system that can authentically evaluate students' progress in an interdisciplinary curriculum. It offers step-by-step procedures, checklists, tables, charts, graphs, guides, worksheets, and examples of successful assessment methods. Specific to portfolio assessment, this book shows how portfolios can be used to measure learning. Provides some information on types and development of portfolios. Patton, M. Q. (1990). Qualitative evaluation and research methods, 2nd ed. Newbury Park, CA: Sage. A good general reference on issues of qualitative methods, and strategies for analysis and interpretation of qualitative data. Shaklee, B. D., Barbour, N. E., Ambrose, R., & Hansford, S. J. (1997). Designing and using portfolios. Boston: Allyn and Bacon. Discusses the history of portfolio assessment, decisions that need to be made before beginning the portfolio assessment process (eg., what it will look like, who should be involved, what should be assessed, how the assessment will be accomplished), designing a portfolio system (eg., criteria and standards), using portfolio results in planning, and issues related to assessment practices (eg., accountability).
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA ESCUELA DE IDIOMAS PROFESORADO EN INGLES LICDA. EVELYN R. QUIROA 2008
Shaklee, B. D., & Viechnicki, K. J. (1995). A qualitative approach to portfolios: The Early Assessment for Exceptional Potential Model. Journal for the Education of the Gifted, 18 (2), 156-170. The creation of a portfolio assessment model based on qualitative research principles is examined by the authors. Portfolio framework assumptions for classrooms are: designing authentic learning opportunities, interaction of assessment, curriculum and instructions, multiple criteria derived from multiple sources, and systematic teacher preparations. Additionally, the authors examine the qualitative research procedures embedded in the development of the Early Assessment for Exceptional Potential model. Provided are preliminary results for credibility, transferability, dependability, and confirmability of the design. Winograd, P., & Jones, D. L. (1992). The use of portfolios in performance assessment. New Directions for Educational Reform, 1 (2), 37-50. Authors examine the use of portfolios in performance assessment. Suggestions are offered to educators interested in using portfolios in aiding students to become better readers and writers. Addresses concerns related to portfolios' usefulness. Educators need support in learning how to use portfolios, including their design, management, and interpretation.
PLEASE VISIT THE FOLLOWING WEBSITE TO ACQUIRE MORE INFORMATION NEEDED FOR THIS COURSE IN REFERENCE TO PORTFOLIO ASSESSMENT: http://www.pgcps.org/~elc/portfolio.html
Based on the attached document create a diagram comparing the three types of evaluation and include WHEN, WHERE, WHY & HOW. Make sure to post your assignment on the access icon provided in order for me to process your grade.
•Why: The purposes are to determine students' knowledge and skills, their learning needs, and their motivational and interest levels. •When: usually occur at the beginning of the school year and before each unit of study. •Where: In the classroom or anywhere the teacher considers apropriate. •How: Diagnostic assessment tools such as the Writing Strategies Questionnaire and the Reading Interest/Attitude Inventory in this guide can provide support for instructional decisions.
Diagnostic
•Why: Learner Assessment generally looks at how an individual learner performed on a learning task. It assesses a student's learning •When: at he time needed, end of the day or weekly progress. •Where: In the classroom or anywhere the teacher considers apropriate. •How: Involvement in constructing their own assessment instruments or in adapting ones the teacher has made allows students to focus on what they are trying to achieve, develops their thinking skills, and helps them to become reflective learners.
Formative
•Why: By looking at the group, the instructional designer can evaluate the learning materials and learning process. •When: MId point of the course or end of the course. •Where: In the classroom or anywhere the teacher considers apropriate. •How: Summative judgments are based upon criteria derived from curriculum objectives. By sharing these objectives with the students and involving them in designing the evaluation instruments, teachers enable students to understand and internalize the criteria by which their progress will be determined.
Summative
UNIVERSIDAD MARIANO GALVEZ ESCUELA DE IDIOMAS PROFESORADO EN INGLES TECNICAS DE EVALUACION 2009
“PORTFOLIO RUBRIC” STUDENT NAME:________________________________ DATE:__________ I.D.#:_______________________ Points
Required items
Concepts
25‐20
All required items are included, with a significant number of additions.
Items clearly demonstrate that the desired learning outcomes for the term have been achieved. The student has gained a significant understanding of the concepts and applications.
Reflections illustrate the ability to effectively critique work, and to suggest constructive practical alternatives.
Items are clearly introduced, well organized, and creatively displayed, showing connection between items.
19‐15
All required items are included, with a few additions.
Items clearly demonstrate most of the desired learning outcomes for the term. The student has gained a general understanding of the concepts and applications.
Reflections illustrate the ability to critique work, and to suggest constructive practical alternatives.
Items are introduced and well organized, showing connection between items.
14‐10
All required items are included.
Items demonstrate some of the desired learning outcomes for the term. The student has gained some understanding of the concepts and attempts to apply them.
Reflections illustrate an attempt to critique work, and to suggest alternatives.
Items are introduced and somewhat organized, showing some connection between items.
09‐05
A significant number of required items are missing.
Items do not demonstrate basic learning outcomes for the term. The student has limited understanding of the concepts.
Reflections illustrate a minimal ability to critique work.
Items are not introduced and lack organization.
0
No work submitted
Reflection/Critique
Overall Presentation
UNIVERSIDAD MARIANO GALVEZ ESCUELA DE IDIOMAS PROFESORADO EN INGLES TECNICAS DE EVALUACION 2009
â&#x20AC;&#x153;PORTFOLIO RUBRICâ&#x20AC;? STUDENT NAME:________________________________ DATE:__________ I.D.#:_______________________
Points
Required items
Concepts
Reflection/Critique
Overall Presentation
25-20
All required items are included, with a significant number of additions.
Items clearly demonstrate that the desired learning outcomes for the term have been achieved. The student has gained a significant understanding of the concepts and applications.
Reflections illustrate the ability to effectively critique work, and to suggest constructive practical alternatives.
Items are clearly introduced, well organized, and creatively displayed, showing connection between items.
19-15
All required items are included, with a few additions.
Items clearly demonstrate most of the desired learning outcomes for the term. The student has gained a general understanding of the concepts and applications.
Reflections illustrate the ability to critique work, and to suggest constructive practical alternatives.
Items are introduced and well organized, showing connection between items.
14-10
All required items are included.
Items demonstrate some of the desired learning outcomes for the term. The student has gained some understanding of the concepts and attempts to apply them.
Reflections illustrate an attempt to critique work, and to suggest alternatives.
Items are introduced and somewhat organized, showing some connection between items.
09-05
A significant number of required items are missing.
Items do not demonstrate basic learning outcomes for the term. The student has limited understanding of the concepts.
Reflections illustrate a minimal ability to critique work.
Items are not introduced and lack organization.
0
No work submitted
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE HUMANIDADES ESCUELA DE IDIOMAS ESCUELA DE IDIOMAS LICDA. EVELYN R. QUIROA
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.
ASSESSMENT AND EVALUATION VOCABULARY (DIAGNOSTIC) Action Research Affective Outcomes Annual Report Assessment Assessment Cycle Assessment Tool Assessment Literacy Authentic Assessment Benchmark Cohort Course‐embedded assessment Course‐level assessment Course mapping Criterion Referenced Tests Curriculum Map Diagnostic Evaluation Direct Assessment Educational Goals Formative assessment General Education Assessment Holistic Scoring Learning outcomes Measurable Criteria Metacognition Norm Portfolio Primary Trait Method Process Program assessment Reliability Rubric Self‐efficacy Senior Project Summative assessment Validity
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE HUMANIDADES ESCUELA DE IDIOMAS ESCUELA DE IDIOMAS LICDA. EVELYN R. QUIROA
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.
ASSESSMENT AND EVALUATION VOCABULARY (DIAGNOSTIC) Action Research Affective Outcomes Annual Report Assessment Assessment Cycle Assessment Tool Assessment Literacy Authentic Assessment Benchmark Cohort Course‐embedded assessment Course‐level assessment Course mapping Criterion Referenced Tests Curriculum Map Diagnostic Evaluation Direct Assessment Educational Goals Formative assessment General Education Assessment Holistic Scoring Learning outcomes Measurable Criteria Metacognition Norm Portfolio Primary Trait Method Process Program assessment Reliability Rubric Self‐efficacy Senior Project Summative assessment Validity
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE HUMANIDADES ESCUELA DE IDIOMAS ESCUELA DE IDIOMAS LICDA. EVELYN R. QUIROA
What is assessment and evaluation? Assessment is defined as data‐gathering strategies, analyses, and reporting processes that provide information that can be used to determine whether or not intended outcomes are being achieved: Evaluation uses assessment information to support decisions on maintaining, changing, or discarding instructional or programmatic practices. These strategies can inform: •
The nature and extent of learning,
•
Facilitate curricular decision making,
•
Correspondence between learning and the aims and objectives of teaching, and
•
The relationship between learning and the environments in which learning takes place.
Evaluation is the culminating act of interpreting the information gathered for the purpose of making decisions or judgments about students' learning and needs, often at reporting time. Assessment and evaluation are integral components of the teaching‐learning cycle. The main purposes are to guide and improve learning and instruction. Effectively planned assessment and evaluation can promote learning, build confidence, and develop students' understanding of themselves as learners. Assessment data assists the teacher in planning and adapting for further instruction. As well, teachers can enhance students' understanding of their own progress by involving them in gathering their own data, and by sharing teacher‐gathered data with them. Such participation makes it possible for students to identify personal learning goals. Types of Assessment and Evaluation There are three types of assessment and evaluation that occur regularly throughout the school year: diagnostic, formative, and summative. Diagnostic assessment and evaluation usually occur at the beginning of the school year and before each unit of study. The purposes are to determine students' knowledge and skills, their learning needs, and their motivational and interest levels. By examining the results of diagnostic assessment, teachers can determine where to begin instruction and what concepts or skills to emphasize. Diagnostic assessment provides information essential to teachers in selecting relevant learning objectives and in designing appropriate learning experiences for all students, individually and as group members. Keeping diagnostic instruments for comparison and further reference enables teachers and students to determine progress and future direction. Diagnostic assessment tools such as the Writing Strategies Questionnaire and the Reading Interest/Attitude Inventory in this guide can provide support for instructional decisions.
Formative assessment and evaluation focus on the processes and products of learning. Formative assessment is continuous and is meant to inform the student, the parent/guardian, and the teacher of the student's progress toward the curriculum objectives. This type of assessment and evaluation provides information upon which instructional decisions and adaptations can be made and provides students with directions for future learning. Involvement in constructing their own assessment instruments or in adapting ones the teacher has made allows students to focus on what they are trying to achieve, develops their thinking skills, and helps them to become reflective learners. As well, peer assessment is a useful formative evaluation technique. For peer assessment to be successful, students must be provided with assistance and the opportunity to observe a model peer assessment session. Through peer assessment students have the opportunity to become critical and creative thinkers who can clearly communicate ideas and thoughts to others. Instruments such as checklists or learning logs, and interviews or conferences provide useful data. Summative assessment and evaluation occur most often at the end of a unit of instruction and at term or year end when students are ready to demonstrate achievement of curriculum objectives. The main purposes are to determine knowledge, skills, abilities, and attitudes that have developed over a given period of time; to summarize student progress; and to report this progress to students, parents/guardians, and teachers. Summative judgements are based upon criteria derived from curriculum objectives. By sharing these objectives with the students and involving them in designing the evaluation instruments, teachers enable students to understand and internalize the criteria by which their progress will be determined. Often assessment and evaluation results provide both formative and summative information. For example, summative evaluation can be used formatively to make decisions about changes to instructional strategies, curriculum topics, or learning environment. Similarly, formative evaluation assists teachers in making summative judgements about student progress and determining where further instruction is necessary for individuals or groups. The suggested assessment techniques included in various sections of this guide may be used for each type of evaluation.
UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE HUMANIDADES ESCUELA DE IDIOMAS ESCUELA DE IDIOMAS LICDA. EVELYN R. QUIROA
What is assessment and evaluation? Assessment is defined as data‐gathering strategies, analyses, and reporting processes that provide information that can be used to determine whether or not intended outcomes are being achieved: Evaluation uses assessment information to support decisions on maintaining, changing, or discarding instructional or programmatic practices. These strategies can inform: •
The nature and extent of learning,
•
Facilitate curricular decision making,
•
Correspondence between learning and the aims and objectives of teaching, and
•
The relationship between learning and the environments in which learning takes place.
Evaluation is the culminating act of interpreting the information gathered for the purpose of making decisions or judgments about students' learning and needs, often at reporting time. Assessment and evaluation are integral components of the teaching‐learning cycle. The main purposes are to guide and improve learning and instruction. Effectively planned assessment and evaluation can promote learning, build confidence, and develop students' understanding of themselves as learners. Assessment data assists the teacher in planning and adapting for further instruction. As well, teachers can enhance students' understanding of their own progress by involving them in gathering their own data, and by sharing teacher‐gathered data with them. Such participation makes it possible for students to identify personal learning goals. Types of Assessment and Evaluation There are three types of assessment and evaluation that occur regularly throughout the school year: diagnostic, formative, and summative. Diagnostic assessment and evaluation usually occur at the beginning of the school year and before each unit of study. The purposes are to determine students' knowledge and skills, their learning needs, and their motivational and interest levels. By examining the results of diagnostic assessment, teachers can determine where to begin instruction and what concepts or skills to emphasize. Diagnostic assessment provides information essential to teachers in selecting relevant learning objectives and in designing appropriate learning experiences for all students, individually and as group members. Keeping diagnostic instruments for comparison and further reference enables teachers and students to determine progress and future direction. Diagnostic assessment tools such as the Writing Strategies Questionnaire and the Reading Interest/Attitude Inventory in this guide can provide support for instructional decisions.
Formative assessment and evaluation focus on the processes and products of learning. Formative assessment is continuous and is meant to inform the student, the parent/guardian, and the teacher of the student's progress toward the curriculum objectives. This type of assessment and evaluation provides information upon which instructional decisions and adaptations can be made and provides students with directions for future learning. Involvement in constructing their own assessment instruments or in adapting ones the teacher has made allows students to focus on what they are trying to achieve, develops their thinking skills, and helps them to become reflective learners. As well, peer assessment is a useful formative evaluation technique. For peer assessment to be successful, students must be provided with assistance and the opportunity to observe a model peer assessment session. Through peer assessment students have the opportunity to become critical and creative thinkers who can clearly communicate ideas and thoughts to others. Instruments such as checklists or learning logs, and interviews or conferences provide useful data. Summative assessment and evaluation occur most often at the end of a unit of instruction and at term or year end when students are ready to demonstrate achievement of curriculum objectives. The main purposes are to determine knowledge, skills, abilities, and attitudes that have developed over a given period of time; to summarize student progress; and to report this progress to students, parents/guardians, and teachers. Summative judgements are based upon criteria derived from curriculum objectives. By sharing these objectives with the students and involving them in designing the evaluation instruments, teachers enable students to understand and internalize the criteria by which their progress will be determined. Often assessment and evaluation results provide both formative and summative information. For example, summative evaluation can be used formatively to make decisions about changes to instructional strategies, curriculum topics, or learning environment. Similarly, formative evaluation assists teachers in making summative judgements about student progress and determining where further instruction is necessary for individuals or groups. The suggested assessment techniques included in various sections of this guide may be used for each type of evaluation.
Workshops Comment I went to twelve workshops and I got a lot from each one of them, I wish I could go to more but I wasn’t possible. All of them were great, but to mention three I would say that the one given by M.A. Wendy del Aguila was a really good one on which I learned a lot form the speaker who aside from being astonishing beautiful, made me realize that there are some new tool available for teachers in order to help students become better in Business English. Another one was one that we had in Aula Magna given by Philip Haines “The What, Why and How of ESP” All of us have been teaching English for several years but it has usually been in a school and it hasn’t been for an specific purpose. Some of us work as free lancers, where we have to teach Doctors, Engineers, Veterinarians, etc. most of the time we don’t know most of the vocabulary they use but in this workshop I learned that it isn’t necessary to know everything, we just have to know how to handle the situation. Just use ESP. Another one that caught my attention was the one given by M.A. Ivette Menendez “Ubiquitous Learning” She showed us fun and easy ways of making a web page which we could use to help our students and develop our classes more interesting to our students.
It is great that we didn't go to the same workshops, so through you I could get a little to what were the other workshops about, then I won’t miss much. Thanks for such great summaries.
WOW PROJECT DESCRIPTION: Your group will create a new kind of machine, object, fruit or vegetable and you will provide concrete evidence on how the invention works, looks, tastes, smells, and feels. Your group has to bring in pictures or physically if possible to class. OBJECTIVE: Students will be able to apply descriptive words in the descriptive process of their new invention. INSTRUCTIONS: • Decide what sort of invention you will make. • Write a well-structured descriptive paragraph that explains what it is? And what it does? • Plan and draw how it would look like. You’ll have to prepare an advertisement to present your invention and try to sell it to your friends. TIME You will have today and tomorrow to plan and make your new invention. DATE OF ACTIVITY: • We will unveil it tomorrow since we have two periods of class. • You will have 10 minutes to present your invention.