Art | Design + Science and the Power of Metrics, Contribution #2014-01

Page 1

Art | Design + Science: Lessons from Informal Learning and the Value of Metrics1 Melita Morales2 and Jennifer Bissonnette3

Edna Lawrence Nature Lab Rhode Island School of Design Providence, RI

Supported in part by National Science Foundation grant #1004057 to Rhode Island Experimental Program to Stimulate Competitive Research (EPSCoR)

1

Contribution #2014-01 to the Edna Lawrence Nature Lab, Rhode Island School of Design, Providence, RI

EPSCOR| STEAM Communications and Engagement Coordinator, Edna Lawrence Nature Lab, Rhode Island School of Design 3 Biological Programs Designer, Edna Lawrence Nature Lab, Rhode Island School of Design 2


Table of Contents INTRODUCTION ............................................................................................................................. 1 DESIGNING YOUR PROJECT PLAN ................................................................................................ 4 Mapping a Project ...................................................................................................................... 4 Choosing a Learning Outcome ................................................................................................... 7 Picking an evaluation method ................................................................................................... 8 Picking a Population Sample ................................................................................................... 10 METHODS OF DATA COLLECTION .............................................................................................. 15 Surveys ..................................................................................................................................... 15 The Interview ........................................................................................................................... 19 Observations ............................................................................................................................ 21 Personal Meaning Maps ........................................................................................................... 23 New Technologies .................................................................................................................... 24 ANALYZING YOUR DATA ............................................................................................................. 25 Visual Representations ............................................................................................................ 25 Comparing Your Data – Statistics ............................................................................................ 29 REPORTING YOUR RESULTS ........................................................................................................ 39 The Scientific Paper ................................................................................................................. 39 Science Conference Poster ....................................................................................................... 40 APPENDIX A - Logic Model Case Studies ..................................................................................... 41 APPENDIX B – Example of Learning Impact ................................................................................. 43 APPENDIX C – References for Further Study ................................................................................ 46 APPENDIX D – Chi square table ................................................................................................... 51 APPENDIX E – Mann Whitney U test table for alpha level 0.05 .................................................... 52


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

1

INTRODUCTION

Art| Design + Science: Lessons from Informal Learning and the Value of Metrics How do we know what we know? When can we categorize what we know as universal knowledge, and when does it remain an individual perspective? Artists and scientists supposedly take two different inroads to investigate and construct knowledge. Often viewed as polar opposites and set apart by their disciplinary methodologies, there is a charge to the different modes of inquiry that can attract or repel the other practitioner. This is especially true for collaborative projects in which both the artist and the scientist must find common grounds for assessing and critiquing the validity, or impact, of work being done. Andrew Yang warns, “If immigrants to a discipline don’t bother to learn the rules, remain ignorant of founding assumptions, or work unintelligibly separate from the values or concerns of natives, then their creative contributions will not be recognizable as any sort of contribution at all.”4 In authentic interdisciplinary work, artists and scientists are called to step across disciplinary boundaries in a way that does not marginalize or trivialize the work that happens in either field.5 By acknowledging connected relationships and allowing for a flow of knowledge between the personal qualitative and the empirical quantitative, crossing disciplinary boundaries can yield results that are different and separate that when completed alone.

The focus of this document is to familiarize artists with the inquiry methods of scientists, and serve as a guide to understanding a formalized approach to assessment and evaluation. It is for the creator of an art exhibition/project that is asked to cross over into a more quantitative critique. To be clear, there is more than one way to know the world, and this is not an endorsement of the scientific method being the only way. It is to guide the artist in taking a step in collaborative work toward the processes that define scientific exploration.

When granting agencies such as the National Science Foundation (NSF) fund art|design + science projects or collaborations, certain formalities specific to the science culture should be understood and respected. Science progresses through the use of the scientific method, which begins with series of observations that lead to the construction of a hypothesis. This hypothesis is then tested for its validity through carefully constructed experimentation, the results of which are then analyzed and findings and suggestions for further study are discussed.

4. Yang, Andrew. “Interdisciplinarity as Critical Inquiry: Visualizing the Art/Bioscience Interface.” Interdisciplinary Science Reviews 36(1), 2011: 42-54. 5. Marshall, Julia. “Transdiciplinarity and Art Integration: Toward a New Understanding of Art-Based Learning Across the Curriculum.” Studies in Art Education 55(6). 2006:17-24.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

2

As a society, we have come to rely on the scientific method to advance our understanding of how the world works, using quantitative data to reveal viable causal relationships. Conversely, much of what we know about the impact of art resides in qualitative narratives, stories of transformation and inspiration that shift perspectives about the world. Scientist + artist|designer collaborations, often create a hybrid result that can be considered a facet of informal learning.

Informal learning spaces are one example where artists and scientists are called to overlay their knowledge of design, aesthetics and content to generate work that is engaging, memorable and transformative. The field of informal learning has grown to encompass all learning that takes place in museums, nature centers, aquaria, botanical gardens and outside-of-school environments. Referred to as free-choice, complementary, informal and out-of-school learning, informal learning environments offer participants a chance to engage the information with self-direction, moving through content with varying degrees of prior-knowledge and a variety of constraints.6 It recognizes “personal experience as fundamental” and can be identified as seeking “experience vs. information, exploration vs. explanation, and meaning making vs. transmission-absorption.”7 If it is accurate that “people get most of their knowledge about science from someplace other than school or formal education,”8 there is great benefit to our society for artists and scientists to work together to find the best ways possible to communicate scientific information.

Assessment is a critical component to all creative and experimental projects, but discussion of this requirement is particularly relevant in projects that receive federal funding in which a certain level of accountability is implied. Funders want to see that their investment success will be formally assessed. In the case of an emerging funding field such as art|design + science collaborations, it is also critical to document and explain the process undertaken in the research. If we look closely at the evaluation of informal settings, it is not such a foreign concept to the artist. The act of gathering data to assess success of initial research goals runs parallel to the idea of critique. Artists are trained to both solicit and participate in critique as a critical part of the peer evaluation of their work. They often are taught to request formative and summative evaluation in order to iterate on their first idea as well as come closer to successfully communicating their idea or perspective. In the introduction to the book The Critique Handbook, Buster and Crawford write, “the critique is both a deadline and a marker of a perpetual beginning, a freeze-frame moment in the context of a continuous studio practice.” Critique helps identify work at a point in its evolution, propelling, through the deconstruction of a work, the student and peers to locate and articulate where the artist succeeds or falls short in communicating their idea. Analysis and knowledge use come into play, pushing the views and the art maker toward

6. Judy Diamond, Jessica J. Luke & David H. Uttal. Practical evaluation Guide: Tools for museums and other informal educational settings. Alta Mira Press: Lanham, Maryland. 2009. 11. 7. Ted Ansbacher. “On Making Exhibits Engaging and Interesting.” Curator. 45(3), 2002: 167-173. 8. Oregon State University. “Surveys Confirm Enormous Value of Science Museums, ‘Free Choice’ Learning.” ScienceDaily. Accessed June 10, 2014 http://bit.ly/1lkDC2r


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

3

higher order thinking. Artists learn to utilize logical reasoning during critique, finding evidence to support their declarations of meaning. The informal learning assessment similarly looks for evidence, qualitative and quantitative, that answers the question, “How can we know what impact this work has?” The scope of the work in addition to its assessment offers an intersection between what others have done, the data available and your interpretation in communicating scientific information. In a report for the MacArthur Learning Foundation, assessment is defined as, “The production of knowledge useful for individuals, groups and communities to improve practices toward valued goals.”9 Evaluation is defined as, “Judgments made regarding how well goals are being achieved and how valuable the totality of all outcomes is.”10 The process of creating a metric follows the scientific method, which calls for the design of an experiment in which a hypothesis is tested, data collected, and conclusions made based on the data. What is revealed should be useful for both assessing and evaluating your project. A quote from a NSF report on the impacts of evaluating informal learning hints at one important area of crossover in fields of art and science. “…evaluation is not just for preparing good proposals. It is also an integral part of running good projects. During crucial stages of program development, evaluation documents or measures achievements or outcomes against intended goals and objectives (while also being open to unanticipated outcomes as well). All forms of evaluation play an important role in planning, enabling “reflective practice” and facilitating project team/institutional learning… Utilizing all forms of evaluation helps to ensure the progress and success of your efforts.”11 Learning ideally becomes visible and at best, repeatable, adding to the growing body of research on the benefits of using art and experiential modes of inquiry to teach scientific information.

Many variables complicate the landscape of informal learning assessment, as these interactions take place everywhere from crowded public venues to the solitary space of a human-computer interaction in a home. This can make a controlled evaluation challenging. The following sections should serve to both inform you of different informal learning assessment practices, as well as increase your perspective as to how to gain the most accurate and authentic insight into the success of your project. As all artists know, there are many ways to get to the final outcome and assessment is no less of a creative journey.

9. Jay Lemke, Robert Lecusay, Mike Cole, Vera Michalchik. “Documenting and Assessing Learning in Informal and Media-Rich Environments.” A report to The MacArthur Foundation (2012). 11. 10. Documenting and Assessing Learning in Informal and Media-Rich Environments, 11. 11. National Science Foundation. The Division of Research on Formal and Informal Learning. Framework for Evaluating Impacts of Informal Science Education Projects. By Sue Allen, Patricia B. Campbell, Lynn D. Dierking, Barbara N. Flagg, Alan J. Friedman, Cecilia Garibay, Randi Korn, Gary Silverstein, and David A. Ucko. Ed Alan J. Friedman. (Washington D.C.: The United States Government Printing Office, 2008), 20. (Available at: http://insci.org/resources/Eval_Framework.pdf)


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

4

DESIGNING YOUR PROJECT PLAN

Mapping a Project All informal learning projects ideally are able to articulate the goals and purpose for exhibits. For example, a goal might be to introduce a participant to the concept of global warming, or the idea that small changes in the strength of one species can lead to extinction for others. By starting with the project goals as a constraint, you can come up with the why and for whom you are creating this project. You can also build in a desired area of impact. In their report for the National Science Foundation report on informal learning, Allen et al write, “You first think about what you want to accomplish with the target audience you feel you can best reach and then describe how the particular type of project will enable these outcomes to be accomplished.”12 This does not mean that you have a product designed at the outset. Rather, you can begin to shape your project around your intended purpose.

In order to continue a discussion on project mapping, it is important to be clear about terms. The following words will be used to describe different components of a project and its evaluation plan.

Goals are the overall purpose of the work you are doing. This is a broad statement about what the program intends to accomplish.

EXAMPLE: I am creating a work of art to increase interest and engagement in concepts of climate change as it affects marine life.

Objectives are the expected achievements that are defined, specific and measurable, drawn from the statement of the goal.

EXAMPLE: By interacting with images of plankton and being introduced to information regarding plankton’s production of much of the Earth’s oxygen, participants’ connection to ocean life will shift, making them personally want to understand the creatures as indicators of climate change.

Activities are the modules of the projects with which the audience will be engaged, the interactions, which will happen to achieve the objectives.

EXAMPLE: Participants will listen to musical compositions set to microscopy videos that make plankton more charismatic and relatable.

Outputs are the actual deliverables, the people affected, the products distributed or program provided. Outputs are what you did and who you reached. EXAMAPLE: 250 people will be exposed to the video. Of those exposed to the video, 40 will be chosen of various gender and age to complete a survey targeting their level of interest. The exhibit of the video will also be recorded.

12. Framework for Evaluating Impacts of Informal Science Education Projects, 25.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

5

Outcomes are the desired immediate, intermediate and longer-term effects of the project on

changing behaviors, attitudes, knowledge or perceptions. They are the evidence, which relates to the achievement of your goal. Outcomes should also include unintended outcomes. (These are also detailed later on in this document.) EXAMPLE: It is my hope that of the 40 people who will take the survey, 75% demonstrate an increase in curiosity about what plankton are and why they are important in climate change. Additionally, I would like to see that more than 50% of the people who viewed the video (125 people) had dwell times above the national average for viewing a piece of art.

Impact is the change that might occur over long periods of time from the one exposure or continued exposure to this exhibit or other similar exhibits. Changes could be in individuals, communities or policies. EXAMPLE: It is my goal that two years after watching the video, 75% of the viewers who took the survey reported that they independently sought out information on plankton.

External Factors describe the context of your study that may affect your project operations and outcomes. These are often factors that are beyond your control or address the limitations of your study.

EXAMPLE: Due to the time of evening that the gallery will be able to host the video exhibit, we will not able get substantial numbers of children between the ages of 8-14 to view the exhibit. Therefor we will be unable, with this particular study, to make generalizations about this age group. A logic model is a mapping tool to assist in articulating one’s journey in getting to the overarching goal (see Appendix A), a road-map for declaring what you are trying to accomplish and how. They are working diagrams to purposefully and consciously demonstrate relationships among the goals of your work and the objectives, activities, outputs and outcomes. The logic model can be laid out in if/then statements as well to expose the conceptual relationship between the categories: IF [program activity] then [program objective] and IF [program objective] then [program goal]. Dissecting a project in such a way can initially seem like the antithesis of the creative process, and it may not be the right tool for all artistic projects. Whether you use a logic model, or an alternative mapping tool, it should help guide you to the problem you are trying to solve while leaving room for the surprises that surface along the way. Beginning with your goal and working backward, called backwards design, works to make sure the activities planned, or the outputs will actually get you to your desired outcomes. Settling on at least one testable project will be hard work. It will involve times of inspired ideation that have no seeming direction as well as periods of concentrated, linear effort. The oscillation between the two modes of working is the quality that is important not to lose in your efforts. Yet there is also a need to formulate ideas about how you can assess your project in the early stages as it begins to take shape. It is important that the logic model, or project map, should not replace the brainstorming that goes into designing your actual final output. Many artists are familiar with the design thinking process. Currently incorporated into both formal and


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

6

informal settings, it models the fluctuation between convergent and divergent thinking, as the work is developed (see Figure1). This often is an intuitive part of the creative process. The ideation phase is a time for expansion around how to achieve your goal, and the experimentation phase is a time to test if your chosen communication method worked to achieve those goals. The stages of ideation and exploration allow for an opening up of possibilities and ideas, which then require tailoring to be able to experiment its success in communicating effectively. Through this process, you can come up with many possibilities as to how to tackle your key questions, and refine the ideas to at least one output.

Figure 1. Phases of the Design Thinking process. from IDEO’s Educator’s Toolkit13

13. IDEO. Design Thinking Toolkit for Educators, 2nd Edition.” accessed on 4/12/14. http://www.designthinkingforeducators.com/DTtoolkit_v1_062711.pdf.15


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

7

Choosing a Learning Outcome When exhibitions or programs are set up in informal learning spaces, evaluators want to be able to test how effective it is. As mentioned earlier, objectives are just one portion of the project plan, yet they are specific enough to help determine your evaluation method. To be able to evaluate the success of a learning module, the learning outcome must be clearly articulated. Below are learning outcomes that encompass typical goals for informal learning environments. 1. Awareness, knowledge or understanding of phenomena that demonstrate shifts in implicit knowledge around key ideas or concrete information such as vocabulary. This could be a shift in knowledge retention or recall. 2. Increased engagement or interest, or the degree to which we are invested in something can add to what is retained or remembered. It can also be a factor in raising curiosity about a subject, which motivates further exploration of the topic. 3. A shift in attitude suggests long term changes in the way a participant responds to, or feels about the content or phenomena. It can be seen in a shift in the relationship a participant has toward the content. 4. Behavior changes are measured by what a participant might do differently, the actions that are taken after engaging with the project. 5. Skills are tested by looking at participant’s changes in areas such as critical thinking, interpreting, making predictions and drawing conclusions. In order to most accurately assess for these impacts, it is important to be specific about what you are trying to achieve. To lump many of these outcomes together can confuse the data and nullify measuring anything at all. While whittling it down to just one specific impact may seem like a miniscule, incremental measurement, the priority is to be able to accurately assess the outcomes. The more specific you are in defining what you would like those to be, the more accurate your data will be.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

8

Picking an evaluation method Evaluations that test at multiple points in a project can give more accurate and revealing data. Frontend evaluation can determine information about what participants are interested in, what they know, and the demographics of participants at the beginning of a study. Formative evaluation allows the project creators to alter their direction based on what is discovered about how the activity is actually working and for whom. This is often done with a prototype or test audience as a way to improve a strategy before it is set in concrete. Remedial evaluation offers a look at a finished project after an initial exposure to the greater public, ideally with time to make one last round of improvements before the final evaluation of a project. Ultimately, the summative evaluation tells about impact of the project, it collects data on how well your activities and outputs were able to meet your goal. Ideally the data from a summative evaluation would meet all your proposed objectives. Be prepared for the possibility that it won’t. This is not a failure, especially if this is one of the first times you are evaluating an art project in this way. There are many nuances, which we will get into later, to the types of evaluation chosen and how to improve the method. Often, the surprises that might emerge in the data can lead to unexpected insights about the project or the population sampled. When determining what evaluation method will be most appropriate for your project, it is important to separate the quantitative assessments from the qualitative assessments.

Quantitative methods attempt to classify diverse opinions or behaviors into established categories. They are good for establishing numerical patterns and comparisons based on a limited set of variables for comparisons. Quantitative responses can be quickly turned into histograms which visually chart data with line, bar or pie graphs. Quantitative evaluation methods are often used for creating generalizations about larger populations from an appropriate sized sample population.

Qualitative methods emphasize a depth of understanding. They often examine individual cases and utilize direct quotations, open-ended narrative, detailed reporting of events, and behavioral observation. Qualitative methodologies can be good for more complex data, getting a wider range of feedback, or uncovering prevalent trends in thoughts and opinions. The population sample size tends to be small within a given quota.

Two key components of either qualitative or quantitative surveys are validity and reliability. Validity attests to the idea that the tool you decided to use is in fact measuring what you set out to measure. For example, if you are looking at behavior as an indicator of learning, it is important that you know the behavior mentioned is actually a real indicator. Diamond et al offer, “observing a visitor is looking at a label is not necessarily evidence that they have read the label.”14 Reliability refers to how replicable a method is. The method you choose, when tried by some one else, should be able to measure the same thing, in the same way, each time it is used. For example, “A question in an interview that is asked a different way each time may not be reliable because it elicits a different kind of response depending on

14. Practical evaluation Guide, 46.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

9

the way that it is asked.”15 Validity and reliability that turn the collection of findings into research that could be useful for other interested in exploring the same topics, taking your data a step further, or digging deeper into one emergent aspect.

Complications to the evaluation process As mentioned before, there are many complications to assessing informal learning. Below are just a few to be aware of and think through as you think through your study.

It is difficult to reveal a participant’s prior knowledge, or implicit knowledge. Often unconscious, this knowledge shapes they way a participant interacts with the work, and cannot be assumed for all participants.

Another variable is “that the very act of measuring and reporting on a person or an organization changes their behavior.”16 The evaluator wants to observe the interaction of the participants in the activity without any behavior change or modification. When being observed, a layer of self-consciousness will cause many people to alter their normal actions.

Participant’s actions can be socially influenced by the people around them, such as if they came with a school group, with family or are individually exploring the exhibit.

Coverage error results from not all members of the target population for the study having an equal chance of being selected in the sample. For example, if a survey is targeting the general population, but being administered online, the sample population will only consist of those participants who have access to the internet.

Non-Response error is a bias created by a differential in response rates of sub groups. For example, if the target population the survey was sent to was between the ages of 20-30 and 30 40, and a majority of the responders were in the 30-40 bracket, the results would be skewed.

Measurement error is attributed to bias in the questions (see above) or bias by the interviewer, which affects survey results.

As surveys become the norm, people have reportedly exhibited signs of survey fatigue and are less willing to put in the time or energy required.

Social desirability bias suggests survey participants unwillingness to disclose personal information about themselves or wish to present self in favorable light.17

15. Practical evaluation Guide, 46. 16. Elizabeth Merritt. “You Get What You Measure.” Center for the Future of Museums Blog, 11/17/2009. Retrieved 6/3/13 from http://futureofmuseums.blogspot.com/2009/11/blog-post.html 17. Caroline Roberts. “Mixing modes of data collection in surveys: A methodological review.” ESRC National Centre for Research Methods Briefing Paper. City University, London. March 2007. 7.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

10

Picking a Population Sample Once it has been decided what you are going to test for, and whether your assessment method is qualitative or quantitative, a sample population must be generated. Who they are and how many they are will have a great effect on what data is generated and whether or not it is valid and reliable to make generalizations about a larger population. In quantitative studies, it is important to have larger sample population sizes so that the findings are less skewed. The mathematics involved in figuring out a sample size get into statistics and probability that are beyond the scope of this paper, however you are presented with the main ideas as an overview and introduction. Please consult the references listed in Appendix D for further study.

Guidelines for quantitative sample sizes Sample sizes should depend on the participant groups you want to compare. You should have at least 5-10 participants for each participant group you want to compare. (see Table 1).18 In this example, 90 total participants would be suggested. Table 1: Sample size example for comparing 3 participant groups from three different schools.

Grade

School A

School B

School C

K-2

10

10

10

3-5

10

10

10

6-8

10

10

10

How many people you need for you sample size is dependent on the sampling error you accepting. It is most common to accept a 3% sampling error. The sampling error, or margin of error is a measure of potential error that might occur from gathering data with only a portion of the entire population. The greater the population size, the less your sample size has to shift to accommodate for a 3% error (see Table 2). Table 2: This table demonstrates the sample size needed for different population sizes given a 3% or 10% sampling error.

Population Size

+/- 3% Sampling Error

+/- 10% Sampling Error

100

92

49

500

341

81

1000

516

88

5000

880

94

10,000

964

95

50,000

1045

96

100,000

1056

96

1,000,000

1066

96

100,000,000

1067

96

18. Practical Evaluation Guide, 48


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

11

The Confidence interval describes the range of values that accommodates for differences between your population sample and surety of results when scaling for the general population. EXAMPLE: If you have a +/-4% sampling error for a question that 63% of the sample population respondents answer ‘yes’ to, then you can be “sure” that between 59% (63-4) and 67% (63 +4) of the general population will also answer ‘yes.’ The range of values between 59 – 64% is your confidence interval (see Figure 2). 19

Figure 2: The top part of this graphic depicts the relative likelihood that the "true" percentage is in a particular area (if at least half the sample participants, or 50%, respond). The bottom portion shows the 95% confidence intervals as horizontal line segments, the corresponding margins of error (on the left), and sample sizes (on the right).

19. Fadethree, "Marginoferror95." English Wikipedia - Licensed under Public domain via Wikimedia Commons – accessed on August18, 2014. http://bit.ly/1q2Vs8O


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

12

Confidence level is a percentage given to how confident you are in the given margin of error, which

depends on the sample size. The greater your sample size is, the smaller your margin of error, and the higher your confidence level percentage. EXAMPLE: Continuing from the above example, you could be 90% sure, or 95% sure, that the respondents would respond “yes� to the question based on a +/-4% Margin of Error. The formula for figuring out confidence level is the following:

ME = z

p(1− p) n

Where: • •

ME = Margin of Error (or Sampling Error) z = zscore: The statistical variable predetermined for set confidence intervals, e.g. 1.645 for 90% confidence level, 1.96 for a 95% confidence level, and 2.58 for a 99% confidence level (there are many published tables with this set variable available). The zscore can be figured out using the following formula:  �=

• •

(!"#" Â !"#$%!!"#$) !"#$%#&% Â !"#$%&$'(

p = prior judgment of correct value or .5 if unknown n = sample size

Another factor to take into account is the response rate. If you send out 400 surveys and 200 of them are returned, your response rate is 50%. Typically a 20% response rate is considered good and a 30% response rate is considered really good. Therefore, if you need to collect 60 responses to establish a particular confidence level in your results, you would have to send out 300 survey requests.

Who you select for you participants dictates how accurately you can make generalizations about the larger public. Two methods typically used are systematic and representative sampling.

Systematic sampling offers equivalent numbers from each desired group. For example, if you were

sampling for gender, such as you would make sure to have an equal number of participants in the male and the female categories.

Representative sampling selects participants in proportion to how they are found in the general population you are interested in. An example of this would be testing for ethnic group in which if one group, say Latinos, represent 25% of your interested population, they would be selected as 25% in your sample population. Once you have figured out the target numbers you need for an accurate sample population with a declared margin of error, you have to decide what system you will use to actually pick the participants. This is to minimize the bias with which you chose your sample population.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

13

Below are some example options to try and randomize your choice for participants: 1. Position yourself by the door to an appropriate venue (if not the place of exhibition). Choose the first male that walks in the door and ask for his participation. Return to the door and choose the first female that walks through the door. You could also create your own method such as every 5th person male/female that walks through the door. 2. Look at membership lists, student rosters, class rosters and pull out names in a “randomized order” for your study, returning to the list if you need more participants. 3. Many references also note that visitors shift in demographic throughout the day so you may want to choose participants systematically in different hour slots, such as 10am – 12pm and 1pm – 3pm.

Guidelines for qualitative sample sizes The following guidelines can help in gathering participants for a qualitative assessment, which tends to require a specific type of participant, with qualities relevant to the research question. The goal is to develop logical generalizations by studying a few cases in depth.20

Extreme Case Sampling Choose participants that represent the “best” and “worst” case

representatives. Interviewing in depth will offer insight into the extremes of what does and does not work well.

Maximum Variation Sampling Select a diverse group of participants to find central themes and main outcomes. This could represent geographic, socioeconomic, racial or gender diversity.

Homogenous Sampling Select a very specific, condensed group on which to concentrate, such as youth between 15-18, or males over 35 born and raised in Foxpoint.

Typical Case Sampling This selection is based on who would typically be interacting with your

project. For example, if your project is a book for kids, ask a librarian for the typical age, gender, etc. that might check out the book. Your typical sampling should be as close to that as possible.

Critical Case Sampling Participants to study are determined by their position as an indicator. For

example, if your were trying to create a project that could be understood by the general public, you could a) present it to a group of highly educated participants and generalize that if they can’t understand it, most people probably can not; or b) present it to a group of under-educated participants and generalize that if they can understand it, most people probably would be able to. You would not choose both to inform your analysis but the one that would impact the direction of your study the most.

Chain Sampling A study participant would refer you to someone else to interview, who would in turn refer you to someone else to interview.

20. Practical evaluation Guide, 53.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

14

Informants An informant would be someone from within the program who could provide you with ongoing information about your project.

Not every project will need every type of sampling, and it is most important that you are able to articulate your reasons for your choice. Beginning with either qualitative or quantitative method, you can then selectively decide upon your desired sample population. And, as Diamond, et al, mention with a note of empathy, “Finally, there are always occasions in which time or resources are simply limited, so you end up sampling whoever is available and taking into account possible sources of bias as you examine the findings.”21 The more defined your goal and target audience, the better equipped you will be to assemble the right study participants, asking questions that will yield results that are ultimately relevant to what you want to know. Before moving on from a discussion about study participants, it must be made clear that by law, you are required to inform participants that they are a part of your study. The Institutional Review Board (IRB) has been set up to approve, monitor and review research studies involving human participants. The IRB guidelines dictate that informed consent generally involves the following:22 §

An explanation of the purpose of the research

§

A description of what you will ask the participant to do, how long it will take, and whether or not the participant will be compensated for their time

§

A description of any risks involved (either physical, psychological, social or criminal)

§

A description of any benefits to the participant or society as a whole

§

The degree to which the information will be kept confidential

§

Who to contact if there are any questions about the research or the participant’s rights

§

A statement that the individual is free to not participate and can stop participating at any time.

Ensuring that your participants are on board to give their time and experiences will serve to create a more reliable pool of applicants for you. Depending on your project, a “signature” might be made by accepting the terms and proceeding past the first page of an online document that includes the details of informed consent. At other times, a sign outside of an exhibit that contains details of your study can serve to alert participants that their entrance into the exhibit signifies consent. For some projects, a signature may be the only way to ensure the participants are informed that they are a part of a study. Whichever way you decide is best for your study, this is not an optional part of the evaluation process.

21. Practical evaluation Guide, 54. 22. “Human Research Protection Program.” Yale University. 2009. accessed 7/24/14 http://www.yale.edu/hrpp/participants/rights.html


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

15

METHODS OF DATA COLLECTION

Surveys One metric often used to find out information from participants is through surveys and interviews. Although obtained in different ways, both methods require a thoughtful approach to the questions asked. Preparing good questions will lead to answers that can offer the information you are interested in. As you construct each question, refer back to your main topic to double check that the question you ask will work to get at your initial research question. “The quality of an interview depends largely on how the questions are asked. Questions should be free from bias, allowing participants to answer based on their own personal opinions.”23 In interviews and surveys, qualitative questions are open-ended and allow the participant to fill in with their own words. The answers can be dissected for patterns or trends. The results can be summarized in a final narrative that uses direct quotes from the responses. Quantitative questions are structured, and their responses can be explored with statistical analysis. Using a mixture of both qualitative and quantitative questions will give you more room for pulling out narrative pieces as well as making generalizations based on empirical data. Surveys can be created online using software such as (but not limited to) Google Forms, Survey Monkey and Survey Planet. A tool like Survey Monkey also offers some analytical tools that generate graphs for you, whereas with Google Forms, you will have to create the graphs manually. Paper surveys can also be passed out or mailed out. Additionally, surveys can be given as a pretest and posttest. This method works very well for comparing and contrasting answers to questions that are worded exactly the same yet contain different information from the participant before and after the interaction with your work. This method also works well with a control group, allowing the evaluator to compare the posttest results between those who have interacted with the work and those that have not. Below is are examples of qualitative survey questions. What can you tell us about your experience with the ocean? What feeling did the plankton music composition leave you with? What part of the exhibit did you find most engaging and why? Below are some examples of different categories of quantitative survey questions. Dichotomous questions have only two choices.24

23. Practical evaluation Guide, 76. 24. William Trochim. “Research Methods Knowledge Base.” Research Methods Knowledge Base. 2006. accessed http://www.socialresearchmethods.net/kb/questype.php


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

16

EXAMPLE: What is your gender? Male/Female Do you believe in evolution? Yes/No Recall and retention questions both offer ways to uncover what knowledge has been translated and demonstrate a shift in awareness of content. Recall tests for specific information, where as retention test for the information in a given context. EXAMPLES of Recall questions: 1. What is a zooplankton? __________________________ 2. What percentage of the earth’s oxygen does plankton produce? ___________ 3. Define plankton. ___________________________ EXAMPLE of Retention question: Check all of the following that are true for plankton: o o o o o

Plankton can be plants or animals Can play an important role in absorbing nutrients dissolved in the water Can go through photosynthesis They drift in the current Weighed together, plankton outweigh all the other sea animals

Level of Measurement questions offer an opportunity to gather information on a scaled system where the numbers have no relevance to value. EXAMPLE of Nominal Question: Circle the choice that best describes your last science learning environment: 1. College Course 2. Workshop 3. Reading an article/book/journal 4. TV series

EXAMPLE of an Ordinal Question: Rank how you like to gather scientific information from most favorite (1) to least (4): __ peer reviewed journal __ text book __ online forums


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

17

__ documentaries EXAMPLE of an Interval Question: All people who believe in global warming are crazy. 1 strongly disagree

2 disagree

3 neutral

4

5

agree

strongly agree

Bi-polar interval questions have a neutral point in middle of scale ends. At times, you may not want to offer a neutral option, so you can offer a 1-4 scale. EXAMPLE of a Cumulative Question: (these questions are tiered in such a way that if a participant checks one they probably agree with all the choices above) Please check each statement for which your answer would be ‘Yes’: __ Are you driven to vote for a president based on his platform of reform measurements to address global warming? __ Are you driven to make changes to your neighborhood culture, taking community action to address climate change? __ Are you driven to work with a neighbor to address climate change? __ Are you driven to change your habits to address climate change? Filter or Contingency questions require a participant to answer one question in order to assess if the subsequent questions is relevant. It is suggested that these questions do not have more than 2 or three levels as the participant can get confused or lost in the sublevels of the initial question. EXAMPLE of a Filter Question: Have you ever looked at plankton under a microscope? Yes/No If yes, how many times have you looked at plankton in the past year? o once o 1-10 times o 11-20 times o 21- 35 times o more than 35 times Filtered questions can also visually guide the participant using arrows and boxes for clarity.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

18

As you construct your questions, some things to look out for are: § § § § §

Can the question be misunderstood? What assumptions does the question make? How personal is the wording? (feelings about something v. personal satisfaction..) Is the wording too direct, calling up stressful memories? Does the wording contain universal terminology?

Biased questions lead the participant to respond in a certain way.25 EXAMPLE:

Biased question: Don’t you agree that climate change is a problem? Unbiased question: Is climate change a problem? Double-barreled questions can cause confusion in how to answer and make the participant quit the survey. EXAMPLE:

Double-barreled question: Do you agree that climate change is a problem and that the National Science Foundation should be working diligently to fund solutions? Revised question: Is climate change is a problem? (if the participant responds yes): Should the National Science Foundation be responsible for funding solutions? Questions that do not have enough specificity can confuse the participant. EXAMPLE:

Too Confusing: What do you think about climate change? Revised: What is your opinion of the effect of human behavior on climate change?

25. Purdue University Online Writing Lab. accessed 6/14/14. downloaded from https://owl.english.purdue.edu/owl/resource/559/06/


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

19

The Interview Another method of asking questions to study participants is through the interview. The interview can be conducted face-to-face and recorded on paper or digitally. It can be administered over the phone or through a set up over the phone touch pad response system. As digital tools increase, so too the options widen such as using online face-to-face conversation such as Skype or Facetime. Whichever method seems best for your particular study, the importance again resides in the crafting of the questions. The goal, as with surveys, is to conduct an interview that elicits responses that are not biased by the interviewer. Interviews can seek both qualitative and quantitative answers. Qualitative interviews tend to be less structured and informal. The interviewer can follow-up answers with further questions, taking the conversation in unexpected, unplanned directions. The quantitative interview tends to be more structured and formal in comparison, asking from a pre-arranged set of questions in a consistent method.

Informal conversational interviews are unstructured and yield useful insights as conversations

naturally evolve in discussion of a topic. This method of interviewing often puts the participant most at ease and can allow for deep questioning. This also allows for the evaluator to reference the context of answers and nuances of individual difference.

Drawbacks include participants wanting to please the evaluator as they guess at the evaluator’s belief system. Another might be that a participant changes their mind throughout the conversation, negating the data already set forth.

Semi-structured interviews are conducted with topics and issues specified, yet are flexible enough for topics to shift depending on circumstances. This allows for questions to be asked as fit for each participant.

Focus groups are semi-structured interviews in which participants are grouped by a certain similarity such as personal interests or experiences, and tested. The group dynamics and interactions between members of the focus group yield data not easily accessed in an individual interview. One drawback can be that members of the focus group are unwilling to share an opinion that is not common to the majority of the group. Expert panels can also be called in to discuss a certain topic with an evaluator. For example, if you are gearing your project toward 12-14 year olds, an expert panel could be called in to answer the questions and assist in writing questions that get at your main goals.

Structured Interviews are the most useful for collecting data useful to statistical analysis. For these

interviews, the questions and response categories are determined ahead of time. This will allow for the responses to be sorted, summarized and compared more easily.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

20

The following are tips for maximizing the potential of all interviews:26

• •

• •

• •

Decide on your study participants ahead of time. Arrange your interview times and locations, allowing for plenty of time to conduct the interviews and take notes in between. Remember to allow for participants informed consent, telling them about the survey and how the results will be used. Arrange a time and place convenient and comfortable to talking. If participants are adults with kids, make sure to provide an activity for the kids while their parents participate in the study. Make your survey’s consistent and focused so you are not wasting the participant’s time. Ask the most personal questions last. This will allow participants to get a sense of you before divulging information about themselves that they might not yet be comfortable discussing. Offer a category for more personal information such as: an age range (between 20 – 25; 26-35; 36-45; 46 – 55; over 55) or income level (under $30,000; $30,001 – $60,000; $60,001 - $90,000; over $90,001). Ask only one question at a time. It will be hard for people to remember and respond to multiple level questions. During structured interviews, write categories on a card so that people can read them as they answer and do not have to memorize, increasing the likelihood that they will pick the answer that they can remember.

With both questionnaires and interviews, it is helpful and informative to conduct a preliminary test. This will give you information about how participants interpret your questions, as well as how to restructure to get at the answers you are need to you gather data for specific comparisons.

26. Practical Evaluation Guide, 70-73.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

21

Observations Observing and recoding behaviors of individuals interacting with your work or exhibit can give you lots of information, that when compiled and analyzed, can create a data set. Observation methods range from the informal to the formal, and differ depending on what kind of information you are hoping to get from your observations, similar to the survey or interview process.

Counting visitors is simply counting how many people are exposed to the exhibit. It can be measured mechanically such as by a counter, or by how many tickets are sold, or individually by the evaluator who stands in one place and counts. Measuring this outcome is desirable if part of your goal is to simply increase exposure to the content. You can gather information on superficial qualities such as how many families interact with the work, or if different groups come by an exhibit at different times. A modern form of automatic counting would be a counter that records data on how many visitors were accessed a website. In fact, Google analytics offers varying degree of data on each unique IP address visiting a site. However, bear in mind that observing visitorship as a method of evaluating success has been criticized for expanding the population accessing a work simply for the sake of increasing visitor traffic, without evaluating the success of other outcomes.

Tracking movements can be helpful if there are multiple rooms or pieces of work to your project. The evaluator can track movement within the exhibition space by recording the participant’s actions on a premade floorplan, designating where stops were made and for how long. Each person observed would be recorded on a separate floorplan. It is helpful to create criteria for the actions ahead of time, setting categories for heavy, medium or light use. Another example might be fast and slow movement through an exhibit, or intensive, focused or minimal interaction. A classic use of observing movement is to record dwell time. Dwell time is traditionally measured through careful observation, at times in a live exhibit or often through watching recorded video. This is considered a valuable effort in that research has shown a correlative relationship between museum dwell time and visitor learning. It is reported that a person will spend 17 seconds on average in front of a work of art.27

Basic behavioral observations record what you see the participants do as they interact with your work. It is important to make sure these are as accurate as possible and don’t make fall presumptions such as ‘looking at a label’ versus ‘reading a label.’ It is suggested that you include a bucket list of sorts to make the observations easier to record. Diamond, et al, suggest the following list (see Table 3) as an example:

27. Jeffrey K Smith and Lisa F. Smith. “Spending Time on Art.” Empirical Studies of the Arts. 19(2). 2001. 229-136.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

22

Table 3: Example behaviors and shorthand codes to use when observing participants at exhibit.

le

Looked at exhibit only

man

Manipulated exhibit

ce

Comment, exhibit related

cn

Comment, not exhibit related

qe

Question, exhibit related

lat

Look at label/graphic

ra

Read label aloud

nn

None of the above

Beyond counting and tracking, basic behavior observations can give feedback on what participants are actually doing when interacting with the work. This can help give information about what needs to change with a work during the formative stages of building a piece. During a summative phase it can give information about what could be done differently in another exhibit. It can also be helpful to record conversations. Again, only record if the participants are aware of being recorded for the purpose of your research. This can give insight into what conversations your work inspires. Feedback through patterns in questions and confusions might also emerge.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

23

Personal Meaning Maps A method used to test conceptual change is a personal meaning map (PMM). This method is effective in that it exposes the prior knowledge that a participant is bringing with them. John Falk, who popularized the method, states. “PMM does not assume that all learners enter with comparable knowledge and experience, nor does it require that an individual produce a ‘right’ answer in order to demonstrate learning.”28 Participants are given a piece of paper with a word or image in the middle. This word/image should pertain to the main theme of your research question. Participants are invited to write down everything they can think of that relates to the cue in the middle. They may do this as a web, with lines connecting ideas, or as a list. After the participant has had a chance to interact with your work, they go back into their map with another color, and alter, add to or change what they had written before. A variation on the PMM process is to interview the participant after their first map making, adding notes that further develop and explain their web. This would be repeated afterward (see Figure 3)

Figure 3: An example of Personal Meaning Map assessing conceptual and content knowledge before and after interaction with project.

28. John Falk, Theano Moussouri, and Douglas Coulson. “The effect of Visitors’ Agendas on Museum Learning.” Curator June 1998. 41(2). 109.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

24

New Technologies As modern technology gets smaller and smarter, it is no surprise that it has even infiltrated the assessment sector. Many of the technologies mentioned below are in various stages of testing and infancy. They are being refined to more accurately assess non-verbal behaviors, those which a participant is often not even aware of expressing.

Camera-based engagement methods are currently being used to assess high and low levels of engagement, also referred to as boredom and flow. While still in its infancy, digital devices such, as ipads, are “taught” how to read students faces for engagement. Later, the device’s camera is used to evaluate when the interest of the participant shifted intensity levels.29 Evaluators and teachers can use this feedback to gauge where their interactive game/video/digital presentation may need more work.

Posture measurement monitors engagement through the participant’s posture. The idea behind it is to use the “psychobiological response to audiovisual stimuli in seated individuals, using human kinetic (body movement) and non-verbal behavior as objective, subconscious proxies of instantaneous engagement with stimulus.”30 This can be done through pressure sensors on the chair seat and back of the chair, which read the distribution of weight. Shifts in head pose have been measured using a headband with an accelerometer, as well as mounted reflectors.

Radio frequency identification (RFID) and mobile phone GPS tracking has been more recently

used to gather passive information on the location of visitors to an exhibit. Initially, this method involved an active swiping of a card with an RFID at the exhibit but has moved into a less obtrusive method, which automatically gathers data without having to physically do anything. One example of this was implemented at the Exploratorium in San Francisco. Visitors carried a card, or wore one around their neck, and tracked which exhibits the participant went to. The card could also trigger to have a photo taken at different locations. After the visit, the participant could look up their visitor info on a personal page which gate them statistics about what they did while at the Exploratorium as well as had their photos in a gallery.31 In fact, beyond the technologies mentioned above, there are many advances we have yet to utilize for this purpose. The proposed metrics are by no means exhaustive and it may be you have to create a new method to get at the answers you want. In fact, all the metrics listed above just touch the surface, and can be more thoroughly explored individually. There is a vast amount of research available on perfecting each one of these metric choices.

29. Holly Yettick. “Computers ‘Read’ Students’ Faces to Measure Engagement.” Education Week. June 2, 2014. Downloaded on 7/15/15 from http://bit.ly/1pq6HsN 30. Harry J. Witchel, Carina Westling, Aoife Healy, Nachiappan Chokalingam. and Rob Needham. “Comparing Four Technologies for Measuring Postural Micromovements During Monitor Engagement.” European Conference on Cognitive Ergonomics August 2012. 189-192. Accessed 6/ 24/14. http://bit.ly/1uMw9Jw 31. Timothy Baldwin and Lejoe Thomas Kuiakose. “Cheap, Accurate RFID Tracking of Museum Visitors for Personalized Content Delivery.” Museums and the Web Conference. April 2009. Accessed on 8/414 http://bit.ly/1pqoiRm


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

25

ANALYZING YOUR DATA Once you have all the data returned to you, the next step is to sort through the data to make sense of what it is telling you. Anyone who is familiar with statistics will be able to tell you there are many ways to look at the data and pull sets of information from it. Your exploratory data analysis is a formalized way of charting and graphing your information results to easily reveal information. Below are a number of ways to map out your data and make sense of it.

Visual Representations One of the great strengths rigorous arts training builds in a person is the capacity to understand and interpret the world through images. New techniques for visually communicating research findings are pushing what we can learn through novel representations of data. While data visualization is an area in which scientists readily reach out to artists, it is important to emphasize that the artist has much more to offer here than a pretty colored graph. Traditionally, scientists pull from a small toolbox when graphically representing quantitative data. A bar graph (see Figure 4) contains the independent variable along the x-axis (different age groups) and the y-axis contains the dependent variable (minutes of dwell time). Bar graphs are used to show quick comparisons of data in “buckets” along the x-axis. Very similar to bar graphs, histograms (see Figure 5) have ranges of information along the xaxis (age 0-90) versus distinct buckets (0-5; 6-10; 11-15, etc.). The columns of data in a histogram occur right next to each other as a continuous range, generating a curve. Line graphs (see Figure 6) similarly represent data by connecting points along an x- and y-axis to demonstrate trends through continuous lines. Alternatively, you can use a pie chart (see Figure 7) to display percentages or parts of a whole. Graphing methods of displaying data are considered figures and should have the label, or description underneath the graphic and noted in examples below. 80

% of Respondents

70 60 50 40 30 20 10 0 Yes

No

High

Respondents answer Importance Level

Low


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

26

Total time spent at the art exhibition (in minutes)

Figure 4. Bar graph representing % respondents answer to the questions a) Do you think climate change should be discussed in depth in 6th – 8th grade? And b) What importance level would you give learning content about global

45 40 35 30 25 20 15 10 5 0 < 3

4-­‐6

7-­‐9

10-­‐12

13-­‐15

16-­‐18

19-­‐21

22-­‐25

>26

Time spent in interpretive plankton environment installation (in minutes)

warming? Figure 5. Histogram representing time spent in specific plankton exhibit as compared to overall time spent at the art exhibit.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

27

40

number of attendees

35 30 25 < 21 years old

20

22-­‐35 years old

15

36-­‐50 years old

10

> 55 years old

5 0 mon

tues

wed

thurs

fri

sat

sun

Attendance by day of the week Figure 6. Line graph displaying the number of different age of visitors on a given day of the week of the exhibit.

mean dwell times per exhibit per visit

15%

10% wall photographs 17%

video projection interactive video game

23%

reading zone live tanks 35%

Figure 7. Pie chart displaying the mean dwell times on different exhibits for randomized sample of 40 museum visitors.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

28

Tables on the other hand are not meant to visually communicate complex relationship patterns in data. Rather, they are for showing the exact numerical values found for small data sets. Tables are labeled at the top of the graphic (see Table 4). Table 4. Adjectives used by readers to describe book on plankton

Adjective picked

# of study participants

Informative

13

Educational

11

Inspiring

8

Beautiful

5

Confusing

2

Inaccurate

1

Whichever method you choose, keep in mind that tables and graphs have been used to communicate scientific information to varying degrees of success. Diamond, et al, suggest that a “person should be able to interpret it just on the basis of what is presented in the graph and the legend.”32 Essentially, if the graph doesn’t make the information clearer to you or the viewer, there is no point in using it. In fact, as artists, this can be an area where your visual training gives you expertise in communicating the information effectively. A few tips are: •

Make sure to label your axis

Use colors that are “color blind” and can be read as values

Labels shouldn’t block graphic information

Type choice is legible over decorative

Words generally run from left to right vs. vertically

32. Practical evaluation Guide, 96.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

29

Comparing Your Data – Statistics Responses from quantitative data usually fall into three main categories:

Counting which yields a whole number integer, such as how many males between 18-25 saw the video exhibit.

Measures are collected from variables such as approximate distance of interaction with the exhibit, or time spent in front of exhibit.

Ratios relate two variables to each other and yield a percentage. An example would be how long a person spent on each page of the book, divided by the total time spent reading the book.

After collecting for a particular type of data, you will have to figure out how you are going to use the data. This may be showing single values, small sets of values, or summarizing values of an entire data set. DESCRIPTIVE STATISTICS Summaries of data (descriptive statistics) are divided into two main kinds: central tendency and measures of variability.

Central tendency refers to a “middle” value of a given set of data using the mean, the median, or the mode.

The Mean is an estimate of the center of the overall distribution of data found by adding all of the values observed and dividing by the total number of observations. EXAMPLE: Let’s say you were recording dwell time for 12 different people at your artwork and yielded this set of numbers in seconds {3,6,2,5,4,7,4,5,2,2,2,5,9}. The total sum of dwell times would be: 56. When you divide this by the total number of observations (13), you have the mean dwell time, 4.3 sec. The Median is the midpoint value of a distribution. EXAMPLE: Using the dwell time data set above, we would reorder the numbers as: {2,2,2,2,3,4,4,5,5,5,6,7,9}. The median value would be 4. The Mode is the most common number in the data set, and is primarily used when finding a mean does not make sense, and a whole number is needed. Taking the data set above, the modal value would be 2, as it appears 4 times in the data set.

Variability is an important factor to think about when looking at the central tendency in a set of

numbers. It describes the distribution of the number set from the mean. The level of variability offers


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

30

information on how close to, or the mean the values fall. EXAMPLE: Given the following data sets, although the mean is the same, the variability in the numbers from the mean is very different. a: {20, 40, 60, 80, 100} and b: {50, 55, 60, 65, 70}. All data sets will express variability. This is normal as there are many factors involved in the collection of the data. The mathematical representation of this variability is usually calculated as the “standard deviationâ€? or s.d. There is another term you may see, variance, which is the square of the standard deviation. We use typically use s.d. because it has the same units as the original measurements. The formula for computing the s.d. might look a little scary: đ?‘ !!

1 = Â đ?‘

!

(đ?‘Ľ! − đ?‘Ľ)! !!!

But if you break it down, all it is saying is that you’re going to calculate the average (mean) of the differences between each data point and the sample mean. First, you take the sum (Σ) of the difference between each data point xi and the mean of all your data points � , squared – this gets rid of the problem of adding positive and negative differences. You’ll take the square root on the whole thing in the end, after you’ve divided by the number of data points you collected. EXAMPLE: Say for example you have 8 data points: 2, 3, 4, 4, 5, 6, 8, 8 These 8 points have a mean (�) of 5 First, you will calculate the difference of each data point from the mean, and square it: (2-5)2 = (-3)2 = 9

(5-5) 2 = 0 2 = 0

(3-5)2 = (-2)2 = 4

(6-5) 2 = 12 = 1

(4-5) 2 = (-1) 2 = 1

(8-5) 2 = 32 = 9

(4-5) 2 = (-1) 2 = 1

(8-5) 2 = 32 = 9

These new 8 values have a mean (đ?‘Ľ) of 4.25, and that is the variance of your population, the square root of which is approximately 2.06. So your standard deviation is 2.06

But what does that mean? First, lets assume that your data is normally distributed. This means that your data is evenly distributed around your mean, a plot of which would look like this:


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

31

0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0

5

10

15

20

25

30

35

40

Figure 8. Bell curve, showing an idea normal distribution.

Where the middle of the chart represents the mean, and the data is clustered around this central point, with there being fewer very large or very small numbers. A real world example might look like this, with a sample size of 30 and the average amount of time spent at the exhibit being 5.5 minutes: 6

Nubmer of visitors

5 4 3 2 1 0 1

2

3

4

5

6

7

number of minutes at exhibit Figure 9. Sample data, normally distributed.

8

9

10


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

32

But wait, here’s another data set, also with sample size = 30 and a mean of 5.5, but it looks very different:

9

Number of visitors

8 7 6 5 4 3 2 1 0 1

2

3

4

5

6

7

8

9

10

number of minutes at exhibit Figure 10. Sample data set, with normal distribution and small standard deviation.

The graph on the bottom has a much smaller standard deviation than the graph on the top, meaning that the data are all more tightly grouped around the mean, with fewer extremes. You can visualize the standard deviation (often shown in calculations as s or the Greek letter sigma: σ) as how spread out your data are from the mean. Roughly 68% of your data will fall within 1 standard deviation of the mean (area in red), 95% will fall within 2 standard deviations from the mean (red and green) and adding in the blue, which encompasses 3 standard deviations from the mean, will capture 99% of your sample.

Figure 11. Normal distribution, showing amount of data captured in one (red), two (green) and three (blue) standard deviations from the mean. Source: www.robertniles.com/stats/stdev.shtml


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

33

So, two things: 1) To make a histogram or bar chart like this, you will have to determine “bins” for your data that make sense. For example, if you are doing “numbers of minutes at exhibit”, people won’t be staying exactly 5 or 6 minutes, they will be staying for 5 minutes 25 seconds, or 6 minutes 3 seconds, etc, so you might for example, put every data point in the 5 minute 31seconds to 6 minutes 29 seconds range into the “6 minute” bin. The extreme of too many bins is to have a number “1” on the axis for all your data points, the other extreme is to have only two or three columns, both of which don’t really show you how your data is spread out. Example of too few bins for same data set: 16

number of visitors

14 12 10 8 6 4 2 0 1-­‐5

6-­‐10

number of minutes at exhibit Figure 12. An extreme example of data separated into too few bins.

2) This is where your study sample size comes into effect. The more data collected, the more the variations are likely to achieve a normal distribution (see page 10). Normally distributed data are also referred to as “parametric data”. If your data doesn’t cluster into a nice bell curve around a central tendency, it’s considered to be “non-parametric data”. These terms are important when it comes to deciding which statistical test to run on your data to determine if your findings are “significant.” INFERENTIAL STATISTICS After looking at the tendencies and variability of your data, you may also wish to make comparisons between groups, or to assess before and after affects. In this case you will want to determine if any differences are statistically significant, using inferential statistics. Results are considered to be statistically significant when the differences between groups are larger than what could have occurred by chance alone. It means that any differences observed most likely reflect an actual difference between the populations that cannot be accounted for just by variation in the samples. The level at which you are willing to accept that the two (or more) groups have a significant difference is called the


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

34

alpha (α) level, given as a number between 0 and 1. Subtract the significance level from 1 to get the % chance of being true. For example, if your alpha (α) level is .05, it means you are willing to accept a 5% chance of being not true, or the converse, a 95% chance of being there being a true difference. The convention is use an alpha (α) level of .05, .01 or .001 to denote a statistically significant difference. The alpha level is important to choose in part because, as we will see later, we often must reference pre-determined tables that give us a cut-off number beyond which we can reject or accept that there is a significant difference, and these tables are set up by α levels. The p-value is the actual statistical significance that your test reveals, and is usually determined as part of the statistical testing you run on your data, whether using Excel’s data analysis toolkit, or another statistical program. A p-value less than what you’ve set your alpha level at means that you are willing to say there is a significant difference. There are many, many statistical tests that can be run on data to determine differences between groups, influencing factors, correlations between factors, and other parameters. For the purpose of this paper, we will introduce you to just a couple of options that relate to what are likely to be the most common analysis called for in looking at the outcomes of art exhibits and installations, which involve comparisons between groups or before/after affects on the same group.

Parametric Tests Parametric tests can be powerful ways of analyzing your data as long as they are normally distributed. An analysis of variance (ANOVA) is the most common technique of comparing between two group results. What it tests is the likelihood, given the sample means and standard deviations, that the distributions are separate, and therefore you can claim a significant difference between them Below you can see what the graph of two groups distributions look like. In the case of low variability, it seems pretty straightforward to say that these are two distinct groups, but as variability increases and/or the means come closer together, you can see where it would be handy to have a test that tells you as unequivocally as possible if there are actual statistical differences between the groups.

Figure 13. Three sets of data, all with same means, but differing levels of variance, highlighting the need for robust statistical tests, which can determine significant differences between data sets. Source: http://bit.ly/1wuvP5n


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

35

T-test The t-test is an ANOVA test that compares two samples, for example two different groups dwell time response to an exhibit – maybe male/female, or one school vs. another. If measurements are taken on the same group twice, for example, in the case of before/after measurements, you will use what is called a paired t-test. If the two groups are independent, you will use an unpaired t-test. Choosing the correct test has important implications for the validity of your results! People will typically use the excel toolkit or an online t-test calculator, but the math isn’t all that complicated once you know your mean (X) and standard deviation (s) for the two groups. The math for two populations with equal sample sizes and equal variances in the two groups is given as: �=

đ?‘‹! − đ?‘‹! đ?‘ !! !! −

2 đ?‘›

where

đ?‘ !! !! !

1 ! (đ?‘ + đ?‘ !!! ) 2 !!

analysis of variance (ANOVA) is the general name of the test, which can be used to compare 2 or

more groups and assess whether they are drawn from separate populations. Again, Excel data analysis or other software can run this test for you.

Non-Parametric Tests Sometimes your data will not be normally distributed, either because, despite your best efforts, your data set is not sufficiently large, or because of the nature of the data themselves. Statistical tests designed to handle non-normally distributed data are called non-parametric tests, and three of the most common are the chi-squared test, Mann Whitney U Test and the Kruskall Wallis Test.

Chi square test - A Chi square test is used to compare a sample population against an idealized, or previously defined population in which the data are defined in categories, for example, number of people selecting preferences of visualization styles, compared to selection being equally distributed across all of the options.

Example data set, looking at number of people (sample of 50) who preferred a given data visualization Visualization A

Visualization B

Visualization C

Visualization D

Visualization E

Observed

3

8

12

13

14

Equal Dist.

10

10

10

10

10


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

36

The formula for computing the chi-square (X2 ) statistic is đ?‘‹! =

(observed − expected)! expected

This just means that you take the sum of the square of the difference between the observed and expected, divided by the expected value You will then need to look up your X2 statistic in a X2 table (Appendix D), and see if it meets the threshold for your chosen significance level. Please note that you will need to determine the degrees

of freedom (d.f.) in your test. For the example in the table above, it will be the number of categories minus 1, or 5-1 so d.f.= 4. So for the example data set, it will be: X2 = (3-10)2 /10 + (8-10)2/10 + (12-10)2/10 + (13-10)2/10 + (14-10)2/10 = 4.9 + .4 + .4 + .9 + 1.6 = 8.2

With d.f.= 4, we look it up in the X2 table , and it shows us that the value would have to be 9.49 or higher for us to say that there’s more than a 95% chance that this shows a significant difference. Our conclusion statistically must be that we cannot conclude a significant difference from all of these visualizations being equally chosen. It is above the number for us to say that there’s a ninety percent chance that these are different, and people will sometimes refer to a different level of significance than the conventional one of 95%.

Mann Whitney U Test – This test is also sometimes called the Wilcoxon Rank-Sum test. In it, your data are all listed together, and ranked according to value, with the comparison made on whether a larger proportion of high ranks vs. low ranks can be attributed to one of the two groups. For small sample sizes, this can even be done by hand. There are certain assumptions that must be met, in order for the Mann Whitney U Test to be used, including that: 1. The observations of both groups are independent from each other 2. The responses are ordinal – that is, you can say assign them ranks of greater or lesser value 3. The data for each group follows a similar distribution pattern. The statistic is calculated according to the formula: đ?‘ˆ! = đ?‘›! đ?‘›! +

!! !! !! !

and đ?‘ˆđ?‘Ž = đ?‘›đ?‘Ž đ?‘›đ?‘? +

−

�� (�� !!) !

đ?‘…! −

đ?‘…đ?‘?


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

Rank

Group A Data

1 3 3 3 5 6.5 6.5 8.5 8.5 10.5 10.5 12 13.5 13.5 15 16 17

1

37

Group B Data 3

3 3 4 5 5 6 6 7 7 8

Where U is the statistic for the group, na is the size of the first group, nb is the size of the second, Ra is the sum of the ranks for the first group, and Ub is the sum of the ranks for the second group. You will then use the smaller of the two U values to consult a significance table.

9 9 14 18 20

Combine the two data sets, and then rank them in order from lowest to highest. In the event that there are equally ranked items, you should assign them numbers as if they averaged the rank. For example, in this data set, the 2nd, 3rd, 4th numbers are all “3”. So we assign each of them the rank of 3, which is the average of 2+3+4. In this way we avoid giving a higher rank to the same number in different data sets, and then we pick up the ranking of the next number in line with “5”, because technically the 2nd 3rd and 4th are accounted for. Using the formula above, we get Ua = (9x8) + 9x(9+1)/2 – 58 = 72 + 45 – 58 = 59 Ub = (9x8) + 8x(8+1)/2 – 79 =72 +36 – 95= 13 Using the smaller of these, 13, we look the value up on a Mann Whitney U test chart for a significance level of .05, which is 15. Since 13 is smaller than this number, the groups are seen as significantly different at a p value ≤ 0.05. There are also online Mann-Whitney U-value calculators, such as found at http://www.socscistatistics.com/tests/mannwhitney/default2.aspx that will calculate the statistic for you.

Kruskall-Wallis Test - This test is a variation of the Mann Whitney U test, and allows for comparison

between 3 or more groups. This test is also typically available in excel and many software packages, or online through websites such as Wikipedia at http://en.wikipedia.org/wiki/Kruskal–Wallis_oneway_analysis_of_variance.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

38

Statistical testing for Survey Results using the Linkert Scale The Linkert Scale is the typical scale used in responses to Surveys, in which repondents are asked to rate their response to questions along a continuum, typically in the form “strongly agree, agree, not sure/undecided, disagree, strongly disagree�. Reporting this data does not lend itself to computing a mean, and s.d., and descriptive statistics usually report the mode of the responses, distributions are reported as percentages, and displayed as bar graphs or pie charts. In order to statistically analyze this kind of data, categories need to be given a corresponding ordinal number, i.e. strongly disagree = 1, disagree = 2, etc. These can then be used as the data points for use in a non-parametric test, using the Mann-Whitney U test in the case of comparison between two groups, or the Kruskall Wallis for 3 or more groups.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

39

REPORTING YOUR RESULTS The central questions students at RISD are being asked revolve around the following topics: • How do visual artists reveal scientific data to uncover new patterns and relationships? • How do artists represent scientific information to make it more accessible to the public? • What type of collaborative platforms allow for artists and scientists to combine the different ways they approach a problem to yield a wider variety of results? After coming up with a project plan, implementing an evaluation and assessment method, and analyzing your results, the information you reveal becomes a part of the body of research focused on answering the questions above. Therefor, there is value in presenting your findings to a larger audience. It is also critical to your audience and the intended users of your final report. Your final report should be in a format that communicates effectively to the intended stakeholders, especially after all the energy and time put into your study.

The Scientific Paper The written report is the traditional format for presenting your project. It is a way to share summative information about the process of the study and its results. These are detailed and are written as a continuous narrative, with charts and graphs included when necessary. The paper will contain the following elements: Abstract: The abstract is a brief overview that summarizes the entire project. Generally consolidated into a paragraph, it states the purpose of the study, the methods used, primary results and the importance of the findings. It tells the who, what, where, when and why of the story. The abstract should pull out just the most essential aspects of the study. Introduction: Your introduction should give a context for your study. This is the place to mention the problem setting, the “burning” guiding question and the objective of your study, or prior work/history of the study. It is the stage setting that prompted the work you chose to study. This is the place to mention important findings from your literature review. Methods: Artists might describe this part of the report as the process write-up. In the methods section, you write what you did, in brief. This would be the place to describe where you collected samples, what technologies you used for your observations, or what you did to come to the final activities and outputs. Results: The results section will contain the collection of findings from your qualitative and quantitative evaluation methods. Looking back on your desired outcomes, pick the information from your data collection that is relevant to the overall story of your project. You might decide to present your quantitative results in graphs as a summary, or include choice quotes from your qualitative narratives. Discussions: Your conclusions or summary should link back directly to your main question. You can include your thoughts on why specific results are relevant, perhaps including how it relates to the wider body of literature on this topic. The conclusion can also include recommendations based on your findings, and implications for further studies. If you have more questions because of the data you collected, this is the place to mention it.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

40

Bibliography: Always include a list of references that were cited in your report. Generally, these will be in American Psychological Association (APA) format or Chicago Manual of Style format. Acknowledgements: Many people choose to include acknowledgements in the beginning of their papers. There may be requirements that include listing those who funded your project, logos, or partner academic institutions.

Science Conference Poster The science poster follows the conventional outline of scientific papers, but is much more brief. Because it is meant for display, font size, the addition of visuals and succinct language are important for getting the point across. One difference between the poster and the paper is that the poster, displayed at a conference, offers an opportunity for communication between the researcher and the audience. The viewer will not be sitting and reading a lot of text, rather it is an interactive opportunity to gain an overview of process and results, and follow up with questions for the researcher. There have been lively discussions on the aesthetics of the science poster, and whether a text based format is the best, or most interesting format for communicating the above information. As art|design + science projects continue to borrow practices from each other and learn the language most common to that discipline, it is important to keep in focus the ultimate goal. The poster is meant to communicate the story of your study, from process to product, to results. Creating the most effective format for a science poster could be a guiding question for a study in and of itself. Yet, unless this is going to be you main undertaking, and you truly test for effectiveness, it is important to remember your audience. If you are creating a poster for a science conference, a majority of the people attending will be expecting and fluent in traditional poster design. While making your poster as interesting and unique as possible, you also want to think about balancing change with the risk of alienating your audience through obscurity, or a visual language that mostly artists are fluent in. That being said, as you walk that line of communicating your study, you are encouraged to push boundaries and explore technologies that let you communicate with great impact.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

APPENDIX A - Logic Model Case Studies EXAMPLE 1:

41


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

EXAMPLE 2:

42


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

43

APPENDIX B – Example of Learning Impact

Hypothetical Exhibition: “Plants: unsung heroes of our planet” 33 Project goals as stated in grant proposal: We aim to help visitors appreciate the fundamental role that plants play in our ecosystems; to encourage visitors to marvel at the role of plants as carbon dioxide consumers and oxygen producers; to realize that, in spite of their immobility, plants are highly complex and sophisticated living things; and, to address some common misconceptions about plants. Thus, the intended project impacts would be: 1) Knowledge: Visitors will understand aspects of the basic chemistry, properties, and role of plants in ecosystems. 2) Attitude: Visitors will appreciate plants, both in terms of their sophistication as organisms and their vital role on planet earth. Relevant findings would then serve as evidence for these impacts. Evidence of impact on knowledge (from hypothetical results): •

Visitors knew that plants create their own food: When asked to sort cards with written characteristics of living things, 60% of adults leaving the exhibition could correctly identify “create their own food” as characteristic of plants but not animals. When asked to explain their choice in more detail, a smaller percent, 40%, understood that plants assembled their food from simpler materials. A common misconception, even among those who knew that plants make their own food, was that this food is sucked by plants from the soil (35% of adults). There was no control or comparison group, but reference to the literature on literacy (citations) suggested that only 25% of adults in the U.S. population believe that plants make their own food. Visitors understood plants’ role in atmospheric gaseous exchange: In exit interviews, 50% of visitors mentioned plants’ role in oxygenating the atmosphere. While they may have known this before viewing the exhibition, 30% quoted specific plants listed in the exhibition as highly efficient oxygenators, showing that they had remembered detailed information. Visitors became more aware that plants tie up carbon: Concept maps created by adult visitors on the topic of “ways plants help us” were more likely to include carbon sequestration after visitors had gone through the exhibition than before. Specifically, 20% of adults added this feature to their own concept maps after seeing the exhibition. (There was no control group for this finding.)

33. Framework for Evaluating Impacts of Informal Science Education Projects, 50.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

44

Visitors learned that most of a tree’s material comes from carbon dioxide in the air, not from the soil: 25% of visitors who had seen the exhibition correctly identified “the air” as the source of most of the weight of a tree. This number was significantly higher than the 10% from a comparison group who had not seen the exhibition. While answering this question, 15% of adults explicitly mentioned that this fact had surprised them. Visitors already knew that plants move: In a card-sorting task carried out by two groups of visitors (those who had and had not seen the exhibition), there was no significant difference in the number of visitors who correctly identified movement as a behavior of plants (80% versus 83%). However, discussion with visitors suggested that this might have been because the question was misleading: visitors’ most common example of plants moving was because of wind, rather than self-initiated movement.

Evidence of impact on attitude (from hypothetical results): • •

Overall appreciation: in describing what the exhibition was about, 60% of visitors recognized that it was to help people to appreciate plants. Visitors appreciated the sophistication of plants: In exit interviews, 20% of visitors mentioned that plants were more adaptable / flexible than they had realized. Behavioral observations also showed that the time-lapse videos were particularly effective in this way: visitors frequently commented on the cleverness or capacities of plants while watching them (35% of observed groups). A few even described the plants as “smart.” Visitors appreciated the environmental contribution of plants: In exit interviews, 70% of visitors talked about the environmental role of plants, and 35% specifically mentioned that this was a valuable or even vital contribution. 25% mentioned concern about the fate of the earth’s jungles, both as food for animals and as planetary storage for carbon. No comparable data was found from other sources. Visitors sustained their appreciation over time: Email follow-up interviews with visitors three months after their visit provided some evidence that visitors had sustained these attitudes over time. Specifically, 50% recalled the purpose of the exhibition as helping people appreciate plants, and this was not significantly different from the 60% found during exit interviews. 75% said they had discussed plants with friends or family since the exhibition and the majority of these mentioned a sense of appreciation for plants or concern about their decline as part of the conversation.

Notes about reporting: Hypothetically, suppose that 20% of visitors in exit interviews said they wanted to go home and plant more trees in their yards. This finding would be identified as an unanticipated outcome.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

Table 5: Summary of Impacts of Plants Exhibition

45


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

46

APPENDIX C – References for Further Study The following resources are an introduction to the vast amount of research on creating successful program evaluations. Each one of these will also have a resource/bibliography section that can further aid your research.

Informal Learning 1. J. H. Falk and Lynn D. Dierking. Lessons Without Limit: How Free Choice Learning is Transforming Education. Alta Mira Press: Lanham, MD (2002). The authors detail the rising popularity of free-choice learning as well as the research that supports its value in forming an understanding of the world. Heavy emphasis is placed on what goes into creating lifetime learners. 2. J. H. Falk and Lynn D. Dierking. Learning from Museums: Visitor Experiences and the Making of Meaning. Altamira Press: Walnut Creek, CA (2000). Through years of research, the authors offer a picture of what the museum experience is like for a visitor, as well as the learning process in general. The book sets the stage for making informed decisions on how to create engaging and effective exhibits and programs. 3. Ted Ansbacher. “On Making Exhibits Engaging and Interesting.” Curator: The Museum Journal. 45(3), 2002: 167-173 The author clarifies the goals of experience-based exhibits versus traditional information dissemination exhibits. He offers a see-and-do analysis method to evaluate whether exhibits have achieved meaningful goals. 4. Philip Bell, Bruce Lewenstein, Andrew W. Shouse, and Michael A. Feder. Learning Science in Informal Environments: People, Places and Pursuits. National Research Council of the National Academies, National Academies Press: Washington, D.C. (2009). This report, completed for the National Research Council, synthesizes the current field-based research, literature, and psychological and anthropological studies of learning to create a framework, which can then inform the next phase of research into informal science learning. Contributors included researchers, educators, exhibit designers and programmers and looked at museums, zoos, aquariums, state parks, botanical gardens and natural history museums among others.

Formulating Successful Evaluations of Informal Learning Environments 1. Judy Diamond, Jessica J. Luke & David H. Uttal. Practical evaluation Guide: Tools for museums and other informal educational settings. Alta Mira Press: Lanham, Maryland. 2009. 11. This is a great overview of how to create an evaluation from your initial question, from choosing survey participants to analyzing your results. The authors also include a thick recommended resource list. They provide thorough explanations, define terms and offer the pros and cons of different qualitative and quantitative methods.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

47

2. National Science Foundation. The Division of Research on Formal and Informal Learning. Framework for Evaluating Impacts of Informal Science Education Projects. By Sue Allen, Patricia B. Campbell, Lynn D. Dierking, Barbara N. Flagg, Alan J. Friedman, Cecilia Garibay, Randi Korn, Gary Silverstein, and David A. Ucko. Ed Alan J. Friedman. (Washington D.C.: The United States Government Printing Office, 2008), 20. (Available at: http://insci.org/resources/Eval_Framework.pdf) In this report completed for the National Science Foundation, aspects of informal science learning are discussed with a focus on summative evaluation techniques used to assess their impact. There are many examples of project evaluations from media environments to exhibitions, and targeting different outcomes such as attitudes and conceptual shift. 3. Jay Lemke, Robert Lecusay, Mike Cole, Vera Michalchik. “Documenting and Assessing Learning in Informal and Media-Rich Environments.” A report to The MacArthur Foundation (2012). This report to the MacArthur Foundations offers a review of current research into the evaluation of informal learning spaces gathered through a series of meetings with 25 experts in the field. It offers a set of recommendations based on these discussions and studies, arguing for a broader set of value based outcomes. 4. Joy Frechtling, Laure Sharp, Susan Bercowitz and Gary Silverstein. “User-Friendly Handbook for Mixed Method Evaluations.” A Report for the National Science Foundation. August, 1997. This resource is a handbook housed online through the NSF website for creating mixed method evaluations. Many terms are defined within the field of statistics as well as thoughtful questions posed, as to the evaluators intent and focus. 5. Matthew B Miles and A. Michael Huberman. Qualitative Data Analysis. Sage Publications (1994). This text presents the fundamentals of research design and data management. Concrete examples of research studies are presented to the reader. 6. Roger Tourangeau, Lance J. Rips and Kenneth Rasinski. The Psychology of Survey Response. Cambridge University Press (2000). In this text, the psychology of survey responses is examined, such as how and why people answer the way they do, what effects certain responses and what can cause response error. The author also looks at valuable conclusions that can be drawn using research from cognitive psychology, social psychology and survey methodology. 7. Institute of Museum and Library Services. Research: Evaluation Resource. Accessed from: http://www.imls.gov/research/evaluation_resources.aspx This online resource offers links to articles and websites useful in creating evaluations, as well as examples of surveys sent out from libraries and museums. 8. William Trochim. “Research Methods Knowledge Base.” Web Center for Social Research Methods. 2006. accessed http://www.socialresearchmethods.net/kb/questype.php This is an online text-book that addresses much of what would be covered in an undergraduate or graduate research methods class. The text is hyperlinked in order to easily move through the text and find further information on new concepts and vocabulary.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

48

9. Maxwell L. Anderson. “Metrics of Success in Art Museums.” The Getty Leadership Institute at CGU (2004). Accessed from http://www.cgu.edu/pdffiles/gli/metrics.pdf The authors of this essay address the shift in how museums are held accountable to boards for their impact. The authors suggest ways of identifying and measuring institutional success. 10. Elizabeth Merritt. “You Get What You Measure.” Center for the Future of Museums Blog, 11/17/2009. Retrieved 6/3/14 from http://futureofmuseums.blogspot.com/2009/11/blogpost.html In this blogpost, the author addresses what it means to be accountable in today’s informal learning culture. She discusses some common measures museums are using to address their accountability such as dwell time, collection sizes and attendance levels. 11. Caroline Roberts. “Mixing modes of data collection in surveys: A methodological review.” ESRC National Centre for Research Methods Briefing Paper. City University, London. March, 2007. In this paper, the author reviews mixed-mode methods currently being used to conduct evaluations. She specifically addresses some of the complications that arise when using a mixed-method approach. 12. John R. Houser and Gerald M. Katz. “Metrics: You Are What You Measure.” European Management Journal, 16, 5, (October), 1998. 516-528. With a focus on corporate company interest, this article provides great insight into the effect of creating an evaluation on the company itself.

Alternative Technologies in Evaluating Informal Learning 1. Holly Yettick. “Computers ‘Read’ Students’ Faces to Measure Engagement.” Education Week. June 2, 2014. Downloaded on 7/15/14 from http://bit.ly/1pq6HsN Article introduces recent work being done to create software that can read engagement through unconscious facial movement. Presents possibility for use of this technology within a classroom setting. 2. Harry J. Witchel, Carina Westling, Aoife Healy, Nachiappan Chokalingam. and Rob Needham. “Comparing Four Technologies for Measuring Postural Micromovements During Monitor Engagement.” European Conference on Cognitive Ergonomics August 2012. 189-192. Accessed 6/ 24/14. http://bit.ly/1uMw9Jw Study compares readings from a head-mounted accelerometer, single camera sagittal motion tracking, and force distribution changes using floor-mounted force plates against a Vicon 8camera motion capture system in order to obtain objective data on users of interactional content. 3. Timothy Baldwin and Lejoe Thomas Kuiakose. “Cheap, Accurate RFID Tracking of Museum Visitors for Personalized Content Delivery.” Museums and the Web Conference. April 2009. Accessed on 8/4/14 http://bit.ly/1pqoiRm Study completed to track visitors’ movement throughout a museum using RFID (radio frequency identification) technology. Both the accuracy and viability are shown to have great potential in further personalizing a museum visitor’s experience.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

49

4. Sherry Hsi and Holly Fait. “RFID Enhances Visitor’s Museum Experience at the Exploratorium.” Communications of the ACM. 48(9) 2005. 60-65. Researchers look at potential for RFID cards to give visitors a personalized experience beyond the physical trip to the museum. Also addresses the resistance caused by negative public perception of possible privacy invasion with RFID. 5. Sidney D’Mello, Patrick Chipman, Art Graesser. “Posture As a Predictor of Learner’s Affective Engagement.” University of Memphis. Memphis, TN. Researchers link the affected states of flow (high engagement) and boredom (low engagement) using a pressure sensitive chair and the tutoring program AutoTutor. They also used the test to investigate the reliability of a computer differentiating between boredom and flow. 6. Julian Bickersteth and Christopher Ainsley. “Mobile Phones and Visitor Tracking.” Presented at Museums and the Web 2011. Philadelphia, PA, USA. April 5-9, 2011. This article introduces a pilot study in the retail sector that tracks visitors using mobile phone WiFi, Bluetooth and TMSI signals. It is proposed that such studies are useful in reviewing how the visitor experience is monitored and tracked in museums.

Art + Science 1. Yang, Andrew. “Interdisciplinarity as Critical Inquiry: Visualizing the Art/Bioscience Interface.” Interdisciplinary Science Reviews 36(1), 2011: 42-54. The author discusses the value of learning the vocabulary, methods and procedures that are unique to the “other” discipline as a critical piece of interdisciplinary, collaborative work. 2. Marshall, Julia. “Transdiciplinarity and Art Integration: Toward a New Understanding of ArtBased Learning Across the Curriculum.” Studies in Art Education 55(6). 2006:17-24. The author sets forth a way of looking at transdisciplinarity that values the expertise of different disciplines and allows students open pathways to move between the boundaries without losing the rigor of research study. She places emphasis on the student journal or notebook as a place to house questions and mapping of the process.

Case Studies 1. J. H. Falk. “School Field Trips: Assessing Their Long-Term Impact.” Curator: The Museum Journal. 40(3) 1997: 211-218 128 students from 4th grade, 8th grade and adult are interviewed about elementary school fieldtrips. Nearly 100% remember content/subject matter learned on the field trip, even many years later. 2. Alan Brown and Rebecca Ratzkin. Clayton Lord (Ed.). Counting Beans: Intrinsic Impact and the Value of Art. Theater Bay Area. 2012.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

50

Alongside essays by many contemporary art leaders is a write up of a study completed with consulting firm Wolf/Brown to assess true value of art. This piece is important for ongoing discussion around what to measure when assessing impact of the arts. 3. Oregon State University. “Surveys Confirm Enormous Value of Science Museums, ‘Free Choice’ Learning.” This study focuses on the California Science Center in Los Angeles, and demonstrates results that uphold the concept that people get much of their knowledge about science from places other than formal learning institutions. 4. Melanie Young, Steve Burrow and Phillipa Dement. “Evaluating Origins: How a Museum Exhibit is Experienced by Visitors.” Museum i-D. Accessed from http://www.museumid.com/idea-detail.asp?id=18 Education Research, Manager and Curator work to study how people at the National Museum Wales interact with the exhibits. Data plotted through GIS software allows for visual patterns to emerge for interpretation. 5. David Francis, Maggie Esson, and Andrew Moss. “Following Visitors and What It Tells Us.” International Zoo Educators Journal. 49 (2007) 20-24. Researches at the Chester Zoo, UK track visitors movements to better understand how they are engaging with exhibits, specifically as it relates to dwell times. 6. Jeffrey K Smith and Lisa F. Smith. “Spending Time on Art.” Empirical Studies of the Arts. 19(2). 2001. 229-136. In this study, dwell time realities are empirically tested to find out how long people actually spend looking at artwork. This information is also used to look for patterns in age, gender and group size.


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

APPENDIX D – Chi square table

(Source: http://bit.ly/1H5Bikz)

51


Art | Design + Science: Lessons from Informal Learning and the Value of Metrics

52

APPENDIX E – Mann Whitney U test table for alpha level 0.05

n1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

0

0

0

0

1

1

1

1

1

2

2

2

2

n2 2 3 4

0

1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

0

1

2

3

4

4

5

6

7

8

9

10

11

11

12

13

14

5

0

1

2

3

5

6

7

8

9

11

12

13

14

15

17

18

19

20

6

1

2

3

5

6

7

10

11

13

14

16

17

19

21

22

24

25

27

7

1

3

5

6

8

10

12

14

16

18

20

22

24

26

28

30

32

34

8

0

2

4

6

7

10

13

15

17

19

22

24

26

29

31

34

36

38

41

9

0

2

4

7

10

12

15

17

20

23

26

28

31

34

37

39

42

45

48

10

0

3

5

8

11

14

17

20

23

26

29

33

36

39

42

45

48

52

55

11

0

3

6

9

13

16

19

23

26

30

33

37

40

44

47

51

55

58

62

12

1

4

7

11

14

18

22

26

29

33

37

41

45

49

53

57

61

65

69

13

1

4

8

12

16

20

24

28

33

37

41

45

50

54

59

63

67

72

76

14

1

5

9

13

17

22

26

31

36

40

45

50

55

59

64

67

74

78

83

15

1

5

10

14

19

24

29

34

39

44

49

54

59

64

70

75

80

85

90

16

1

6

11

15

21

26

31

37

42

47

53

59

64

70

75

81

86

92

98

17

2

6

11

17

22

28

34

39

45

51

57

63

67

75

81

87

93

99

105

18

2

7

12

18

24

30

36

42

48

55

61

67

74

80

86

93

99

106

112

19

2

7

13

19

25

32

38

45

52

58

65

72

78

85

92

99

106

113

119

20

2

8

13

20

27

34

41

48

55

62

69

76

83

90

98

105

112

119

127


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.