Archive for May, 2010

LoTi vs. TICS

Projects Reports
 Posted by jeremy on May 27th, 2010

A grad student emailed me about the TICS. Specifically, the issue was how it compares to the Levels of Technology Implementation (LoTi) survey. Here are some of the things I came up with:

1. LoTi is a commercial survey, which has its advantages, but it is also not as open to scrutiny – or adaptation – as the TICS.

2. The TICS is aligned to the NETS-T while LoTi has its own framework.

3. Going by LoTi’s site, it’s clear that it is not in conformance with the latest AERA, APA, NCME Standards for Educational and Psychological Testing (which very few instruments are).

4. The LoTi concerns implementation of technology while the TICS measures self-efficacy regarding technology integration. Self-efficacy is mentioned twice on the LoTi website, once in an article that doesn’t mention LoTi (Eastin & Rose, 2000), and in another article (Meorsch, 1995) that hypothesizes self-efficacy’s relationship with LoTi (but no data is gathered).

Given these differences, it may be that the TICS is better suited for teacher education environments where…
1) the free and adaptable nature of the TICS is desirable,
2) the alignment to NCATE-accepted national standards is needed,
3) the academic rigor of the AERA, APA, NCME standards would be understood, and
4) the predictive nature of self-efficacy is advantageous.

The LoTi may be better suited for in-practice contexts where…
1) the LoTi’s cost is a small issue and its corporate backing is seen as legitimatizing,
2) accreditation isn’t an issue, so alignment to the NETS-T is less important,
3) conformance to research standards is not an concern, and
4) the concern is about actual practice, not potential practice.

On David’s Heresy

Open Education Opinion
 Posted by jeremy on May 12th, 2010

The respectable David Wiley proposes that proponents of “Open Educational Resources” simplify their message. “Open” is too confusing, so we should use terms that policymakers already understand. We should just say that, “Educational materials created with public dollars should be placed in the public domain” (emph. added).

David’s heresy is in the exclusion of GPL-like conditions on reuse. “Public Domain” means the user can do anything with the work, whereas some people (e.g. ESR) only consider something “open” when its use includes what CC would call share-alike. I remember this (inconclusive) discussion from David’s OER course last year, and it’s just a content-level version of the endless BSD-GPL argument.

Given that using the term “Public Domain” as David advocates does not encompass what many feel to be the essence of “openness,” I would argue against its use.

I’ve explored the issues of neologisms in complex-yet-important emergent theories (specifically pertaining to validity in assessment), and I don’t think there is a magic bullet answer. Those trying to clarify complex issues (like OER) can…

  1. Use an existing term in its original, accepted sense (“Public Domain”), but then they risk excluding the aspects of the concept that extend beyond that old term. Conversely, the old term may describe aspects that are not included in the new concept. There may be a reason for which the old term wasn’t used to describe the new concept.
  2. Use an existing term, but redefine it (“open”). Theorists who use this method must explain the new meaning and overcome the established meaning. The resulting confusion is often paralyzing. I think this is David’s central point, and it is valid.
  3. Invent a new term from scratch. The advantage is that there will be no confusion; the disadvantage is that there will be a lot of ignorance. You will still have to explain what the term means. See my neologism VIAR.
  4. Invent a new term by adding modifiers to existing nouns. Proponents of the term “Unobstructed License” seem to have taken this approach. As David noted, this yields a term that is often more confusing that either the noun or the modifier. Oh, and the new term still need explanation.

The bottom line is that new and complex concepts have to be explained. There are effective and ineffective ways of explaining them, but, short of telepathy, we have to go through that explanatory process.

Some may argue that policymakers do not need to grok the OERs, they just need to know enough to vote for policy that supports and encourages their use. I would disagree: Cursory understandings will be easily swayed by disinformation. Having groked policymakers is the goal. :)

A Collision of Technology and Policy

Open Education Opinion
 Posted by jeremy on May 11th, 2010

Given my complaints that TEDxNYED was an echo chamber, I hadn’t planned on reviewing the talk with which I most agreed. Plans change. I find myself struggling with policy that restricts my openness, and, worse, encouraged others to be less open. So, I pulled up David Wiley’s talk.

I won’t go over his premises. I admit that his definitions of “open” and “education” are debatable, but the parallels he draws between policy reaction to the printing press and policy reaction to new media is pitch-perfect. According to David’s account, when the printing press made information cheaper and easy to distribute…

Instead of obliging the demand that exists, [policymakers in the Church] ramp up production of indulgences…. and they push for stricter laws against access to vernacular copies of the scriptures.

Though their punishments were more severe and wielded with more authority, this reaction is very similar to how many institutions (including mine) are stumbling into online education.

In my experience that’s only half the problem. Policymakers eventually must allow – even encourage – faculty to explore the world of online education, but in a way that embodies the worst of colonialism and embrace-and-extend. Policymakers scaffold the transition to online learning with assumption-laden guidance that perpetuates the worst instructional paradigms. And there is always an expansion of control, which is the antithesis of openness.

Yes, I liked David’s TEDxNYED talk, but I would like to think that it wasn’t just because I agreed with it. (There were plenty of other talks with which I agreed, but still didn’t like.) I think it was because he brought something new to the discussion. I’m fairly plugged into the open ed world and I had never heard the collision of policy and technology so succinctly and appropriately presented.

How some standards are messed up

Lessons Opinion
 Posted by jeremy on May 5th, 2010

People in assessment are often at the pointy-end of standards. They have to translate national, regional, and even institutional standards into measurable terms. The problem is that very few standards are written to be measured; rather, they embody a committee-negotiated collective set of values.

Consider the National Board of Professional Teaching Standards. Here’s one from adolescent (high school) science that doesn’t look too daunting:

Accomplished Adolescence and Young Adulthood/Science teachers employ a deliberately sequenced variety of research-driven instructional strategies and select, adapt, and create instructional resources to support active student exploration and understanding of science.

At first glance, it is obvious that this standard contains more than one outcome; it’s a comma-delimited list of outcomes. But a bigger problem soon becomes apparent: There are two separate lists that are meant to be cross tabulated. The verbs select, adapt, and create each relate to the pair of objects exploration and understanding. This multiples the number of measurable outcomes contained in the standard. I count a total of seven, but there may be more:

Accomplished Adolescence and Young Adulthood/Science teachers…

  1. employ a deliberately sequenced variety of research-driven instructional strategies
  2. select instructional resources to support active student exploration of science.
  3. select instructional resources to support active student understanding of science.
  4. adapt instructional resources to support active student exploration of science.
  5. adapt instructional resources to support active student understanding of science.
  6. create instructional resources to support active student exploration of science.
  7. create instructional resources to support active student understanding of science.

There are only 6 NBPTS for adolescent science (for comparison, there’s 12 for adolescent math), but those 6 standards breakout into 28 distinct outcomes (including the seven listed above):

Accomplished Adolescence and Young Adulthood/Science teachers…

  1. know how students learn.
  2. know their students as individuals.
  3. determine students’ understandings of science.
  4. determine students’ individual backgrounds. (What is this supposed to mean?)
  5. have a broad and current knowledge of science and science education.
  6. have in-depth knowledge of one of the subfields of science.
  7. use their subfield knowledge to set important learning goals.
  8. use their subfield knowledge to set appropriate learning goals.
  9. employ a deliberately sequenced variety of research-driven instructional strategies.
  10. select instructional resources to support active student exploration of science.
  11. select instructional resources to support active student understanding of science.
  12. adapt instructional resources to support active student exploration of science.
  13. adapt instructional resources to support active student understanding of science.
  14. create instructional resources to support active student exploration of science.
  15. create instructional resources to support active student understanding of science
  16. spark student interest in science.
  17. promote active learning so all students achieve meaningful growth toward learning goals.
  18. promote active learning so all students achieve demonstrable growth toward learning goals.
  19. promote sustained learning so all students achieve meaningful growth toward learning goals.
  20. promote sustained learning so all students achieve demonstrable growth toward learning goals.
  21. create safe learning environments that foster high expectations for each student’s successful science learning.
  22. create safe learning environments in which students experience and incorporate the values inherent in the practice of science.
  23. create supportive learning environments that foster high expectations for each student’s successful science learning.
  24. create supportive learning environments in which students experience and incorporate the values inherent in the practice of science.
  25. create stimulating learning environments that foster high expectations for each student’s successful science learning.
  26. create stimulating learning environments in which students experience and incorporate the values inherent in the practice of science.
  27. ensure that all students succeed in the study of science (including those from groups that have historically not been encouraged to enter the world of science and that experience ongoing barriers).
  28. ensure that all students understand the importance and relevance of science (including those from groups that have historically not been encouraged to enter the world of science and that experience ongoing barriers).

Suddenly the task of gathering evidence that an individual has or has not met this standard is enormous.

Ten Steps to Great Rubrics

Lessons Teaching
 Posted by jeremy on May 5th, 2010

I sat down this morning to write out some thoughts on a rubric I developed for a course this semester. The more I wrote, the more I rambled. I’ve concluded that each of these points needs to be elaborated individually, but for now, here’s a brain-dump.

—–

Introduction

Last semester I taught on very short notice a course entitled “Understanding Educational Research.” It’s essentially a thesis prep class, but because different advisors have different concepts of the thesis, I chose to play it safe by basing the coursework on a topical review of literature that may or may not lead into the student’s thesis. Because this lit review would be a major assignment (50% of the final grade), I knew I needed a solid rubric and I set out deliberately to develop one through the following steps.

1. Start with good evidence and theory

I believe the best rubrics embody the best thinking in their respective field. Unless you are the leader in your field, this means you need to go out and see what other are saying. Find something someone else has done, whether empirical or theoretical, and build your rubric around it. Generally speaking, I’m a fan of both versions of Bloom’s Taxonomy and the lesser-known Krathwohl’s Taxonomy.

Specific to the topic of master’s thesis lit reviews, I found an unpublished article by two friends to be hugely helpful. These friends adapted a rubric from Doing a Literature Review: Releasing the Social Science Imagination (Hart, 1999), and then used it to evaluate 30 theses. Their article was the first assigned reading for my course, and my students spent much more time discussion the rubric than they did going over their evaluation results.

2. Involve other people

Whether you talk to colleagues, your students, or (preferably) both, get someone else to look over the rubric early and often. In my case, the students worked in groups to determine which of Hart’s criteria were applicable to our class assignment, and then collaboratively crafted draft rubrics during next three class sessions. I served primarily as a sieve, sorting out the contributions of each group and keeping the standards adequately elevated. Which is a nice segue into…

3. Aim high

If your rubric doesn’t describe the heights to which you believe your students may soar, you can only blame yourself when their work disappoints you. Few students will ever do more than they’re told. And why should they? It is not their responsibility to guess what extra work will get them a higher grade. It is imperative that your rubrics include what you know they can do, even if they don’t know it yet.

I had to employ a little subterfuge to raise my students’ expectations. OK, I flat-out lied to them. The article (described above) I assigned at the beginning of class wasn’t really a pre-pub version of some friends’ article; it was Scholars Before Researchers: On the Centrality of the Dissertation Literature Review in Research Preparation (Boote & Beile, 2005). I had removed every mention of “doctoral” and “dissertation” and replaced them with “master’s” and “thesis.” So when my masters students were contemplating which criteria they would meet for their lit reviews, they were working from suggestions for doctoral students. I changed the names in the reference so they wouldn’t find the original article and catch me in my ruse.

It worked perfectly. It wasn’t until after all the lit reviews were submitted that I revealed the intrigue to my students. Yes, I saw a metaphorical dagger or two being flung my direction, but I haven’t fielded a single formal complaint. And, I believe, their work was much better when they held themselves to such a high standard.

4. Avoid subjective terms and judgments of quantity

Most rubrics fail to achieve greatness in part because they rely on overly subjective judgments. Terms like rarely, some, clearly, and (my personal favorite) nearly always are often used to distinguish between levels of performance. But these terms leave so much latitude to the rater that nearly every result is debatable. Other rubrics avoid this pitfall by quantifying degrees of frequency (e.g. “Students correctly cite their sources 70%-89% of the time”). This practice only conveys the impression of objectivity because the criteria are typically not actually measured. Neither using subjective terms, no pseudoquantification is advisable.

This was an issue with many of Hart’s original performance levels for lit reviews. Consider the following levels for one of his criteria (emph. added):

Criterion 1 2 3
Placed the research in the historical context of the field. History of topic not discussed. Some mention of history of topic. Critically examined history of topic.

Notice the subjective terms in the top two performance levels. The difference between no discussion, some discussion, and critically discussed is endlessly discussable. But, this is what we typically see on good rubrics. What other options do we have?

Rather than vary the degree to which a student has performed the same verb, we can find different verbs that describe more acceptable performance. In my case I grabbed verbs from Bloom’s original taxonomy. Here’s is the row from our rubric that corresponds to Hart’s row above:

Criterion 1 2 3 4
Placed the research in the historical context of the field. Mentions the history of the topic, but does not describe it. Describes the topic’s history in isolation from external influences. Frames the history of the topic in relevant social, scientific, and educational events/attitudes. Compares the target topic’s history with histories of related topics.

5. Purposefully weight each criterion

Many rubrics assign the same value to each criterion. While it is possible that they all be equally important, I believe that most of the time this phenomenon is the result of laziness on our part. We don’t want to think about how much “grammar and spelling” should be worth compared to “addressing the topic.” How we determine the weight assigned to each criterion (importance? difficulty? frequency?) is for another blog post.

The students’ input was invaluable for this issue on our rubric. They conveyed sincerity in their arguments for why one row should be more than another, and the final rubric – by which their work was judged – represents their collective opinions. The weights ranged from 6% for defining key term to 20% for summarizing the methods researchers have used to explore the topic.

6. Use non-linear performance levels

I would say 95% of the rubrics I have seen attempt to fit their performance levels to an equal-interval scale. That is, they put the same distance between each level. For example, Hart’s rubric (shown above) used a 1-2-3 scale. But what if the space – perhaps measured by effort – between the second and third levels isn’t the same as the space between the first and second? Rather than blindly following this convention, great rubrics may deliberately space out their performance levels unequally.

For our lit review rubric, we chose 70-80-85-100 for two reasons: First, in our opinion, an A-level paper needed to meet the highest criteria. A 70-80-90-100 distribution would have allowed someone to claim an “A-” without ever performing at the highest level. Second, the effort required to move from the second to third levels was consistently less than that required to move from the third to fourth level.

7. Either include zero as a performance level, or do not describe no-performance

One of my biggest pet peeves are rubrics that assign a value of 1 to the lowest level of performance and contain a description of null performance at that level. Looking at Hart’s rubric above, notice that a student who doesn’t do anything under that criterion still receives a 1-out-of-3 score. This would allow students to claim A-level credit when they neglected a criterion that had been important enough to include on the rubric. Taken to the extreme, a blank paper would earn 33% credit.

A better way would be to choose between 1) including a null description with a zero-credit performance level, or 2) letting your lowest performance level be greater than zero, but describe some minimal performance at that level.

For our lit review rubric, we chose the latter. The value of the minimum performance level is 70%, but it is possible that the student will not even accomplish that level. There is a note at the bottom of the rubric which states that students will receive a score lower than 70% if they fail to fulfill those minimum requirements.

8. Check for understanding (both before and after the assignment)

When a rubric is handed out at the same time the task is assigned, it makes sense to check that the students actually understand what is being asked of them. If students’ literature reviews were scored with Hart’s rubric, the students would need to know what is meant by “critically examines the history of the topic.” Additionally, as the assignments are scored, the rater should watch for common misunderstandings so they can be cleared up the next time the rubric is used.

Because the students helped develop our lit review rubric, I assumed they had a good understanding of what was expected. I was wrong. For example, given that these are masters students in a department of education and that they are all current, former, or future educators, there appeared to be confusion surrounding the term method. Some interpreted it, as I had intended, to imply research methods, but others took it to mean teaching methods. I will be clarifying this distinction on future versions of the rubric.

9. Analyze the results

No assessment tool works well the first time it is used. Commercial tests go through rounds of pilot testing before they are released. State tests… usually need more, but let’s hold ourselves to a high standard. Rubric-derived scores need to be tabulated for each criterion and each level of performance, and then the resulting patterns should be evaluated for their appropriateness. Was there one criterion on which many students scored very low? How can we fix that next time? Do we need to adjust the rubric or the instruction?

If you are concerned that the results may depend on who scored the assignment, you should have multiple raters independently score the same students’ work. This check for inter-rater reliability will tell you if more work needs to be done on the rubric, or perhaps on training scorers to use it.

In most cases an internal consistency reliability analysis (Cronbach’s alpha, KR-20, etc.) is not appropriate for rubric results. Internal consistency checks that your high-scorers aren’t losing points on the easy sections, and that your low-scorers are not getting high marks on the difficult sections. The criteria on a rubric are chosen because they represent various ways in which the quality of the student’s work may vary. We would expect criterion scores to be relatively independent, a trait which internal consistency would mistake as a lack of reliability.

For our lit review rubric, I shaded each cell according to how many students ended up in each performance level. The shading (see below) revealed some welcome and disconcerting patterns. First, average scores for each criterion ranged between 81% and 92%, with an average total score of 83%. This is an appropriate result for such a difficult assignment. Second, very few students achieved the highest performance level for two criteria, which I believe was due to an interaction between the criteria and the specific topics students chose for their lit reviews. Third, students did not do well on the rhetoric criterion, which included organization, grammar/spelling, and APA style.

10. Revise as necessary

The problems uncovered by the analysis (if you didn’t find any problems, look again), can be classified into two gross categories: Problems with the rubric, and problems with the instruction. If the issues concern clarity or applicability of the criteria, then revise them. On the other hand, if students (or a certain set of students) scored lower than you expected, it might not be a problem with the rubric, but with the students’ preparedness. It would be a shame to revise or scrap a functioning rubric just because it didn’t give the results we wanted. Instead, consider altering the instructional activities, and then reusing the rubric to track any changes in student performance.

On my shaded copy of our lit review rubric, I highlighted the problematic cells and inserted endnotes describing the problems and possible solutions. I am not teaching this course next semester, so I need to record my concerns right away. With these notes, I can pick up and revise the rubric the next time I use it.

Conclusion

Whether having a great rubric is worth all this work is a valid question. But a wonderful aspect of rubrics is that they can be reused each semester (or even multiple times within a semester) without impacting the students’ results. So you can put in a little time now, then a little time next year, and develop the rubric in baby steps. So then, it’s not a question of whether it’s worth the time, but how much time is it worth.