Testimony on Standardized Testing

Tuesday, May 14th, 2019
Comments Off on Testimony on Standardized Testing

May 14, 2019

New Jersey Joint Committee on the Public Schools

Comments on State Assessments, Equity, and Accountability

Submitted by Christopher H. Tienken, EdD

Associate Professor of Education Leadership, Management, and Policy

Seton Hall University

South Orange, NJ

Good morning Senator Rice and Assemblywoman Jasey and honorable members of the Joint Committee.

My comments today come from years of research on the topic of standardized testing as a professor and from experiences with testing as a former assistant superintendent, principal, and teacher. Overall, the large body of results on the usefulness of standardized test results suggest they are blunt and inaccurate measures of the quality of teaching and learning that take place in schools and they do nothing to address inequality of achievement.

The achievement gap itself is an offensive term that suggest there is something wrong with students, specifically students of color and students from poverty because those are the students who are always identified as having an achievement gap. The term suggests that those students lack something that Caucasian student have.

The achievement gap is a distraction – it is a symptom of a much larger problem that exists in our society: The enactment of policies that favor one group over others- policies that create, by design, inequalities of opportunity. These include tax policies that widen income inequality, housing policies that segregate communities, labor policies that keep specific groups of people on the margins, and even our own school funding policies that have clearly created winners and losers in ways that are completely inequitable. 

Unfortunately, high school exist exams, or any standardized tests for that matter have no history of closing opportunity gaps, achievement gaps, or any other gaps. If they did, New Jersey would not have any gaps, as we have had high school exit exams for decades.

Standardized test results do not capture accurately tell what or how well students learn, or how much they know about a specific topic. The results tell us more about the social and economic conditions in which students live and grow than what they know and can do.

Predicting the Results

Colleagues and I have conducted a series of studies in New Jersey, Connecticut, Massachusetts, Iowa, Michigan, and now in Ohio, in which the results from standardized tests were predicted by knowing only a few demographic factors found in the U.S. Census data about the community and families served by schools.

The findings from these various studies suggest that there are serious flaws built into education accountability systems that rely on standardized test results (e.g. See Tienken, Colella, Angelillo, Fox, McCahill, & Wolfe, 2017).

Most recently, we predicted the percentages of New Jersey high school students who would score at Level 4 or above on the PARCC Algebra I and English 10 assessments for 75% and 71% of the school districts by just using two demographic variables; the percentage of families in a community with income less than $35,000 a year and the percentage of families in a community with income greater than $200,000 a year (Maroun, 2018). We conducted similar studies and found similar results with the HSPA and the NJASK with some other variables.

Our models can identify how much a particular variable affects students’ scores. That allows us to identify the most important demographic characteristics as they relate to the test results. For example, by looking at just one characteristic – the percentage of families in a given community living in poverty – we were able to account for 50% of the test score in English language arts. That is, just one demographic factor accounts for half of the score. Regardless of the district, we were able to predict the results. The interesting thing is that all the factors we found that predicted tests scores accurately were outside the control of the school and really told us more about where the student lived than how the student learned.

In a national study I conducted several years ago I looked at the scores for various groups of students from all the states that had high school exist exams. At that time it was 26 states, now it is just a dozen, as most states have realized they are relatively useless. But what I found, without exception, was that the group of students categorized as economically disadvantaged scored consistently lower that non-disadvantaged students. This finding was not groundbreaking, but was consistent with the findings from the NAEP test as well. However, the interesting thing was that the finding holds true regardless of the district. That is, even in what are labeled Blue Ribbon or high quality schools and districts, the scores of students from poverty are still lower.

That is because the tests are picking up the noise from students’ lives, not their potential as human beings, not how much learn, not the kind of people they are, not their hopes, passions, and interests.

To be clear, this doesn’t mean that money determines how much students can learn. In fact, that couldn’t be further from the truth. Study after study demonstrate that students from poverty learn as much in school year than students not in poverty – they just start at a different place. So if everyone is in a race, and they all run at relatively the same speed, yet one group starts 50 yards behind or in front, it is not hard to see who will always be in the front.

Though some proponents of standardized assessments claim that scores can be used to measure year-to-year academic growth, we’ve found that there’s simply too much noise in the scores to be useful indicators of learning or teaching. In fact, the inventor of the Student Growth Percentile (SGP) used right here in New Jersey, Damien Betebenner stated in his September 2011 article that “the results of standardized assessments should never be used as the sole determinant of education or educator quality” – yet here we are, still debating whether a standardized test should be used to determine high school graduation.

Nationally known tests like the SAT suffer from the same issues yet they are more pernicious as it relates to the opportunity gap and poverty. For example, there is about a 150 point difference between the scores of students living in a families making $40,000 a year and those making $80,000 a year and almost a 300 point difference between that same family making $40,000 a year and a family making $180,000, which we have a lot of both in New Jersey.

In short, the results from standardized tests do not close any gaps whatsoever – they actually create perception gaps: They increase the negative portrayals of students from poverty and students of color. They reinforce stereotypes and are used to justify policies that strip certain communities of badly needed resources.

Using standardized tests results for high stakes decisions do little to inform a system of education and they ensure that certain groups of students will have to jump through more hoops and pay a higher price to graduate than other groups of students. Again, students who need the most, get the least, and do more work than everyone else. How that is equitable is beyond my comprehension.

Accountability 3.0: Assessment to Inform Learning

The following comments come directly from an upcoming article that appeared in the Kappa Delta Pi Record and was distributed to the Committee.

At the end of the day, this entire argument over high school exist exams comes down to accountability. In its most basic sense, education accountability at the state level is about answering the questions, How is the school doing and are students learning? To fully answer that question, a comprehensive accountability program should address how well schools address the economic, social-emotional, socio-civic and avocational interests/hobbies of students (Dewey, 2016). This type of accountability requires a layered system that provides multiple measures and data points. The data points would be captured from the district, state, and regional accreditation layers.

The District Layer

The first layer of the comprehensive accountability system resides at the school district level. School districts should be accountable for assembling a portfolio of district-wide indicators that provide information on how well students are developing academically, socio-civically, and avocationally. The district level is ideal for providing in-depth information because districts can draw upon the many types of teacher-made assessments to help paint a picture of student development.

Districts can use high-quality, teacher-designed, assessments that foster effective teaching methods. Examples include assessing reading levels through running records and readers’ workshop formats, writing prompts, literary analyses, and problem-based assessments that include socio-civic concepts and use of mathematics. Schools also can be judged on the types of avocational opportunities (clubs, hobbies, and organizations) they offer and how many students take advantage of those pursuits or have a hobby activity outside of school.

Although some might not want to accept it, over time, assessments made by teachers are better indicators of student achievement than standardized tests. For example, high school GPA, derived from teacher assessments, is a better predictor of first-year college success and four-year persistence than the SAT – that is according to the College Board’s own data from all SAT takers and large study by the University of California, Berkeley based on 80,000 students in the UC system (College Board, 2012; Geiser and Santelices, 2007). Also, high school GPA is less discriminatory against students from poverty and students of color than the SAT.

Existing Models

The New York Performance Standards Consortium is a group of almost 40 public schools that has developed authentic and problem-based assessments in areas such as higher-order thinking, writing, mathematical problem-solving, technology use, science research, appreciation and performance in the arts, service learning, and career skills. The schools use outside experts from universities and the community, along with the teachers, to audit assessment quality and results, review student work, and provide real-world feedback to students.

A clear framework for a district layer accountability structure already exists. The program known as the Nebraska School-based Teacher-led Assessment and Reporting System (STARS) was first implemented in Nebraska during the 2000–2001 school year under former Nebraska Commissioner of Education Doug Christensen (Dappen and Isernhagen, 2005). The Partnership for 21st Century Skills (2005) called it the “nation’s most innovative assessment system” (p. 13).

The program operated successfully until the 2009 school year when the political winds changed and an NCLB-friendly state legislature changed to an all commercial, standardized, test–based system. But the framework, including state policy documents, assessments, and protocols still exist; and state education leaders could easily reinvigorate the system without having to reinvent the accountability wheel.

The State Layer

The second layer involves the state department of education, in which state personnel serve a three-part role: (a) assessor, (b) auditor, and (c) professional developer. In the role of assessor, the state would administer low-stakes, nonintrusive, off-the-shelf standardized assessments of basic skills such as arithmetic and reading comprehension in grades 3-8 and high school. Such tests can be administered in 30 or 45 minutes, be finished in one day, and are inexpensive to administer and score. The results would carry little weight in the overall accountability system, but they would satisfy the federal ESSA testing requirement for compliance purposes.

The more important roles for state education personnel are those of auditor and professional developer. State personnel provide and/or facilitate job-embedded professional development for teachers on quality assessment design, problem-based activity development, and scoring protocols and processes. State personnel also provide an auditing system in which they audit a percentage of district-level accountability assessments to maintain quality control of the scoring processes.

National Accreditation Layer

The final layer is the capstone of the multidimensional accountability system: accreditation from third-party regional accreditation organizations. For instance, the process used by the Middle States Association of Colleges and Schools (2014) includes 12 components that cover all aspects of education at the school level: school mission, governance and leadership, school improvement planning, finances, facilities, system organization and staff, health and safety, information resources, educational program, assessment and evidence of student learning, student services, and student life and student activities.

National accreditation involves a comprehensive, multi-year process of intensive self-study by the school and district, a rigorous external review capped by a multi-day visitation by an independent team of accreditation auditors, and a detailed visitation report written by the team.

Accreditation looks at how schools are functioning on a broad range of components that affect all areas of schooling. When compared to national accreditation, the current system of QSAC review seems to be nothing more than bureaucratic hairspray to make an otherwise ineffective process look good.

Closing Argument

The time is right to revise New Jersey’s ESSA plan to downplay the role of standardized test results and develop a multi-layer system of accountability to inform teaching and learning.

A three-layered approach to accountability provides triangulated data points from which to inform all areas of the education process. The layered approach brings a sense of balance in which one indicator cannot make or break the rating of a school district. The entire structure acts to provide feedback about school quality to the public and provides actionable formative data that school personnel can use for more evidence-informed school enhancement efforts.


Dappen, L., & Isernhagen, J. C. (2005). Nebraska STARS. Assessment for learning. Planning and Changing, 36(3&4), 147–156.

College Board. (2012). 2012 college bound seniors: Total group profile report. Author.

Dewey, J. (1916). Democracy and education. New York, NY: Macmillan.

Every Student Succeeds Act (ESSA). Pub. L. No. 114–95 § 114 Stat. 1177 (2015).

Geiser, S., & Santelices, M. V. (2007). Validity of high-school grades in predicting student success beyond the freshman year: High-school record vs standardized tests as indicators of four-year college outcomes. Berkeley, CA: Center for Studies in Higher Education, University of California, Berkeley.

Maroun, J. (2018). The predictive power of out-of-school community and family level demographic factors on district level student performance on the New Jersey PARCC in Algebra 1 and English language arts 10. Retrieved from:  https://scholarship.shu.edu/dissertations/2506/

Middle States Association of Colleges and Schools. (2014). Standards for accreditation for schools. Philadelphia, PA: Commissions on Elementary and Secondary Schools. Retrieved from http://www.msa-cess.org/Customized/Uploads/ByDate/2015/April_2015/April_23rd_2015/Standards%20for%20Accreditation%20for%20Schools%20(2014)69218.pdf

No Child Left Behind Act of 2001, 20 U.S.C.A. § 6301 et seq. (West, 2003).

Partnership for 21st Century Skills. (2005). The road to 21st century learning: A policymakers’ guide to 21st century skills. Author.

Tienken, C.H. (2018). Accountability for learning. Kappa Delta Pi Record.

Tienken, C.H., Colella, A.J., Angelillo, C., Fox, M., McCahill, K., & Wolfe, A. (2017). Predicting Middle School State Standardized Test Results Using Family and Community Demographic Data. Research on Middle Level Education, 40 (1), 1-13. Retrieved from: https://www.tandfonline.com/doi/full/10.1080/19404476.2016.1252304

Christopher H. Tienken ©2012 Copyright. All Rights Reserved.