\documentclass[10pt]{article}
\usepackage{fullpage}
\usepackage{setspace}
\usepackage{parskip}
\usepackage{titlesec}
\usepackage[section]{placeins}
\usepackage{xcolor}
\usepackage{breakcites}
\usepackage{lineno}
\usepackage{hyphenat}
\PassOptionsToPackage{hyphens}{url}
\usepackage[colorlinks = true,
linkcolor = blue,
urlcolor = blue,
citecolor = blue,
anchorcolor = blue]{hyperref}
\usepackage{etoolbox}
\makeatletter
\patchcmd\@combinedblfloats{\box\@outputbox}{\unvbox\@outputbox}{}{%
\errmessage{\noexpand\@combinedblfloats could not be patched}%
}%
\makeatother
\usepackage{natbib}
\renewenvironment{abstract}
{{\bfseries\noindent{\abstractname}\par\nobreak}\footnotesize}
{\bigskip}
\titlespacing{\section}{0pt}{*3}{*1}
\titlespacing{\subsection}{0pt}{*2}{*0.5}
\titlespacing{\subsubsection}{0pt}{*1.5}{0pt}
\usepackage{authblk}
\usepackage{graphicx}
\usepackage[space]{grffile}
\usepackage{latexsym}
\usepackage{textcomp}
\usepackage{longtable}
\usepackage{tabulary}
\usepackage{booktabs,array,multirow}
\usepackage{amsfonts,amsmath,amssymb}
\providecommand\citet{\cite}
\providecommand\citep{\cite}
\providecommand\citealt{\cite}
% You can conditionalize code for latexml or normal latex using this.
\newif\iflatexml\latexmlfalse
\providecommand{\tightlist}{\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}%
\AtBeginDocument{\DeclareGraphicsExtensions{.pdf,.PDF,.eps,.EPS,.png,.PNG,.tif,.TIF,.jpg,.JPG,.jpeg,.JPEG}}
\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
\usepackage{float}
\begin{document}
\title{Calculating LongString in Excel to Detect Careless Responders~~~~}
\author[1]{Richard N. Landers}%
\affil[1]{Affiliation not available}%
\vspace{-1em}
\date{\today}
\begingroup
\let\center\flushleft
\let\endcenter\endflushleft
\maketitle
\endgroup
\sloppy
Careless responding is one of the most fundamental challenges of survey
research. We need our respondents to respond honestly and with effort,
but when they don't, we need to be able to detect and remove them from
our datasets. A few years ago, Meade and Craig (2012)~published an
article in~\emph{Psychological Methods}~exploring a significant number
of techniques for doing exactly this, ultimately recommending a
combination of three detection techniques for rigorous data cleaning,
which, let's face it, is a necessary step when analyzing any internet
survey. ~These techniques are even-odd consistency, maximum longstring,
and Mahalanobis D:
\begin{enumerate}
\tightlist
\item
\textbf{The even-odd consistency}~\textbf{index}~involves calculating
the subscale means for each measure on your survey, split by even and
odd items. ~For example, the mean of items 1, 3, 5, and 7 would become
one subscale whereas the mean of items 2, 4, 6, and 8 would become the
other. ~Next, you take all of the even subscales and pair them with
all of the odd subscales across all of the measures in your survey,
calculate a correlation, and then apply the Spearman-Brown prophecy
formula to adjust the value up to a scale of -1 to 1.
\item
\textbf{Maximum LongString~}is the largest value for LongString across
all scales on your survey, where LongString is the number of identical
response in a row. ~Meade and Craig recommended LongString would be
most useful when the items were randomly ordered.
\item
\textbf{Mahalnobis D~}is~calculated from the regression of scale means
onto all the scores that inform them. In a sense, you are trying to
see if responses to individual items correspond with the scale mean
they created consistently across individuals. Some conceptualizations
of this index regress participant number onto scores, which
conceptually accomplishes basically the same thing.
\end{enumerate}
In all three cases, the next step is to create a histogram of the values
and see if you see any outliers.
\subsection*{Calculating Careless Responding
Indices}
Of these three, Mahalanobis D is the most easily calculated, because
saving Mahalanobis D values is a core feature in regression toolkits. It
is done easily in SPSS, SAS, R, etc.
The second, the even-odd consistency index, is a bit harder but still
fundamentally not too tricky; you just need to really understand how
your statistical software works. ~Each step, individually, is simple:
calculate scale means, calculate a correlation, apply a formula.
The third, Max LongString, is the most intuitively understandable but
also, often unexpectedly, the most difficult to calculate. ~I imagine
that the non-technically-inclined generally count by hand -- ``this
person has a maximum of 5 identical answers in a row, the next person
has 3\ldots{}''
\href{https://www.ibm.com/developerworks/community/forums/html/topic?id=a6e61287-222c-427b-9f94-0976621dc243}{An
SPSS macro already exists to do this}, although it's not terribly
intuitive. ~You need to manually change pieces of the code in order to
customize the function to your own data.
Given that, I decided to port the SPSS macro into Excel and make it a
little easier to use.
\subsection*{An Excel Macro to Calculate
LongString}
\texttt{Function\ LongString(cells\ As\ Range)\ ~~~\ Dim\ cell\ As\ Range\ ~~~\ Dim\ run\ As\ Integer\ \ ~~\ Dim\ firstrow\ As\ Boolean\ ~~~\ Dim\ maxrun\ As\ Integer\ ~~~\ Dim\ lastvalue\ As\ String\ \ ~\ ~\ firstrow\ =\ True\ ~~~\ run\ =\ 1\ \ ~~\ maxrun\ =\ 1\ \ ~~\ \ For\ Each\ cell\ In\ cells\ ~~~~~~~\ If\ firstrow\ =\ True\ Then\ ~~~~~~~~~~~\ firstrow\ =\ False\ ~~~~~~~~~~~\ lastvalue\ =\ cell.Value\ \ ~~~~~~\ Else\ ~~~~~~~~~~~\ If\ cell.Value\ =\ lastvalue\ Then\ ~~~~~~~~~~~~~~~\ run\ =\ run\ +\ 1\ ~~~~~~~~~~~~~~~\ maxrun\ =\ Application.Max(run,\ maxrun)\ ~~~~~~~~~~~\ Else\ ~~~~~~~~~~~~~~~\ run\ =\ 1\ ~~~~~~~~~~~\ End\ If\ ~~~~~~~~~~~\ lastvalue\ =\ cell.Value\ ~~~~~~~\ End\ If\ ~~~\ Next\ cell\ \ ~~~\ LongString\ =\ maxrun\ End\ Function}
\subsection*{To Use This Code Yourself}
\begin{enumerate}
\tightlist
\item
With Excel open, press Alt+F11 to open the VBA Editor.
\item
Copy/paste the code block above into the VBA Editor.
\item
Close the VBA Editor (return to Excel).
\item
In an empty cell, simply type~\textbf{=LONGSTRING()}~and put the cell
range of your scale's survey items inside. ~For example, if your first
scale was between B2 and G2, you'd use~\textbf{=LONGSTRING(B2:G2)}
\item
Repeat this for each scale you've used. ~For example, if you measured
five personality dimensions, you'd have five longstrings calculated.
\item
Finally, in a new cell use the~\textbf{=MAX()}~function to determine
the largest of that set. ~For example, if you put your five
LongStrings in H2 to L2, you'd use~\textbf{=MAX(H2:L2)}
\end{enumerate}
That's it! Importantly, the cells needs to be in Excel in the order they
were administered. ~If you used randomly ordered items, this adds an
additional layer of complexity, because you'll need to recreate the
original order for each participant first before you can apply
LongString. That takes a bit of Excel magic, but if you need to do this,
I recommend you read up on~\textbf{=INDIRECT()}~and~\textbf{=OFFSET()},
which will help you get that original order back, assuming you saved
that item order somewhere.
\subsection*{Final Steps}
Once you have Max LongString calculated for each
participant,~\href{https://support.office.com/en-us/article/Create-a-histogram-b6814e9e-5860-4113-ba51-e3a1b9ee1bbe}{create
a histogram}~of those values to see if any strange outliers appear. If
you see clear outliers (i.e., a cluster of very high Max LongString
values off by itself, away from the main distribution), congratulations,
because it's obvious which cases you should drop. ~~If you don't see
clear outliers, then it's probably safer to ignore LongString for this
analysis.
Another potential issue with Max LongString is that if you have scales
with varying numbers of items, shorter scales can be lost when you
calculate the Max. To avoid that problem, try converting each of your
longstrings into a proportion. For example, for a 4-item scale, try
\textbf{=LONGSTRING(H4:H7)/4} and then calculate MAX() on those results.
\section*{References}
{\label{121272}}
Meade AW, \& Craig SB (2012). Identifying careless responses in survey
data.~Psychological Methods, 17~(3), 437-55.
\par\null
\selectlanguage{english}
\FloatBarrier
\end{document}