Calculating Descriptive Statistics

There are two major classes of statistics: descriptive statistics and inferential statistics. Descriptive statistics are computed to reveal characteristics of the sample data set and to describe study variables. Inferential statistics are computed to gain information about effects and associations in the population being studied. For some types of studies, descriptive statistics will be the only approach to analysis of the data. For other studies, descriptive statistics are the ﬁ rst step in the data analysis process, to be followed by infer-ential statistics. For all studies that involve numerical data, descriptive statistics are crucial in understanding the fundamental properties of the variables being studied. Exer-cise 27 focuses only on descriptive statistics and will illustrate the most common descrip-tive statistics computed in nursing research and provide examples using actual clinical data from empirical publications. MEASURES OF CENTRAL TENDENCY A measure of central tendency is a statistic that represents the center or middle of a frequency distribution. The three measures of central tendency commonly used in nursing research are the mode, median ( MD ), and mean ( X ). The mean is the arithmetic average of all of a variable ’ s values. The median is the exact middle value (or the average of the middle two values if there is an even number of observations). The mode is the most commonly occurring value or values (see Exercise 8 ). The following data have been collected from veterans with rheumatoid arthritis ( Tran, Hooker, Cipher, & Reimold, 2009 ). The values in Table 27-1 were extracted from a larger sample of veterans who had a history of biologic medication use (e.g., inﬂ iximab [Remi-cade], etanercept [Enbrel]). Table 27-1 contains data collected from 10 veterans who had stopped taking biologic medications, and the variable represents the number of years that each veteran had taken the medication before stopping. Because the number of study subjects represented below is 10, the correct statistical notation to reﬂ ect that number is: n=10 Note that the n is lowercase, because we are referring to a sample of veterans. If the data being presented represented the entire population of veterans, the correct notation is the uppercase N. Because most nursing research is conducted using samples, not popu-lations, all formulas in the subsequent exercises will incorporate the sample notation, n. Mode The mode is the numerical value or score that occurs with the greatest frequency; it does not necessarily indicate the center of the data set. The data in Table 27-1 contain two EXERCISE 27 292EXERCISE 27 • Calculating Descriptive StatisticsCopyright © 2017, Elsevier Inc. All rights reserved. modes: 1.5 and 3.0. Each of these numbers occurred twice in the data set. When two modes exist, the data set is referred to as bimodal ; a data set that contains more than two modes would be multimodal . Median The median ( MD ) is the score at the exact center of the ungrouped frequency distribution. It is the 50th percentile. To obtain the MD , sort the values from lowest to highest. If the number of values is an uneven number, exactly 50% of the values are above the MD and 50% are below it. If the number of values is an even number, the MD is the average of the two middle values. Thus the MD may not be an actual value in the data set. For example, the data in Table 27-1 consist of 10 observations, and therefore the MD is calculated as the average of the two middle values. MD=+()=15202175… Mean The most commonly reported measure of central tendency is the mean. The mean is the sum of the scores divided by the number of scores being summed. Thus like the MD, the mean may not be a member of the data set.