Program Evaluation Toolkit for Harm Reduction Organizations

Analyzing Your Quantitative Data

Analyzing Your Quantitative Data

The quantitative data analysis process typically follows the following four steps:

Validate Your Data

Edit Your Data

Code Your Data

Analyze

Step 1: Validate the data

Data validation consists of ensuring that all the data that has been collected for your program evaluation has been cleaned, is complete, and is labeled and stored properly. When using tables, labels are often the top row and may be called fields, columns, or variables. At this point, the evaluator and/or members of the program team will review all the quantitative data sets to remove any duplicates and unwanted data points. This is also where all identifiable information about participants that is not relevant for the evaluation should be removed. Data of this nature usually consists of names, addresses, phone numbers, and personal or protected health information. Once complete, the result is a strong quantitative data pool that is accurate, relevant, and usable.

Step 2: Edit the data

The purpose of data editing is to ensure that the data is clear and understandable by viewers and those who may analyze the data. A common situation involves shortening data so that tables don’t include unnecessarily long entries that break visual flow. This usually involves reading through the data to identify raw data output that can be converted to formats that are easier for a computer to read and analyze. For example:

  • If a column is titled Housing, the answer “I am housed in an automobile or car” can be converted into “car.”
  • Empty answers may need to be converted into null or other software terms indicating that a field is empty, whereas questions that the respondent chose not to answer may be titled “Skipped Question” or NA.
  • Columns are often added which re-order or extrapolate data from other responses, even qualitative ones. Using the response “I use heroin and cocaine”, you may want to add a column for ‘Number of Drugs Used’ with 2 as the sum.
  • Computers treat text differently than numbers, so it is often necessary to convert a reply such as “three” to the numeral 3, or to remove units such as years, meters, hours, etc. In the case of units, the text usually becomes part of the column label to ensure clarity, so that a column entitled “distance” with an answer “3 miles” becomes “Distance in Miles” with the answer "3."

This editing involves using reason to decipher the meaning or fill in missing information, where appropriate. While editing, the goal is to make data unambiguous and clearer to a viewer. It is very important to remain objective and avoid biased editing. Biased editing can occur when the editor:

  • Attempts to remove or rephrase information that they don’t agree with.
  • Removes or alters data just to make analysis less complex or easier.
  • Attempts to add information that tells a story that they think is important.
  • Attempts to add information based on what they know about the respondent.

This deep dive into your quantitative data can be both time consuming and tedious, so it is important to allocate enough time for this effort. It is also helpful to consider not waiting to edit all the data at the end of the data collection program, and instead move to edit segments of validated data throughout the data collection process.

Step 3: Code the data

Data coding refers to the process of grouping and assigning value to the quantitative responses. By coding data, you can take large sets of information and break them down into simplified brackets or categories. Below is an example of how to code quantitative data received from a survey.

  • Example: You received 2,000 completed surveys and, as a part of your analysis, will need to find the average age of the survey respondents. Instead of counting each age individually, you can create "age" categories and code each of the categories to condense the amount of information you have combed through during the analysis. Based on the age ranges, the categories you come up with could be 18-24 years old, 25-35 years old, 36-50 years old, and 51-70 years old. Now, instead of looking at 2,000 entries to analyze the age range, you would only have to examine the four categories to analyze the age distribution among respondents.

Step 4. Analyze the data

The most used quantitative data analysis method is descriptive statistics. Descriptive statistics refers to analyzing data in a way that helps to describe or summarize the relationships and patterns that are present. Essentially, it takes large amounts of data and breaks it down into several categories of useful information to examine "what happened."

 

Table (4.9). Here are some common examples of descriptive analysis.

Mean a numerical average
Median the midpoint of a data set when in chronological order
Mode the most common value
Percentage the ratio or number that represents a fraction of 100
Frequency the number of occurences
Range the largest number minus the smallest number in the data set

Note: Descriptive analysis can help reveal outliers, which are data points likely to be incorrect or highly abnormal, such as when someone enters their age as 7,591. These outliers are often removed so as not to skew critical data points, such as "average age." Excluding outliers should be done with care, as some results may be true but abnormal. Start with data that is undeniably incorrect, such as “our clinic is open 28 hours a day.” Statistical methods for identifying outliers can be found in the Quantitative Analysis resources section below.

Inferential statistics goes a step beyond descriptive statistics by using the same quantitative data to draw conclusions (or inferences) and make predictions about the larger population. Common examples of inferential analysis are correlation (describing the relationship between two variables) and regression (showing the strength of the relationship between two variables). Inferential analysis is more complex than descriptive analysis and typically requires a more advanced understanding of statistics to appropriately apply it to your program evaluation.

HELPFUL TOOLS

For descriptive statistics, Microsoft Excel and Google sheets are commonly used.

For inferential statistics, tools such as SPSS, SAS or STATA are commonly used.

Here are some resources to learn more about inferential analysis:

When engaging in your quantitative analysis process, it is helpful to keep the following in mind:

  • Include numbers with your percentages. When writing up your findings, remember that every percentage should also indicate the total number the ratio is based on. Including percentages alone can be misleading because they don’t on their own offer insight into what the ratio means. For example, just saying that “50% of our clients have been linked to psychosocial services” does not paint as complete a picture as “50% of our 10 clients have been linked to psychosocial services.” Often, the size of the sample is established in a shorthand where the letter "n" is meant to show the number of responses in the sample, such as “50% (n=10).”
  • Interpreting vs describing: While it might be tempting to write up findings based on exactly what the numbers say, a key opportunity the analysis process offers is the ability to interpret meaning from the data. Consider what the data is telling you. What are your takeaways? What could this information mean? These, along with the descriptive data, are the insights that you should look to include in your findings.

Here are some resources on conducting a quantitative analysis: