UNR Logo

SPSS FOR WINDOWS

Cleborne D. Maddux, Ph.D.
Professor

maddux@unr.edu

Department of Counseling and Educational Psychology

College of Education
The University of Nevada, Reno

Tutorial #5 - USING SPSS FOR WINDOWS TO CREATE A DATA FILE AND RUN A FREQUENCIES ANALYSIS

Print out this tutorial so you will have a printed copy in your hand while you work.

In this tutorial, you will learn how to use SPSS for WINDOWS to obtain a FREQUENCIES run.

An SPSS FREQUENCIES run is one of the first things you will want to do with any new research data after you have gathered it and constructed a data file in SPSS. This run will list all the obtained values for every variable along with the numbers of subjects who were assigned that value.

For example, if you have a variable named "group" that is coded 1 for experimental group and 2 for control group, a FREQUENCIES run will show you how many subjects are designated with a 1 in this category and how many are designated with a 2. (However, it will NOT calculate statistics such as means for each of these groups. To do that, you will have to use the CROSSTABS command or the MEANS command.)

FREQUENCIES is useful when you are at the beginning of your analyses, and you are trying to find any mistakes you have made in your data entry. Since you will know how many were in each group, this run will be a good way for you to check on the accuracy of the way you entered the data. If you know you had five in each group, but the FREQUENCIES run shows 5 in the experimental group, 4 in the control group, and one "missing case," you will know that you made a mistake in your data entry. The same process will help you double-check every variable.

Sometimes you might not want to include every variable in your data in a FREQUENCIES run, however. For example, if you have several thousand cases with data on a continuous variable that ranges from 0 to 1000, you might not want to include that variable in the run, since the printout for that variable alone will take up many pages. (It will show each value from 1 to 1000 and how many cases earned that value.)

Before beginning this tutorial, you might want to review the last SPSS FOR WINDOWS tutorial on obtaining a DESCRIPTIVES run.


The first thing you should do it make an SPSS data file. Here is the data to use:

Suppose you have scores on 40 students (20 in an experimental group, and 20 in a control group). Here are the scores:

EXPERIMENTAL: 80, 83, 90, 87, 80, 83, 91, 90, 80, 83, 85, 87, 89, 80, 91, 88, 83, 90, 85, 89
CONTROL: 50, 55, 53, 55, 50, 60, 49, 50, 51, 52, 55, 56, 60, 63, 65, 78, 54, 50, 61, 49

_____ 1. Begin by starting SPSS FOR WINDOWS

_____ 2. If asked, choose the option to "Type in data."

Remember that we will use the traditional convention of making each ROW an individual subject, and each COLUMN a variable in the study.

_____ 3. We will first name the variables. Click on the Variable View tab in the lower left of the screen. (These are directions for SPSS version 10 and up. For earlier versions, variables are named by first placing the cursor at the top of the column to be named, and then clicking on the Data menu item and chooseing Define Variable.)

For this data, we will need three variables (columns):
a. One will contain an identification number from 1 to 40, and should be named id.
b. The next will contain all 40 scores, and should be named test.
c. The last variable will contain a code number defining the group the subject is in, and should be named group.

_____ 4. When in the Variable View screen, put the cursor in the first cell in the column labeled Name and type id. You will notice that SPSS will fill in some of the other cells in that row. This is normal.

_____ 5. Put the cursor in the second cell under Name and type test.

_____ 6. Put the cursor in the third cell under Name and type group.

When you are finished naming the variables, the first few rows and columns of the screen should look like this:

Look at the group variable above. This is actually a "dummy variable" that we set up simply to tell us whether a score is an experimental group score or a control group score. We can tell SPSS how to handle that so that this designation will appear on printouts:

_____ 7. Click the cursor once in the row labeled group, the column headed Values. You will see a small gray box with three periods in it appear in that cell.

_____ 8. Click once on the gray box. A Value Labels box will open and should look like this:

SPSS Value Labels apply to categorical variables, and allow you to specify which category is designated by each possible value of that variable. So, we will want to tell SPSS that a Value of 1 means experimental group, while a Value of 2 means control group. Here is how to do that:

_____ 9. Put the cursor in the blank field next to Value and type a 1.

_____ 10. Put the cursor in the blank field next to Value Label and type experimental group.

_____ 11. Click the Add button once to record your value labels for that value.

_____ 12. Put the cursor in the blank field next to Value and type a 2.

_____ 13. Put the cursor in the blank field next to Value Label and type control group.

_____ 14. Click the Add button once to record your value labels for that value.

When you have finished step #14 above, the box should look like this:

_____ 15. Click the OK button once.

_____ 16. Click the Data View tab at the bottom left of the screen to return to the data. You will see that the three columns now contain the variable names you entered (id, test, and group).

Now, you are ready to enter the scores. The easiest way to do so is probably to enter one entire variable (column) at a time, beginning with the id variable.

_____ 17. While in the Data View screen, put the cursor in the first blank cell under id. Now type in a 1, press ENTER on your keyboard, type a 2, press ENTER on your keyboard, and continue until you have type in the numbers between 1 and 40.

_____ 18. Now, we will enter the values in the test variable. Remember, these are the actual scores earned. We will enter the experimental group scores as the first 20 rows, and the control group scores in the next 20. Type them in now.

_____ 19. Now, we will enter the values in the group variable. Remember that these will be a 1 if the score in that row is an experimental group score, and a 2 if the score in that row is a control group score. Therefore, put a 1 in the first 20 rows and a 2 in the next 20 rows. Do that now.

If you have followed directions, here is what the first few rows should look like when you are finished:

At this point, you could elect to save this data in an SPSS data file (such files must have a .sav extension). This is important if you will be doing further analyses later, so that you don't have to re-enter the data. For this exercise, you should save the file on your diskette. Call it spsswinfreq.sav:

_____ 20. As usual in a WINDOWS program, save the file by clicking on File in the menu line at the top of the screen, migrate to the a drive, and save the file as spsswinfreq.sav.


You now have the data successfully entered, the variables named, the value labels defined, and the entire thing saved on a diskette. You are now ready to run any analyses. You will begin with a FREQUENCIES run:

_____ 1. While in the data view screen, click once on Analyze in the menu line at the top of the screen, then Descriptive Statistics, and then Frequencies:

A Frequencies box will open and should look like this:

_____ 2. You must move the variables you want analyzed from the left field to the right field. To do so, click on the variable name, then click the RIGHT ARROW button in the middle of the screen. Do this for all three of the variables.

_____ 3. Now you must choose which statistics you want. Click the Statistics button at the bottom of the box.

_____ 4. When the statistics box opens, you can choose the statistics you want. For this exercise, make the box look like this:

_____ 5. Click the Continue button.

_____ 6. SPSS allows you to also order charts and graphs if you want. When you are returned to the previous box, click the Charts button, and when the Charts box appears, choose Histograms and With normal curve, then click the Continue button.

_____ 7. You will be returned to the original Frequencies box. Click OK to continue. The analysis will run, and you will be shown the output file on the screen.

You will no longer see your data, but don't worry. The file with the data is still there in memory, and you will see an icon for it on the bottom of your screen.

Don't worry about the data. For now, let's concentrate on the output of the FREQUENCIES run. The first thing in the output file will be the statistics you ordered. They should look like this:

Much of the output above has little meaning EXCEPT to help you catch any data entry errors you have made when you typed in the data. For example, notice that for all three variables, there is no MISSING DATA. That is good, and means that there is a number typed in for all three variables and for all 40 subjects. Any missing data would be an error.

Notice there are 40 cases for all three variables. That is also as it should be.

Look at the means for each of the three variables. They also make sense. For example, the mean of the group variable is 1.5. This is correct, since there are 20 ones in this variable, and 20 twos. Therefore, if the mean had been anything except 1.5, we would have know there was an error in this variable.

Look at the minimum and maximum scores for each variable. They also make sense. Ditto for the mode and median of each variable. The range is also exactly what we would expect.


Now, look at the next three boxes in the output.

These are FREQUENCY DISTRIBUTIONS for each of the three variables. This is the most important part of the printout, and getting frequency distributions is the main reason we use the SPSS FREQUENCY analysis. You will recall that a frequency distritution lists each earned value for a variable, and the number of cases that has that value.

The first box on your printout is the frequency distribution for the id variable. Since we entered one unique number for each case, we should see a 1 entered for every value. If not, we will know we made a mistake. There are both percent and valid percent statistics included. The percent is the percent that number of cases is of the total sample. So, the formula is the number earning that value divided by the total N. In this case, that should be 1/40 (100) for every id number. (NOTE - We multiply times 100 to convert the proportion to a percentage.) So, there should be 2.5 listed as the percent for each id number, and that is what we find.

The valid percent would be the percentage of the valid cases - that is, the N would be only those cases for which there is an id number, but not counting any cases that we did not assign an id number to. In other words, the missing cases are not included in the calculation. In this case, there were no missing cases, so the percent and valid percent are exactly the same.

The cumulative percentage is the percentage of cases in each row including all previous rows.

The next box is the frequency distribution for the test variable, which has far more meaning for this study. It shows how many students earned each unique grade. As you can see, 3 people earned a score of 55.

The next box is the frequency distribution for the group variable. You can also see at a glance that there were no errors in entry on this variable.


The next part of the output file is a HISTOGRAM for each of the variables. As you can see, histograms for the id and the group variables are relatively meaningless, and thus can be used for nothing more than finding errors. The histogram for the test variable is a little more meaningful, and lets you see the shape of the distribution. SPSS has picked the intervals to place on the histogram. Later, you will see how to specify these intervals for yourself.


As you can probably guess, it is also possible to save an SPSS output file. Such files must have a .spo extension. For this exercise, save the output on your diskette. Call the file spsswinfreq.spo. Do that now.


NOTE

As you can see, FREQUENCIES is a good way to check for errors.

Now, please return to the data view screen and run a DESCRIPTIVES run on this data. If you have forgotten how to do so, review the previous tutorial which took you step by step through this process.


I would like for you to turn in the following:

a. printout of the data file
b. printout of the FREQUENCIES output file

c. printout of the DESCRIPTIVES output file

Staple these two things together with the data file printout on top. Be sure your name is on everything. LABEL the top sheet under your name "spsswinfreq assignment."


Top of the page.

STATISTICS COURSES TAUGHT BY CLEB MADDUX PAGE.

CEP640 Page - Educational Measurements and Statistics

CEP740 Page - Advanced Educational Measurements and Statistics

Click here to go to the page for CEP741 - Applied Research Design and Analysis in Education I

 University of Nevada, Reno
Please direct questions to: maddux@unr.edu
URL of this document: http://unr.edu:80/homepage/maddux/stat/cep740/spsswinfreq.html