Professor

The University of Nevada, Reno

* I*n this tutorial, you will learn how to use SPSS for WINDOWS to obtain
a FREQUENCIES run.

An SPSS FREQUENCIES run is one of the first things you will want to do with any new research data after you have gathered it and constructed a data file in SPSS. This run will list all the obtained values for every variable along with the numbers of subjects who were assigned that value.

For example, if you have a variable named "group" that is coded 1 for experimental group and 2 for control group, a FREQUENCIES run will show you how many subjects are designated with a 1 in this category and how many are designated with a 2. (However, it will NOT calculate statistics such as means for each of these groups. To do that, you will have to use the CROSSTABS command or the MEANS command.)

FREQUENCIES is useful when you are at the beginning of your analyses, and you are trying to find any mistakes you have made in your data entry. Since you will know how many were in each group, this run will be a good way for you to check on the accuracy of the way you entered the data. If you know you had five in each group, but the FREQUENCIES run shows 5 in the experimental group, 4 in the control group, and one "missing case," you will know that you made a mistake in your data entry. The same process will help you double-check every variable.

Sometimes you might not want to include every variable in your data in a FREQUENCIES run, however. For example, if you have several thousand cases with data on a continuous variable that ranges from 0 to 1000, you might not want to include that variable in the run, since the printout for that variable alone will take up many pages. (It will show each value from 1 to 1000 and how many cases earned that value.)

Before beginning this tutorial, you might want to review the last SPSS FOR WINDOWS tutorial on obtaining a DESCRIPTIVES run.

The first thing you should do it make an SPSS data file. Here is the data to use:

Suppose you have scores on 40 students (20 in an experimental group, and 20 in a control group). Here are the scores:

EXPERIMENTAL: 80, 83, 90, 87, 80, 83, 91, 90, 80, 83, 85, 87, 89, 80, 91,
88, 83, 90, 85, 89

CONTROL: 50, 55, 53, 55, 50, 60, 49, 50, 51, 52, 55, 56, 60, 63, 65, 78, 54,
50, 61, 49

_____ 1. Begin by starting **SPSS FOR WINDOWS**

_____ 2. If asked, choose the option to "**Type in data.**"

Remember that we will use the traditional convention of making each **ROW**
an * individual subject*, and each

_____ 3. We will first name the variables. Click on the **Variable View**
tab in the lower left of the screen. (These are directions for * SPSS version
10 and up*. For earlier versions, variables are named by first placing
the cursor at the top of the column to be named, and then clicking on the

For this data, we will need **three** variables (**columns**):

a. One will contain an identification number from 1 to 40, and should be named
**id**.

b. The next will contain all 40 scores, and should be named **test**.

c. The last variable will contain a code number defining the group the subject
is in, and should be named **group**.

_____ 4. When in the **Variable View** screen, put the cursor in the first
cell in the column labeled **Name** and type **id**. You will notice that
SPSS will fill in some of the other cells in that row. This is normal.

_____ 5. Put the cursor in the second cell under **Name** and type **test**.

_____ 6. Put the cursor in the third cell under **Name** and type **group**.

When you are finished naming the variables, the first few rows and columns of the screen should look like this:

Look at the **group** variable above. This is actually a "dummy
variable" that we set up simply to tell us whether a score is an experimental
group score or a control group score. We can tell **SPSS** how to handle
that so that this designation will appear on printouts:

_____ 7. Click the cursor once in the row labeled **group**,
the column headed **Values**. You will see a small gray box with three periods
in it appear in that cell.

_____ 8. Click once on the gray box. A **Value Labels** box
will open and should look like this:

**SPSS Value Labels **apply to * categorical* variables,
and allow you to specify which category is designated by each possible value
of that variable. So, we will want to tell

_____ 9. Put the cursor in the blank field next to **Value**
and type a **1**.

_____ 10. Put the cursor in the blank field next to **Value Label**
and type **experimental group.**

_____ 11. Click the **Add** button once to record your **value
labels** for that value.

_____ 12. Put the cursor in the blank field next to **Value**
and type a **2**.

_____ 13. Put the cursor in the blank field next to **Value Label**
and type **control group.**

_____ 14. Click the **Add** button once to record your **value
labels** for that value.

When you have finished step #14 above, the box should look like this:

_____ 15. Click the **OK** button once.

_____ 16. Click the **Data View **tab at the bottom left of
the screen to return to the data. You will see that the three columns now contain
the variable names you entered (**id**, **test**, and **group**).

Now, you are ready to enter the scores. The easiest way to do so is probably
to enter one entire variable (column) at a time, beginning with the **id**
variable.

_____ 17. While in the **Data View** screen, put the cursor in the first
blank cell under **id**. Now type in a **1**, press **ENTER** on your
keyboard, type a **2**, press **ENTER** on your keyboard, and continue
until you have type in the numbers between **1** and **40**.

_____ 18. Now, we will enter the values in the **test** variable. Remember,
these are the actual scores earned. We will enter the experimental group scores
as * the first 20 rows*, and the control group scores in

_____ 19. Now, we will enter the values in the **group** variable. Remember
that these will be a **1** if the score in that row is an * experimental*
group score, and a

If you have followed directions, here is what the first few rows should look like when you are finished:

At this point, you could elect to save this data in an **SPSS**
**data file** (such files must have a **.sav** extension). This is important
if you will be doing further analyses later, so that you don't have to re-enter
the data. For this exercise, you * should* save the file on your
diskette. Call it

_____ 20. As usual in a **WINDOWS** program, save the file
by clicking on ** File** in the menu line at the top of the screen,
migrate to the

You now have the data successfully **entered**, the variables
**named**, the value labels **defined**, and the entire thing **saved**
on a diskette. You are now ready to run any analyses. You will begin with a
**FREQUENCIES** run:

_____ 1. While in the data view screen, click once on ** Analyze**
in the menu line at the top of the screen, then

A Frequencies box will open and should look like this:

_____ 2. You must move the variables you want analyzed from the
* left field* to the

_____ 3. Now you must choose which statistics you want. Click
the ** Statistics** button at the bottom of the box.

_____ 4. When the statistics box opens, you can choose the statistics you want. For this exercise, make the box look like this:

_____ 5. Click the **Continue** button.

_____ 6. **SPSS** allows you to also order charts and graphs
if you want. When you are returned to the previous box, click the **Charts**
button, and when the **Charts box** appears, choose **Histograms** and
**With normal curve**, then click the **Continue** button.

_____ 7. You will be returned to the original **Frequencies box**.
Click **OK** to continue. The analysis will run, and you will be shown the
output file on the screen.

You will no longer see your data, but don't worry. The file with the data is still there in memory, and you will see an icon for it on the bottom of your screen.

Don't worry about the data. For now, let's concentrate on the
output of the **FREQUENCIES** run. The first thing in the output file will
be the statistics you ordered. They should look like this:

Much of the output above has little meaning * EXCEPT*
to help you catch any data entry errors you have made when you typed in the
data. For example, notice that for all three variables, there is

Notice there are **40** cases for all **three variables**.
That is also as it should be.

Look at the **means** for each of the three variables. They
also make sense. For example, the **mean** of the **group** variable is
**1.5**. This is correct, since there are **20 ones** in this variable,
and **20 twos**. Therefore, if the **mean** had been anything *except***1.5**, we would have know there was an error in this variable.

Look at the **minimum** and **maximum** scores for each
variable. They also make sense. Ditto for the **mode** and **median**
of each variable. The **range** is also exactly what we would expect.

Now, look at the next **three** boxes in the output.

These are **FREQUENCY DISTRIBUTIONS** for each of the three
variables. This is the most important part of the printout, and getting frequency
distributions is the main reason we use the **SPSS FREQUENCY** analysis.
You will recall that a frequency distritution lists each earned value for a
variable, and the number of cases that has that value.

The first box on your printout is the frequency distribution for
the **id** variable. Since we entered one unique number for each case, we
should see a **1** entered for every value. If not, we will know we made
a mistake. There are both **percent** and **valid percent** statistics
included. The **percent** is the percent that number of cases is of the total
sample. So, the formula is **the number earning that value** * divided*
by the

The valid percent would be the percentage of the **valid cases**
- that is, the **N** would be only those cases for which there is an **id**
number, but not counting any cases that we did * not* assign an id
number to. In other words, the

The cumulative percentage is the percentage of cases in each row including all previous rows.

The next box is the frequency distribution for the **test**
variable, which has far more meaning for this study. It shows how many students
earned each unique grade. As you can see, 3 people earned a score of 55.

The next box is the frequency distribution for the **group**
variable. You can also see at a glance that there were no errors in entry on
this variable.

The next part of the output file is a **HISTOGRAM** for each of the variables.
As you can see, **histograms** for the **id** and the **group** variables
are relatively meaningless, and thus can be used for nothing more than finding
errors. The **histogram** for the **test** variable is a little more meaningful,
and lets you see the shape of the distribution. **SPSS** has picked the intervals
to place on the **histogram**. Later, you will see how to specify these intervals
for yourself.

As you can probably guess, it is also possible to **save**
an **SPSS output file**. Such files must have a **.spo** extension. For
this exercise, **save** the output on your diskette. Call the file **spsswinfreq.spo**.
Do that now.

As you can see, **FREQUENCIES** is a good way to check for errors.

Now, please return to the data view screen and run a **DESCRIPTIVES** run
on this data. If you have forgotten how to do so, review the previous tutorial
which took you step by step through this process.

b.

*STATISTICS
COURSES TAUGHT BY CLEB MADDUX PAGE.*

*CEP640
Page - Educational Measurements and Statistics*

*CEP740
Page - Advanced Educational Measurements and Statistics*

*Click
here to go to the page for CEP741 - Applied Research Design and Analysis
in Education I*

Please direct questions to: