Transposing Data: Multiple Time Points

If you’re interested in longitudinal analysis, chances are that you have or someday will come across a data set that looks like this: many participants who each have data at multiple, discrete time points.

data-before

This data set is a fairly simple example. For each participant, we have a) their age at the 5 different times and b) their score on a measure of delinquency at the 5 different times. But for most longitudinal data analyses, you’ll want to rearrange the data. Specifically, for this example, I need to know delinquency at each age, for each participant.

Part One

  1. In SPSS, go to the “Data” menu and select “Restructure”

merp

This will open the “Restructure Data Wizard.”

2. Select “Restructure Selected Variables into Cases”

The first step in restructuring this data is to create a “long form,” meaning that for each participant or ID number, you will have multiple lines of data.

restructure-step1

The easiest way to remember this: You’re trying to make your data set look longer, so select the picture depicting a change from wide to long.

For those of you who are less visual, remember that we currently have multiple variables that actually represent different data collection points for the same person. Specifically, “age1,” “age2,” and “age3,” are different cases of testing for that participant.

3. How many groups of variables need to be created from the existing variables?

The wizard asks how many variable groups you want to create.
In this case, it’s pretty simple to figure this out.

Currently, we have two “variables” that were collected at 5 different times (i.e., delinquency and age). So we want to create two new “groups” to represent those variables. You will define those groups on the next step.

restructure-step2

4. Specify how the data should be rearranged.

The first step here is to designate your case group identifier.

Again, this is relatively simple once you conceptualize it. We are creating multiple lines of data for each participant. So what needs to be repeated on each line? The participant ID. 

  • Under case group identification, choose “use selected variable” from the drop-down menu.
  • In our example, we are using Case Number (id) 

The wizard also allows you to designate variable that are constant, regardless of time point. In our example, sex and race do not change no matter how old the participant is.

  • Move sex and race to the “Fixed Variable(s)” position.

restructure-step3

The box called “variables to transpose” is where you will define your two variable groups; in this case, that means age and delinquency.

restructure-step4

  • Move all variables representing age to the “Variables to be Transposed” area in the wizard (see above)
  • Next to Target Variable, where it originally says ‘trans1,’ name the variable “age.” This will be the name of the outcome variable. 

restructure-step5

In the drop-down menu, move to the second variable group (the default name is ‘trans2‘). Name this new variable “del,” short for delinquency, and move the variables for delinquency over into the box.

You should now be able to click “next.”

5. The wizard will then as you to create an index variable.

restructure-step6

Here, you select how many index variables you want. This is a variable the system generates to mark how many cases you have made for each participant. In this particular example, we just use one.

6. There are 3 more steps (out of 7) in the wizard.

Feel free to click through the remaining steps. For a simple transformation like this, you shouldn’t need to change anything from the default settings, so you can just click ‘Finish’.

This is the resulting data set.

restructure-result-longform

Here you can see that each participant has a distinct line of data for each age that they contributed data. You can also see the index variable that we generated called “Index1.”

However, the job doesn’t end here. The next step is switching the data set again. In fact, we are taking this long-form data and transposing it back to wide-form data.


Part Two

1. Delete the index variable.

This is not “necessary,” but neither is the variable…so you may as well get it out of the way so it doesn’t accidentally cause you problems.

restructure2-step0

2. Return to the ‘Restructure Data Wizard.’ 

If you need a refresher on where this is, return to Step 1 in Part I.

3. This time, select “Restructure selected cases into variables.”

Remember: we are going from long-form to wide-form this time, so pick the picture that shows that.

In non-visual terms…We have now created a data set with multiple cases for each participant, and each case has a value for age and delinquency. Now, we want to create variables that are “delinquency at [insert age].” So we need to transform these cases into variables.

restructure2-step1

4. Choose your identifier and index variables.

Your identifier is still participant number (i.e., ID).

Recall that our goal for analyses is to create variables such as “delinquency at age 10″, “delinquency at age 11″,”delinquency at age 12″, and so on. Thus, we are indexing delinquency by participant age.

restructure2-step2

5. There are another 3 steps in the wizard, but you can skip these and just click ‘finish.’

Once again, the defaults are pretty standard for the rest of the transformation, so you shouldn’t need to change anything.

This is the final data set.

Now we have a data set where there is only one line (i.e., case) per participant, but variables representing delinquency at each relevant age.

restructure2-result-newwide

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s