Home‎ > ‎Statistics‎ > ‎Introductory Biostatistics‎ > ‎

Stata Tutorials

Data analysts typically spend much more time doing data management tasks of various types (e.g., creating or re-coding variables, merging or restructuring files, etc.) than they do performing statistical analyses per se.  In order to simulate real world conditions, the Stata assignment for this course will require you to carry out some data management tasks before you analyze the data and interpret the results.

In order to prepare yourself for the Stata assignment, you are strongly encouraged to work through the following Stata tutorials during the first 5 weeks of the term.  These tutorials will not be graded.  However, the data management skills you learn by doing them will be required for successful completion of the assignmentThey will also serve as a nice little library of examples for you, should you continue to use Stata after the course is done.

In the real world, it is vital for data analysts to document their work.  One important way to do that is by using syntax files, which are basically computer programs that record all of the commands one has issued to their statistical software.  In Stata, those syntax files are called DO-files.  You will be introduced to DO-files in Week 2.  For the text-based parts of Weeks 2-5, you should create DO-files that contain both your Stata commands and explanatory comments.  This will be excellent practice for the assignment.   

Week 1:  Introduction to Stata

Week 2:  Using DO-files (i.e., syntax files)

Week 3:  Some basic data management tasks

Week 4:  UCLA Tutorials on aggregating & merging data files

  • Collapsing data across observations (i.e., aggregating over rows)
  • Combining (or merging) Stata data files
    • Please note that the merge command has been updated since this tutorial was written.  The merge commands shown in the tutorial still work, but they also generate the following note:  (note: you are using old merge syntax; see [D] merge for new syntax).  The difference is that in the new merge syntax includes 1:1, 1:m, m:1 or m:m after merge to indicate the nature of the merge.  I suggest that you consult the help for merge (i.e., type help merge in the Command window), and then try using the new merge syntax in your DO-file for this tutorial.

Week 5:  UCLA Tutorials on reshaping (or restructuring) Stata data files

  • From WIDE to LONG
  • From LONG to WIDE
    • Please note that the use kids, clear command in Examples 1 and 2 will only work if you have the kids.dta data file saved in your working directory from an earlier tutorial.  It would be much safer to use this command insteaduse http://www.ats.ucla.edu/stat/stata/modules/kids, clear
    • For Example 3, the use dadmom1, clear command will probably not work, because the file dadmom1.dta probably will not exist in your working directory.  In order to make Example 3 work properly, replace that use command with the following code:

input famid str4 name inc str3 dadmom
2  "Art"   22000   "dad" 
1  "Bill"  30000   "dad" 
3  "Paul"  25000   "dad" 
1  "Bess"  15000   "mom" 
3  "Pat"   50000   "mom"
2  "Amy"   18000   "mom"  

If you have questions about the tutorials, please post them to the class discussion forum. 

Many of the preceding text-based tutorials are from this UCLA website.  You may also find UCLA's Stata FAQ page helpful. 

StataCorp's video tutorials can be found via this web-page, or via StataCorp's YouTube ChannelYou can find links to other video and text-based tutorials on my Stata web-page.

Last revised:  23-Jul-2019