13 Tables

As well as tables of statistics for continuous data calculated with tabstat (introduced above), you can create tables of frequencies for categorical data. For example

table gender

The basic use is:

table (variable)(variable)
Where the first variable (or variable list) populates the rows of the table and the second the columns. Either rows or columns may be empty.

To the basic frequencies displayed by this basic command, you can add a variety of detail, including percentages and cumulative percentages by using the tab command (short for tabulate) instead of table

tab gender

These commands create one-way tables of frequencies. A two-way table can be created by adding a variable either for each of rows and columns or by nesting two variables in rows or columns.

tab gender teacher
tabulate gender teacher

With both producing the same output.

So, the syntax of table is actually

table (rowvariables) (columnvariables)

where the parentheses are only required if you have more than one variable in row or column.

13.1 Tables with custom statistics

It is possible to create tables that show expected as well as observed frequency values

table gender teacher, expected

and to add specific detail about a continuous variable by using the statistic() option8.

For example:

table (teacher) (gender),statistic(mean maths)

13.2 Customized tables

The customization of tables in Stata depends on the very powerful collection command set. There is not time to cover this in detail here and I recommend that you read this Stata blog post by Chuck Huber for more detail.

To follow this section you should run the file customtabledo.do which you can download from the course web pages .

13.2.1 A tabulation with customized layout

First, look at the output of this command:

tabulate teacher gender, chi2 expected

and now let us see how to collect the results of this command and build a simple custom layout.

collect: tabulate teacher gender, chi2 expected

This creates a collection of all the output from the tabulate command (not just what is shown by default) which it stores in a set of “dimensions”:

collect dims

From the output here we see a dimension result with five levels. We can look at the content of these levels with the levelsof option:

collect levelsof result

and we see that the reult dimension has levels for N c chi2 p r: the number of observations, the number of columns, Pearson’s Chi Square, the probability for Chi Square and the number of rows.

One of the complications using tables and collections is the need to be aware of the dimensions and levels in a collection and their meanings. For now let us use the layout option on collect to create a simple table of output:

collect layout (result[N chi2 p])()

Like the table command, layout specifies first rows, then columns of your output.

We will check the labelling of these levels with

collect label list result, all

and modify them for our output:

collect label levels result chi2 "Test of Association" N "Count" p "Prob(Chi)", modify

This section gives the outlines of a simple example a customized table. There is much more that you can learn from the documentation.