Here are the basic rules, and they apply to formulas in all of the analysis functions in RevoScaleR:
1. The interaction of two continuous variables is equivalent to the multiplication of those variables, and is thus continuous. That is, w:x is the same as w*x.
2. The interaction of two factor (categorical variables) is a categorical variable whose categories are all possible combinations of the categories of the original two variables. Thus, age:sex, if both are categorical, contains all age and sex categories.
3. The interaction of a continuous variable and a categorical variable results in an "interaction" variable in which the continuous variable is operated on within each category. Thus rxSummary( ~income:sex ) gives summary statistics for income within each sex category; rxCube( ~income:sex) computes average income within each sex category. For both rxSummary (this is a very recent change) and rxCube/rxCrossTab, ~income:sex is equivalent to income~sex. That is, the continuous variable can be put on the left hand side of the ~.
These rules apply to multiple continuous and categorical variables. All of the continuous variables are multiplied by each other, and all of the categorical variables are interacted to give a combined categorical variable, and then the resulting continuous variable is operated on within each category of the resulting categorical variable.
1. The interaction of two continuous variables is equivalent to the multiplication of those variables, and is thus continuous. That is, w:x is the same as w*x.
2. The interaction of two factor (categorical variables) is a categorical variable whose categories are all possible combinations of the categories of the original two variables. Thus, age:sex, if both are categorical, contains all age and sex categories.
3. The interaction of a continuous variable and a categorical variable results in an "interaction" variable in which the continuous variable is operated on within each category. Thus rxSummary( ~income:sex ) gives summary statistics for income within each sex category; rxCube( ~income:sex) computes average income within each sex category. For both rxSummary (this is a very recent change) and rxCube/rxCrossTab, ~income:sex is equivalent to income~sex. That is, the continuous variable can be put on the left hand side of the ~.
These rules apply to multiple continuous and categorical variables. All of the continuous variables are multiplied by each other, and all of the categorical variables are interacted to give a combined categorical variable, and then the resulting continuous variable is operated on within each category of the resulting categorical variable.