Notice: This website is an unofficial Microsoft Knowledge Base (hereinafter KB) archive and is intended to provide a reliable access to deleted content from Microsoft KB. All KB articles are owned by Microsoft Corporation. Read full disclaimer for more details.

QA: What are the rules for variable interactions in RevoScaleR formulas?


View products that this article applies to.

Here are the basic rules, and they apply to formulas in all of the analysis functions in RevoScaleR: 

1. The interaction of two continuous variables is equivalent to the multiplication of those variables, and is thus continuous. That is, w:x is the same as w*x. 

2. The interaction of two factor (categorical variables) is a categorical variable whose categories are all possible combinations of the categories of the original two variables. Thus, age:sex, if both are categorical, contains all age and sex categories. 

3. The interaction of a continuous variable and a categorical variable results in an "interaction" variable in which the continuous variable is operated on within each category. Thus rxSummary( ~income:sex ) gives summary statistics for income within each sex category; rxCube( ~income:sex) computes average income within each sex category. For both rxSummary (this is a very recent change) and rxCube/rxCrossTab, ~income:sex is equivalent to income~sex. That is, the continuous variable can be put on the left hand side of the ~. 

These rules apply to multiple continuous and categorical variables. All of the continuous variables are multiplied by each other, and all of the categorical variables are interacted to give a combined categorical variable, and then the resulting continuous variable is operated on within each category of the resulting categorical variable. 

↑ Back to the top


Keywords: kb

↑ Back to the top

Article Info
Article ID : 3104248
Revision : 1
Created on : 1/7/2017
Published on : 10/29/2015
Exists online : False
Views : 61