Notice: This website is an unofficial Microsoft Knowledge Base (hereinafter KB) archive and is intended to provide a reliable access to deleted content from Microsoft KB. All KB articles are owned by Microsoft Corporation. Read full disclaimer for more details.

QA: How do the RevoScaleR chunking algorithms work?

View products that this article applies to.

You can use the same RevoScaleR functions to process huge data sets stored on disk as you do to analyze in-memory data frames. This is because RevoScaleR functions use 'chunking' algorithms. Basically, chunking algorithms follow this process:

Initialization: intermediate results needed for computation of final statistics are initialized
Read data: read a chunk (set of observations of variables) of data
Transform data: perform transformations and row selections for the chunk of data as needed; write out data if only performing import or data step
Process data: compute intermediate results for the chunk of data
Update results: combine the results from the chunk of data with those of previous chunks
Repeat steps (2) - (5) (perhaps in parallel) until all data has been processed
Process results: when results from all the chunks have been completed, do final computations and return results

↑ Back to the top

Applies to:

Revolution Analytics

↑ Back to the top

Keywords: kb

↑ Back to the top

Article Info

Article ID	:	3104271
Revision	:	1
Created on	:	1/7/2017
Published on	:	10/29/2015
Exists online	:	False
Views	:	158

Microsoft KB Archive Search

QA: How do the RevoScaleR chunking algorithms work?

Applies to: