Notice: This website is an unofficial Microsoft Knowledge Base (hereinafter KB) archive and is intended to provide a reliable access to deleted content from Microsoft KB. All KB articles are owned by Microsoft Corporation. Read full disclaimer for more details.

Tuning Options for ScaleR text Imports

View products that this article applies to.

Windows/Linux Block Size

When choosing block size, try to select rowsPerRead to yield ~10M elements in the block, or even less
- With 20 columns, rowsPerRead=500e3
- With 1000 cols, rowsPerRead=1000
This tends to give a block size such that you can process multiple blocks per read
Use blocksPerRead > 1
- The exact value depends on how much RAM you have available
- Generally having multiple blocks in memory simultaneously improves performance
It is easy to increase blocksPerRead, but expensive to re-block, so err on the side of having smaller blocks
If you use rxSplit() or rxDataStep() to create samples, e.g. training/validation, then use rxDataStep() to re-block according to the previous principle

↑ Back to the top

Applies to:

Revolution Analytics

↑ Back to the top

Keywords: kb

↑ Back to the top

Article Info

Article ID	:	3104210
Revision	:	1
Created on	:	1/7/2017
Published on	:	10/29/2015
Exists online	:	False
Views	:	306

Microsoft KB Archive Search

Tuning Options for ScaleR text Imports

Applies to: