If your text data is not separated by commas or tabs, you must specify the delimiter using the columnDelimiters argument. (This is not actually an argument to rxImport, but to the underlying RxTextData data source object.) In normal usage, this argument is a single character, such as columnDelimiters="\t" for tab-delimited data or columnDelimiters="," for comma-delimited data. However, each column may be delimited by a different character; all the delimiters must be concatenated together into a single character string. For example, if you have one column delimited by a comma, a second by a plus sign, and a third by a new line, you would use the argument columnDelimiters=",+\n".
So for the above data how do I fix the below code to consider ‘|’ as the delimeter
hdfsFS <- RxHdfsFileSystem(hostName=”dummy ", port="dummy")
txtSource <- RxTextData("directory value/ file_name in hdfs", fileSystem=hdfsFS)
airData <- rxImport(inData=txtSource, outFile = "/tmp/test.xdf",stringsAsFactors = TRUE, missingValueString = "M", rowsPerRead = 200000, overwrite=TRUE)
rxSummary(~ id+val, data = airData)
2). To be able to read 'pipe'-delimited data, you will need to set the option 'delimiter="|"' in your RxTextData() call:
txtSource <- RxTextData(("directory value/ file_name in hdfs", fileSystem=hdfsFS, delimiter = "|")