Notice: This website is an unofficial Microsoft Knowledge Base (hereinafter KB) archive and is intended to provide a reliable access to deleted content from Microsoft KB. All KB articles are owned by Microsoft Corporation. Read full disclaimer for more details.

Using ODBC connections in parallel code


View products that this article applies to.

Attempting to use multiple ODBC connections across parallelized worker threads may fail as in the following example:
loaddata <- function(cn){

result <- sqlQuery(cn,'select * from boston')

return(head(result))

}

library(RODBC)

cn1 <- odbcConnect("RevoTestDB", uid='RevoTester', pwd='RevoTester')

cn2 <- odbcConnect("RevoTestDB", uid='RevoTester', pwd='RevoTester')

cn3 <- odbcConnect("RevoTestDB", uid='RevoTester', pwd='RevoTester')

cn4 <- odbcConnect("RevoTestDB", uid='RevoTester', pwd='RevoTester')

rxSetComputeContext('localpar')

system.time ({

z <- rxExec(loaddata, rxElemArg(list(cn1,cn2,cn3,cn4)), 
packagesToLoad='RODBC')

})

Error in do.call(.rxDoParFUN, as.list(args)) :

task 1 failed - "first argument is not an open RODBC channel"
The problem is the worker processes receive the ODBC connections as closed.

The issue here is that connections are process-specific, so unless the workers are sharing the parent process (as in multicore workers created via forking), the parent's connections can't be shared by the workers. To distribute ODBC computations on non-forked workers, establish the connections on each worker as part of the distributed task.

Example:
loaddata <- function(){
library(RODBC)
cn <- odbcConnect("RevoTestDB", uid='RevoTester', pwd='RevoTester')
result <- sqlQuery(cn,'select * from boston')
return(head(result))
}

z <- system.time({z <- rxExec(loaddata,
packagesToLoad='RODBC')})

↑ Back to the top


Keywords: kb

↑ Back to the top

Article Info
Article ID : 3103824
Revision : 1
Created on : 1/7/2017
Published on : 11/1/2015
Exists online : False
Views : 63