Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to dbInsert a .txt file #9

Open
pakom opened this issue Dec 22, 2015 · 0 comments
Open

Unable to dbInsert a .txt file #9

pakom opened this issue Dec 22, 2015 · 0 comments

Comments

@pakom
Copy link

pakom commented Dec 22, 2015

Dear Roger,

I am trying to dbInsert a large .txt file as a data frame using the read_fwf function from the readr package. The file comes from OECD's PISA 2012 and its size is 1.1GB. It contains the responses to the student questionnaire. I work on a laptop with 4GB of RAM under Arch Linux (64-bit) and have about 250GB of free space on the hard drive. The size of the swap partition is 2GB. Here is the code that I use:

setwd("/media/work")

dbCreate("tmpDB")

DB <- dbInit("tmpDB")

dbInsert(DB, "x", data.frame(read_fwf(
file = "/media/PISA_2012/INT_STU12_DEC03.txt",
fwf_positions(start = ranges.start, end = ranges.end,
col_names = var.names), progress = FALSE)))

ranges.start, ranges.end and var.names are taken from the .sps file provided with the .txt data file.

The tmpDB file is created, the DB is initialized in the R environment. The dbInsert runs without any error or warning messages, but after being done the file size of the tmpDB still remains 0B, the dbList(DB) returns character(0) and the key x does not seem to exist.

I tried with smaller files from the same or previous cycles and with those of about 500MB it works. I also tried taking just 200 lines from the file I have troubles with and it works too. I thought this might be due to the limitation of my /tmp folder which is the system's temporary folder and is limited to 1.8GB. Then I installed the unixtoolspackage and used the following to change R's temporary folder and check if it is changed:

> set.tempdir("/media/temp")
> tempdir()
[1] "/media/temp"
> tempfile()
[1] "/media/temp/file8fc7d43a8d6"

I run the dbInsert code above again. However, the result is the same - tmpDB is still 0B, the x key does not exist.

What would be the reason for this behavior?

Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant