R/import_data.R
import_mhealth_csv_chunked.Rdimport_mhealth_csv_chunked imports the raw multi-channel accelerometer
data stored in mHealth Specification in chunks.
import_mhealth_csv_chunked(filepath, chunk_samples = 180000)list. The list contains two items. The first item is a generator
function that each time it is called, it will
return a dataframe with at most chunk_samples samples of imported data.
The third item is a close_connection function which you can call at
any moment to close the file loading.
This function is a File IO function that is used to import data stored in mHealth Specification during algorithm validation.
default_ops = options()
options(digits.secs=3)
# Use the mhealth csv file shipped with the package
filepath = system.file('extdata', 'mhealth.csv', package='MIMSunit')
# Example 1
# Load chunks every 1000 samples
results = import_mhealth_csv_chunked(filepath, chunk_samples=100)
next_chunk = results[[1]]
close_connection = results[[2]]
# Check data as chunks, you can see chunk time is shifting forward at each iteration.
n = 1
repeat {
df = next_chunk()
if (nrow(df) > 0) {
print(paste('chunk', n))
print(paste("df:", df[1, 1], '-', df[nrow(df),1]))
n = n + 1
} else {
break
}
}
#> [1] "chunk 1"
#> [1] "df: 2017-03-16 12:25:50.0005 - 2017-03-16 12:25:51.2375"
#> [1] "chunk 2"
#> [1] "df: 2017-03-16 12:25:51.2635 - 2017-03-16 12:25:52.5005"
#> [1] "chunk 3"
#> [1] "df: 2017-03-16 12:25:52.5255 - 2017-03-16 12:25:53.7635"
#> [1] "chunk 4"
#> [1] "df: 2017-03-16 12:25:53.7875 - 2017-03-16 12:25:55.0255"
#> [1] "chunk 5"
#> [1] "df: 2017-03-16 12:25:55.0505 - 2017-03-16 12:25:55.9875"
# Close connection after reading all the data
close_connection()
# Example 2: close loading early
results = import_mhealth_csv_chunked(filepath, chunk_samples=1000)
next_chunk = results[[1]]
close_connection = results[[2]]
# Check data as chunks, you can see chunk time is shifting forward at each iteration.
n = 1
repeat {
df = next_chunk()
if (nrow(df) > 0) {
print(paste('chunk', n))
print(paste("df:", df[1, 1], '-', df[nrow(df),1]))
n = n + 1
close_connection()
}
else {
break
}
}
#> [1] "chunk 1"
#> [1] "df: 2017-03-16 12:25:50.0005 - 2017-03-16 12:25:55.9875"
# Restore default options
options(default_ops)