Recently I ran into problem where I was trying to read a CSV files from a Scandinavian friend into a DataFrame
. I was getting errors it could not properly parse the latin1
encoded names.
I tried running
using DataFrames dataT=readtable("example.csv", encoding=:latin1)
but the got this error
ArgumentError: Argument 'encoding' only supports ':utf8' currently.
The solution make use of (StringEncodings.jl)[https://github.com/nalimilan/StringEncodings.jl] to wrap the file data stream before presenting it to the readtable
function.
f=open("example.csv","r") s=StringDecoder(f,"LATIN1", "UTF-8") dataT=readtable(s) close(s) close(f)
The StringDecoder
generates an IO
stream that appears to be utf8
for the readtable
function.