hadoop - Loading datetime field doesn't work in pig latin 0.12 -
i'm using pig 0.12, , document here says supports datetime datatype
http://pig.apache.org/docs/r0.12.0/basic.html#data-types
but following load statement gives me unsupportedoperationexception on first field. hdfs location contains tab separated files first field in format yyyy-mm-dd.
rsa = load '/mypath/*' using pigstorage() ( hit_date:datetime, agency_id:long, agency_name:chararray, .... );
error 2999: unexpected internal error. null
java.lang.unsupportedoperationexception @ parquet.pig.pigschemaconverter.convertwithname(pigschemaconverter.java:273) @ parquet.pig.pigschemaconverter.convert(pigschemaconverter.java:248) @ parquet.pig.pigschemaconverter.convert(pigschemaconverter.java:285) @ parquet.pig.pigschemaconverter.converttypes(pigschemaconverter.java:241) @ parquet.pig.pigschemaconverter.convert(pigschemaconverter.java:234) @ parquet.pig.tuplewritesupport.(tuplewritesupport.java:63) @ parquet.pig.parquetstorer.getoutputformat(parquetstorer.java:103) @ org.apache.pig.newplan.logical.rules.inputoutputfilevalidator$inputoutputfilevisitor.visit(inputoutputfilevalidator.java:80)
check notes section below datatypes in document link shared.it says -
there no native constant type datetime field. can use todate udf chararray constant argument generate datetime value.
rsa = load '/mypath/*' ( indatechar:chararray, agency_id:long, agency_name:chararray, .... ); convertdate = foreach rsa generate todate(indatechar, 'yyyy-mm-dd') (indatedt:datetime);
todate uses simpledateformat.[http://docs.oracle.com/javase/6/docs/api/java/text/simpledateformat.html]
Comments
Post a Comment