Wednesday, November 25, 2015

[Impala] About "InvalidStorageDescriptorException: Invalid delimiter: 'X' " error

Hi,

In this post , i will talk about an error we face recently when we query data after loading data from xml file via Beeline. First, we get a generic error but after running query on Impala we get real reason and solved it.

Firstly, developers exported data from Exadata by using  sqoop with using delimiter '0x01' and they loaded data with HUE by using  Metastore Manager.

Everything is normal here, but when they issue an count operation mapreduce tasks stops with following error:

beeline> select  count(*) from schema3;
...

$ tail -f /var/log/hive/hadoop-cmf-hive-HIVESERVER2-HOST.log.out

...
2015-11-24 15:32:05,399 ERROR org.apache.hive.service.cli.operation.Operation: Error running hive query: 
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
        at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:147)
        at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:70)
        at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:197)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:209)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2015-11-24 15:34:39,645 INFO org.apache.hadoop.hive.ql.log.PerfLogger: <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver
...

Hive version-> 1.1.0
Impala version-> 2.2.0

After some investigation, i couldn't see further so , I run query with Impala and it gave me following error:

Your query has the following error(s):


AnalysisException: Failed to load metadata for table: OURDB.schema3 CAUSED BY: TableLoadingException: Failed to load metadata for table: schema3 CAUSED BY: InvalidStorageDescriptorException: Invalid delimiter: '0x01'. Delimiter must be specified as a single character or as a decimal value in the range [-128:127]

Now, after that i had a small talk with developer and he says yes it is our delimiter in this case. After that i changed Hex(0x01) to decimal , it is 1 , and created the table again.


create table schemanew (user  string, 
fis   bigint,
t string, ...
 )   ROW FORMAT DELIMITED FIELDS TERMINATED BY '1' s;


load data  inpath '/data/....../ schema3.xml' OVERWRITE INTO TABLE schemanew ; 

After that i run query with impala & beeline and it gives no error.

 > select count(*) from schemanew ;
Query: select count(*) from schemanew 
+----------+
| count(*) |
+----------+
| 5509     |
+----------+

Ok, thats'all .

In case you face this error, wish it helps.

Ok, that is all
Thanks for reading.
Enjoy & share.




No comments :

Post a Comment