Integrating Hive 0.9.0 with HBase 0.94.3 – Identifying root cause for RuntimeException: Error while reading from task log url

The last post here was on integrating Hive 0.11.0 with HBase 0.94.2. But because of issue HIVE-4515 currently we are not able to query HBase with varied queries. While the contributors are fixing the issue we can use HBase 0.94.3 for our experiments.

The Above posts has details on configuring Hive with HBase and table creation process.
All the steps are same as integration with HBase 0.94.2, with few exceptions.

We just need to add new HBase jars.

Adding necessary JARS:

  • guava-11.0.2.jar
  • protobuf-java-2.4.0a.jar
  • hbase-0.94.3.jar
  • hive-hbase-handler-0.9.0.jar
  • zookeeper-3.4.3.jar

hbase-0.94.3 Hive 0.9.0

We also need to add protobuf jar to Hive auxlib, else Hive blows up with below error:

Exception in thread “Thread-32” java.lang.RuntimeException: Error while reading from task log url                
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:240)                
at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:227)                
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:92)                
at java.lang.Thread.run(Thread.java:724)Caused by: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:50060/tasklog?taskid=attempt_201310301622_0001_m_000000_2&start=-8193                
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1625)                
at java.net.URL.openStream(URL.java:1037)                
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:192)                … 3 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

Root cause is not shown on Hive console. 

Getting the root cause of Error:

We are running into : https://issues.apache.org/jira/browse/HIVE-1579
Hive blows up because it is not able to fetch the debug logs of a Failed task.
To get the root cause, Copy and change the URL (in the error) & hit the browser:

http://localhost:50060/tasklog?taskid=attempt_201310301622_0001_m_000000_2&start=-8193
to,
http://localhost:50060/tasklog?attemptid=attempt_201310301622_0001_m_000000_2&start=-8193

 Root Cause:

 Error running child : java.lang.NoClassDefFoundError: com/google/protobuf/Message

Solution:

Also add Protobuf jar to Hive auxlib folder from HBase’s lib folder.
Thats all for this post, Create HBase table similar to the last post and create an external Hive table to query its data.

Refer this post for details:
http://www.confusedcoders.com/bigdata/hive/hbase-hive-integration-querying-hbase-via-hive

Drop in your comments in case you face any issues.

Cheers \m/

 

Yash Sharma is a Big Data & Machine Learning Engineer, A newbie OpenSource contributor, Plays guitar and enjoys teaching as part time hobby.
Talk to Yash about Distributed Systems and Data platform designs.

Leave a Reply

Your email address will not be published. Required fields are marked *