We had a EMR cluster reboot and hit this error all of sudden. The error is independent of EMR so worth sharing.
Error:
Caused by: java.net.NoRouteToHostException: No route to host at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712) at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528) at org.apache.hadoop.ipc.Client.call(Client.java:1451) ... 56 more java.net.NoRouteToHostException: No Route to Host from ip-XXX-XXX-XXX-XXX/XXX-XXX-XXX-XXX to ip-YYY-YYY-YYY-YYY:PORT failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:758) at org.apache.hadoop.ipc.Client.call(Client.java:1479) at org.apache.hadoop.ipc.Client.call(Client.java:1412) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy14.getListing(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:573) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498)
Note: ip-XXX-XXX-XXX-XXX was our new cluster master’s IP while ip-YYY-YYY-YYY-YYY was the old cluster’s master’s IP (which had been terminated now).
Root cause:
We had an external Metastore for the cluster so that we could get rid of the cluster and spin up a new one anytime. Hive Metastore still keeps references to old cluster if there are ‘MANAGED’ tables.
Fixes:
- Drop all managed tables, since the data is lost with old cluster.
- Remove/update references to old cluster from Metastore. This is not very useful, but it was good knowing that it can be done.
$ hive --service metatool -listFSRoot Initializing HiveMetaTool.. Listing FS Roots.. hdfs://ip-YYY.YYY.YYY.YYY:PORT/user/hive/warehouse hdfs://ip-YYY.YYY.YYY.YYY:PORT/user/hive/warehouse/products.db hive --config /etc/hive/conf/conf.server --service metatool -dryRun -updateLocation <new_value> <old_value> $ hive --service metatool -updateLocationhdfs://ip-XXX.XXX.XXX.XXX:PORT/user/hive/warehousehdfs://ip-YYY.YYY.YYY.YYY:PORT/user/hive/warehouse
Final notes:
This command is insanely slow and takes hours depending on the number of partitions and tables in your Metastore. The command looks for all the tables and updates the references to old locations.
More info in this very useful post. Hope that helps.
Cheers