Here is a collection of spark configs that have helped make the job runs faster. Most of the configs come with trade-offs but work very… Read More »Handpicked Spark configs to make the job runs faster
Yash Sharma is a Big Data & Machine Learning Engineer, A newbie OpenSource contributor, Plays guitar and enjoys teaching as part time hobby. Talk to Yash about Distributed Systems and Data platform designs.
This post is focussed on getting our dev env setup for our future experiments. Dev stack is pretty minimal for now: TensorFlow Keras Jupyter notebooks… Read More »Deep Learning Day 1 – Install Tensorflow, Keras and Jupyter for dev
I am writing this series of blogs to document my learnings as I travel through the data science world. I will be focussing on Deep… Read More »Deep Learning Day 0 – Knowing the technology
This is a startup post to get your dev environment setup for diving into Deep Learning. I have chosen to begin with TensorFlow and Keras… Read More »Setup PyCharm for Deep learning with TensorFlow, Keras and Jupyter (with virtualenv)
Sometimes we quickly need to check the schema of a parquet file, or to head the parquet file for some sample records. Here are some… Read More »How to view content of parquet files on S3/HDFS from Hadoop cluster using parquet-tools
Here is a small post on a very specific use case of Airflow. While its not very difficult to do this, but I found really… Read More »How to run incremental and range run scripts in airflow
We had a EMR cluster reboot and hit this error all of sudden. The error is independent of EMR so worth sharing. Error: Caused by:… Read More »Spark-sql java.net.NoRouteToHostException on cluster reboot
This was a fun debug activity for a Hive-on-S3 use case. Thought of writing a log of debug steps here before I lose the details.… Read More »Debugging : Hive DAG did not succeed due to VERTEX_FAILURE. Unable to rename output.
This post is about creating a Raspberry pi powered remote control car. Before we start I hope you have got your raspberry pi ready and… Read More »How to create a raspberry pi powered car, remote controlled via mobile phone
This quick post is on two important things in pi. Enable wifi and connect to internet Open the file for network interfaces, and look for a… Read More »SSH into a raspberry pi and connect pi to wifi
Here are my takeway points from the papers we love meetup on – Design and implementation of log-structured file system. Here is the link to the… Read More »Papers we love – Summary of Design and implementation of log-structured file system