Posts

Showing posts from June, 2014

Introduction to Hadoop

My effort to learn Hadoop and Map Reduce resulted in the presentation. If you like it and find it useful do leave a comment.

Forecasting Retail Sales - Linear Regression with R and Hadoop

Image
image from Responsemagazine.com A retail store tracks the volume of sale for each stock-keeping-unit (SKU) that the store deals with. Given the sales for days 1 through 5, is it possible to predict the sales on days 6 through 10 ? Common sense dictates that sales will remain constant and the average sales per day for the first 5 days will be the same as the average sales per day for the next 5 days. However this may not always be the case, if there is a rising or falling trend. If there is a strongly rising trend, caused by a some strong promotional activity, then the assumption of constant sale will lead to a stock-out and loss of potential business. Similarly, if there is a strongly falling trend, then a similar assumption will lead to accumulation of dead stock and hence a loss related to excessive inventory. Instead of days, the same analysis can be done on the basis of weeks, fortnights or even month. Net-net given the sales over 5 periods of time, it is useful to be able to ...

HIVE and PIG to simplify Hadoop

Image
[Note]   -- Hadoop, IMHO, is history. Rather than waste time with all this, suggest you check up my blog post on  Spark with Python . When I was doing engineering at IIT, Kharagpur, the computers that we had were not even as powerful as a low-cost non-smart phone today and other than the basic concept of programming, nothing that we learnt is of any relevance today. So when we start a teaching a course on Business Analytics , that lies at the bleeding edge of  current technology and business practices, there is simply no option but to take the Do-It-Yourself approach of first learning a subject and then teaching it to students. Fortunately, there are many kind and knowledgeable souls on this planet who have taken the pains to explain new and difficult concepts to ancients like us and thanks to Google, it is not too difficult to locate them. Using this route, I first learnt what is Data Science and then created this compilation of tutorials and training materials th...