I was at a meeting with a leading Big Data vendor recently and a group presented their current data analysis pipeline featuring Storm, Kafka, Elastic Map Reduce, Spark, and a number of other plumbing pieces that ferried the data through.
One of the vendor’s lead technologists mentioned that it was pretty involved and it sounded worthy of a good blog post. That mention had come up again when another piece of architecture was discussed.
On the surface this had seemed like a good thing that the company was working at the edge of this space, but as the day went on and the themes of simplification of architecture and focus were discussed, it had become apparent that they had spent entirely too much energy in a space that was still nascent in the industry building a complex data and analytics pipeline. Much of this effort would have been better spent on simplifying the ability for new data insights to be generated and productizing them instead.
While it may seem somewhat meta and ironic to discuss the merits of blog posts in a blog post, this seemed like a good experience to share. Blogging about bleeding edge technical pursuits at your company can have other benefits such as building the strength of your brand in the technical community which can be a powerful recruiting boost. Just don’t conflate useful research and POC’s that you learn from and blog about with what you actually put together for your production data architecture nor lose sight of the actual analysis goal.
Photo Credit: Link by Rob Davies