Interested developers can now test drive Microsoft’s Azure Data Factory service. According to the company, Data Factory is “a managed service to compose data storage, processing, and movement services into managed data production pipelines.” In other words, it helps developers integrate and process data from disparate sources.
A blog post about the cloud computing service explains, “Data processing is enabled initially through Hive, Pig and custom C# activities. Such activities can be used to clean data, mask data fields, and transform data in a wide variety of complex ways. The Hive and Pig activities can be run on an HDInsight cluster you create or you can allow Data Factory to fully manage the Hadoop cluster lifecycle on your behalf. Author your activities, combine them into a pipeline, set an execution schedule and you’re done – no Hadoop cluster setup or management. Data Factory also provides an up-to-the moment monitoring dashboard, which means you can deploy your data pipelines and immediately begin to view them as part of your monitoring dashboard.”