Google Launches Cloud DataFlow, a Cloud Computing Based Data Pipeline Management Tool

Google Launches Cloud DataFlow, a Cloud Computing Based Data Pipeline Management Tool

Cloud DataFlow in Beta Version would be used to manage the large data pipelines of different social media activities or other cloud computing based feeds, and data flows from multiple online platforms.

This was announced by Google Corporation during its annual conference for the developers at the last weekend that, it has started some new initiatives for the users of the cloud computing services of the company to enhance their experience. These initiatives includes many cloud computing based features and capabilities that can increase the satisfaction of the customers – the most important of them is the launch of Cloud DataFlow in Beta Version for managing the cloud computing based data pipelines.

According to the official announcement made by the company, the newly introduced tool would help users manage the large and most complex data pipelines with the help of the powerful drill down options of the tool. The Cloud DataFlow tool is capable of handling both the streaming as well as the batch of data to process and manage according to the requirements of the user.

While talking about the technical features of this cloud computing based tool of the company, the Head of the Product Marketing at Google Cloud Platform, Mr. Brian Goldfarb explained that the tool uses the same API for both the complex as well as the simple pipelines, which enables the developers to concentrate on their development work on the data, and allowing the Cloud DataFlow to manage the data pipelines. This tool takes into account all conversation between different segments, and is capable to handle the traffic among the segments with the help of different logics, such as, aggregation by sliding window, parts of map, keys and others.

This tool is capable to produce the graphical presentation of the desire reports out of the large volume of data pipelines. This tool provides monitoring information of the cloud application, auto debugging feature, and service level metrics.

“Cloud Dataflow handles both batch and streaming data,” as pointed out by Brian. “Imagine analyzing millions of tweets posted during a worldwide event in real time. In one pipeline segment, you read the tweets. In the next segment you extract tags. In another segment, you classify tweets by sentiment (positive, negative, or other). In the next segment, you filter for keywords. And so on. Map/Reduce — an older paradigm for handling large data sets — doesn’t readily deal with such real-time data, and doesn’t easily apply to such long, complex pipelines” he added.