X1: Skalierbare Analyse
Co-designing Big Data Analytics Systems with Modern Networks
Data is currently counted among the most important assets in academia and industry. It has been estimated that companies that adopt Big Data Analytics can increase productivity by 10% more than companies that do not, and that Big Data practices in Europe could add 1.9% to GDP between 2014 and 2020. In the recent years many different Big Data Analytics Systems such as MapReduce, Spark, or more recently TensorFlow are used to process large amounts of data. A basic need of all these systems is that data centre networks provide high throughput for large parallel dataflows such as the massive shuffle traffic in a MapReduce application or to support workloads that result from using a central parameter server that stores the trained model parameters in distributed machine learning systems such as TensorFlow. Furthermore, these systems are nowadays also deployed more and more over multiple data centres including the edges of the network to support scenarios where data needs to be pre-processed close to the data sources and might include even mobile end-devices.