Spark

Towards Dependency-Aware Cache Management for Data Analytics Applications

Memory caches are being used aggressively in today’s data analytics systems such as Spark, Tez, and Piccolo. The significant performance impact of caches and their limited sizes call for efficient cache management in data analytics clusters. However, …

LRC: Dependency-Aware Cache Management for Data Analytics Clusters

Participated in the implementation of online LRC module in Spark.