Spark

Apache Spark (https://spark.apache.org) is a unified analytics engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. Currently, Spark has been widely deployed in industry.

The log set was collected by aggregating logs from the Spark system in our lab at CUHK, which comprises a total of 32 machines. The logs are aggregated at the machine level. However, three machines have been repaired and unfortunately some logs are lost. The logs have a huge size (over 2GB) and are provided as-is without further modification or labelling, which involve both normal and abnormal application runs.

Note that Spark_2k.log is a sample log. The raw logs can be requested from Zenodo: https://doi.org/10.5281/zenodo.1144100

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Spark

Files

README.md

Latest commit

History

README.md

File metadata and controls

Spark