Pacific-Design.com

    
Home Index

1. Apache Spark

2. Parallelize

Apache Spark / Parallelize /

PySpark - Compute Average in Parallel


a = sc.parallelize([1,3,4,5,8,9,12,45,67,88,99])

def average(a,b):
        return (a+b)/2.0;

def sum(a,b):
        return a+b;

b = a.reduce(average)

SUM=a.reduce(sum)
COUNT=a.count()