Sunday, January 20, 2019

Apache Spark Code Collection

Read 'csv' File

lines = sc.textFile("..../u.data")

Get the first line of  an RDD file type

lines = sc.textFile("..../u.data")

firstRow=lines.first()

Count the number by Appearance and Show the Results

lines = sc.textFile("/Users/edmondlegaspi/Desktop/Datasets/u.data")

ratings = lines.map(lambda x: x.split()[2])

results = ratings.countByValue()

sortedResults = collections.OrderedDict(sorted(results.items()))

for key, value in sortedResults.items():
    print(key, value)



2 comments:

  1. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
    Web Designing course in Hyderabad

    ReplyDelete