Part 0 Deliverables-
Sample program that alphabetizes data:
- RDD API
- Dataset API
- Dataframe API
Algorithm:
- separate data by word
- aggregate arrays of words
- alphabetize array
- print out array
Sample program that returns the word with the highest count:
- RDD API
- Dataset API
- Dataframe API
Algorithm:
- separate data by word
- map words to key value pairs where key = word and value = count
- aggregate key value pair counts
- print out word(s) with highest frequency