Icp 11 - bhargavi1411/BigDataProgramming GitHub Wiki

Name : Bhargavi Saipoojitha Chennupati

Class Id : 4

Task : Apache Spark Streaming

Task 1 : Spark Streaming using Log File Generator

Below is the code screenshot for generating log files and doing the word count on each log file.

Log file generator code:

Word Count On each log file code:

Output:

First execute streaming.py and after some time execute file.py.

2.Spark Streamingfor TCP Socket:

Below is the code screenshot for streaming words through a port number 5000 and the word count appears in the terminal of pycharm.

Output:

First start the port through netcat in command prompt and then execute wordcount.py.

Below is the code screenshot for file streaming through tcp socket and doing the word count:

Output:

First execute socketlisten.py and then after execute fullwordcount.py .

Bonus : Spark Streaming for Character Frequency using TCP Socket.

Below is the code for streaming character frequency using spark streaming:

Output:

First start the netcat in command prompt with port number 5000 and then execute charfreq.py .