Streaming link: - veeraravi/Spark-notes GitHub Wiki

====Streaming==== http://spark.apache.org/docs/1.6.2/streaming-programming-guide.html#a-quick-example http://spark.apache.org/docs/1.6.2/api/scala/index.html#org.apache.spark.streaming.StreamingContext https://prateekvjoshi.com/2015/12/22/analyzing-real-time-data-with-spark-streaming-in-python/ https://prateekvjoshi.com/2015/12/29/performing-windowed-computations-on-streaming-data-using-spark-in-python/ http://henning.kropponline.de/2015/03/22/spark-streaming-simple-example/ http://www.techsquids.com/bd/spark-socket-streaming-example-windows/

======= Cassandra ===== https://teddyma.gitbooks.io/learncassandra/content/about/the_cap_theorem.html

=====Spark setup================= http://letsprog.com/apache-spark-tutorial-java-maven-eclipse/ https://hortonworks.com/tutorial/setting-up-a-spark-development-environment-with-java/

======SPARK-AVRO=== https://docs.databricks.com/spark/latest/data-sources/read-avro.html

===SPARK to DB== http://www.sparkexpert.com/2015/04/17/save-apache-spark-dataframe-to-database/ https://people.csail.mit.edu/matei/papers/2015/sigmod_spark_sql.pdf https://forums.databricks.com/questions/8736/get-mysqlrdbms-data-using-spark-streaming.html

====Dataset=== http://xinhstechblog.blogspot.co.uk/2016/07/ https://databricks.com/blog/2016/01/04/introducing-apache-spark-datasets.html https://www.linkedin.com/pulse/apache-spark-rdd-vs-dataframe-dataset-chandan-prakash https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-StructType.html http://why-not-learn-something.blogspot.co.uk/2016/07/apache-spark-rdd-vs-dataframe-vs-dataset.html http://www.agildata.com/apache-spark-rdd-vs-dataframe-vs-dataset/ https://databricks.com/blog/2016/01/04/introducing-apache-spark-datasets.html https://databricks.com/blog/2017/08/31/cost-based-optimizer-in-apache-spark-2-2.html http://www.kdnuggets.com/2016/02/apache-spark-rdd-dataframe-dataset.html https://blog.codecentric.de/en/2016/07/spark-2-0-datasets-case-classes/

http://blog.madhukaraphatak.com/introduction-to-spark-two-part-2/ http://blog.madhukaraphatak.com/categories/spark-two/ https://github.com/phatak-dev/spark2.0-examples/blob/master/src/main/scala/com/madhukaraphatak/examples/sparktwo/DatasetVsDataFrame.scala https://hortonworks.com/tutorial/learning-spark-sql-with-zeppelin/

===KAKFKA==== https://www.javaworld.com/article/3060078/big-data/big-data-messaging-with-kafka-part-1.html http://cloudurable.com/blog/kafka-tutorial-kafka-producer/index.html https://dzone.com/articles/kafka-producer-in-java

==WEB HDFS=== https://bighadoop.wordpress.com/2013/06/02/hadoop-rest-api-webhdfs/ https://hortonworks.com/blog/webhdfs-http-rest-access-to-hdfs/ http://pivotalhd.docs.pivotal.io/docs/using-webhdfs-rest-api.html

=======off heap === https://dzone.com/articles/heap-vs-heap-memory-usage https://stackoverflow.com/questions/43330902/spark-off-heap-memory-config-and-tungsten https://www.pgs-soft.com/spark-memory-management-part-1-push-it-to-the-limits/

OOZIE====

+++SERDE===

⚠️ **GitHub.com Fallback** ⚠️