ICP 07 : Cassandra - acikgozmehmet/BigDataProgramming GitHub Wiki

ICP-07: Independent Column based No SQL Tool - Cassandra

Installation:

  1. Download Python 2.7.X (https://www.python.org/downloads/release/python-2717/)
  2. Install Python and during installation select Entire python paths
  3. Download Cassandra Version 3.11.6 or latest (http://cassandra.apache.org/download/)
  4. Extract it in you localsystem, i.e., C Drive
  5. Add envrionemnt variable CASSNADRA_HOME with value "C:\apache-cassandra-3.11.6"
  6. Navigate to "C:\apache-cassandra-3.11.6\bin" folder
  7. Open command terminal window as administratior, navigate to "C:\apache-cassandra-3.11.6\bin" and Type "cassandra.bat -f" for starting cassandra server on localhost
  8. Wait for server to initialize completely for atleast 3 - 5 minutes
  9. open new terminal windows navigate to "C:\apache-cassandra-3.11.6\bin" and type "cqlsh"

How to run Cassandra:

After installation, open the cmd with admin privileges and execute the following command to initiliaze the cassandra

C:\Program Files\apache-cassandra-3.11.6\bin>> cassandra.bat

Then open another cmd with admin priviligies again and type the following command to get the shell

C:\Program Files\apache-cassandra-3.11.6\bin >> c:\python27\python cqlsh.py (If you have several Python versions installed on your system)


To create new keyspace

create keyspace test with replication={'class':'SimpleStrategy', 'replication_factor':1};

To display all keyspaces

desc keyspaces;

To use created keyspace

use test;

To create new table called emp

test> create table emp (employee_id int PRIMARY KEY, department text, lastname text, years_with_company int, hiredate text, jobtitle text, salary int, managerid int);

To load file into created new tables

test> copy emp (employee_id, department, lastname, years_with_company, hiredate, jobtitle, salary, managerid) from 'D:\UMKC\__Spring2020\CS5590BDP\Lesson7\employee_entries.csv' WITH DELIMITER='|' AND HEADER = TRUE;

To print the data loaded in employees

Exercises

1. List the empID,ename,jobtitle,and hiredate of employee from the employee table.

select employee_id, lastname, jobtitle, hiredate from emp;

2. List the name,salary of the employees who are clerks.

select lastname, salary from emp where jobtitle ='clerks' allow filtering;

3. List the name,job,salary of every employee joined on ‘february 18,2000’,

select lastname, jobtitle, salary from emp where hiredate='2000-02-18' allow filtering;

4. List name and annual salary of all the employees.

select lastname, salary from emp;

5. Display employees’ names, salary and manager values of those employees whose salary is 45000 from EMP table using SELECT statement.

select lastname, salary, managerid from emp where salary=45000 allow filtering;

Bonus

Write a query that displays last names with first letter capitalized and all other letters in lowercase start with G.

CREATE CUSTOM INDEX fn_prefix ON emp (lastname) USING 'org.apache.cassandra.index.sasi.SASIIndex';

SELECT * FROM emp WHERE lastname LIKE 'G%';

References:

  1. http://cassandra.apache.org/
  2. UMKC Course Notes - CSEE5590-0001/490-0003: Big Data Programming