ICP 07 : Cassandra - acikgozmehmet/BigDataProgramming GitHub Wiki
ICP-07: Independent Column based No SQL Tool - Cassandra
Installation:
- Download Python 2.7.X (https://www.python.org/downloads/release/python-2717/)
- Install Python and during installation select Entire python paths
- Download Cassandra Version 3.11.6 or latest (http://cassandra.apache.org/download/)
- Extract it in you localsystem, i.e., C Drive
- Add envrionemnt variable CASSNADRA_HOME with value "C:\apache-cassandra-3.11.6"
- Navigate to "C:\apache-cassandra-3.11.6\bin" folder
- Open command terminal window as administratior, navigate to "C:\apache-cassandra-3.11.6\bin" and Type "cassandra.bat -f" for starting cassandra server on localhost
- Wait for server to initialize completely for atleast 3 - 5 minutes
- open new terminal windows navigate to "C:\apache-cassandra-3.11.6\bin" and type "cqlsh"
How to run Cassandra:
After installation, open the cmd with admin privileges and execute the following command to initiliaze the cassandra
C:\Program Files\apache-cassandra-3.11.6\bin>> cassandra.bat
Then open another cmd with admin priviligies again and type the following command to get the shell
C:\Program Files\apache-cassandra-3.11.6\bin >> c:\python27\python cqlsh.py
(If you have several Python versions installed on your system)
To create new keyspace
create keyspace test with replication={'class':'SimpleStrategy', 'replication_factor':1};
To display all keyspaces
desc keyspaces;
To use created keyspace
use test;
To create new table called emp
test> create table emp (employee_id int PRIMARY KEY, department text, lastname text, years_with_company int, hiredate text, jobtitle text, salary int, managerid int);
To load file into created new tables
test> copy emp (employee_id, department, lastname, years_with_company, hiredate, jobtitle, salary, managerid) from 'D:\UMKC\__Spring2020\CS5590BDP\Lesson7\employee_entries.csv' WITH DELIMITER='|' AND HEADER = TRUE;
To print the data loaded in employees
Exercises
1. List the empID,ename,jobtitle,and hiredate of employee from the employee table.
select employee_id, lastname, jobtitle, hiredate from emp;
2. List the name,salary of the employees who are clerks.
select lastname, salary from emp where jobtitle ='clerks' allow filtering;
3. List the name,job,salary of every employee joined on ‘february 18,2000’,
select lastname, jobtitle, salary from emp where hiredate='2000-02-18' allow filtering;
4. List name and annual salary of all the employees.
select lastname, salary from emp;
5. Display employees’ names, salary and manager values of those employees whose salary is 45000 from EMP table using SELECT statement.
select lastname, salary, managerid from emp where salary=45000 allow filtering;
Bonus
Write a query that displays last names with first letter capitalized and all other letters in lowercase start with G.
CREATE CUSTOM INDEX fn_prefix ON emp (lastname) USING 'org.apache.cassandra.index.sasi.SASIIndex';
SELECT * FROM emp WHERE lastname LIKE 'G%';
References:
- http://cassandra.apache.org/
- UMKC Course Notes - CSEE5590-0001/490-0003: Big Data Programming