3. Working with relational database - upalr/Python-camp GitHub Wiki
1 Introduction to relational databases
2 Creating a database engine in Python
2.1 Creating a database engine
- 'sqlite:///Northwind.sqlite': Connection string
- sqlite: Type of the database you are connecting
- Northwind.sqlite: Name of the database
2.2 Getting table names
3 Querying relational databases in Python
3.1 Basic SQL query
3.2 Workflow of SQL querying
3.3 Your first SQL query
problem with the columns name
3.4 Set the DataFrame column names
3.5 Using the context manager
you can use the context manager contract to open a connection which will save you the trouble of closing the connection later or save you the trouble for forgetting to close it.
Example 1: Filtering your database records using SQL's WHERE
There are a couple more standard SQL query chops that will aid you in your journey to becoming an SQL ninja.
Let's say, for example that you wanted to get all records from the Customer table of the Chinook database for which the Country is 'Canada'. You can do this very easily in SQL using a SELECT statement followed by a WHERE clause as follows:
SELECT * FROM Customer WHERE Country = 'Canada'
In fact, you can filter any SELECT statement by any condition using a WHERE clause. This is called filtering your records.
Example 2: Ordering your SQL records with ORDER BY
You can also order your SQL query results. For example, if you wanted to get all records from the Customer table of the Chinook database and order them in increasing order by the column SupportRepId, you could do so with the following query:
"SELECT * FROM Customer ORDER BY SupportRepId"
In fact, you can order any SELECT statement by any column.
4 Querying relational databases directly with pandas
4.1 The pandas way to query
4 lines of code is good but you can do it in one line
5 Advanced Querying: exploiting table relationships
5.1 Tables are linked
5.2 JOINing tables
5.3 INNER JOIN in Python (pandas)
Example 1 : Filtering your INNER JOIN
Milliseconds
is actually from Track
# Execute query and store records in DataFrame: df
df = pd.read_sql_query("SELECT * FROM PlaylistTrack INNER JOIN Track on PlaylistTrack.TrackId = Track.TrackId WHERE Milliseconds < 250000", engine)
# Print head of DataFrame
print(df.head())