Any computer program requires data for processing. Data provided by user as well as data generated during or as a result of processing is stored in computer’s memory (RAM) which is temporary in nature. In order to retain the data permanently so that it can be retrieved and used whenever required, it is stored in computer files.
Computer files may store data in human readable text form, or computer program may apply certain encryption on it before storing. However, data files are largely unstructured. This leads to data redundancy (unnecessary repetition of data in multiple files) and compromises data integrity.
Database is a program independent and organised collection of data which removes data redundancy and ensures data integrity. Relational database model was proposed by E.F.Codd in 1970. In a relational database, data is organised in one or more tables (called relations). Generally, each table represents one entity.
Relational database design involves identifying independent entities involved and defining table structure for each entity. It also involves identifying relationship amongst the tables. There may be a one to one or one to many type of relationship between two tables.
An entity is characterized by attributes. Name of product and its price are attributes of product entity. The product table consists of columns, each corresponding to attribute. One row of product table describes one example of product. A row is also called tuple or record. Here is a sample product table:
Each column, representing attribute of entity is also called field in database terminology. In addition, table in a relational database consists of an attribute which uniquely identifies each record. In above example, productID field is unique for each record. Such a field/attribute is called primary key.
Another table shown below is customer table with Customer_name field and customerId as primary key.
If primary key of one table appears in design of another table to establish relationship, it becomes foreign key. Let us think of an invoices table. Each invoice record contains ProductID field using which name of product can be fetched from products table. Invoice table also contains CustomerID field which enables fetching customer’s name from customer table. In invoices table, Invoice No is the primary key, but ProductID and CustomerID are foreign keys.
This approach removes data redundancy. For example, instead of name of product or name of customer need not appear again in invoices table. Moreover if you change name of product or customer, you need not make multiple changes in invoices table.
In order to ensure data integrity, different constraints can be set on fields while designing a table.
Constraints enforce restrictions on data that a column can contain. They help in maintaining integrity and reliability of data in the table. Following clauses are used in definition of one or more columns of a table to enforce constraints:
PRIMARY KEY: Only one column in a table can be defined to be a primary key. Value of this table will uniquely identify each row (record) in the table. The primary key can be set to AUTOINCREMENT if its type is INTEGER. In that case, its value need not be manually filled.
NOT NULL: By default value for any column in a row can be left as null. NOT NULL constraint ensures that while filling a new row in the table or updating existing row, contents of specified column are not allowed to be null.
FOREIGN KEY: A column in one table may refer to primary key column in other. This is useful to establish relationship between the tables. However, to ensure referential integrity between tables, FOREIGN KEY constraint is imposed by using REFERENCES clause in table definition.
All RDBMS systems use SQL (Structured Query Language) for querying and manipulating data stored in tables of database. SQL was designed by Donald Chamberlin and R.F.Boyce. It was initially named as SEQUEL, but later on changed to SQL. It was adopted by ANSI as a standard in 1986.
Oracle, the first commercial RDBMS software was released in 1979, by Relational Software Inc. Later the company itself was renamed as Oracle Corporation. MySQL is an open source RDBMS. MySQL was owned and sponsored by the Swedish company MySQL AB, now owned by Oracle Corporation.
SQLite database is also open source, but serverless, and needs no configuration. Entire database is a single disk file that can be placed anywhere in operating system's file system. SQLite commands are similar to standard SQL. Unlike other mainstream languages like C, C++ or Java, SQL is not an imperative language. It is a set-based declarative language. Primary purpose of SQL is to query and manipulate data stored in relational database. The statements in SQL are classified as:
SQL statements that deal with creation, modification and deleting table in database come under data Definition Language (DDL) of SQL.
Primary operation of SQL is to query the data stored in tables. It is done by SELECT statement. Various clauses, for example WHERE, ORDER, GROUP etc. are used along with SELECT to fetch the data conditionally and present it appropriately.
Data manipulation Language (DML) statements in SQL are about inserting, deleting and updating data in table. COMMIT and ROLLBACK statements come under Transaction control category. They are generally used in stored procedures. If one or more data manipulation statements are likely to cause error, the changes are undone by ROLLBACK. If however, no errors are found, the transactions are confirmed by COMMIT.
Most RDBMS softwares are multiuser systems. User management statements are under Data control category. It involves statements to create and drop user, grant and revoke roles and privileges to a user.
Except for SELECT and DML statements, others are generally used by Database Administrators and not used when a program written in languages like Python tries to interface with the database. The front-end programs generally need to perform CRUD (CREATE, RETRIEVE, UPDATE and DELETE) operations on database.
SQLite is freely downloadable from its official web site https://www.sqlite.org/download.html. This page contains pre-compiled binaries for all major operating systems. A bundle of command-line tools contains command-line shell and other utilities to manage SQLite database files.
We shall download latest version of SQLite (version 3.25.1) along with command-line tools (https://www.sqlite.org/2018/sqlite-tools-win32-x86-3250100.zip for Windows) and extract the archive. To create a new SQLite database, enter following command from Windows Command prompt:
SQLite prompt appears as below:
SQLite version 3.25.1 2018-09-18 20:20:44 Enter ".help" for usage hints. sqlite>
To create a table use SQL CREATE TABLE query as follows: (Note that the statement ends with a semicolon)
sqlite> create table marks(name text, age int, marks int);
Add a record in the above table, use INSERT query
sqlite> insert into marks values('Shri', 21, 75);
To retrieve record, use SELECT query as below:
sqlite> select * from marks; Shri|21|75
Easiest way to use MySQL on your computer is to install Apache/MySQL bundle of your operating system. XAMPP for Windows is one web solution stack consisting of Apache http server and community developed fork of MySQL database along with interpreters for PHP and Perl languages.
Download XAMPP installer from https://www.apachefriends.org/download.html and start XAMPP control panel.
Click on Shell button on the right hand side to open XAMPP console and start mysql from command prompt:
You can now perform SQL operations in the command line mode.
There are many RDBMS software products in use today. Oracle, MySQL, SQL Server, SQLite being few popular ones. Many individuals and organizations developed their own Python modules acting as interface between respective databases and Python. However, each of these modules had differing nomenclature and functionality. As a result there was no portability among the database modules. If change in database is required, the database connectivity programs had to be rewritten.
A Special Interest Group was setup for database module standardization. Its recommendations are documented in Python Enhancement Proposal for DB-API. These standards were further modified to DB-API 2.0 by another enhancement proposal (PEP-249)
Standard Python distribution has in built support for SQLite database connectivity. It contains sqlite3 module which adheres to DB-API 2.0. The sqlite3 module is a DB-API V2 compliant module written by Gerhad Haring.
Similarly PyMySQL is a DB_API compatible module that is required for interacting with MySQL. You can also use MySQL Connector/Python driver available on MySQL official download page https://dev.mysql.com/downloads/connector/python/. However, these modules are not ported in Python's standard library and hence need to be installed before use. PyMySQL can be installed using pip installer:
pip install PyMySQL
DB-API has recommended that first of all a connection object representing the database is to be obtained. In order to establish connection with a SQLite database, sqlite3 module needs to be imported and connect() function needs to be executed.
import sqlite3 db=sqlite3.connect('test.db')
Similarly, for MySQL database
import PyMySQL db = PyMySQL.connect("localhost","root","","testdb" )
In case of PyMySQL, the connect() function requires server's name or IP address ('localhost' in this case), username(root) and password (not set in this case) and the name of database with which connection is to be established.
The connect() function returns a connection object referring to existing database. Following methods are defined in the connection class
|cursor()||returns a Cursor object which uses this Connection.|
|commit()||commit pending transactions to the databse.|
|rollback()||causes a transaction to be rolled back to the starting point.|
|close()||Closes the connection to the database permanently.|
This discussion from this point onwards is in context of SQLite database, but is perfectly applicable to MySQL database also. You only need to change 'import sqlite3' statement to 'import PyMySQL'.
A cursor is a Python object that enables you to work with the database. It acts as a handle for a given SQL query; it allows retrieval of one or more rows of the result. Hence, cursor object is obtained from connection to execute SQL queries.
SQL query is executed using execute() method of cursor object. In previous topic CREATE TABLE query was described. A string containing CREATE QUERY statement is the argument to execute() method.
cur.execute('''CREATE TABLE marks ( StudentID INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT (20) NOT NULL);''')
Lastly, the connection established should be explicitly closed.
Save following code as createdb.py and run it. You should see test.db created in the current folder
import sqlite3 db=sqlite3.connect('sqlitetest.db') cur=db.cursor() cur.execute('''CREATE TABLE marks ( StudentID INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT (20) NOT NULL, age INTEGER, marks INTEGER);''') db.close()
Once again execute() method of cursor object should be called with a string argument representing INSERT query syntax.
We have already created marks table having three fields, name, age and marks. To begin with, a new record is added in the table by executing INSERT query as shown below:
cur.execute("INSERT INTO marks (name, age, marks) VALUES ('Kiran',20,75);"
Then we accept user input for name, age and marks
nm=input("enter name: ") a=int(input("enter age: ")) m=int(input("enter marks: "))
The fixed values in Insert query are now replaced by variables nm, a and m. In order to concatenate the query string, a and m variables are converted to string by str() function.
sql="INSERT INTO marks (name, age, marks) VALUES ('"+nm+"','"+str(a)+"','"+str(m)+"');"
The connection class defines commit() and rollback() methods. Changes in database are finalised only if execute() method runs successfully by commit() method. Otherwise, any changes are undone by rollback() method.
Following script accepts user input and adds a new record in marks table
db=sqlite3.connect("sqlitetest.db") nm=input("enter name: ") a=int(input("enter age: ")) m=int(input("enter marks: ")) sql="INSERT INTO student (name, age, marks) VALUES ('"+nm+"','"+str(a)+"','"+str(m)+"');"
cur=db.cursor() cur.execute(sql) db.commit() print ("one record added successfully")
print ("error in operation") db.rollback() db.close()
Run above code and give input as below:
enter name: Komal enter age: 23 enter marks: 65
You can use SQlite console to verify the additions
sqlite> select * from marks; Komal|23|65
SELECT query forms a result set containing all records returned as a response to query. The execute() method uses a string representing SELECT query statement. There are two prominent methods as per DB-API standard.
fetchone(): Fetches next available record from the result set. It is a tuple consisting of values of each column of the fetched record.
Following code snippet retrieves and prints one record at a time till the result set is exhausted.
sql="SELECT * from marks;" cur=db.cursor() cur.execute(sql) while True: record=cur.fetchone() if record==None: break print (record)
fetchall(): Fetches all remaining records in the form of list of tuples. Each tuple corresponds to one record and contains values of each column in the table.
Following code snippet fetches all records and prints one at a time by using for statement.
sql="SELECT * from student;" cur=db.cursor() cur.execute(sql) set=cur.fetchall() for record in set: print (record)
Both code snippets will display all records in marks table.
We now find out how to execute UPDATE query through Python program. Following code first displays record belonging to name input by user.
nm=input("enter name") sql="SELECT * from marks WHERE name='"+nm+"';" cur=db.cursor() cur.execute(sql) record=cur.fetchone() print (record)
Then another input is accepted from user as new value for marks field. It is used to update respective record in student table
m=int(input("enter new marks: ")) sql="UPDATE marks SET marks='"+str(m)+"' WHERE name='"+nm+"';" try: cur.execute(sql) db.commit() print ("record updated successfully") except: print ("error in update operation") db.rollback()
enter nameAmol (4, 'Amol', 22, 61) enter new marks: 75 record updated successfully
The complete code is as follows:
import sqlite3 db=sqlite3.connect("sqlitetest.db") nm=input("enter name") sql="SELECT * from marks WHERE name='"+nm+"';" cur=db.cursor() cur.execute(sql) record=cur.fetchone() print (record) m=int(input("enter new marks: ")) sql="UPDATE marks SET marks='"+str(m)+"' WHERE name='"+nm+"';" try: cur.execute(sql) db.commit() print ("record updated successfully") except: print ("error in update operation") db.rollback() db.close()
Similarly a specific record from a table can be deleted by providing DELETE query as a parameter to execute() method.
Following code accepts name and displays its record. Then user is asked to confirm delete operation after which the corresponding record is deleted permanently from the table.
import sqlite3 db=sqlite3.connect("sqlitetest.db") nm=input("enter name") sql="SELECT * from marks WHERE name='"+nm+"';" cur=db.cursor() cur.execute(sql) record=cur.fetchone() print (record) res=input("Do you want to delete this record? (Y/N)") sql="DELETE FROM student WHERE name='"+nm+"';" if res=='Y': try: cur.execute(sql) db.commit() print ("record deleted successfully") except: print ("error in update operation") db.rollback() db.close()
When above code is run, following output will be shown in Python console:
enter nameRita (3, 'Rita', 21, 60) Do you want to delete this record? (Y/N)Y record deleted successfully
Thus we can perform CRUD operations on SQLite/MySQL database using DB-API compliant Python modules.