21. Oracle Database Architecture (IMPORTANT) - Agnivo102/Database_Architect GitHub Wiki
Oracle Database Architecture:
It is divided into 3 parts:
Logical Part (System Global Area)
Background Process
Physical Part (Where the data are stored)
System Global Area: Basically the RAM of the database where the logical part of the database is stored. When the database is started the system global area is generated and stored in our RAM. There are different components under System Global Area.
Share Pool:
Library Cache: This takes the hashcode generated by the server process. There is one component called sql area. The sql queries of every user who are connected into the database and their hashcodes and all are stored into this sql area. THere the server process do a syntax check on the Library Cache. If there is a syntax error then server process returns the query and the syntax error to user process. If there is no error the query moves to data dictionary.
Data Dictionary: This contains the metadata (information about data) of all objects of the database. Like the data about the objects stored in the database (tables, views), like who owns the objects, where is the physical address of the objects, what are the roles and responsibilities of the user who has connected, like what are the columns of the objects etc. All of this stored in data dictionary.
Server process do a semantic check on this data dictionary. Like the query has entered a table or view which doesn't exist on the database, or a field requested by user is not present in the requested tale or view. That's not a syntax error, so it will come to this layer. So in this semantic this will caught. Then the server process will return this error to the user process.
If all the above checks passes the server process then checks whether the user has permission to execute this query or not. All this checks are done in this data dictionary layer.
Once all the checking is done then the server process process comes to Optimizer.
Optimizer: Generates multiple plan for the query execution. Like when we plan our travel accordingly to the resource available at the moment and our priority we create multiple plans and choose the optimal plan. Like that optimizer creates multiple execution plans and the server process chooses the one which will use the lowest resources and will also take lowest time for execution.
Resources are things that are needed to execute a query like RAM and processor available. Server process consider this and chooses the optimum execution plan
With that plan server process go to Data File in the physical layer location
Database Buffer Cache: Server Process takes the data block of user query and takes it to this area. Currently that block is called clean block. Frokm here the Server Process returns the output.
Background Process: When we start our database this processes get started. These processes manage the entire database. They are:
SMON: System Monitor: Helps for instance recovery.
PMON: Process Monitor
DBWn: Database Writter (Lazy Writter): When the process CKPT instructs it, it will write all the dirty blocks from the database buffer cache to the data file. Its called lazy writter as it never writes on its own. It only writes when CKPT instructs it to write.
CKPT: Checkpoint: Executes every 3 seconds. It job that the 3 files in the physical area the data file, the control file, the redo log file all are in sink. It tell the process DBWn to write all the dirty blocks from Database buffer cache to data file. When the user commits it writes.
2 types:<br Incremental CKPT: Executes every 3 seconds. When the user commits the SCN generates in that nano second that SCN number is added to the dml statements as suffix and that is put in to the redo log file by Log Writter. The function of CKPT is to find if there is any new SCN in the suffix if there is any log with new SCN number then that SCN number IS written in the control file. Also it tells the DBWr to write the dirty blocks into data file. Full Check Point: Executes when the database is shutdown or there is any log switches. Same function as Incremental CKPT. Just the execution time is different.
Lets think that i used a dml query and another user also do some dml. The dmls of both of us are moved to redo log files from redo log cache by LGWR. Then I commit my chnages but the pther user didn't commit. Now as both of our statements are in redo log file how will the system know which statements are commited and which statements are non commited.
Somehow we have to identify which statements are commited and when. We will need to know the time later during instance Recovery. Now from the moment the database is started every nano second a number is generated by itself. It is called SCN. System change number. Now when we commit a statement that SCN number gets attached as a suffix to all my and only my DML statements. Thats the way system recognise that this statement is commited by this user.
The job of incremental CKPT it will activate in every 3 seconds and search for new SCN number and write it in control file and update the header of data file with the SCN number so that all the data file, control file and redo log file will remain in sync.
LGWR: Log Writter: This process writes the contents of the Redo log Buffer Cache into Redo Log File. The writing is done on 4 scenarios:
4 scenarios: 1. Every 3 seconds 2. If the 1/3rd of Redolog Buffer Cache is full 3. Commit 4. If 1 MB is full
This statements are written because this will help database into instance recovery.
Physical Part: The data gets stored here. These are covered into 3 parts:
Data File: In Data file all our data (tables and views) are stored in form of data block. The size of Oracle Data Blocks can be 2 KB, 4 KB, 8 KB, 32 KB. In industry standard practice is 8 KB default block size. Server Process comes to this data File and finds the data block where the data of the user query is stored and takes the copy of that data block and takes it to Data Base Buffer Area of System Global Area.
Data Block: A table is not stored like what we see together in the screen in physical memory. They are stored randomly where they could find the space to store them. Like:
Name Id Cobb 10 Din 12 These values 12, Cobb etc will be stored randomly. They will only have a address of the other records with the table address to connect them. So data block is not a table, its just this values of a single table can be stored randomly in separate data blocks with just a address connecting them. A data block can have data from multiple tables as they are stored randomly.
4 types of Data Block: Clean Block Clear Block or New Block Pinned Block Dirty Block
Major data files: This got created by default whenever we create a database.
1.system01.dbf: In this data file all the system related information and data dictionary are present. Because when we stop the database the system global area becomes blank. So they have to be in some place physically. The Data Dictionary of system global area it exists physically in system01.db data file.
2. sysaux01.dbf: This file takes a snapshot of the database every 1 hour depending upon the time set by the industry. This is due to checking the health of database. Also when we want to take a report of the database we will use this sysaux01.db data file
users01.dbf: This file store all the users we create who can connect to the database. All information related to users are stored in user01.db data file.
undotbs01.dbf: Most important.
Like we rollback. We changed the data using this: Update employees set salary = 15000 where employee_id=102;. First the data was 10000 before. Because of the query the server process dedicated to that user will change the data in the database buffer cache and create a dirty block with the new data. That user still didn't commit.
Then if another user want to see the data of the salary of employee 102 they will see the old record obviously as the commit didn't occur. This is due to this undotbs.db data file. The old record gets saved in this undo data file. In the main data file the address of this undo data file comes. So when another user executed the query to see the salary of empoyee 102 the server process comes to the data file and picks up the address of the undo file and picks the data there and takes it to the database buffer cache to display.
If the user who run the update query commits then the old record in the undotbs01.dbf data file got erased. If that user rollbacks then the record from the undo data file got written in the main data file.
Now why this file is important. Because when the first user run the update query the data block of that record gets locked so that no one else cannot make any changes in that time. Like when someone makes change to that record and in that moments someone else also made changes there then it wouldn't be consistent. So the lock happens. Now with that lock no will be able to see that record in that time. So thats the reason this undo file is used to enable others to see it during that locking period. When the original user who updated the record commits or rollback will the lock gets removed.
temp01.db: Important. When someone do a order by or group by or some other calculation the temp.db data file is used to do all the calculation. From there that result returns to database buffer cache and is displayed. This is due to when that calculation on an extremely large dataset like millions of records then the database buffer cache is not enough to all those things. So this file is used for complex calculations.
We can create our own data files after this to store our own data.
Control File: The heart of database. 2 default control files when the database is created. control01.ctl, control02.ctl. They both have same content. If they become different then the database will not start. It contains the name of database, database creation date and time, meta data of all the data files and checkpoint information. Also the physical repository of backup like when the backup done, how much backup is done, which backup got obsolete, etc all this stored in control files.
Redo Log File: The contents of Redo Log file are all the DDL and DML statements. redo01.log, redo02.log, redo03.log default redo log files.
For example in morning an user runs a DQL query. That query's hashcode is stored in sql area of Library Cache. In afternoon that user executes the same query again then the same process follows like server process parses the query, generates the hashcode and takes it to the sql area. Then it finds that same hashcode already available in the sql area already. Then it will not follow the syntax checks and semantic checks again and follows the previous execution plan it used before to run the query. This is called soft parsing. The other parsing where all the process is done is called hard parsing.
Now if the same query is run by different user then the syntax check will be avoided but the sementic check in Data Dictionary will happen to ckeck whether the new user has the permission to run this query or not.
Now all the places have a size like say the size of sql area has a size of 5 MB. So how many queries can be stored in SQL Area. So only limited number of queries can be stored in sql area. A small query (a simple select statement from a single table has 55 bytes, so a total of 95325 appx queries can be stored in the sql area.
Execution Process of DQL Queries:
1. When a user gets connected to a database they connects to the instance of the database. The instance is a combination of System Global Area and Background Process.
2. When the user connects the instance a process called User Process is created. In the server side there is also a process generated called Server Process.
Server Process (Does all the work to execute the query)
For every User there is a unique and dedicated User Process and Server Process. The query written by the user will be carried by the user process to the instance. The server process receives the query. Then it performs certain tasks.
3. First Task: Parse the query: It breaks the query into pieces (depends on the server process whether it is into individual characters or individual words) and generate a hash code. This hash code generated will be stored into Library Cache of Shared Pool of System Global Area.**
4. Then the server process do a syntax check on the Library Cache. If there is a syntax error then server process returns the query and the syntax error to user process. If there is no error the query moves to data dictionary.
5. Server process do a semantic check on this data dictionary. Like the query has entered a table or view which doesn't exist on the database, or a field requested by user is not present in the requested tale or view. That's not a syntax error, so it will come to this layer. So in this semantic this will caught. Then the server process will return this error to the user process.
6. If all the above checks passes the server process then checks whether the user has permission to execute this query or not. All this checks are done in this data dictionary layer.
7. Once all the checking is done then the server process process comes to Optimizer.
8. From Optimizer server process chooses the one which will use the lowest resources and will also take lowest time for execution. Then it goes to the data file in the physical area.
9. Server Process comes to data File and finds the data block where the data of the user query is stored and takes the copy of that data block and takes it to Data Base Buffer Area of System Global Area.
10. Server Process takes the data block of user query and takes it to this area. Currently that block is called clean block. Frokm here the Server Process returns the output.
Execution process of DML Queries: For example user has run this "Update employees set salary = 15000 where employee_id=102;"
The same process at first like this:
Creation of User Process and Server Processes
Server Process will take the query from the User Process, parses it, generates hashcode
Do a syntax check in Library Cache, a semantic check in Data Dictionary
Optimizer generates multiple execution plan and from there the server process take the optimum one and takes it to data file
Then it copies the data block or data blocks related to the query and takes it to Database Buffer Cache as a clean Data Block.
Lets say the salary of the 102's was 10000. The server process changes it to 15000 according to the query. The moment the data got changed the clean data block got transformed into dirty data block. The update will be only present in the Database Buffer Cache..
In case of DML and DDL those statements are written into Redolog Buffer Cache of System Global Area. Once they are written these Redolog Buffer Cache into Redo Log file. The written is done by LGWR (Log Writter) Background Process.
Now all the dml statements of multiple users are in redo log file. When an user commits then the System change number (SCN) is attached with that users DML statement. The background process CKPT searches for these new SCN numbers and put then into control file and instruct DBWn to write the data blocks related to those dml statements and put them into data file.
Now when an user writes a dml statement afrer all the steps of copying those data blocks from data file to database buffer area the orginal data blocks in the data file a lock is applied on it so that in that time no other user can run a dml statement on those records. In that time a copy of that record are stored in undotbs01.db data file and its address are given in the data file. In that moments is some other user wants to see thoat records the server process will pick up that address and take the value from the undotbs01.db data file. When the original user commits or rollbacks the lock gets removed.
Process of Commit and Rollback:
When the commit is done the the backgroup process CKPT tell the background process DBWn to write all the dirty blocks from the database buffer cache to the data file. If the user rollbacks the recordd from undotbs01.db gets erased and the lock gets removed from the original records in the data file and the dirty block gets removed from database buffer cache.
See when the database is started the RAM gets divided into this area as the architecture like the system global area, database buffers, redo buffers:
To manage the entire process oracle has given 4000 views.
Main views
Parameter file
2 types sp file (system parameter file) p file (parameter file)
Naming convention: spfile
spfile(sid).ora
spfileprod.ora (Binary file)
Naming convention:- pfile
pfile :- init(sid).ora
initprod.ora (text file)
Other than the types one is binary the other is text file other than this both are same.
In spfile we cannot change parameters manually while in pfile we can change the parameters manually.
There is another file is there. Password file. This store the passwords of sysdba user. The users with the role sysdba, only the passwords of those users are stored. Not for all the users.
Unless the sys user login and starr the database no one will be able to login to the database. Like when we leave the flat we only take with us the key of the main gate not the keys of the other parts of the flat because until and unless we open the main gate others won't get access to it. Similarly until the sys user opens the lock (login and starts the database) other users can't authenticate themselves.
Password file naming convention :-
orapw(sid) :- orapwprod
Lets see where these files (datafiles, control files, rede log files ,parameter and password files) are physically:-
/u01/app/oracle/oradata/PROD
Inside this location all the files will located. This is oracle base location.
All the data files :
system01.dbf sysaux01.dbf users01.dbf undotbs01.dbf temp01.dbf
All the control files :
control01.ctl comtrol02.ctl
All the redo log files:
redo01.log redo02.log redo03.log
All these files are binary files.
For parameter and password file: That is store into oracle home location.
/u01/app/oracle/product/19.0.0/dbhome_1/dbs
spfile:
spfileprod.ora
pfile: Its not there yet. We will create it.
Login to database as sys user:
Create pfile from spfile;
The file got created.
initora.ora
In pfile all the values are here. The location of control file is there.
To see the file in sql prompt:
Show parameter spfile
Show parameter control_files
Password file:
orapwprod
Important views:
Desc dba_data_files
From here we can see what are our data files and what are their locations.
Select file_name from dba_data_files;
Log files: view for watching them
Dedc v$logfile
Select member from v$logfile;
2 types of view: static view, dynamic view.
To see every views given by oracle:
Select table_name from dict;
To see the database name and opening mode:
Select name, open_modr from v$database;