For any SQL Server DBA, the basic architecture of SQL Server should always be implanted in his/her memory. In this topic, I will explain the architecture in simple terms and how SQL Server uses the Transaction log file.
When SQL Server has to fetch rows, it would first search in the Data cache of the Buffer pool(SQL Server memory area) if the data required is already present. If it is present, it is fetched and returned to the Client. If it is not present, then it would fetch the rows from the data files present on the disk, places in the data cache and then returns the results back to the client. In a similar way, if an update operation needed to be performed by SQL Server, it would check in Data cache, updates them and sends back the result. Now a question arises, when will those changes made in the Data cache is written to the data files? The answer is the checkpoint. Checkpoint is a process that is run by SQL Server in the following cases:
1. Automatically using the recovery interval setting defined( default is 0 which means SQL Server calls Checkpoint for every less than a minute usually.)
2. When a Transaction log backup is performed
3. Explicitly issuing the CHECKPOINT command
These are the three primary cases wherein the checkpoint process is triggered. When it is triggered, those pages in the data cache which are not yet written to the data files on the disk are flushed to the data files. Those pages which are present in the Data cache and are different from the data in the data files are known as dirty pages.
Apart from Checkpoint, the data is also flushed to disk by a process known as Lazy Writer when the instance is under memory pressure.
Suppose, SQL Server is crashed. It needs to make sure that there is consistent and valid data present on the disk. When SQL Server is started, it would run a process called recovery for each database. Recovery means it would perform a redo and undo process of transactions.
Each and every action performed by SQL Server is recorded in the Transaction log file. In order to maintain consistency and durability, SQL Server first records the action in Transaction log file before making the change in Data cache. This is called Write-ahead Log protocol. A transaction may be committed or uncommitted, but when checkpoint occurs the dirty pages of these transactions are written to disk.
During recovery process, SQL server checks for all those transactions which are committed but are not yet written to disk by reading from the Transaction log file and are rolled forward(redo) meaning that they are hardened to disk. And, all those incomplete transactions which are not yet committed or those with no commit or rollback statements are rolled back(undo) from the data files. This is done to ensure atomicity and consistency of data. A recovery will also run at the time of restore database operation. After the initialization of files, copy and redo phase of the restoration process, SQL Server makes the database available while the undo process happens simultaneously when the users are accessing the database.