SQL SERVER-Transaction Log File Architecture

The transaction log is used to guarantee the data integrity of the database and for data recovery. Understanding the physical architecture can improve effectiveness in managing transaction logs.

The SQL Server Database Engine divides each physical log file internally into a number of virtual log files. Virtual log files have no fixed size, and there is no fixed number of virtual log files for a physical log file. The Database Engine chooses the size of the virtual log files dynamically while it is creating or extending log files. The Database Engine tries to maintain a small number of virtual files. The size of the virtual files after a log file has been extended is the sum of the size of the existing log and the size of the new file increment. We cannot configure size or number of virtual log files. Too many VLF can slow down database startup and also log backup and restore operations.

When the database is created, the logical log file begins at the start of the physical log file. New log records are added at the end of the logical log and expand toward the end of the physical log. Log truncation frees any virtual logs whose records all appear in front of the minimum recovery log sequence number (MinLSN). The MinLSN is the log sequence number of the oldest log record that is required for a successful database-wide rollback.

The transaction log in the example database would look similar to the one in the following illustration.

Transaction log architecture

Transaction log

 

When the end of the logical log reaches the end of the physical log file, the new log records wrap around to the start of the physical log file.

This cycle keep repeating and repeating.

If the log contains multiple physical log files, the logical log will move through all the physical log files before it wraps back to the start of the first physical log file.

So we can see that there is no random writing in log file and it is always circular so there is no chance to gain performance even someone create multiple log file.

Remember writing in log file is always synchronous so in time of abnormal shutdown SQL Server can redo or undo transaction.

For performance gain we should put log file on fastest drive available.

SQL Server-When was Sql server last restart?

Here are some ways to know a time when  SQL server last Restart?

Below is some way that i came across while i was working on this.

  • sys.dm_os_sys_info has new column sqlserver_start_time since SQL Server 2008.
SELECT sqlserver_start_time
FROM sys.dm_os_sys_info
  • Here  to  is another way from creation of Tempdb. We know Tempdb is created again when server restarts.
SELECT  crdate AS sqlserver_start_time
FROM    sysdatabases
WHERE   name = 'tempdb'
  • Another way is to read from xp_readerrorlog(But Beware what if someone cleared ErrorLog by executing “EXEC sp_cycle_errorlog ;”.
DECLARE  @logger TABLE (dt datetime, Process varchar(50), txt varchar(MAX))

INSERT @logger exec xp_readerrorlog

SELECT TOP 1 sqlserver_start_time FROM @logger

Here are some other way by which we can have near by time of SQl Server Restart.

  • First Session entry in sys.dm_exec_sessions.
SELECT  login_time
FROM    sys.dm_exec_sessions
WHERE   session_id = 1
  • Lazy Writer Login time.
SELECT  login_time
FROM    master..sysprocesses
WHERE   cmd = 'LAZY WRITER'


SQL SERVER-Non-uniform Memory Access(NUMA)

Microsoft SQL Server is non-uniform memory access (NUMA) aware, and performs well on NUMA hardware without special configuration. As clock speed and the number of processors increase, it becomes increasingly difficult to reduce the memory latency required to use this additional processing power. To circumvent this, hardware vendors provide large L3 caches, but this is only a limited solution. NUMA architecture provides a scalable solution to this problem. SQL Server has been designed to take advantage of NUMA-based computers without requiring any application changes.

NUMA Concepts:

The trend in hardware has been towards more than one system bus, each serving a small set of processors.

NUMA node is group of processors which has its own memory and possibly its own I/O channels. However, each CPU can access memory associated with the other groups in a coherent way.

The number of CPUs within a NUMA node depends on the hardware vendor. It is faster to access local memory than the memory associated with other NUMA nodes. This is the reason for the name, non-uniform memory access architecture.

Local memory is the memory that is on the same node as the CPU currently running the thread.

Foreign Memory/remote memory  is  memory that does not belong to the node on which the thread is currently running.

NUMA ratio is  ratio of the cost to access foreign memory over that for local memory.

If the NUMA ratio is 1, it is symmetric multiprocessing (SMP). The greater the ratio, the more it costs to access the memory of other nodes. Windows applications that are not NUMA aware (including SQL Server 2000 SP3 and earlier) sometimes perform poorly on NUMA hardware.

The main benefit of NUMA is scalability. The NUMA architecture was designed to surpass the scalability limits of the SMP architecture. With SMP, all memory access is posted to the same shared memory bus. This works fine for a relatively small number of CPUs, but not when you have dozens, even hundreds, of CPUs competing for access to the shared memory bus. NUMA alleviates these bottlenecks by limiting the number of CPUs on any one memory bus and connecting the various nodes by means of a high speed interconnection.

Hardware NUMA:

Computers with hardware NUMA have more than one system bus, each serving a small set of processors.The number of CPUs within a NUMA node depends on the hardware vendor. Your hardware manufacturer can tell you if your computer supports hardware NUMA.

If you have hardware NUMA, it may be configured to use interleaved memory instead of NUMA. In that case, Windows and therefore SQL Server will not recognize it as NUMA. Run the following query to find the number of memory nodes available to SQL Server:

SELECT DISTINCT memory_node_id
FROM sys.dm_os_memory_clerks

If SQL Server returns only a single memory node (node 0), either you do not have hardware NUMA, or the hardware is configured as interleaved (non-NUMA). If you think your hardware NUMA is configured incorrectly, contact your hardware vendor to enable NUMA. SQL Server ignores NUMA configuration when hardware NUMA has four or less CPUs and at least one node has only one CPU.

Soft-NUMA

  • SQL Server allows you to group CPUs into nodes referred to as soft-NUMA.
  • You usually configure soft-NUMA when you have many CPUs and do not have hardware NUMA, but you can also use soft-NUMA to subdivide hardware NUMA nodes into smaller groups.
  • Only the SQL Server scheduler and SQL Server Network Interface (SNI) are soft-NUMA aware.
  • Memory nodes are created based on hardware NUMA and therefore not impacted by soft-NUMA. So, for example, if you have an SMP computer with eight CPUs and you create four soft-NUMA nodes with two CPUs each, you will only have one memory node serving all four NUMA nodes. Soft-NUMA does not provide memory to CPU affinity.
  • The benefits of soft-NUMA include reducing I/O and lazy writer bottlenecks on computers with many CPUs and no hardware NUMA.
  • There is a single I/O thread and a single lazy writer thread for each NUMA node.
  • Depending on the usage of the database, these single threads may be a significant performance bottleneck. Configuring four soft-NUMA nodes provides four I/O threads and four lazy writer threads, which could increase performance.
         you cannot create a soft-NUMA that includes CPUs from different hardware NUMA nodes. For example, if your hardware has eight CPUs (0..7) and you have two hardware NUMA nodes (0-3 and 4-7), you can create soft-NUMA by combining CPU(0,1) and CPU(2,3). You cannot create soft-NUMA using CPU (1, 5), but you can use CPU affinity to affinitize an instance of SQL Server to CPUs from different NUMA nodes. So in the previous example, if SQL Server uses CPUs 0-3, you will have one I/O thread and one lazy writer thread. If, in the previous example SQL Server uses CPUs 1, 2, 5, and 6, you will access two NUMA nodes and have two I/O threads and two lazy writer threads.

SQL Server-Instant File Initialization

Data and log files are initialized to overwrite any existing data left on the disk from previously deleted files. Data and log files are first initialized by filling the files with zeros when you perform one of the following operations:

  • Create a database.
  • Add files, log or data, to an existing database.
  • Increase the size of an existing file (including autogrow operations).
  • Restore a database or filegroup.

File initialization causes these operations to take longer. However, when data is written to the files for the first time, the operating system does not have to fill the files with zeros.

Instant File Initialization:

Instant File initialization  was introduced from SQL Server 2005.

In SQL Server, data files can be initialized instantaneously. This allows for fast execution of the previously mentioned file operations. Instant file initialization reclaims used disk space without filling that space with zeros. Instead, disk content is overwritten as new data is written to the files. Log files cannot be initialized instantaneously.

SQL Server -Tail Log Backups

Tail Log Backup: Tail Log Back up is Back up of Tail of Log Just Before Before Restore operation of Database.

  • As it is Log Backup so we can take Tail Log Backup when recovery model Database either Full or Bulk Logged.

Like any log backup, a tail-log backup is taken by using the BACKUP LOG statement and it is  recommend that you take a tail-log backup in the following situations:

  • If the database is online and you plan to perform a restore operation on the database, before starting the restore operation, back up the tail of the log using WITH NORECOVERY:
    
    BACKUP LOG database_name TO <backup_device> WITH NORECOVERY
    
    -- If the database is damaged, use WITH CONTINUE_AFTER_ERROR, as follows:
    
    BACKUP LOG database_name TO <backup_device> WITH CONTINUE_AFTER_ERROR
    
    
  • Tail Lob Backup capture the tail of log even database is in offline mode,Database is damaged or missing data file.This might cause incomplete metadata from the restore information commands and msdb. However, only the metadata is incomplete; the captured log is complete and usable.
  • If a tail-log backup has incomplete metadata, in the backupset table, has_incomplete_metadata is set to 1. Also, in the output of RESTORE HEADERONLY, HasIncompleteMetadata is set to 1.
  • If the metadata in a tail-log backup is incomplete, the backupfilegroup table will be missing most of the information about filegroups at the time of the tail-log backup. Most of the backupfilegroup table columns are NULL.

 

 

 

 

 

 

SQL Server-Collation

Collation: Collation is set of rules which Determines how data are sorted and compared.

collations can be specified at many levels. When we install an instance of SQL Server ,we can specify the default collation for that instance. Each time we create a database,we can specify the default collation used for the database. If we do not specify a collation, the default collation for the database is the default collation for the instance.

  • We can get list of Collation by Querying.
select  * from  ::fn_helpcollations()
Let’s take example of two collation to know how it differ in behavior in context of Data insertion.
1.SQL_Latin1_General_CP1_CS_AS (Case Senstive,Accent-Sensitive).
2.SQL_Latin1_General_CP1_CI_AI (Case Insenstive,Accent-Insensitive).
first we create different table for both collation.
  • Employee_CS for (case sensitive collation i.e. SQL_Latin1_General_CP1_CS_AS )
  • Employee_CI for (case insensitive collation i.e. SQL_Latin1_General_CP1_CI_AI )
let’s Create table  Employee_CS,we will define collation at column level in this table which is case  and accent sensitive.
--Creating table Employee_CS ,define collation at column level which is case sensitive and accent sensitive
CREATE TABLE Employee_CS
    (
      id INT IDENTITY(1, 1) ,
      NAME NVARCHAR(100) COLLATE SQL_Latin1_General_CP1_CS_AS PRIMARY KEY
    )
 We have made Column Name Primary key in Table Employee_CS.Let’s insert same name but with different case.

 --insert to value for name which is not same  for case sensitive,both are different
INSERT INTO dbo.Employee_CS
        ( NAME )
VALUES  ( 'Bill'),('BILL')
--Insertion will be completed without error even we have primary key on Name column cause of collation we define on it.

We got Successful insertion of the data in the table cause Bill and BILL is different while we are defining case sensitive collation type on Column.

  • Let’s Create  Another table Employee_CI and define collation at column level which is case sensitive.
--Creating table Employee_CI ,define collation at column level which is case insensitive ans accent insensitive

CREATE TABLE Employee_CI
    (
      id INT IDENTITY(1, 1) ,
      NAME NVARCHAR(100) COLLATE SQL_Latin1_General_CP1_CI_AI PRIMARY KEY
    )

Let’s insert Same name Bill and BILL into Name of Employee_CI.


--insert to value for name which is same for case Insensitive,both are different in term of case sensitive
INSERT INTO dbo.Employee_CI
 ( NAME )
VALUES ( 'Bill'),('BILL')
--Insertion steps throws error "Violation of PRIMARY KEY constraint"
  • As you run insertion step it throws error cause as collation is case insensitive.
  • Now Question is how we can compare column that having different collation?
  • Let’s insert Value in Table Employee_CI so we can compare Column Name with table Employee_CS
INSERT INTO dbo.Employee_CI
        ( NAME )
VALUES  ( 'Bill'),('BILLS')
  • When we write simple joining statement it throws error that collation conflict occur.
SELECT * FROM dbo.Employee_CI EI INNER JOIN dbo.Employee_CS ES ON EI.NAME = ES.NAME
--Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CS_AS" and "SQL_Latin1_General_CP1_CI_AI" in the equal to operation.
  • We have to provide collation name where we are joining table based on column comparison as given below.
--You can give Default collation by stating DATABASE_DEFAULT.but you first got to know what collation current database have.
SELECT * FROM dbo.Employee_CI EI INNER JOIN dbo.Employee_CS ES
ON EI.NAME COLLATE SQL_Latin1_General_CP1_CI_AI= ES.NAME COLLATE SQL_Latin1_General_CP1_CI_AI
  • Here is code by which we can get collation at database level and server level.
SELECT DATABASEPROPERTYEX('sandbox','collation')--Gives Database level collation
SELECT SERVERPROPERTY('collation')--Gives Server Level collation

SQL Server-BackUp History

Many times we need backup history, when which type of backup is taken?

Is back up was compressed? So I think let me make a Script which gives list of backup history.

Let me share One interesting fact which I observed when I was working on  this Script.

Let me describe scenario.


  • We have two server 1.Test server and 2.Local server
  • I need to restore database on Local serer from Test server.
  • But Database does not exist on Local server.
  • So I will create new database.Then Restore Full back up which is taken from Test server.
  • now i will run following script for newly created and restored database.
  • Note it will show one record for which database is restored.
  • Reason is very simple cause it is backup right now for newly created Database as well as it is for Test server.as we have restored database to Local server from backup.so it will show one record for backup history on Local server without taking backup.
SELECT  ms.software_name [software] ,
		s.user_name ,
        s.database_name ,
        CASE ms.is_compressed
          WHEN 1 THEN 'Compressed'
          ELSE 'DeCompressed'
        END AS [BackupMethod] ,
         s.recovery_model ,
        CASE s.type
          WHEN 'D' THEN 'FULL'
          WHEN 'L' THEN 'Transaction Log'
          WHEN 'I' THEN 'Differential'
        END AS BackupType ,
        m.physical_device_name ,
        CASE ms.is_compressed
          WHEN 1
          THEN CAST(s.compressed_backup_size / 1048576 AS NUMERIC(18, 2))
          ELSE CAST(s.backup_size / 1048576 AS NUMERIC(18, 2))
        END AS [Size in MB] ,

        s.backup_start_date ,
        CAST(DATEDIFF(s, s.backup_start_date, s.backup_finish_date) AS VARCHAR(4)) TimeTaken ,
        s.first_lsn ,
        s.last_lsn ,
        s.server_name
FROM    msdb.dbo.backupset s
        INNER JOIN msdb.dbo.backupmediafamily m ON s.media_set_id = m.media_set_id
        INNER JOIN msdb.dbo.backupmediaset ms ON ms.media_set_id = s.media_set_id
WHERE   s.database_name = DB_NAME()