Q: List all joins supported in SQL?

Inner Join- This is the simplest one and most widely used JOIN. This is default one if we do not specify any JOIN between tables. It returns all matching records from both tables where the matching condition is satisfied. LEFT JOIN- We call this LEFT OUTER JOIN as well. When we have situations like we want all columns from one table and only matching records from another table, we go for this type of JOIN. There are two types of same. The LEFT one is the first type where we would like to have all records from LEFT table and only matching records from RIGHT one then we go for this type of JOIN. When we do not have any matching condition in the right table then all columns from the right one will be NULL while LEFT table will have all the records. RIGHT JOIN- We call this RIGHT OUTER JOIN as well. This is just the reverse of what we discussed for LEFT JOIN. The result set will be having all records from the right table but only matching records from the left one. Even if the ON clause which gives matching record fails, it will ensure rows are returned from the right table and corresponding columns from the LEFT table with NULL values. FULL JOIN- It is also called FULL OUTER JOIN. It is having characteristics of both LEFT /RIGHT outer join. The result set will be having rows whenever we have the match in any of the tables either LEFT or RIGHT one. We can also say that it gives the same result if we apply UNION on LEFT and RIGHT OUTER JOIN resultset. CROSS JOIN- This is a cartesian product of two tables where each row from the primary table is joined with each and every row of the secondary table. Even if we use SELECT statement on two tables separated by a comma and without any WHERE condition, it will give the same result as we get from applying CROSS JOIN.

Q: How many system databases we have in the SQL server?

Master: It contains framework catalogs that keep data about disk space, record portions, use, system-wide setup settings, login accounts, the presence of other databases, and the presence of other SQL Servers (for appropriate activities). If this database does not exist or corrupted then the SQL Server instance cannot start. Although we user objects in the master database, it is not advised to do so. This database should always remain as static as possible. In the case of a master database being rebuilt, all user objects will be lost. Model: It is basically a template database. Each time you make another database, SQL Server makes a duplicate of a model to frame the premise of the new database. Any changes made to this database, related DB size, collation, recovery model, and any other configurations, are applied to any new database created afterward. Tempdb: Temporary database, tempdb, is a workspace. SQL Server tempdb database is one of a kind among every single other database since it is reproduced not recuperated each time SQL Server is started. Msdb: This database is utilized by the SQL Server Agent Service, which performs planned exercises, for example, backup and replication assignments.

Question 1

What is the difference between UNION and UNION ALL?

Accepted Answer

UNION blends the contents of two structurally-compatible tables into a solitary joined table. The distinction among UNION and UNION ALL is that UNION will discard duplicate records through UNION ALL will incorporate duplicate records. Note that the presentation of UNION ALL will commonly be superior to UNION since UNION requires the server to do the extra work of expelling any duplicate. In this way, in situations where there is a surety that there won't be any copies, or where having copies isn't an issue, utilization of UNION ALL eventual suggested for performance reasons. Let's have a look at the below examples explaining the usage of both. In the first, we have used UNION and in the second we have explained UNION ALL.

Union example code

Question 2

List all joins supported in SQL?

Accepted Answer

Inner Join- This is the simplest one and most widely used JOIN. This is default one if we do not specify any JOIN between tables. It returns all matching records from both tables where the matching condition is satisfied.
LEFT JOIN- We call this LEFT OUTER JOIN as well. When we have situations like we want all columns from one table and only matching records from another table, we go for this type of JOIN. There are two types of same. The LEFT one is the first type where we would like to have all records from LEFT table and only matching records from RIGHT one then we go for this type of JOIN. When we do not have any matching condition in the right table then all columns from the right one will be NULL while LEFT table will have all the records.
RIGHT JOIN- We call this RIGHT OUTER JOIN as well. This is just the reverse of what we discussed for LEFT JOIN. The result set will be having all records from the right table but only matching records from the left one. Even if the ON clause which gives matching record fails, it will ensure rows are returned from the right table and corresponding columns from the LEFT table with NULL values.
FULL JOIN- It is also called FULL OUTER JOIN. It is having characteristics of both LEFT /RIGHT outer join. The result set will be having rows whenever we have the match in any of the tables either LEFT or RIGHT one. We can also say that it gives the same result if we apply UNION on LEFT and RIGHT OUTER JOIN resultset.
CROSS JOIN- This is a cartesian product of two tables where each row from the primary table is joined with each and every row of the secondary table. Even if we use SELECT statement on two tables separated by a comma and without any WHERE condition, it will give the same result as we get from applying CROSS JOIN.

Question 3

How many system databases we have in the SQL server?

Accepted Answer

Master: It contains framework catalogs that keep data about disk space, record portions, use, system-wide setup settings, login accounts, the presence of other databases, and the presence of other SQL Servers (for appropriate activities). If this database does not exist or corrupted then the SQL Server instance cannot start. Although we user objects in the master database, it is not advised to do so. This database should always remain as static as possible. In the case of a master database being rebuilt, all user objects will be lost.
Model: It is basically a template database. Each time you make another database, SQL Server makes a duplicate of a model to frame the premise of the new database. Any changes made to this database, related DB size, collation, recovery model, and any other configurations, are applied to any new database created afterward.
Tempdb: Temporary database, tempdb, is a workspace. SQL Server tempdb database is one of a kind among every single other database since it is reproduced not recuperated each time SQL Server is started.
Msdb: This database is utilized by the SQL Server Agent Service, which performs planned exercises, for example, backup and replication assignments.

system databases

Question 4

what is Normalization and denormalization in SQL?

Accepted Answer

Normalization and denormalization are the strategies utilized in databases. The terms are differential where Normalization is a procedure of limiting the addition, removal and update peculiarities through disposing of the redundant information. Then again, Denormalization is the reverse procedure of Normalization where the repetition is added to the information to improve the exhibition of the particular application and information integrity. Normalization prevents the disk space wastage by limiting or disposing of the redundancy.

Normalization v/s denormalization in SQL

Question 5

What is the difference between clustered and non clustered index?

Accepted Answer

The indexing is required to quicken search results in the database. If we compare index in our real world then page number of books and keywords mostly on the back side of book work similar as Index. We can quickly go to respective pages if we know the page number and also if we have an idea of keywords, we are looking into the book then just visiting keywords section will make our job easier as keywords are linked with page numbers. There are two types of indexes that exist in SQL database. One is called clustered while other is called non-clustered. The page number of the book is similar to the clustered index while the keyword section of the book represents non-clustered indexes. They exist in the database as a B-Tree structure. Let's go into the details of each index.

Clustered Index

We can have only one clustered index per table. Once we create a primary key, by default it ends up creating a clustered index on that key. We can also specify the index type as well in case we would like to have non clustered index as well. The table is called heap if it does not have any clustered index. As we said earlier that indexes are B-Tree structures. The leaf node is mapped to data pages of the table in case of a clustered index. The data is physically sorted based on the clustered index. This is the reason why it is quickest in two of the indexes.

Non-clustered Index

Earlier it was possible to have only 249 non clustered indexes on the table but after SQL server 2008 we can have 999 non clustered indexes. It gets created by default when we create unique key constraint. We can also decide index type as well while creating Unique constraint. It is not mandatory to have any non clustered index. The leaf nodes of non clustered index map to index pages having details of clustering key or RID to get the actual row. In case the table does not have any clustered index, leaf nodes are mapped to the physical location of the row which is called as RID and if present leaf nodes mapped to clustering key. It is much quicker for DML operations like adding /deleting and updating records.

Question 6

What are Magic tables in the SQL server?

Accepted Answer

The magic tables are an integral part of Triggers. They facilitate Triggers and make the job easier. There are two magic tables that exist, one is inserted one and the other one is called deleted one. These are built by SQL server itself to hold the data while processing gets triggered. They hold the inserted, deleted and updated records while doing DML operations on a table. If we have triggers for INSERT operation then the newly inserted values are stored in the INSERTED magic table. If triggers get fired for UPDATE /DELETE operation then updated and deleted values of records are stored in DELETED magic tables.

INSERTED Magic table

Let me explain the INSERTED magic table by looking at the below example. In the below screenshot we have triggers for INSERT operation and we can see two tables INSERTED/DELETED which we can use in the trigger for performing any manipulation.

INSERTED Magic table code

DELETED Magic table

The below example illustrates the use case of a DELETED magic table which comes into the picture when we have UPDATE/DELETE trigger. This table holds the old record which was either updated/deleted.

DELETED Magic table code

Question 7

How do you implement one-to-one, one-to-many, and many-to-many relationships while designing tables?

Accepted Answer

SQL server favors all three relationships and is well supported by SQL server design. Let me explain to you each relationship one by one:

One to One – This type of relationship is supported by a single table and in some cases, we can have two tables as well. As the name suggests, we can have only a single record from each entity, primary and secondary. A person can have only one passport, he can not have more than one. In below example we have two tables Person and Passport having one to one relationship by leveraging foreign key and forcing unique key constraint on the foreign key. In this case, person ID which is the primary key in one table works as the foreign key.

One to One

One to Many – This type of relations is always supported by two tables. This relationship focuses on at least one entry in the secondary table for each entry in the primary table. But in the primary table, we will always have a single entry for each record against each record of a secondary table. Let me try to understand you with below examples where we have two table book and author. A book can have more than one writer so there will always be more than one entry of book in author table but there will always be a single entry of author in the author table. This type of relationship is supported by a primary key-foreign key relationship. Here Book Id is working as a foreign key in the Author table to support one to many.

One to Many

Many to Many – This type of relationship is realized by more than two tables where one table works as Join table between the first two tables. In the below example we have Students, Enrollments, and Classes as three tables where Enrollment table is working as a bridge between Students and classes. One student can enroll in multiple classes and one class can be mapped to multiple students.

Many to Many

One of the most frequently posed SQL interview questions for freshers, be ready for it.

Question 8

When to use SQL temp tables vs table variables?

Accepted Answer

There are several scenarios in which we use temp tables. In case we have no DDL or DML access to a table. You can utilize your current read access to maneuver the information into a SQL Server temp table and make modifications from that point. Or on the other hand, you don't have the authorization to make a table in the current database, you can make a SQL Server temp table that you can control. At long last, you may be in a circumstance where you need the information to be available just in the present session.

The temp tables have "hashtag" followed by the table name. For instance: #Table_name. SQL temp tables are made in the tempdb database. A local SQL Server temp table is just available to the present session. It can't be accessed or utilized by procedures or queries outside of the session. A standout frequently utilized situation for SQL Server temp tables is when we use a loop in the query. For instance, you need to process information for a SQL query and we need a spot to store information for our loop to peruse. It gives a speedy and productive way to do as such. Another motivation to utilize SQL Server temp tables is when we have to execute some expensive processing. Suppose that we make a join, and each time we have to pull records from that result-set then it needs to process this join again and again. Why not simply process this outcome set once and toss the records into a SQL temp table? At that point, you can have the remainder of the SQL query allude to the SQL temp table name. In addition to the fact that this saves costly processing, it might even make our code look a little cleaner.

The SQL temp table is dropped or demolished once the session drops. Most of the time we will see: DROP #tbl command but it is not mandatory. Temp tables are always created in the tempdb database. It resolves name conflict by adding a suffix in case of a similar name for temp tables.

SQL temp tables vs table variables

Global SQL temp tables are helpful when we need to reference tables in all sessions. Anybody can add, modify, or delete records from the table. Additionally, note that anybody can DROP the table. Like Local SQL Server temp tables, they are dropped once the session drops and there are never again any more references to the table. We can generally utilize the "DROP" command to tidy it up manually.

Table variables are more like some other variables. It is a common understanding that table variables exist just in memory, yet that is basically not correct. They live in the tempdb database just like local SQL Server temp tables. Additionally like local SQL temp tables, table variables are accessible just inside the session in which they are declared. The table variable is just available inside the present batch.

Code

If performance is criterion then table variables are helpful with modest quantities of data. Generally, a SQL Server temp table is helpful when filtering through a lot of information. So for most times, we will in all probability observe the utilization of a SQL Server temp table rather than a table variable.

Question 9

What is self join, explain with example?

Accepted Answer

Self-join is the one in which the same table is joined by itself to get the desired output. We can understand self join in case of the below scenario :

Let's assume we have below table structure for Employee :

EmpId- The key identifier for Employee
MgrID-The key identifier for Manager.which is mapped to EmpID
empname-The Employee name

self join code

If we need to extract employee and manager information from the above table then the only option we have is to using self join. We can make a self join on the Employee table on joining ManagerID with EmployeeID. Please find below query which explains self join clearly :

self join code

Although we can have other alternatives as well to get the same desired result set. It uses left outer Join:

self join code

Question 10

What are the Advantages of Using CTE?

Accepted Answer

CTE is virtually created temporary storage which holds query output created by SQL syntax SELECT, INSERT, UPDATE, DELETE OR CREATE VIEW statement. It holds the data until the time of the query session. It is similar to the derived table only. We can also use the CTE in case of VIEW in some scenarios:

With common table expression

WITH
expression _name
column_name
AS
CTE_query_definition
WITH common_table_expression  Railroad diagram

expression_name: Is a valid identifier for the common table expression. It must be different from the others defined within the same WITH clause, but it can be the same as the name of a base table or view. Any reference to it in the query uses the common table expression and not the base object
column_name: Specifies a unique column name in the common table expression. The number of column names specified must match the number of columns in CTE _query_definition

If the query definition supplies distinct names for all columns then the column names are optional

CTE_query_definition: This is a SELECT statement whose result set populates the common table expression. This must meet the same requirements as for creating a view except that a CTE cannot define another CTE

Let me list down some benefits of using CTE :

When we have a recursive query we can use CTE. It holds a query output in an area that is defined during CTE definition. It is more beneficial when we need to work on the queries query. It holds query data till the time the SQL query is running. It is beneficial for holding filter data which is needed for subsequent processing.
It improves the readability of queries without affecting performance.
It can be referenced multiple times in the SQL query.
It can be used instead of a view where metadata information does not require to be stored.

There are two types of CTE :

Recursive CTE: This type of reference itself within CTE. This is useful while working with hierarchical data since it executes until the time the whole hierarchy gets returned. A recursive query must contain two SQL query statements joined by UNION ALL, UNION, INTERSECT, or EXCEPT operator. In the below example we have Employee table having hierarchical data (Employee and Manager relationship)

Recursive CTE code

Non-Recursive CTE: This type does not reference itself. This is much simpler than other types of CTE.

Non-Recursive CTE code

Question 11

Explain dirty read in SQL server?

Accepted Answer

One of the foremost common issues that occur when running parallel transactions is the Dirty read problem. A dirty read happens once when the transaction is permissible to read the information that's being changed by another one that is running at the same time however which has not nevertheless committed itself.

If the transaction that modifies the information and does the commits itself, the dirty read problem never occurs. but if the transaction, that has triggered the changes in the information, is rolled back when the opposite transaction has read the data, the latter transaction has dirty information that doesn’t truly exist.

Let us try to understand the scenario when a user tries to buy a product. The transaction which does the acquisition task for the user. the primary step within the transaction would be to update the Items in Stock.

Before the transaction, there are 12 items in the stock; the transaction can update this to 11 items. The transaction concurrently communicates with a third party payment gateway. If at this time in time, another transaction, let’s say Transaction 2, reads Items In Stock for laptops, it'll read 11. However, if after, the user who has triggered the first transaction A, seems to possess light funds in his account, transaction A is rolled back and therefore ItemsInStock column can revert to again 12. However, transaction B has 11 items because it read old data from ItemsInStock column. This is often dirty information and therefore the drawback is termed a dirty scan problem.

Dirty read in SQL server

Question 12

What is the difference between Function and Stored procedure?

Accepted Answer

Both are a set of SQL statements which are encapsulated inside the block to complete some task. But they are very different in so many ways.

Stored Procedure- The set of SQL statements in Stored procedure is pre-compiled objects which get compiled for the very first time only and afterward it gets saved and reused.
Functions- These are executed and compiled every time it gets called. It is mandatory for the function to always return some value also it cannot do DML operations on data.

Let me list down some major differences for both :

It is mandatory in function to return value while it is optional in a stored procedure.
Function only support input parameters while stored procedures can have both input/output parameter.
A function can be called from a stored procedure but not vice versa.
The procedure allows SELECT as well DML operations as well but function allows only to SELECT.
There are restrictions of using stored procedure in the WHERE/HAVING/SELECT section while nothing for function. We use function frequently in WHERE/HAVING/SELECT section.
The transaction is possible in stored procedures but not in function.
An exception can be taken care of in the stored procedure by using the TRY CATCH block but not in function.

Question 13

What is ACID property in a database?

Accepted Answer

It is very critical for any database to maintain data integrity and having consistent data. For choosing any database this is one of the important considerations. Data architects evaluate database on ACID properties only. The ACID is an acronym and stands for Atomicity, Consistency Isolation, Durability.

ACID property in a database

Let me explain to you four pillars of ACID property:

Atomicity -These properties ensure that either all operation part of any database transaction will commit or none of the operations will be successful. This will ensure the consistency of data. Nothing partially gets committed in the database either all or none.
Consistency-This properties ensure that data will never be in a half-finished state. It ensures changes in data always happen in a consistent way.
Isolation-This ensure that all transactions run independently and without intervening one another and till the time each transaction is finished in its own way.
Durability- This property is very critical as it ensures that even in case of any failover, there is enough mechanism to recover data in the system.

Question 14

What are the different normalization forms?

Accepted Answer

A normalization which ensures a reduction in redundancy and dependability. It is a popular database breaking down large tables into smaller ones with the help of a relationship. Edgar code was the first one who coined the term normalization and came with three different forms of normalization(1NF,2NF,3NF). Raymond F. Boyce later added another form of normalization which has been named after both inventors (Boyce-Codd NF). Till now we have below normalization forms available so far:

Different normalization forms

1st Normal Form
2nd 3rd
Boyce-Codd NF
4th 5th 6th

But in most practical scenario database design is well supported till 3NF only. Now, let us understand the first four forms of normalization:

1NF: 1NF has the below characteristics :
- Each row cell should have a single value
- Each row needs to be unique

1NF Example

2NF: The second normal form is an extension of 1NF. It has two rules:
- It has to be 1NF
- It should have a single column as Primary Key

The above 1NF table can be extended to 2NF by diving above tables into below two tables:

2 NF Example

3 NF: This normalization form has also two rules:
- It has to be in 2 NF
- It should not have a structure in which changes in non-key column value changes another non-key column

3 NF Example

For the above table structure, we can have below design to support 3 NF

3NF Example

BC NF: This is required when we have more than one candidate key.

Question 15

What is the primary key and foreign key?

Accepted Answer

The Primary key is the single column or combination of the column which uniquely distinguishes individual rows in the table. It must be unique but can not have NULL values which make it different then UNIQUE KEY. A table can have only a single primary key and contain a single column or combination of columns called a composite key. The foreign key in the table is actually the to the primary key of another table. Please find below the list of differences between two Keys which look similar in nature :

Primary Key can not have NULL values while foreign key can have NULL values.
Be default primary key creates clustered Index on the table which physically sorts out data based on key while there is no index created on the foreign key. We need to explicitly create indexes on the foreign key.
We can have a single primary key in the table but we can have multiple foreign keys in the table.

Defining primary key and foreign key

Question 16

What is UDF in SQL and how many different types exist for UDF?

Accepted Answer

UDF represents user-defined represents which are designed to accept the parameter and does processing on parameter before returning a result set of processing or actions. The result could be either a scalar value or a complete result set. The UDF’s are widely used in scripts, stored procedures and within other’s UDF as well.

There are several benefits of using UDF :

UDF is designed keeping in mind modular programming. Once you have created UDF, we can call the UDF multiple times and it is not dependent on source code so we can easily modify it.
UDF is designed to reduce compilation cost by caching plan so we can reuse them without compiling it.
The WHERE clause is considered as an expensive operation so if we have some filter on complex constraints we can easily create UDF for the same and same UDF can be replaced in WHERE clause.

There are two types of UDF based on the output value of UDF’s

The scalar functions-The output of this function is scalar values. It may accept zero or more parameters and can return any SQL defined data type other than text, ntext, image, cursor, and timestamp.

The scalar functions example

Table values functions- This type of function return table as resultset. It can accept zero or more parameters but always return table.

Table values functions example

A staple in SQL Server interview questions, be prepared to answer this one.

Question 17

List down the difference between DELETE and TRUNCATE command?

Accepted Answer

Let me list down some basic difference then we will try to understand with examples as well :

Truncate is lightning fast when we compare it with DELETE commands. The reason for the same is less usage of transaction logs in Truncate. DELETE erase records one by one and at the same time transaction logs get maintained for the get for every record. TRUNCATE erases the records by deallocating space from pages and it makes a single entry in logs.
Triggers are not fired in case of TRUNCATE while delete command fires trigger.
The identity column values get reset in case of TRUNCATE while delete does not reset the identity column value.
In the case of foreign key constraint or tables used in replication, Truncation happen.
DELETE falls into the DML category while TRUNCATION falls into the DDL category of commands.

Please consider below example where even after deleting records when transaction was rolled back, it reverted the changes :

SQL Server delete with rollback

Question 18

List down the difference between WHERE and HAVING clause.

Accepted Answer

The WHERE clause is the most widely used command in SQL and used for filtering records on the result set. The HAVING clause does the same thing but on the grouped result set. The WHERE clause will be executed first before having a clause trigger. Let me try to explain the differences with examples. Any SQL statement follows below syntax:

command in SQL

The order of execution of SQL statements follows from top to bottom. It implies that records are filtered first on the WHERE clause and once the result set is grouped, HAVING clause comes into the picture.

Where clause

Question 19

You find SP is not applied to all the nodes across the cluster. How to apply SP only on the required nodes?

Accepted Answer

If you find that the product level is not consistent across all the nodes, you will need to fool the 2005 patch installer into only patching the nodes that need updating. To do so, you will have to perform the following steps:

Fail Instance, Cluster, and MSDTC groups to an unpatched node
Remove any successfully patched nodes from failover candidates of the SQL Server Service of the instance group (do this using Cluster Admin tool)
Run the patch
After the patch installs successfully, add the Nodes removed in Step 2 back to the SQL Server Service of the Instance group

Why do you need to do this? Well when the patch installer determines that not all nodes in the cluster are at the same patch level, a passive node operation will fail and will prevent you from moving forward with any further patching.

Question 20

How to apply service pack on Active / Active cluster Nodes?

Accepted Answer

Make a note of all node names (and/or IP addresses), SQL Server virtual names along with preferred nodes. If there are more than three nodes you may need to also take note of possible owners for each SQL resource group. For my example assume that I have a cluster with node1 and node2, SQL1 normally lives on node1 and SQL2 normally lives on node2.
To start with a clean slate and ensure any previous updates are completed both nodes should be restarted if possible. Choose the physical node that you want to patch second and restart that node (in my example node2).
Restart the node you want to patch first (node1). This will mean that both active SQL instances are now running on node2. Some restarts will be essential, but you could avoid the first two restarts if you need to keep downtime to a minimum and just fail SQL1 over to node2. The main point here is to always patch a passive node.
In a cluster, the administrator removes node1 from the possible owner's lists of SQL1 and SQL2. This means that neither SQL instance can fail over to node1 while it is being patched.
Run the service pack executable on node1.
Restart node1.
Add node1 back into the possible owner's lists of SQL1 and SQL2 and fail both instances over to node1.
Repeat steps 4 – 6 on node2.
Add node2 back into the possible owner's lists of SQL1 and SQL2 and fail both instances over to node2. Check that the building level is correct and review the SQL Server error logs.
Fail SQL1 over to node1. Check build levels and SQL Server error logs

Question 21

What are the main events and columns helpful in troubleshooting performance issues using a profiler?

Accepted Answer

Events:

Event Group: Performance

Event: ShowPlan_ALL (BinaryData column must be selected)
Event: ShowPlan_XML

Event Group: T-SQL

Event: SQL:BatchStarted
Event: SQL:BatchCompleted

Event Group: Stored Procedures

Event: RPC:Completed

Event Group: Locks

Event: Lock: Deadlock Graph
Event: Lock: Lock Deadlock Chain (Series of events that leads to a deadlock)

Event Group: Sessions

Event: Existing Connection

Event Group: Security Audit

Event: Audit Login
Event: Audit Log Out

Columns:

Below are the most common columns that help us in understanding the trace file to troubleshoot the problems.

TextData
ApplicationName
NTUserName
LoginName
CPU
Reads
Writes
Duration
SPID
StartTime
EndTime
Database Name
Error
HostName
LinkedServerName
NTDomainName
ServerName
SQLHandle

All these columns need not be available for all of the events but depend on the event select we have to choose the appropriate columns.

Filters:

ApplicationName
DatabaseName
DBUserName
Error
HostName
NTUserName
NTDomainName

Question 22

What are the agents in replication?

Accepted Answer

different agents in replication

Snapshot Agent: Copy Schema+Data to snapshot folder on distributor. Used in all types of replication.
Log reader Agent: Sends transactions from Publisher to Distributor. Used in transactional replication
Distribution Agent: Applies Snapshots / Transactions to all subscribers’ runs at a distributor in PUSH and Runs at Subscriber in PULL. Used in transactional and transactional with updatable subscriptions.
Queue reader Agent: Runs at distributor send back transactions from subscriber to publisher. Used in Transactional With updatable subscriptions.
Merge Agent: Applies initial snapshot to subscribers, from the next time synchronize by resolving the conflicts.

Question 23

Can we configure log shipping in the replicated database?

Accepted Answer

Replication does not continue after a log shipping failover. If a failover occurs, replication agents do not connect to the secondary, so transactions are not replicated to Subscribers. If a fallback to the primary occurs, replication resumes. All transactions that log shipping copies from the secondary back to the primary are replicated to Subscribers.

For transactional replication, the behaviour of log shipping depends on the sync with a backup option. This option can be set on the publication database and distribution database; in log shipping for the Publisher, only the setting on the publication database is relevant.

Setting this option on the publication database ensures that transactions are not delivered to the distribution database until they are backed up at the publication database. The last publication database backup can then be restored at the secondary server without any possibility of the distribution database having transactions that the restored publication database does not have. This option guarantees that if the Publisher fails over to a secondary server, consistency is maintained between the Publisher, Distributor, and Subscribers. Latency and throughput are affected because transactions cannot be delivered to the distribution database until they have been backed up at the Publisher.

Question 24

What are the best RAID levels to use with SQL Server?

Accepted Answer

Before choosing the RAID (Redundant Array of Independent Disks) we should have a look into the usage of SQL Server files.

As a basic thumb rule “Data Files” need random access, “Log files” need sequential access and “TempDB” must be on the fastest drive and must be separated from data and log files.

We have to consider the below factors while choosing the RAID level:

factors to be considered while choosing the RAID level

Reliability
Storage Efficiency
Random Read
Random Write
Sequential Read
Sequential Write
Cost.

As an Admin, we have to consider all of these parameters in choosing the proper RAID level. Obviously, the choice is always between RAID-5 and RAID-10

Question 25

How to monitor latency in replication?

Accepted Answer

There are three methods.

Replication monitor
Replication commands
Tracer Tokens

Replication Monitor: In the replication monitor from the list of all subscriptions just double click on the desired subscription. There we find three tabs.
- Publisher to Distributor History
- Distributor to Subscriber History
- Undistributed commands
Replication Commands:
- Publisher.SP_ReplTran: Checks the pending transactions at p
- Distributor.MSReplCommands and MSReplTransactions: Gives the transactions and commands details. Actual T_SQL data is in binary format. From the entry time, we can estimate the latency.
- Distributor.SP_BrowseReplCmds: It shows the eaxct_seqno along with the corresponding T-SQL command
- sp_replmonitorsubscriptionpendingcmds: It shows the total number of pending commands to be applied at subscriber along with the estimated time.
Tracer Tokens:

Available from Replication Monitor or via TSQL statements, Tracer Tokens are special timestamp transactions written to the Publisher’s Transaction Log and picked up by the Log Reader. They are then read by the Distribution Agent and written to the Subscriber. Timestamps for each step are recorded in tracking tables in the Distribution Database and can be displayed in Replication Monitor or via TSQL statements.

When Log Reader picks up Token it records time in MStracer_tokens table in the Distribution database. The Distribution Agent then picks up the Token and records Subscriber(s) write time in the MStracer_history tables also in the Distribution database.

Below is the T-SQL code to use Tracer tokens to troubleshoot the latency issues.

–A SQL Agent JOB to insert a new Tracer Token in the publication database.
USE [AdventureWorks]
Go
EXEC sys.sp_posttracertoken @publication = 
Go
–Token Tracking Tables
USE Distribution
Go
–publisher_commit
SELECT Top 20 * FROM MStracer_tokens Order by tracer_id desc
–subscriber_commit
SELECT Top 20 * FROM MStracer_history Order by parent_tracer_id desc

It's no surprise that this one pops up often in SQL multiple choice questions.

Question 26

Explain page, A fundamental unit of storage in SQL Server and different types of pages.

Accepted Answer

A page is a fundamental unit of storage that stores data. The page is the size of 8KB and we have 8 types of pages in SQL Server.

Data
Index
Text/Image (LOB, ROW_OVERFLOW, XML)
GAM (Global Allocation Map)
SGAM (Shared Global Allocation Map)
PFS (Page Free Space)
IAM (Index Allocation Map)
BCM (Bulk Change Map)
DCM (Differential Change Map)

Question 27

Explain the difference between Sequence vs Identity?

Accepted Answer

An identity column in the table has auto-generate & auto increase value with each new row insert. The user cannot insert value in the identity column.

The sequence is a new feature introduced with SQL Server 2012 similar to Oracle’s sequence objects. A sequence object generates a sequence of unique numeric values as per the specifications mentioned. Next VALUE for a SEQUENCE object can be generated using the NEXT VALUE FOR clause.

IDENTITY is column level property & tied to a particular table. This cannot be shared among multiple tables.
SEQUENCE is an object defined by the user with specific details and can be shared by multiple tables. This is not tied to any particular table.
IDENTITY property cannot be reset to its initial value but the SEQUENCE object can be reset to initial value any time.
Maximum value cannot be defined for IDENTITY whereas this can be done for SEQUENCE object.

Question 28

What are the dynamic management views and share any 5 that you are using on a regular basis?

Accepted Answer

The DMV (Dynamic Management Views) is a set of system views introduced with SQL Server 2005. DMV’s are a new tool of DBA to get internal information of the system and it’s working.

sys.dm_exec_query_stats
sys.dm_exec_sql_text
sys.dm_os_buffer_descriptors
sys.dm_tran_locks - Locking and blocking
sys.dm_os_wait_stats - Wait stats
sys.dm_exec_requests Percentage – For Percentage complete for a process
Sys.dm_exec_sessions

This is a frequently asked SQL Server interview question.

Question 29

Name all Available isolation levels in SQL Server?

Accepted Answer

SQL Server supports different types of isolation level.

Read Uncommitted – Read Uncommitted is the lowest isolation level allow dirty reads. Here, Changes does by one transaction is visible to other transactions before committing.
Read Committed – Read Committed s isolation does not allow dirty read. Any data read by the transaction is 100% committed. Till the time transaction is updating the record, it holds exclusive locks on it.
Repeatable Read – Repeatable read is the most restrictive isolation level which holds locks on the table even with read transactions. Table data cannot be modified from any other sessions until the transaction is completed.
Serializable – This is the Highest isolation level similar to repeatable read with the prevention of Phantom Read. It ensures transaction referencing same records must run in serially.
Snapshot – This also known as RCSI (Read Committed Snapshot Isolation). It’s similar to Serializable Isolation but the only difference is Snapshot does not hold a lock on the table during the transaction so that the table can be modified in other sessions. Snapshot isolation maintains versioning in Tempdb for old data called Row Versioning.

Question 30

Explain any 5 differences between CheckPoint & LazyWriter?

Accepted Answer

Checkpoint

Checkpoint occur on Recovery time interval, Backup/Detach Command & on the execution of DDL command.
Checkpoint writes only dirty pages to the disk. Checkpoint does not release any memory.
The checkpoint is responsible for DB recovery point.
Checkpoint always mark entry in T-log before it executes either SQL engine or manually
Checkpoint occurrence can be monitored using performance monitor “SQL Server Buffer Manager Checkpoint/sec”.
You need SYSADMIN, DB_OWNER and DB_BACKUPOPERATOR rights to execute checkpoint manually

Lazy Writer

Lazy Writer release buffer pool memory when memory pressure occurs to ensure enough free memory.
Lazy Writer looks for LRU, least recently used (“cold” = least recently read or written, not accessed in recent time) pages in the buffer pool and releases the memory taken by them. Lazy writer releases both dirty and clean pages. Clean pages can be released without writing to disk and dirty pages release after written to the disk.
Lazy Writer is only responsible for managing free buffer space. The lazy writer does not affect or manage database recovery.
The lazy writer doesn’t mark any entry in T-log.
Lazy writer occurrence can be monitored using performance monitor “SQL Server Buffer Manager Lazy writes/sec”.
SQL server executes its own. User cannot run it manually

Question 31

Name any 5 components can be installed with the SQL Server 2016 installation?

Accepted Answer

SQL Server Database Engine
SQL Server Analysis Services (SSAS)
SQL Server Reporting Services (SSRS)
SQL Server Integration Services (SSIS)
SQL Server Management Studio
SQL Server Profiler
SQL Server Configuration Manager
SQL Server CEIP (Telemetry Services)
SSIS CEIP (Telemetry Services)
Database Tuning Advisor
SQ Profiler
R Service
Connectivity Components
Communication between clients and servers
Network libraries for DB-Library, ODBC, and OLE DB.
Documentation and Samples
Books Online

Question 32

How many types of SQL Server database backups and Recovery Models are there?

Accepted Answer

SQL Server supports the following types of database backups:

Full Backup – It contains a complete backup of the database.
Differential backup – Differential backup is also known as incremental backup. It contains changes happen after the last full backup.
Log Backup – Log backup contains a backup of the transactional log after last log backup. It contains both committed & uncommitted transactions.
Copy-Only Backup – Special backup type that does not impact the sequence of regular backups and gives you a copy of the complete database.
Filegroup Backup – SQL gives you the capability of one or more filegroup backup in place of complete database backup.
Partial Backup – Partial backup helps you to take a backup of specific data from different files or filegroups.

SQL Server database backups are supported by all 3 recovery models:

Simple
Full
Bulk-logged

Expect to come across this popular SQL Server DBA interview questions.

Question 33

What’s the impact if we enable SQL Server Trace Flag 3042 for compressed backups?

Accepted Answer

With Compressed backups, SQL Server works on the pre-allocation algorithm to determine how much space the compressed backup will require and will allocate this space at the start of the backup process. At the time of backup completion, the backup process might expand or shrink the backup file on the basis of actual requirement.

Pre-Allocation helps SQL Server save the overhead of constantly growing the backup file.

Sometimes when we have a very large database and the difference in uncompressed and compressed backups is very high, preallocated and actual backup space has a significant difference. Trace flag 3042 promote better space management with little over-head to SQL server to grow backup file on a regular basis. This is useful with limited disk space.

Enabling trace flag 3014 may result in a little spike in CPU & memory but not significant to cause performance issues.

Question 34

What is pseudo-simple Recovery Model?

Accepted Answer

Pseudo-simple Recovery Model is situational based recovery model. This is something not given by project but ones with circumstances. It’s a situation where the database may be configured in the FULL recovery model but behaves like a SIMPLE recovery model.

The database will behave like a SIMPLE recovery model until a full database backup is taken after switching the recovery model to Full.

For example, You have a database in Simple Recovery Model and now you have switched it to the FULL recovery model. You need to perform a full backup after switching the recovery model; otherwise, the database will keep behaving like simple and keeps truncating the transaction log on commit.

A must-know for anyone heading into SQL Server interview, this is frequently asked in SQL multiple choice questions.

Question 35

Consider a scenario where the server is performing a full backup at Sunday 2 AM, Differential backup Monday to Saturday 2 AM and Transaction Log backup in each 30 Min. You have been reported system crash around Wednesday 7 AM, Please share the recovery path with required backups to recover the system.

Accepted Answer

You need to restore Last Sunday full backup happen at 2 AM then Wednesday Differential backup of 2 AM followed by all Transaction log backups after. Consider all backups are valid, in case of any bad backup file recovery path, will be changed.

Question 36

What types of functionality does SQL Server Agent provide?

Accepted Answer

SQL Server Agent provides multiple functionalities like:-

SQL Server Agent is a Windows service come with all SQL editions other than express.
SQL Server Agent is like task scheduler of OS. It used to schedule jobs/process & reports.
SQL Server Agent is also provided inbuilt features of Alerts like Blocking, deadlock, etc.
SQL Server Agent support multiple types of process like T-SQL \ CMD \PowerSheel \ SSAS \ SSIS for scheduled execution.
SQL Server Agent provides the proxy option to secure and limit the direct user access on critical sub-systems.

Question 37

What are SQL Server Agent Fixed Database Roles with the definition?

Accepted Answer

SQL Server agent has lots of features and sometimes you need to give rights on an agent to manage and view jobs. Allowing every one with sysadmin permission is not appropriate.

SQL Server has 3 fixed DB roles for permission on SQL Agent & Jobs.

SQLAgentUserRole – Users with SQLAgentUserRole rights can manage Agent jobs created by self. Self-owned jobs can be managed by this rights, he can not view other jobs available on the server.
SQLAgentReaderRole – This is the 2nd level of permission on the agent. User with SQLAgentReaderRole rights can only view job available on server but cannot change or modify any jobs. This roll gives the ability to review multi-server jobs, their configurations, and history with SQLAgentUserRole rights.
SQLAgentOperatorRole - SQLAgentOperatorRole roles gives you highest permission on agent. User with this role has the ability to review operators, proxies and alerts, execute, stop or start all local jobs, delete the job history for any local job as well as enable or disable all local jobs and schedules with AQLAgentReaderRole rights.

Question 38

What is the Guest user account? Explain in brief.

Accepted Answer

The Guest user account is created by default with SQL server installation. Guest user is not mapped to any login but can be used by any login when explicit permissions are not granted to access an object. You can drop the guest user from all databases except Master and TempDB.

When any user login to the SQL Server, it has 2 parts Server & database. First, at the Server level, user authentication verifies at the server level and User is able to login to SQL Servers. Second, Login and mapping to the database are verified. In case, Login is not mapped to any user but able to log in on SQL server. SQL automatically map that login to Guest and grant him database access.

One of the security recommendations of Microsoft is to Drop or disable a Guest user in every database except Master & TempDB database. By having Guest user, you are at risk of unauthorized access to data.

It's no surprise that this one pops up often in SQL Server interview questions.

Question 39

What are SQL Injection and its problems?

Accepted Answer

An SQL injection is a web hacking techniques done by unauthorized personnel or processes that might destroy your database.

An SQL injection is a web security vulnerability that allowed hackers to access application code and queries used to interact with the database. Hacker uses those methods to retrieve data belonging to other users and data not authorized to them. The major challenge is SQL injection can cause a system crash, data stolen, data corruption, etc.

An SQL injection is not a task of one man or team. Complete support and architecture team work together to get it prevented to happen.Developers \ DBAs are responsible for DB security and proper SQL code. Applications developers are responsible for application code and Db access methods. Infra team is responsible for network, firewall & OS security.

Proper SQL instance, OS & Farwell security with a well-written application can help to reduce the risk of SQL injection.

Question 40

What are the primary differences between an index reorganization and an index rebuild?

Accepted Answer

Index Reorg

The reorganization is an "online" operation.
Reorganization only affects the leaf level of an index
Reorganization shuffle data place to rectify existing allocated pages of index
The reorganization is always a fully-logged operation
Reorganization can be stopped or killed any time, no need for a rollback

Index Rebuild

Rebuild is an "offline" operation by default.
Rebuild creates a completely new structure of index B-Tree
Rebuild uses new pages/allocations
Rebuild can be a minimally-logged operation
Rebuild can be stopped or killer but it requires rollback to complete transactionally

Question 41

Can the rebuilding cluster index rebuild the nonclustered index on the table?

Accepted Answer

Each table can have only one Cluster & multiple nonclustered indexes. All nonclustered indexes use the index key of the cluster index or directly depend on the clustered index.

Because of this dependency, we usually got this question if rebuilding the clustered index will rebuild the nonclustered index on a table or not.

The answer is NO. Cluster index rebuild will not rebuild or reorganize the nonclustered index. You need to plan a nonclustered index rebuild or reorganization separately.

A common in MS SQL interview questions, don't miss this one.

Question 42

List down best practices of tempDB configuration?

Accepted Answer

Collation of TempDB should be the same as the SQL Server instance collation.
TempDB database should be sa.
Guest user should not drop or revoke permissions from TempDB database
Keep the recovery model SIMPLE only.
Configure tempdb files to automatically grow as required.
Ensure TempDB drives is with RAID protection in order to prevent a single disk failure from shutting down SQL Server.
Keep the TempDB database to separate set of disks
TempDB database size should be according to server load.
TempDB database data files should be configured as per available C.P.U. Cores.
If no of cores < 8 then no. of data files equals no. of logical processors.
If no. of cores is between 8 to 32 inclusive, then no. of data files equals 1/2 data files as logical processors.
If no. of cores > 32, then no. of data files equals 1/4 data files as logical processors.
Ensure each data file should be of the same size to allow optimal proportional-fill performance.
Reside tempdb database on a fast I/O subsystem.
Configure auto growth if tempDB data Log files are to a reasonable size to avoid the tempdb database files from growing by too small a value.
Guidelines for setting the FILEGROWTH increment for tempdb files.
If TempDB file size > 0 and < 100 MB, then the recommended filegroup increment can be 10 MB.
If TempDB file size > 100 and < 200 MB, then the recommended filegroup increment can be 20 MB.
If TempDB file size >= 200 MB, then the recommended filegroup increment can be 10% OR any fixed value depending on the requirement or on the basis of I/O system capabilities.
Do not shrink TempDB unless necessary.
Do not enable auto-create statistics & auto update statistics.
Ensure to have auto close OFF.

One of the most frequently posed SQL Server interview questions for freshers, be ready for it.

Question 43

What are the different types of SQL Server replication?

Accepted Answer

Snapshot replication – Snapshot replication works on a snapshot of the published objects. Snapshot agent takes care of it and applies it to a subscriber. Snapshot replication overwrites the data at the subscriber each time a snapshot is applied. It’s best suited for fairly static data or sync interval is not an issue. Subscriber does not always need to be connected.

Transactional replication – Transactional replication replicates each transaction for the article being published. For initial setup, Snapshot or backup is required to copy article data at the subscriber. After that Log Reader Agent reads it from the transaction log and writes it to the distribution database and then to the subscriber. It’s the most widely used replication.

Merge replication – Merge replication is the most complex types of replication which allow changes to happen at both the publisher and subscriber. Changes happen at publisher & subscriber are merged to keep data consistency and a uniform set of data. For an initial setup like transactional replication, Snapshot or backup is required to copy article data at the subscriber. After that Merge Reader Agent reads it from the transaction log and writes it to the distribution database and then to the subscriber. The merge agent is capable of resolving conflicts that occur during data synchronization.

Question 44

What is the difference between Push and Pull Subscription?

Accepted Answer

Push & Pull is a type of Replication configuration. It decides how data will be replicated to the subscriber.

Push – In Push subscription, the publisher is responsible for sending data. Publisher pushes data from the publisher to the subscriber. Distributor agent will run on the Distributor to push changes. The distributor can be configured on the Publisher or separate server. Changes can be pushed to subscribers on demand, continuously, or on a scheduled basis.

Pull is best suited configured when the subscriber is connected all the time and need the latest data all the time accessed.

Pull - In Pull subscription, Subscribers is responsible for fetching data. Subscribers requests changes from the Publisher. Distributor agent will run on the subscriber to Pull changes. The subscriber can pull data as & when needed.

Pull is best suited when subscribers are not online every time. The subscriber can allow data lag and can wait for delay in data refresh

Question 45

How log shipping works?

Accepted Answer

Log shipping is a Microsoft SQL Server DR solution from the very beginning. In Log Shipping, Backup and restores of a database from one Server to another standby server happens automatically.

Log Shipping works with 3 mandatory & 1 optional Job.

Backup Job - Backup Job is responsible for taking transactional log backups on the Primary server. It runs on Primary Server.
Copy Job - Copy Job runs on a secondary server and responsible for copying T-Log backups from the primary server to Secondary server.
Restore Job - Restore job runs on secondary server to restore T-log backups in sequential order.

These 3 jobs are created separately for each database configured in log shipping.

Alert Job – This is an optional job to monitor log shipping threshold and generate notifications. This Job is instance specific.

A staple in SQL server interview questions, be prepared to answer this one.

Question 46

Which are the basic system tables to track the information about the Log Shipping?

Accepted Answer

Log shipping is a disaster recovery solution from Microsoft. Log Shipping comes up with multiple internal tables to refer to its details and monitor current status.
log_shipping_monitor_alert – This system table saves alert configuration used to monitor and trigger a notification on violations.
log_shipping_monitor_error_detail – This system table shows errors occurred during Log shipping.
log_shipping_monitor_history_detail – This system table saves the history of log shipping and it’s status. This can be referred in future for any issues and security report.
log_shipping_monitor_primary – These tables save one record per database with backup and monitoring threshold.
log_shipping_monitor_secondary - These tables save one record per secondary database with the primary server, primary database, restore and monitoring threshold.
log_shipping_primary_databases – This table saves a list of all databases serving as primary database & enabled for Log shipping.
Log_shipping_secondary_databases - This table saves a list of all databases serving as a secondary database in Log shipping.

Also, You can use the Transaction Log Shipping Report at Server Level to view the current status.

Question 47

Explain Principal, Mirror And Witness Servers in DB Mirroring?

Accepted Answer

Principal Server:- The main server holding a primary copy of the database and serving client application & requests.
Mirror Server:- The secondary server which holds a mirrored copy of Principal database and acts as a hot or warm standby server on basis Synchronous & Asynchronous configuration.
Witness Server: - The witness server is an optional server and it controls automatic failover to the mirror if the principal becomes unavailable.

Question 48

When I configure database mirroring I’m receiving the below error,

“One or more of the server network addresses lacks a fully qualified domain name (FQDN). Specify the FQDN for each server, and click start mirroring again.”

Accepted Answer

The FQDN (fully qualified domain name) is computer name of each server with the domain name. This can be found running the following from the command prompt:

IPCONFIG /ALL
Concatenate the “Host Name” and “Primary DNS Suffix”
Host Name . . . . . . . . . . . . : XYZ
Primary DNS Suffix . . . . . . . : corp.mycompany.com
The FQDN of your computer name will be XYZ.corp.mycompany.com.

Question 49

Explain Physical Database Architecture In Brief?

Accepted Answer

The physical database architecture is a description of the way the database and files are organized in a SQL server.

Pages and extents- A page is the size of 8KB and set of 8 Pages are called extent. This is the fundamental unit where data is stored.
Physical Database Files and Filegroups- Database files visible at file system to store data and logs.
Table and Index Architecture- Database objects inside the database to store & access data.
Database- Database is a set of data & Log file which resides over the filesystem and managed by the operating system.
SQL Instance- SQL instance is a logical entity controlled databases. One SQL instance can have multiple databases. Security and accessibility is also part of the SQL instance.

Question 50

Explain One to One (1:1), One to Many (1:N) and Man to Many (N:N) mapping with an example?

Accepted Answer

One to One(1:1) – For each instance, in the first entity, there is one and only one in the second entity and vice versa. Like - Employee Name and Employee ID, One employee can have only one Employee ID and one Employee ID can be allocated to one person only.
One to Many(1:N) – For each instance, in the first entity, there can be one or more in the second entity but for each instance, in the second entity, there can be one and only one instance in the first entity. Like - Manager & Employee, One Manager can have many employees but one employee can have only one Manager.
Many to Many(N:N) –For each instance, in the first entity there can be one or more instance in the second entity and vice versa. Like – Employee & Project, One Employee can work on multiple projects and One Project can have multiple employees to work.

Question 51

Explain the Write-Ahead Transaction Log (WAL) protocol?

Accepted Answer

Write-ahead transaction log controls & defined recording of data modifications to disk. To maintain ACID (Atomicity, Consistency, Isolation, and Durability) property of DBMS, SQL Server uses a write-ahead log (WAL), which guarantees that no data modifications are written to disk before the associated log record is written to disk.

Write-ahead log work as below:-

SQL Server maintains a buffer cache to maintain data pages on retrieval
When a page is modified in the buffer cache, it marked as dirty. The page is not immediately written back to disk.
A data page can be modified multiple times before written to disk but maintain separate log entry a transaction log record.
The log records must be written to disk before the associated dirty page is removed from the buffer cache and written to disk.

This is a regular feature in SQL query interview questions for experienced professionals, be ready to tackle it.

Question 52

What is Row Versioning?

Accepted Answer

To understand Row versioning, you should first understand RCSI (Read Committed Snapshot Isolation). RCSI does not hold a lock on the table during the transaction so that the table can be modified in other sessions, eliminate the use of shared locks on read operations.

RCSI isolation maintains versioning in Tempdb for old data called Row Versioning. Row versioning increases overall system performance by reducing the resources used to manage locks.

When any transaction updates the row in the table, New row version was generated. With each upcoming update, If DB is already having the previous version of this row, the previous version of the row is stored in the version store and the new row version contains a pointer to the old version of the row in the version store.

SQL Server keeps running clean up task to remove old versions which are not in use. Until the transaction is open, all versions of rows that have been modified by that transaction must be kept in the version store. Long-running open transactions and multiple old row versions can cause a huge tempDB issue.

Question 53

Is it possible to import data using T-SQL commands without using SQL Server Integration Services? If yes, please share commands?

Accepted Answer

Sometimes we have a situation when we cannot perform such activities using SSIS due to multiple types of databases, Different domains, Older version, etc.

We have several commands available to import using T-SQL.

BCP – BCP is Bulk copy program used to import a large number of rows into SQL server or to text\csv files.
Bulk Insert – Bulk insert is used to data files (text \ csv \ excel) into a database table in a user-specified format.
OpenRowSet – OpenRowSet is used to access remote data from an OLE DB data source. This is an alternative method of Linked server for the one-time or ad-hoc connection.
OPENDATASOURCE - OPENDATASOURCE is used to access remote data on an ad-hoc basis with 4 part object name without using a linked server name.
OPENQUERY – OPENQUERY is used to execute a specified query on the specified linked server. OPENQUERY can be referenced in from clause with INSERT, UPDATE, or DELETE statement on the target table.
Linked Servers – Linked Servers are configured to access data outside SQL Server from another SQL instance or DB types (Like Oracle \ DB2, etc.). It has the ability to issue queries & transactions on heterogeneous data sources.

This is a frequently asked SQL server interview question for experienced professionals.

Question 54

We have one 3 Node cluster with 1 SQL 2005 and 1 SQL 2008 R2 instance. How many minimum installations are needed to patch all SQL instances in the cluster?

Accepted Answer

We need to run the SQL Server patch installation minimally 4 times to patch both SQL instance on all 3 cluster nodes.

SQL 2005 - 1 installation, SQL 2005 support remote installation. SQL 2005 patch installation will install the patch on all cluster nodes in one go but only for cluster objects like DB engine or agent.
SQL 2008 R2 – 3 installation, SQL 2008 R2 does not support remote installation. We need to patch all 3 nodes separately.

Additionally, We may need to run SQL 2005 setup on other 2 nodes to patch non-cluster objects like SSMS but that’s the additional part.

Question 55

If a transaction starts before backup start time and transaction committed before the completion of the full backup. Please explain which data would be included in the full backup.

Accepted Answer

Full backups are set of the complete database with all data. AT the time of any crash or data recovery, it is the starting point to select which full to sue to plan the recovery path.

To make things clear and doubt free, SQL Server includes all data or transactions data into full backup till the END TIME of backup. If your backup took 2 hours that backup will also contain data changes and transactions that happened in these 2 hours.

A full backup includes all data and transactions at the completion of backup time. The full backup cover the complete transaction with all the changes that were made after the backup start checkpoint to apply those changes during the database recovery process.

Expect to come across this popular question in SQL interview questions.

Question 56

What is DCM & BCM Page?

Accepted Answer

DCM (Differential Changed map): DCM is used by SQL server for differential backup. DCM pages keep track of all extents which has changed since the last full database backup.

During SQL server differential backup, database engine reads the DCM page and takes a backup of only those pages which has been changed since the last full backup. If the value is 1 means extent has been modified and 0 means not modified. After each full backup, all these extents are marked to 0.

BCM (Bulk Changed map): BCM page is used in bulk-logged recovery model to track extends changed due to bulk-logged or minimally logged operations.

During log backup, the database engine reads BCM page and includes all the extents which have been modified by the bulk-logged process. If the Value 1 means modified extent and 0 means not modified. After each log backup, all these extents are marked to 0

Question 57

While trying to take a differential backup of MASTER database, I am getting below error. Differential backup is supported by all recovery model then why it’s failing for MASTER database. Can you explain what can be the reason?

“You can only perform a full backup of the master database. Use BACKUP DATABASE to back up the entire master database.”

Accepted Answer

To restore the differential backup of any database, DB needs to be in restoring mode which means DB will not accessible.

The MASTER database is a startup database for any SQL server instance. SQL instance will be in an offline state if the MASTER database is not accessible.

If we combined both statements, We can see that differential backup of the MASTER database is unnecessary as we can not restore. That’s why SQL server will not allow you to do take a differential backup of MASTER DB.

Question 58

Explain multi-server administration and its uses.

Accepted Answer

SQL Server provides the feature of managing jobs from Master \ central server on target servers called multi-server administration. Jobs and steps information is stored on Master Server. When the jobs complete on the target servers notification is sent to the master server so this server has the updated information. This is an enterprise level solution where a consistent set of jobs need to run on numerous SQL Servers.

Question 59

What are the SQL Server Agent Proxy and its sub-system?

Accepted Answer

As the name implies, SQL Server Agent Proxy is an account that grant privilege to the user to execute a particular process or an action when a user does not have rights. The SQL Server Agent Proxies include multiple sub-systems:

ActiveX Script – Access to run ActiveX Scripts
Operating System (CmdExec) – Access to run Command line scripts
Replication Distributor – Replication Agent Rights
Replication Merge – Replication Agent Rights
Replication Queue Reader – Replication Agent Rights
Replication Snapshot – Replication Agent Rights
Replication Transaction-Log Reader – Replication Agent Rights
Analysis Services Command - SSAS execution rights
Analysis Services Query - SSAS execution rights
SSIS Package Execution – SSIS package execution rights
Unassigned Proxies – If required option is not able, You can select this like Other option.

All these sub-systems are available under Proxy in SSMS and you can create as per your requirement.

Question 60

How can SQL Server instances be hidden?

Accepted Answer

To secure your SQL Server instance, it’s advisable to hide your SQL instance. SQL Server instance can be marked as hidden from the SQL Server Configuration Manager.

SQL Server Configuration Manager > Select the instance of SQL Server, > Right click and select Properties > After selecting properties you will just set Hide Instance to "Yes" and click OK or Apply.

You need to restart the instance of SQL Server.

Question 61

How can SQL Injection be stopped / Prevent?

Accepted Answer

An SQL injection is a web hacking techniques done by unauthorized personnel or processes that might destroy your database.

The major challenge is SQL injection can cause a system crash, data stolen, data corruption, etc.

Proper SQL instance, OS & Farwell security with the well-written application can help to reduce the risk of SQL injection.

Development\DBA

Validate or filter the SQL commands that are being passed by the front end
Validate data types and parameters
Use stored procedures with parameters in place of dynamic SQL
Remove old installable from application & database servers
Remove old backup, application files & user profiles
Restrict commands from executing with a semicolon, EXEC, CAST, SET, two dashes, apostrophe, special characters, etc.
Restrict the option of CMD execution or 3rd party execution
Limited or least possible rights to DB users

Infra\Server

Latest Patches
Restricted Access
Updated Antivirus

Network Administration

Allow traffic from required addresses or domains
Firewall settings to be reviewed on a regular basis to prevent SQL Injection attacks

Question 62

How Powershell is useful for DBA? Give any 5 things that you have done using PowerShell?

Accepted Answer

Powershell is windows scripting and powerful enough to manage things from the core in a very efficient manner. Powershell help in deep automation and quick action on DBA activities. The new face of DBA automation.

We can perform multiple actions from a DBA perspective using Powershell, like:-

Check the status of SQL Server service or SQL Server Agent service
Start/stop a SQL Server service
Find the SQL Server version/edition including the service pack level
Find the SQL Server operating system information such as the OS version, processor number, physical memory, etc.
Perform Backups
Script out a SQL Server Agent Job, based on a specific category
Kill all sessions connected to a SQL Server database

Question 63

Why we execute Update Statistics?

Accepted Answer

Update Statistics has performed a recalculation of query optimization statistics for a table or indexed view. Although, with Auto Stats Update option, Query optimization statistics are automatically recomputed, in some cases, a query may benefit from updating those statistics more frequently. UPDATE STATISTICS uses tempdb for its processing.

Please note that update statistics causes queries to be recompiled. This may lead to performance issues for initial execution.

You can perform Update Statistics by 2 methods:

UPDATE STATISTICS - This need Alter right on the table and has more controlled options of performing an update only for one table or specific stats. This can’t be used for the complete database in one go.
SP_UPDATESTATS - This need sysadmin rights on SQL instance. This can help in performing update stats for all the stats of all table in the database.

A must-know for anyone heading into SQL Server interview, this is frequently asked in MS SQL interview questions.

Question 64

If a column is having,>50 % null then which index we should choose?

Accepted Answer

If a column in the table is having >50% NULL values then index selection if very selective.

Index schema is based on B-Tree structure and if the column is having more than 50% NULL values then all data will reside on one side of the tree result ZERO benefits of the index on query execution.

The SQL server is having a special index called filtered index which can be used here. You can create an index on column only on NON NULL values. NULL data will not be included in the index.

SQL Server Interview Questions and Answers Database

Beginner

Intermediate

Advanced