SQL Interview Guide
SQL Interview Guide for Full-Stack Developer (3+ Years Experience)
Table of Contents
- SQL Basics (DDL & DML)
- Data Types and Constraints
- Joins and Relationships
- Subqueries and CTEs
- Window Functions
- Indexes and Performance
- Transactions and Concurrency
- Stored Procedures, Functions, and Triggers
- Normalization and Denormalization
- Advanced Topics (Partitioning, JSON, etc.)
- Interview Questions
SQL Basics (DDL & DML)
Data Definition Language (DDL)
DDL commands define and modify database structures.
CREATE TABLE:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY IDENTITY(1,1),
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50) NOT NULL,
DepartmentID INT,
Salary DECIMAL(10,2),
HireDate DATE DEFAULT GETDATE(),
FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
);
ALTER TABLE:
ALTER TABLE Employees ADD Email VARCHAR(100);
ALTER TABLE Employees DROP COLUMN Email;
ALTER TABLE Employees ALTER COLUMN Salary DECIMAL(12,2);
DROP TABLE:
DROP TABLE Employees;
Data Manipulation Language (DML)
DML commands manipulate data within tables.
INSERT:
INSERT INTO Employees (FirstName, LastName, DepartmentID, Salary)
VALUES ('John', 'Doe', 1, 50000.00);
-- Bulk insert
INSERT INTO Employees (FirstName, LastName, DepartmentID, Salary)
SELECT FirstName, LastName, DepartmentID, Salary FROM TempEmployees;
UPDATE:
UPDATE Employees
SET Salary = Salary * 1.1, DepartmentID = 2
WHERE EmployeeID = 1;
DELETE:
DELETE FROM Employees WHERE DepartmentID = 3;
-- Safe delete with EXISTS
DELETE e FROM Employees e
WHERE EXISTS (SELECT 1 FROM Departments d WHERE d.DepartmentID = e.DepartmentID AND d.IsActive = 0);
SELECT:
-- Basic select
SELECT * FROM Employees;
-- With conditions
SELECT FirstName, LastName, Salary
FROM Employees
WHERE Salary > 50000 AND DepartmentID IN (1,2,3);
-- With aliases
SELECT e.FirstName + ' ' + e.LastName AS FullName, d.DepartmentName
FROM Employees e
JOIN Departments d ON e.DepartmentID = d.DepartmentID;
Data Types and Constraints
Common Data Types
- Numeric: INT, BIGINT, DECIMAL(p,s), FLOAT
- String: VARCHAR(n), NVARCHAR(n), CHAR(n), TEXT
- Date/Time: DATE, DATETIME, DATETIME2, TIME
- Binary: VARBINARY, IMAGE
- Other: BIT, UNIQUEIDENTIFIER, XML, JSON (SQL Server 2016+)
Constraints
PRIMARY KEY: Ensures uniqueness and identifies rows.
CREATE TABLE Users (
UserID INT PRIMARY KEY,
Username VARCHAR(50) UNIQUE NOT NULL
);
FOREIGN KEY: Maintains referential integrity.
ALTER TABLE Orders ADD CONSTRAINT FK_Orders_Customers
FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID);
CHECK: Validates data against a condition.
ALTER TABLE Products ADD CONSTRAINT CHK_Price
CHECK (Price > 0 AND Price < 10000);
DEFAULT: Provides default values.
ALTER TABLE Users ADD CreatedDate DATETIME DEFAULT GETDATE();
UNIQUE: Ensures column values are unique.
CREATE UNIQUE INDEX IX_Users_Email ON Users(Email);
Joins and Relationships
Types of Joins
INNER JOIN: Returns matching rows from both tables.
SELECT e.FirstName, d.DepartmentName
FROM Employees e
INNER JOIN Departments d ON e.DepartmentID = d.DepartmentID;
LEFT JOIN: Returns all rows from left table, matching from right.
SELECT e.FirstName, d.DepartmentName
FROM Employees e
LEFT JOIN Departments d ON e.DepartmentID = d.DepartmentID;
RIGHT JOIN: Returns all rows from right table, matching from left.
SELECT e.FirstName, d.DepartmentName
FROM Employees e
RIGHT JOIN Departments d ON e.DepartmentID = d.DepartmentID;
FULL OUTER JOIN: Returns all rows from both tables.
SELECT e.FirstName, d.DepartmentName
FROM Employees e
FULL OUTER JOIN Departments d ON e.DepartmentID = d.DepartmentID;
CROSS JOIN: Cartesian product of both tables.
SELECT e.FirstName, d.DepartmentName
FROM Employees e
CROSS JOIN Departments d;
SELF JOIN: Join table to itself.
SELECT e1.FirstName AS Employee, e2.FirstName AS Manager
FROM Employees e1
LEFT JOIN Employees e2 ON e1.ManagerID = e2.EmployeeID;
Advanced Join Techniques
Multiple Joins:
SELECT o.OrderID, c.CustomerName, p.ProductName, od.Quantity
FROM Orders o
JOIN Customers c ON o.CustomerID = c.CustomerID
JOIN OrderDetails od ON o.OrderID = od.OrderID
JOIN Products p ON od.ProductID = p.ProductID;
JOIN with Aggregates:
SELECT d.DepartmentName, COUNT(e.EmployeeID) AS EmployeeCount, AVG(e.Salary) AS AvgSalary
FROM Departments d
LEFT JOIN Employees e ON d.DepartmentID = e.DepartmentID
GROUP BY d.DepartmentName;
Subqueries and CTEs
Subqueries
Queries nested within another query.
Scalar Subquery: Returns single value.
SELECT FirstName, LastName, Salary
FROM Employees
WHERE Salary > (SELECT AVG(Salary) FROM Employees);
Correlated Subquery: References outer query.
SELECT DepartmentName
FROM Departments d
WHERE EXISTS (
SELECT 1 FROM Employees e
WHERE e.DepartmentID = d.DepartmentID AND e.Salary > 100000
);
IN Subquery:
SELECT ProductName
FROM Products
WHERE ProductID IN (
SELECT ProductID FROM OrderDetails
WHERE Quantity > 10
);
Common Table Expressions (CTEs)
Temporary named result sets.
Simple CTE:
WITH HighPaidEmployees AS (
SELECT EmployeeID, FirstName, Salary
FROM Employees
WHERE Salary > 50000
)
SELECT * FROM HighPaidEmployees;
Recursive CTE:
WITH EmployeeHierarchy AS (
SELECT EmployeeID, ManagerID, FirstName, 0 AS Level
FROM Employees
WHERE ManagerID IS NULL
UNION ALL
SELECT e.EmployeeID, e.ManagerID, e.FirstName, eh.Level + 1
FROM Employees e
JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
)
SELECT * FROM EmployeeHierarchy;
Multiple CTEs:
WITH DeptStats AS (
SELECT DepartmentID, COUNT(*) AS EmpCount, AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY DepartmentID
),
TopDepts AS (
SELECT DepartmentID FROM DeptStats WHERE AvgSalary > 60000
)
SELECT d.DepartmentName, ds.EmpCount, ds.AvgSalary
FROM TopDepts td
JOIN DeptStats ds ON td.DepartmentID = ds.DepartmentID
JOIN Departments d ON ds.DepartmentID = d.DepartmentID;
Window Functions
Ranking Functions
ROW_NUMBER(): Assigns sequential numbers.
SELECT FirstName, Salary,
ROW_NUMBER() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees;
RANK(): Assigns ranks, with ties.
SELECT FirstName, Salary,
RANK() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees;
DENSE_RANK(): Assigns ranks without gaps.
SELECT FirstName, Salary,
DENSE_RANK() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees;
Aggregate Window Functions
SUM, AVG, COUNT with OVER:
SELECT DepartmentID, FirstName, Salary,
SUM(Salary) OVER (PARTITION BY DepartmentID) AS DeptTotalSalary,
AVG(Salary) OVER (PARTITION BY DepartmentID) AS DeptAvgSalary
FROM Employees;
Analytic Functions
LAG() and LEAD(): Access previous/next rows.
SELECT EmployeeID, Salary,
LAG(Salary) OVER (ORDER BY EmployeeID) AS PrevSalary,
LEAD(Salary) OVER (ORDER BY EmployeeID) AS NextSalary
FROM Employees;
FIRST_VALUE() and LAST_VALUE():
SELECT DepartmentID, FirstName, Salary,
FIRST_VALUE(FirstName) OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS HighestPaidInDept
FROM Employees;
Frame Clauses
SELECT EmployeeID, Salary,
SUM(Salary) OVER (ORDER BY EmployeeID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal
FROM Employees;
Indexes and Performance
Types of Indexes
Clustered Index: Sorts and stores data physically.
CREATE CLUSTERED INDEX IX_Employees_ID ON Employees(EmployeeID);
Non-Clustered Index: Separate structure with pointers.
CREATE NONCLUSTERED INDEX IX_Employees_Name ON Employees(LastName, FirstName);
Unique Index:
CREATE UNIQUE INDEX IX_Users_Email ON Users(Email);
Composite Index:
CREATE INDEX IX_Employees_Dept_Salary ON Employees(DepartmentID, Salary DESC);
Index Best Practices
- Index foreign keys
- Consider selectivity (>80% unique values)
- Avoid over-indexing
- Use included columns for covering indexes
CREATE INDEX IX_Orders_Customer_Date ON Orders(CustomerID) INCLUDE (OrderDate, TotalAmount);
Query Optimization
EXPLAIN PLAN / Execution Plan Analysis:
SET SHOWPLAN_ALL ON;
SELECT * FROM Employees WHERE DepartmentID = 1;
SET SHOWPLAN_ALL OFF;
Index Usage Statistics:
SELECT * FROM sys.dm_db_index_usage_stats;
Missing Indexes:
SELECT * FROM sys.dm_db_missing_index_details;
Performance Tips
- Use appropriate data types
- Avoid SELECT *
- Use EXISTS instead of COUNT for existence checks
- Consider indexed views for complex aggregations
- Partition large tables
- Use query hints judiciously
Transactions and Concurrency
ACID Properties
- Atomicity: All or nothing
- Consistency: Data remains consistent
- Isolation: Transactions don't interfere
- Durability: Changes persist
Transaction Control
BEGIN TRANSACTION;
UPDATE Accounts SET Balance = Balance - 100 WHERE AccountID = 1;
UPDATE Accounts SET Balance = Balance + 100 WHERE AccountID = 2;
IF @@ERROR <> 0
ROLLBACK;
ELSE
COMMIT;
Isolation Levels
READ UNCOMMITTED: Allows dirty reads.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
READ COMMITTED: Default, prevents dirty reads.
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
REPEATABLE READ: Prevents non-repeatable reads.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SERIALIZABLE: Highest isolation, prevents phantom reads.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Concurrency Issues
- Dirty Read: Reading uncommitted data
- Non-repeatable Read: Same row returns different values
- Phantom Read: New rows appear in subsequent reads
- Lost Update: Concurrent updates overwrite each other
Locking
Shared Locks (S): For reading Exclusive Locks (X): For writing Update Locks (U): For potential updates
Deadlock Prevention:
- Access resources in consistent order
- Keep transactions short
- Use lower isolation levels when possible
- Implement retry logic
Stored Procedures, Functions, and Triggers
Stored Procedures
Precompiled SQL code stored in database.
CREATE PROCEDURE GetEmployeeByID
@EmployeeID INT
AS
BEGIN
SELECT * FROM Employees WHERE EmployeeID = @EmployeeID;
END;
-- Execute
EXEC GetEmployeeByID @EmployeeID = 1;
With Output Parameters:
CREATE PROCEDURE GetEmployeeCount
@DepartmentID INT,
@Count INT OUTPUT
AS
BEGIN
SELECT @Count = COUNT(*) FROM Employees WHERE DepartmentID = @DepartmentID;
END;
DECLARE @EmpCount INT;
EXEC GetEmployeeCount @DepartmentID = 1, @Count = @EmpCount OUTPUT;
Functions
Return values, can be used in queries.
Scalar Function:
CREATE FUNCTION GetFullName(@FirstName VARCHAR(50), @LastName VARCHAR(50))
RETURNS VARCHAR(101)
AS
BEGIN
RETURN @FirstName + ' ' + @LastName;
END;
-- Usage
SELECT dbo.GetFullName(FirstName, LastName) AS FullName FROM Employees;
Table-Valued Function:
CREATE FUNCTION GetEmployeesByDept(@DeptID INT)
RETURNS TABLE
AS
RETURN (
SELECT * FROM Employees WHERE DepartmentID = @DeptID
);
-- Usage
SELECT * FROM dbo.GetEmployeesByDept(1);
Triggers
Automatic execution on events.
AFTER Trigger:
CREATE TRIGGER AuditEmployeeInsert
ON Employees
AFTER INSERT
AS
BEGIN
INSERT INTO AuditLog (TableName, Action, ActionDate)
VALUES ('Employees', 'INSERT', GETDATE());
END;
INSTEAD OF Trigger:
CREATE TRIGGER PreventDelete
ON Employees
INSTEAD OF DELETE
AS
BEGIN
-- Custom logic instead of delete
PRINT 'Deletes not allowed';
END;
Using INSERTED/DELETED Tables:
CREATE TRIGGER AuditEmployeeUpdate
ON Employees
AFTER UPDATE
AS
BEGIN
INSERT INTO AuditLog (EmployeeID, OldSalary, NewSalary, ChangeDate)
SELECT i.EmployeeID, d.Salary, i.Salary, GETDATE()
FROM INSERTED i
JOIN DELETED d ON i.EmployeeID = d.EmployeeID;
END;
Normalization and Denormalization
Normal Forms
1NF: Eliminate repeating groups, atomic values.
- Before: Employee: John, Skills: SQL, Python
- After: EmployeeSkills table with EmployeeID, Skill
2NF: Remove partial dependencies.
- No non-key attributes depend on part of composite key.
3NF: Remove transitive dependencies.
- Non-key attributes depend only on key.
BCNF: Every determinant is a candidate key.
Example Normalization
Unnormalized:
CREATE TABLE Orders (
OrderID INT,
CustomerName VARCHAR(100),
CustomerAddress VARCHAR(200),
ProductName VARCHAR(100),
Quantity INT,
Price DECIMAL(10,2)
);
1NF/2NF/3NF:
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
CustomerName VARCHAR(100),
CustomerAddress VARCHAR(200)
);
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
ProductName VARCHAR(100),
Price DECIMAL(10,2)
);
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT FOREIGN KEY REFERENCES Customers(CustomerID),
OrderDate DATE
);
CREATE TABLE OrderDetails (
OrderID INT FOREIGN KEY REFERENCES Orders(OrderID),
ProductID INT FOREIGN KEY REFERENCES Products(ProductID),
Quantity INT,
PRIMARY KEY (OrderID, ProductID)
);
Denormalization
Intentionally introduce redundancy for performance.
When to Denormalize:
- Read-heavy systems
- Complex joins impacting performance
- Reporting databases
- Data warehouses
Example:
CREATE TABLE OrderSummary (
OrderID INT PRIMARY KEY,
CustomerName VARCHAR(100), -- Denormalized
TotalAmount DECIMAL(10,2), -- Pre-calculated
OrderDate DATE
);
Advanced Topics
Partitioning
Divide large tables into smaller, manageable pieces.
Range Partitioning:
CREATE PARTITION FUNCTION pf_Orders(Date)
AS RANGE RIGHT FOR VALUES ('2020-01-01', '2021-01-01', '2022-01-01');
CREATE PARTITION SCHEME ps_Orders
AS PARTITION pf_Orders TO (fg1, fg2, fg3, fg4);
CREATE TABLE Orders (
OrderID INT,
OrderDate DATE,
-- other columns
) ON ps_Orders(OrderDate);
JSON Support (SQL Server 2016+)
Store and query JSON data.
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
ProductData NVARCHAR(MAX) CHECK (ISJSON(ProductData) = 1)
);
-- Insert JSON
INSERT INTO Products VALUES (1, '{"name": "Laptop", "price": 999.99, "specs": {"ram": "16GB", "storage": "512GB"}}');
-- Query JSON
SELECT ProductID,
JSON_VALUE(ProductData, '$.name') AS Name,
JSON_VALUE(ProductData, '$.price') AS Price,
JSON_QUERY(ProductData, '$.specs') AS Specs
FROM Products;
Full-Text Search
Advanced text searching capabilities.
-- Create full-text index
CREATE FULLTEXT CATALOG ft_Catalog AS DEFAULT;
CREATE FULLTEXT INDEX ON Products(ProductDescription)
KEY INDEX PK_Products;
-- Search
SELECT * FROM Products
WHERE CONTAINS(ProductDescription, 'laptop OR computer');
Temporal Tables (SQL Server 2016+)
Track data changes over time.
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Salary DECIMAL(10,2),
ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START,
ValidTo DATETIME2 GENERATED ALWAYS AS ROW END,
PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.EmployeesHistory));
Interview Questions
Basic Level (1-2 Years Experience)
Q1: What is the difference between DELETE and TRUNCATE?
A: DELETE removes specific rows and can be rolled back; TRUNCATE removes all rows, resets identity, and cannot be rolled back. TRUNCATE is faster for large tables.
Q2: What is a primary key?
A: A constraint that uniquely identifies each row in a table. It cannot contain NULL values and must be unique.
Q3: Explain INNER JOIN vs LEFT JOIN.
A: INNER JOIN returns only matching rows from both tables. LEFT JOIN returns all rows from the left table and matching rows from the right table (NULL for non-matches).
Q4: What is normalization?
A: The process of organizing data to reduce redundancy and improve data integrity by dividing large tables into smaller, related tables.
Q5: How do you find duplicate records in a table?
A:
SELECT column1, column2, COUNT(*)
FROM table_name
GROUP BY column1, column2
HAVING COUNT(*) > 1;
Intermediate Level (2-3 Years Experience)
Q6: What is the difference between WHERE and HAVING?
A: WHERE filters rows before grouping; HAVING filters groups after aggregation. HAVING is used with GROUP BY.
Q7: Explain ACID properties.
A: Atomicity (all-or-nothing), Consistency (data remains valid), Isolation (transactions don't interfere), Durability (changes persist after commit).
Q8: What is a correlated subquery?
A: A subquery that references columns from the outer query. It executes once for each row in the outer query.
Q9: How do you optimize a slow query?
A: Check execution plan, add appropriate indexes, rewrite query, use EXISTS instead of IN for large datasets, avoid SELECT *, consider partitioning.
Q10: What is a CTE and when would you use it?
A: Common Table Expression - a temporary named result set. Use for recursive queries, simplifying complex queries, or when you need to reference the result multiple times.
Q11: Explain different types of indexes.
A: Clustered (sorts data physically), Non-clustered (separate structure), Unique (ensures uniqueness), Composite (multiple columns).
Q12: What is a deadlock and how to prevent it?
A: A situation where two transactions wait for each other to release locks. Prevent by accessing resources in consistent order, keeping transactions short, using lower isolation levels.
Q13: What is the difference between a function and a stored procedure?
A: Functions return values and can be used in SELECT statements; procedures perform actions and may not return values. Functions can't modify data.
Q14: How do you handle errors in stored procedures?
A: Use TRY-CATCH blocks, check @@ERROR, use RAISERROR for custom errors.
Q15: What is a trigger and when would you use it?
A: Code that executes automatically on INSERT/UPDATE/DELETE. Use for audit trails, maintaining derived columns, enforcing complex business rules.
Advanced Level (3+ Years Experience)
Q16: How do you implement pagination in SQL?
A:
-- SQL Server
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (ORDER BY column_name) AS RowNum
FROM table_name
) AS TempTable
WHERE RowNum BETWEEN @StartRow AND @EndRow;
-- Alternative
SELECT TOP (@PageSize) * FROM table_name
WHERE column_name > @LastValue
ORDER BY column_name;
Q17: Explain query execution order.
A: FROM, WHERE, GROUP BY, HAVING, SELECT, ORDER BY. Understanding this helps in writing efficient queries.
Q18: What is a covering index?
A: An index that includes all columns needed by a query, eliminating the need to access the table data.
Q19: How do you handle hierarchical data in SQL?
A: Use recursive CTEs, adjacency list model, or nested set model. Recursive CTEs are most common for tree structures.
Q20: What is the difference between UNION and UNION ALL?
A: UNION removes duplicates and sorts; UNION ALL includes duplicates and is faster. Use UNION ALL unless duplicates need to be removed.
Q21: How do you implement row-level security?
A:
-- Using views
CREATE VIEW SecureEmployees AS
SELECT * FROM Employees
WHERE DepartmentID IN (
SELECT DepartmentID FROM UserDepartments
WHERE UserID = CURRENT_USER
);
-- Or using functions
CREATE FUNCTION fn_GetAccessibleDepartments()
RETURNS TABLE AS RETURN (
SELECT DepartmentID FROM UserDepartments
WHERE UserID = CURRENT_USER
);
Q22: What are window functions and give examples?
A: Functions that perform calculations across a set of rows related to the current row. Examples: ROW_NUMBER(), RANK(), SUM() OVER(), LAG(), LEAD().
Q23: How do you optimize a database for high concurrency?
A: Use appropriate isolation levels, minimize lock duration, use row versioning, implement optimistic locking, partition tables, use connection pooling.
Q24: What is database sharding?
A: Horizontal partitioning of data across multiple databases/servers. Each shard contains a subset of data, improving scalability.
Q25: How do you handle large data migrations?
A: Use batch processing, disable constraints/indexes during migration, use bulk insert, implement logging, test rollback procedures, schedule during low-traffic periods.
Q26: What is the difference between OLTP and OLAP?
A: OLTP (Online Transaction Processing): Optimized for fast transactions, normalized data, high concurrency. OLAP (Online Analytical Processing): Optimized for complex queries, denormalized data, historical analysis.
Q27: How do you implement audit trails?
A:
-- Using triggers
CREATE TRIGGER AuditTrigger ON TableName
AFTER INSERT, UPDATE, DELETE
AS
BEGIN
INSERT INTO AuditTable (TableName, Action, OldValues, NewValues, UserID, Timestamp)
SELECT 'TableName',
CASE WHEN EXISTS(SELECT * FROM deleted) AND EXISTS(SELECT * FROM inserted) THEN 'UPDATE'
WHEN EXISTS(SELECT * FROM inserted) THEN 'INSERT'
ELSE 'DELETE' END,
(SELECT * FROM deleted FOR JSON PATH),
(SELECT * FROM inserted FOR JSON PATH),
CURRENT_USER,
GETDATE();
END;
Q28: What is database replication and when to use it?
A: Process of copying data from one database to another. Use for high availability, load balancing, disaster recovery, reporting databases.
Q29: How do you handle database schema changes in production?
A: Use migration scripts, implement backward compatibility, use feature flags, schedule during maintenance windows, have rollback plans, test thoroughly.
Q30: What are the considerations for choosing between SQL and NoSQL?
A: SQL: Structured data, complex relationships, ACID transactions, established ecosystem. NoSQL: Flexible schemas, horizontal scaling, high write throughput, eventual consistency.
Best Practices
- Use appropriate data types - Choose smallest sufficient type
- Always use parameterized queries - Prevent SQL injection
- Index foreign keys - Improves join performance
- **Avoid SELECT *** - Specify required columns
- Use EXISTS instead of COUNT for existence checks
- Keep transactions short - Reduce locking
- Use stored procedures for complex business logic
- Implement proper error handling - Use TRY-CATCH
- Regularly maintain indexes - Rebuild/reorganize fragmented indexes
- Monitor query performance - Use execution plans and DMVs
Common Pitfalls
- Not handling NULL values properly - Use IS NULL, COALESCE
- Implicit conversions - Can prevent index usage
- Over-normalization - Can lead to complex joins
- Ignoring execution plans - Don't optimize blindly
- Not using transactions - Can lead to data inconsistency
- Hardcoding values - Use parameters or config tables
- Forgetting to commit/rollback - Can cause locks
- Not considering concurrency - Can cause deadlocks
- Ignoring data types in joins - Can cause performance issues
- Not testing with realistic data volumes - Performance may degrade
Performance Tips
- Use EXPLAIN PLAN to analyze queries
- Create indexes on WHERE, JOIN, and ORDER BY columns
- Use UNION ALL instead of UNION when possible
- Consider denormalization for read-heavy workloads
- Use query hints sparingly and only when necessary
- Archive old data to separate tables
- Use appropriate isolation levels
- Monitor and tune memory settings
- Use database mirroring/replication for read scaling
- Regularly update statistics for query optimizer
Other SQL Details
Details
Indexes
Indexes improve query performance by allowing faster data retrieval.
Creating an Index:
CREATE INDEX idx_employee_name ON Employees (LastName, FirstName);
Types: Clustered, Non-clustered, Unique.
Constraints
Rules enforced on data.
- PRIMARY KEY: Unique identifier.
- FOREIGN KEY: References primary key in another table.
- UNIQUE: Ensures unique values.
- NOT NULL: Prevents NULL values.
- CHECK: Validates data against a condition.
Example:
ALTER TABLE Employees ADD CONSTRAINT chk_salary CHECK (Salary > 0);
Normalization
Process of organizing data to reduce redundancy.
- 1NF: Eliminate repeating groups.
- 2NF: Remove partial dependencies.
- 3NF: Remove transitive dependencies.
Subqueries
Queries nested inside another query.
Example:
SELECT * FROM Employees WHERE Salary > (SELECT AVG(Salary) FROM Employees);
Window Functions
Perform calculations across a set of rows.
Example (ROW_NUMBER):
SELECT EmployeeID, FirstName, Salary,
ROW_NUMBER() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees;
Pivoting Data
Transform rows into columns.
Example:
SELECT * FROM (
SELECT Department, Salary FROM Employees
) AS SourceTable
PIVOT (
AVG(Salary) FOR Department IN ([IT], [HR], [Finance])
) AS PivotTable;
Interview Questions and Answers
Q1: What is the difference between clustered and non-clustered indexes?
A: Clustered index sorts and stores data physically; non-clustered is a separate structure with pointers.
Q2: Explain normalization and its benefits.
A: Normalization reduces data redundancy and improves data integrity, but can impact query performance.
Q3: What is a correlated subquery?
A: A subquery that references columns from the outer query.
Q4: How do window functions differ from aggregate functions?
A: Window functions operate on a window of rows and return a value for each row; aggregates return one value.
Q5: What is denormalization?
A: Intentionally introducing redundancy for performance gains, often in data warehouses.
Comments
Post a Comment