SQL Indexing, DataPages - Yash-777/MyWorld GitHub Wiki
To check the MySQL version, you can use any of the following methods:
SELECT VERSION(); -- 8.0.33
🛠️ Table Schema and Collation Comparison
|
🗂️ MySQL Indexing Summary : PRIMARY KEY , a UNIQUE NOT NULL key , or an auto-generated GEN_CLUST_INDEX . Defining a PRIMARY KEY or UNIQUE NOT NULL key is recommended for better control and performance. |
MySQL SQL script that estimates the allocated size of each field in a table and computes the total estimated row size.
CREATE TABLE Employees (
Id INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(50),
Email VARCHAR(50),
Department VARCHAR(50) -- active BOOLEAN,
);
🛠️ Bulk Data Insertion in MySQL Using Batching and Delay (Insert Up to 200,000 Rows)
used stored procedure to:
- Insert in smaller batches (e.g. 10,000 rows at a time).
- Add a short delay (SLEEP) between each batch.
- Ensure the connection doesn’t time out.
Error Code: 2013. Lost connection to MySQL server during query
DROP TABLE IF EXISTS Employees;
-- Create the Employees table
CREATE TABLE Employees (
Id INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(50),
Email VARCHAR(50),
Department VARCHAR(50)
);
DROP PROCEDURE IF EXISTS InsertEmployees;
DELIMITER $$
CREATE PROCEDURE InsertEmployees()
BEGIN
DECLARE counter INT DEFAULT 1;
DECLARE batchSize INT DEFAULT 10000; -- Insert 10,000 rows per batch
DECLARE totalRows INT DEFAULT 200000;
WHILE counter <= totalRows DO
-- Inner loop to insert a batch
BEGIN
DECLARE i INT DEFAULT 0;
WHILE i < batchSize AND counter <= totalRows DO
INSERT INTO Employees (Name, Email, Department)
VALUES (
CONCAT('ABC ', counter),
CONCAT('abc', counter, '@pragimtech.com'),
CONCAT('Dept ', counter)
);
SET i = i + 1;
SET counter = counter + 1;
END WHILE;
END;
-- Optional progress output
SELECT CONCAT(counter - 1, ' rows inserted') AS progress;
-- Sleep for a short time to avoid timeout (adjust as needed)
DO SLEEP(0.5); -- 0.5 seconds
END WHILE;
END$$
DELIMITER ;
-- Call the procedure
CALL InsertEmployees();
-- Optional: Drop the procedure after running
DROP PROCEDURE InsertEmployees;
-- Check inserted row count
SELECT COUNT(*) FROM Employees;
To calculate the allocated size of each field (column) and the total size of a table row (and overall table),
📋 DESCRIBE Employees;
— Structure of Table
DESCRIBE Employees;
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| Id | int | NO | PRI | NULL | auto_increment |
| Name | varchar(50) | YES | MUL | NULL | |
| Email | varchar(50) | YES | | NULL | |
| Department | varchar(50) | YES | MUL | NULL | |
+-------------+--------------+------+-----+---------+----------------+
📊 SHOW TABLE STATUS LIKE 'Employees';
— Table Info Summary
SHOW TABLE STATUS LIKE 'Employees';
+-----------+--------+---------+-------------+----------------------+--------------------+
| Name | Engine | Rows | Avg_row_len | Data_length (bytes) | Collation |
+-----------+--------+---------+-------------+----------------------+--------------------+
| employees | InnoDB | 199,384 | 76 | 15,220,736 | utf8mb4_0900_ai_ci |
+-----------+--------+---------+-------------+----------------------+--------------------+
📐 Estimated Field-wise Size Calculation (Using utf8mb4)
Assume:
- int = 4 bytes
- varchar(N) with
utf8mb3 = N × 3 + 1 bytes
/utf8mb4 = N × 4 + 1 bytes
(length prefix)
🛠️ 📋 Estimated Column Size Table (Based on |
🛠️ 📄 SQL Script Field-wise Size Calculation - Charset utf8mb3 and utf8mb4
- Set current schema and table
SET @schema_name = DATABASE(); -- use current database
SET @table_name = 'Employe'; -- your table name
- Get the character set of the table into a variable
SELECT
CCSA.CHARACTER_SET_NAME
INTO @table_charset
FROM INFORMATION_SCHEMA.TABLES T
JOIN INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY CCSA
ON T.TABLE_COLLATION = CCSA.COLLATION_NAME
WHERE T.TABLE_SCHEMA = @schema_name
AND T.TABLE_NAME = @table_name;
-- 3. Column size estimation
SELECT
ORDINAL_POSITION,
COLUMN_NAME,
DATA_TYPE,
CHARACTER_SET_NAME,
CASE
-- Fixed-size types
WHEN DATA_TYPE IN ('tinyint') THEN 1
WHEN DATA_TYPE IN ('smallint') THEN 2
WHEN DATA_TYPE IN ('mediumint') THEN 3
WHEN DATA_TYPE IN ('int', 'integer') THEN 4
WHEN DATA_TYPE IN ('bigint') THEN 8
WHEN DATA_TYPE IN ('float') THEN 4
WHEN DATA_TYPE IN ('double', 'real') THEN 8
WHEN DATA_TYPE IN ('date') THEN 3
WHEN DATA_TYPE IN ('time') THEN 3
WHEN DATA_TYPE IN ('year') THEN 1
WHEN DATA_TYPE IN ('datetime') THEN 8
WHEN DATA_TYPE IN ('timestamp') THEN 4
WHEN DATA_TYPE IN ('bit') THEN CEIL(CHARACTER_MAXIMUM_LENGTH / 8)
-- Decimal types
WHEN DATA_TYPE IN ('decimal', 'numeric') THEN
CEIL((NUMERIC_PRECISION + 2) / 2)
-- VARCHAR and CHAR with charset fallback
WHEN DATA_TYPE IN ('varchar', 'char') THEN
CHARACTER_MAXIMUM_LENGTH *
CASE
WHEN CHARACTER_SET_NAME IS NOT NULL THEN
CASE CHARACTER_SET_NAME
WHEN 'utf8mb3' THEN 3
WHEN 'utf8mb4' THEN 4
ELSE 1
END
ELSE
CASE
WHEN @table_charset IN ('utf8', 'utf8mb3') THEN 3
WHEN @table_charset = 'utf8mb4' THEN 4
ELSE 1
END
END +
CASE WHEN DATA_TYPE = 'varchar' THEN 1 ELSE 0 END
-- TEXT and BLOB types
WHEN DATA_TYPE IN ('tinytext', 'tinyblob') THEN 255
WHEN DATA_TYPE IN ('text', 'blob') THEN 65535
WHEN DATA_TYPE IN ('mediumtext', 'mediumblob') THEN 16777215
WHEN DATA_TYPE IN ('longtext', 'longblob') THEN 4294967295
ELSE 0
END AS estimated_column_size_bytes
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = @schema_name
AND TABLE_NAME = @table_name
ORDER BY ORDINAL_POSITION;
output:
+------------------+-------------+-----------+---------------------+----------------------------+
| ORDINAL_POSITION | COLUMN_NAME | DATA_TYPE | CHARACTER_SET_NAME | estimated_column_size_bytes |
+------------------+-------------+-----------+---------------------+----------------------------+
| 1 | Id | int | NULL | 4 |
| 2 | Name | varchar | utf8mb4 | 201 |
| 3 | Email | varchar | utf8mb4 | 201 |
| 4 | Department | varchar | utf8mb4 | 201 |
+------------------+-------------+-----------+---------------------+----------------------------+
✅ Get Total Estimated Row Size - To get the sum of all column sizes:
SELECT
SUM(
CASE
WHEN DATA_TYPE IN ('tinyint') THEN 1
WHEN DATA_TYPE IN ('smallint') THEN 2
WHEN DATA_TYPE IN ('mediumint') THEN 3
WHEN DATA_TYPE IN ('int', 'integer') THEN 4
WHEN DATA_TYPE IN ('bigint') THEN 8
WHEN DATA_TYPE IN ('float') THEN 4
WHEN DATA_TYPE IN ('double', 'real') THEN 8
WHEN DATA_TYPE IN ('date') THEN 3
WHEN DATA_TYPE IN ('time') THEN 3
WHEN DATA_TYPE IN ('year') THEN 1
WHEN DATA_TYPE IN ('datetime') THEN 8
WHEN DATA_TYPE IN ('timestamp') THEN 4
WHEN DATA_TYPE IN ('bit') THEN CEIL(CHARACTER_MAXIMUM_LENGTH / 8)
WHEN DATA_TYPE IN ('decimal', 'numeric') THEN CEIL((NUMERIC_PRECISION + 2) / 2)
WHEN DATA_TYPE IN ('varchar') THEN CHARACTER_MAXIMUM_LENGTH *
CASE CHARACTER_SET_NAME
WHEN 'utf8' THEN 3
WHEN 'utf8mb4' THEN 4
ELSE 1
END + 1
WHEN DATA_TYPE IN ('char') THEN CHARACTER_MAXIMUM_LENGTH *
CASE CHARACTER_SET_NAME
WHEN 'utf8' THEN 3
WHEN 'utf8mb4' THEN 4
ELSE 1
END
WHEN DATA_TYPE IN ('tinytext') THEN 255
WHEN DATA_TYPE IN ('text') THEN 65535
WHEN DATA_TYPE IN ('mediumtext') THEN 16777215
WHEN DATA_TYPE IN ('longtext') THEN 4294967295
WHEN DATA_TYPE IN ('tinyblob') THEN 255
WHEN DATA_TYPE IN ('blob') THEN 65535
WHEN DATA_TYPE IN ('mediumblob') THEN 16777215
WHEN DATA_TYPE IN ('longblob') THEN 4294967295
ELSE 0
END
) AS total_estimated_row_size_bytes
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_SCHEMA = @schema_name
AND TABLE_NAME = @table_name;
output:
total_estimated_row_size_bytes
------------------------------
607
+------------+---------------+------------------------------+----------------+
| Column | Type | Size Calculation | Approx. Size |
+------------+---------------+------------------------------+----------------+
| Id | int | Fixed size | 4 bytes |
| Name | varchar(50) | 50 × 4 + 1 = 201 | 201 bytes |
| Email | varchar(50) | 50 × 4 + 1 = 201 | 201 bytes |
| Department | varchar(50) | 50 × 4 + 1 = 201 | 201 bytes |
+------------+---------------+------------------------------+----------------+
| | | TOTAL | 607 bytes max |
+------------+---------------+------------------------------+----------------+
🧮 Quick Comparison:
Metric | Value |
---|---|
Max Theoretical Row | 607 bytes |
Avg Row Length (actual) | 76 bytes |
Total Rows | 199,384 |
Data Length (total) | 15,220,736 bytes (~15MB) |
🔎 The large difference between 607 bytes theoretical max and 76 bytes average implies that:
- Most string fields are not filled up to their max (50 characters)
- Data is sparse or strings are typically shorter
📌 Short Description: The query retrieves column metadata (name, type, and size details) for the table named Employees from the INFORMATION_SCHEMA.COLUMNS
view. It shows maximum lengths for string types and precision/scale for numeric types.
SELECT
ORDINAL_POSITION AS position,
COLUMN_NAME AS column_name,
DATA_TYPE AS data_type,
CHARACTER_MAXIMUM_LENGTH AS char_max_length,
NUMERIC_PRECISION AS numeric_precision,
NUMERIC_SCALE AS numeric_scale,
IS_NULLABLE AS is_nullable,
COLUMN_KEY AS key_type,
COLUMN_DEFAULT AS default_value,
EXTRA AS extra_attributes
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = DATABASE() -- ensures current DB context
AND TABLE_NAME = 'Employees'
ORDER BY ORDINAL_POSITION;
output CSV data nicely formatted as a tabular plain-text table:
+----------+-------------+------------+-------------------+-------------------+----------------+-------------+-----------+---------------+--------------------+
| Position | Column Name | Data Type | Char Max Length | Numeric Precision | Numeric Scale | Is Nullable | Key Type | Default Value | Extra Attributes |
+----------+-------------+------------+-------------------+-------------------+----------------+-------------+-----------+---------------+--------------------+
| 1 | Id | int | NULL | 10 | 0 | NO | PRI | NULL | auto_increment |
| 2 | Name | varchar | 50 | NULL | NULL | YES | MUL | NULL | |
| 3 | Email | varchar | 50 | NULL | NULL | YES | | NULL | |
| 4 | Department | varchar | 50 | NULL | NULL | YES | MUL | NULL | |
+----------+-------------+------------+-------------------+-------------------+----------------+-------------+-----------+---------------+--------------------+
❓ Why is key_type (COLUMN_KEY) empty for the Email column?
The COLUMN_KEY field in INFORMATION_SCHEMA.COLUMNS
shows whether a column is part of a key/index:
Value | Meaning |
---|---|
PRI |
Primary key |
UNI |
Unique key |
MUL |
Non-unique index (multiple duplicates) |
(empty) | No index/key on this column |
The key_type
is empty, which means: There is no index (primary, unique, or normal
) defined on the Email column.
🛠️ If You Expect It to Be Indexed:
- You might have forgotten to add an index.
- Or you intended Email to be unique, but forgot to enforce it.
How is data stored in sql database |
---|
![]() |
Data is stored in series of data-pages in a Tree-Like structure (Root, Leaf Nodes) (B-Tree) (Clustered Index Structure) |
![]() |
In MySQL, there is no concept of clustered vs nonclustered indexes like in SQL Server. However, MySQL supports:
-
Primary key
=clustered index
(in InnoDB) -
Secondary index
=nonclustered-like
So, to create an index on the Name
column in MySQL (equivalent to a nonclustered index in SQL Server
), you simply use:
CREATE INDEX IDX_Employees_Name ON Employees(Name);
📌 Additional Notes:
- In MySQL's InnoDB engine, the primary key is always the clustered index.
- Any secondary index (like above) stores a copy of the indexed column plus a reference to the primary key, which is used to look up the full row — similar to a nonclustered index with row locator in SQL Server.
- You can also use UNIQUE INDEX if you want to enforce uniqueness:
CREATE UNIQUE INDEX IDX_Employees_Email ON Employees(Email);
🛠️ Bulk Data Insertion in MySQL Using Batching and Delay (Insert Up to 200,000 Rows)
used stored procedure to:
- Insert in smaller batches (e.g. 10,000 rows at a time).
- Add a short delay (SLEEP) between each batch.
- Ensure the connection doesn’t time out.
Error Code: 2013. Lost connection to MySQL server during query
DROP TABLE IF EXISTS Employees;
-- Create the Employees table
CREATE TABLE Employees (
Id INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(50),
Email VARCHAR(50),
Department VARCHAR(50)
);
DROP PROCEDURE IF EXISTS InsertEmployees;
DELIMITER $$
CREATE PROCEDURE InsertEmployees()
BEGIN
DECLARE counter INT DEFAULT 1;
DECLARE batchSize INT DEFAULT 10000; -- Insert 10,000 rows per batch
DECLARE totalRows INT DEFAULT 200000;
WHILE counter <= totalRows DO
-- Inner loop to insert a batch
BEGIN
DECLARE i INT DEFAULT 0;
WHILE i < batchSize AND counter <= totalRows DO
INSERT INTO Employees (Name, Email, Department)
VALUES (
CONCAT('ABC ', counter),
CONCAT('abc', counter, '@pragimtech.com'),
CONCAT('Dept ', counter)
);
SET i = i + 1;
SET counter = counter + 1;
END WHILE;
END;
-- Optional progress output
SELECT CONCAT(counter - 1, ' rows inserted') AS progress;
-- Sleep for a short time to avoid timeout (adjust as needed)
DO SLEEP(0.5); -- 0.5 seconds
END WHILE;
END$$
DELIMITER ;
-- Call the procedure
CALL InsertEmployees();
-- Optional: Drop the procedure after running
DROP PROCEDURE InsertEmployees;
-- Check inserted row count
SELECT COUNT(*) FROM Employees;
Example: Employees table with 200000 records
select * from Employees;
select count(*) from Employees; -- 200000
Describe Employees;
SHOW INDEXES FROM Employees;
ANALYZE TABLE Employees;
EXPLAIN SELECT * from Employees;
Employees Table with Total of Records/Rows: 200000 EXPLAIN ANALYZE SELECT * from Employees where name = "ABC 7"; |
---|
Before Index - Table Scan
-> Filter: (employees.`Name` = 'ABC 7') (cost=20172 rows=19940) (actual time=0.111..150 rows=1 loops=1)
-> Table scan on Employees (cost=20172 rows=199396) (actual time=0.103..109 rows=200000 loops=1) |
After Index - Index lookup
CREATE INDEX IDX_Employees_Name ON Employees(Name); -- Secondary index = nonclustered-like -> Index lookup on Employees using IDX_Employees_Name (Name='ABC 7') (cost=0.35 rows=1) (actual time=0.0453..0.0487 rows=1 loops=1) |
On Duplicate Index - Index lookup
CREATE INDEX IDX_Employees_Name_2 ON Employees(Name); ⚠️ warning(s): 1831 Duplicate index 'IDX_Employees_Name_2' defined on the table 'employees'. This is deprecated and will be disallowed in a future release.
Records: 0 Duplicates: 0 Warnings: 1 -> Index lookup on Employees using IDX_Employees_Name (Name='ABC 7') (cost=0.35 rows=1) (actual time=0.0457..0.049 rows=1 loops=1)
|
In MySQL, when multiple indexes exist on the same column(s) in the same order (even if they have different names), they are considered redundant. However:
✅ Index Lookup Behavior:
MySQL will typically use the first-created
index (based on internal order in the INFORMATION_SCHEMA.STATISTICS
table) that satisfies the query optimizer's requirements.
So, the first B-tree
structure (i.e., the first index created) is likely to be used for lookups, and the second will rarely
(if ever) be used unless the first is dropped or becomes unavailable.
Identify and delete the duplicate index.
DROP INDEX IDX_Employees_Name_2 ON Employees;
Why does MySQL allow CREATE INDEX on a column that's already the PRIMARY KEY?
(e.g.,
CREATE INDEX IDX_Employees_ID ON Employees(id);
when id is already the PRIMARY key)
🔍 Detailed Explanation
In InnoDB (the default storage engine):
- The
PRIMARY KEY
is a clustered index, and it already indexes the column (Id in your case). - Creating a secondary index on the same column (Id) is allowed, but it creates a separate index structure — which:
- Takes extra disk space
- Requires extra maintenance on
inserts/updates/deletes
- Provides no performance benefit in this case
📌 Example
-- Suppose you already have this:
PRIMARY KEY (Id)
-- And then you do:
CREATE INDEX IDX_Employees_ID ON Employees(Id);
Now, MySQL has:
- A
clustered index
(PRIMARY) - A
redundant secondary index
(IDX_Employees_ID) that indexes the same column
MySQL Index Creation Script (on Employees table)
MySQL allows redundant indexes |
---|
CREATE INDEX IDX_Employees_ID ON Employees(id);
CREATE UNIQUE INDEX IDX_Employees_ID_UNIQUE ON Employees(id); |
Before creating indexes, always check existing ones:
SHOW INDEX FROM Employees; |
![]() |
MySQL index creation on sequential columns, we’ll simulate different combinations. Below is a script that creates multiple indexes with varying column order for comparison, focusing on combinations of:
CREATE UNIQUE INDEX IDX_Employees_Email ON Employees(Email); -- Secondary index = nonclustered-like
CREATE INDEX IDX_Employees_Name ON Employees(Name); -- Secondary index = nonclustered-like
-- 1. Index on (id, name, email)
CREATE INDEX idx_id_name_email_1 ON Employees(id, name, email);
CREATE INDEX idx_id_name_email_2 ON Employees(id, name, email); -- Identical index, redundant
-- 2. Index on (id, email)
CREATE INDEX idx_id_email ON Employees(id, email);
-- 3. Index on (name, email, id)
CREATE INDEX idx_name_email_id ON Employees(name, email, id);
-- 4. Index on (name, id, email)
CREATE INDEX idx_name_id_email ON Employees(name, id, email);
-- 5. Another variation of (name, email, id) - identical to idx_name_email_id
CREATE INDEX idx_name_email_id_2 ON Employees(name, email, id); -- Redundant |
Indexes Recap grok.com
SELECT
TABLE_SCHEMA AS table_schema,
TABLE_NAME AS table_name,
INDEX_NAME AS `index`,
GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX) AS index_columns,
NON_UNIQUE AS NON_UNIQUE
FROM
INFORMATION_SCHEMA.STATISTICS
WHERE
TABLE_SCHEMA = DATABASE() -- Applies to all tables in the current DB
AND TABLE_NAME = 'Employees'
-- AND INDEX_NAME != 'PRIMARY' -- AND SEQ_IN_INDEX = 1 (AND COLUMN_NAME != 'id')
GROUP BY
TABLE_SCHEMA, TABLE_NAME, INDEX_NAME, NON_UNIQUE
ORDER BY
TABLE_NAME, INDEX_NAME; |
Understand Leftmost Prefix grok.com
| table_schema | table_name | index | index_columns | NON_UNIQUE |
|--------------|------------|--------------------------|----------------|------------|
| mydb | Employees | PRIMARY | Id | 0 |
| mydb | Employees | IDX_Employees_ID_UNIQUE | Id | 0 |
| mydb | Employees | IDX_Employees_ID | Id | 1 |
| mydb | Employees | IDX_Employees_Name | Name | 1 |
| mydb | Employees | IDX_Employees_Email | Email | 1 |
| mydb | Employees | idx_id_email | Id,Email | 1 |
| mydb | Employees | idx_id_name_email_1 | Id,Name,Email | 1 |
| mydb | Employees | idx_id_name_email_2 | Id,Name,Email | 1 |
| mydb | Employees | idx_name_email_id | Name,Email,Id | 1 |
| mydb | Employees | idx_name_email_id_2 | Name,Email,Id | 1 |
| mydb | Employees | idx_name_id_email | Name,Id,Email | 1 |
|
DROP INDEX IDX_Employees_ID_UNIQUE ON Employees;
DROP INDEX IDX_Employees_ID ON Employees; |
Identify redundant or duplicate indexes to optimize query performance and reduce write overhead on the table.grok.com
-- Set the table names variable (comma-separated list, e.g., 'Employees,OtherTable')
-- SET @table_names = ''; -- Modify this to include desired table names or leave empty for all tables
SET @table_names = 'Employees';
-- Drop temporary tables if they exist
DROP TEMPORARY TABLE IF EXISTS selected_tables;
DROP TEMPORARY TABLE IF EXISTS selected_tables_copy;
DROP TEMPORARY TABLE IF EXISTS id_indexes;
DROP TEMPORARY TABLE IF EXISTS id_indexes_copy;
DROP TEMPORARY TABLE IF EXISTS general_redundant_indexes;
DROP TEMPORARY TABLE IF EXISTS id_redundant_indexes;
DROP TEMPORARY TABLE IF EXISTS redundant_indexes;
-- Create temporary table to store selected table names
CREATE TEMPORARY TABLE selected_tables (
table_name VARCHAR(64)
);
-- Create a copy of selected_tables to avoid 'Can't reopen table' error
CREATE TEMPORARY TABLE selected_tables_copy (
table_name VARCHAR(64)
);
-- Create temporary table to store single-column 'id' indexes
CREATE TEMPORARY TABLE id_indexes (
table_schema VARCHAR(64),
table_name VARCHAR(64),
index_name VARCHAR(64)
);
-- Create a copy of id_indexes to avoid 'Can't reopen table' error
CREATE TEMPORARY TABLE id_indexes_copy (
table_schema VARCHAR(64),
table_name VARCHAR(64),
index_name VARCHAR(64)
);
-- Create temporary table for general redundant indexes (non-PRIMARY, excluding single-column 'id' indexes)
CREATE TEMPORARY TABLE general_redundant_indexes (
table_schema VARCHAR(64),
table_name VARCHAR(64),
redundant_index VARCHAR(64),
kept_index VARCHAR(64),
index_columns TEXT,
drop_statement TEXT
);
-- Create temporary table for id-specific redundant indexes (against PRIMARY)
CREATE TEMPORARY TABLE id_redundant_indexes (
table_schema VARCHAR(64),
table_name VARCHAR(64),
redundant_index VARCHAR(64),
kept_index VARCHAR(64),
index_columns TEXT,
drop_statement TEXT
);
-- Create final storage table for redundant index info
CREATE TEMPORARY TABLE redundant_indexes (
table_schema VARCHAR(64),
table_name VARCHAR(64),
redundant_index VARCHAR(64),
kept_index VARCHAR(64),
index_columns TEXT,
drop_statement TEXT
);
-- Step 1: Parse table names from @table_names variable
INSERT INTO selected_tables (table_name)
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(@table_names, ',', numbers.n), ',', -1)) AS table_name
FROM (
SELECT a.N + b.N * 10 + 1 AS n
FROM
(SELECT 0 AS N UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) a,
(SELECT 0 AS N UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) b
ORDER BY n
) numbers
WHERE
@table_names IS NOT NULL
AND @table_names != ''
AND n <= 1 + (LENGTH(@table_names) - LENGTH(REPLACE(@table_names, ',', '')))
UNION
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = DATABASE()
AND (@table_names IS NULL OR @table_names = '');
-- Step 2: Copy selected_tables to selected_tables_copy
INSERT INTO selected_tables_copy (table_name)
SELECT table_name FROM selected_tables;
-- Step 3: Identify single-column 'id' indexes
INSERT INTO id_indexes (table_schema, table_name, index_name)
SELECT
TABLE_SCHEMA, TABLE_NAME, INDEX_NAME
FROM
INFORMATION_SCHEMA.STATISTICS
WHERE
TABLE_SCHEMA = DATABASE()
AND INDEX_NAME != 'PRIMARY'
AND COLUMN_NAME = 'id'
AND SEQ_IN_INDEX = 1
AND TABLE_NAME IN (SELECT table_name FROM selected_tables)
AND NOT EXISTS (
SELECT 1
FROM INFORMATION_SCHEMA.STATISTICS s2
WHERE s2.TABLE_SCHEMA = TABLE_SCHEMA
AND s2.TABLE_NAME = TABLE_NAME
AND s2.INDEX_NAME = INDEX_NAME
AND s2.SEQ_IN_INDEX > 1
);
-- Step 4: Copy id_indexes to id_indexes_copy
INSERT INTO id_indexes_copy (table_schema, table_name, index_name)
SELECT * FROM id_indexes;
-- Step 5: Insert general redundant indexes (non-PRIMARY, excluding single-column 'id' indexes)
INSERT INTO general_redundant_indexes (table_schema, table_name, redundant_index, kept_index, index_columns, drop_statement)
SELECT
s1.TABLE_SCHEMA,
s1.TABLE_NAME,
s1.INDEX_NAME AS redundant_index,
s2.INDEX_NAME AS kept_index,
GROUP_CONCAT(s1.COLUMN_NAME ORDER BY s1.SEQ_IN_INDEX) AS index_columns,
CONCAT('ALTER TABLE `', s1.TABLE_NAME, '` DROP INDEX `', s1.INDEX_NAME, '`;') AS drop_statement
FROM
INFORMATION_SCHEMA.STATISTICS s1
JOIN
INFORMATION_SCHEMA.STATISTICS s2
ON s1.TABLE_SCHEMA = s2.TABLE_SCHEMA
AND s1.TABLE_NAME = s2.TABLE_NAME
AND s1.SEQ_IN_INDEX = s2.SEQ_IN_INDEX
AND s1.COLUMN_NAME = s2.COLUMN_NAME
AND s1.INDEX_NAME != s2.INDEX_NAME
WHERE
s1.TABLE_SCHEMA = DATABASE()
AND s1.INDEX_NAME != 'PRIMARY'
AND s2.INDEX_NAME != 'PRIMARY'
AND s1.TABLE_NAME IN (SELECT table_name FROM selected_tables)
AND s2.TABLE_NAME IN (SELECT table_name FROM selected_tables_copy)
AND s1.INDEX_NAME NOT IN (
SELECT index_name
FROM id_indexes
WHERE table_schema = s1.TABLE_SCHEMA
AND table_name = s1.TABLE_NAME
)
AND s2.INDEX_NAME NOT IN (
SELECT index_name
FROM id_indexes_copy
WHERE table_schema = s2.TABLE_SCHEMA
AND table_name = s2.TABLE_NAME
)
GROUP BY
s1.TABLE_SCHEMA, s1.TABLE_NAME, s1.INDEX_NAME, s2.INDEX_NAME
HAVING
COUNT(*) = (
SELECT COUNT(*)
FROM INFORMATION_SCHEMA.STATISTICS s3
WHERE s3.TABLE_SCHEMA = s1.TABLE_SCHEMA
AND s3.TABLE_NAME = s1.TABLE_NAME
AND s3.INDEX_NAME = s1.INDEX_NAME
)
AND COUNT(*) = (
SELECT COUNT(*)
FROM INFORMATION_SCHEMA.STATISTICS s4
WHERE s4.TABLE_SCHEMA = s2.TABLE_SCHEMA
AND s4.TABLE_NAME = s2.TABLE_NAME
AND s4.INDEX_NAME = s2.INDEX_NAME
)
AND redundant_index > kept_index;
-- Step 6: Insert id-specific redundant indexes (single-column 'id' against PRIMARY)
INSERT INTO id_redundant_indexes (table_schema, table_name, redundant_index, kept_index, index_columns, drop_statement)
SELECT
s.TABLE_SCHEMA,
s.TABLE_NAME,
s.INDEX_NAME AS redundant_index,
'PRIMARY' AS kept_index,
'id' AS index_columns,
CONCAT('ALTER TABLE `', s.TABLE_NAME, '` DROP INDEX `', s.INDEX_NAME, '`;') AS drop_statement
FROM
INFORMATION_SCHEMA.STATISTICS s
WHERE
s.TABLE_SCHEMA = DATABASE()
AND s.INDEX_NAME != 'PRIMARY'
AND s.COLUMN_NAME = 'id'
AND s.SEQ_IN_INDEX = 1
AND s.TABLE_NAME IN (SELECT table_name FROM selected_tables)
AND NOT EXISTS (
SELECT 1
FROM INFORMATION_SCHEMA.STATISTICS s2
WHERE s2.TABLE_SCHEMA = s.TABLE_SCHEMA
AND s2.TABLE_NAME = s.TABLE_NAME
AND s2.INDEX_NAME = s.INDEX_NAME
AND s2.SEQ_IN_INDEX > 1
)
AND EXISTS (
SELECT 1
FROM INFORMATION_SCHEMA.STATISTICS s3
WHERE s3.TABLE_SCHEMA = s.TABLE_SCHEMA
AND s3.TABLE_NAME = s.TABLE_NAME
AND s3.INDEX_NAME = 'PRIMARY'
AND s3.COLUMN_NAME = 'id'
AND s3.SEQ_IN_INDEX = 1
);
-- Step 7: Combine results into final redundant_indexes table
INSERT INTO redundant_indexes
SELECT * FROM general_redundant_indexes where index_columns != 'id'
UNION
SELECT * FROM id_redundant_indexes;
-- Clean up temporary tables
DROP TEMPORARY TABLE IF EXISTS selected_tables;
DROP TEMPORARY TABLE IF EXISTS selected_tables_copy;
DROP TEMPORARY TABLE IF EXISTS id_indexes;
DROP TEMPORARY TABLE IF EXISTS id_indexes_copy;
DROP TEMPORARY TABLE IF EXISTS general_redundant_indexes;
DROP TEMPORARY TABLE IF EXISTS id_redundant_indexes;
-- Step 8: Select final results
SELECT * FROM redundant_indexes ORDER BY table_name, redundant_index;
🔥 Why Does MySQL Allow Duplicate Index's?
Because:
- MySQL doesn't enforce index uniqueness by definition, only by column content (UNIQUE)
- It's
up to the developer to manage redundancy
Some databases (like PostgreSQL) avoid creating redundant indexes, but MySQL doesn't.
🧹 What Should You Do?
You can safely drop the redundant index:
DROP INDEX IDX_Employees_ID ON Employees;
And rely on the existing PRIMARY KEY (Id)
for any lookups, joins, or sorts on Id.
Here's a detailed explanation of SQL indexing in relation to your example, covering performance, behavior, and how SQL engines optimize queries.
SQL Server index structure sqlshack.com MySQL Visual Explain Plan
SQL indexing is a crucial technique for optimizing database performance, especially for large tables and frequently executed queries. It's akin to creating an index in a book, allowing the database system to quickly locate specific data without scanning the entire table.
Imagine a giant library without any kind of catalog or organization. If you wanted to find a specific book, you'd have to walk through every aisle, pull out every book, and check its title. This is like a database without an index.
Now, imagine that same library with a detailed index in the front. You look up the book title, and the index tells you exactly which shelf and row it's on. This is what an index does for your database.
SQL Server : What is SQL Indexing?
Youtube
:Key/RID lookup
, Unique Index,Clustered/Non-Clustered Index
RID Lookup / Nonclustered Index Structures | Key Lookup / Clustered Index Structures |
---|---|
|
In SQL Server, indexes are organized as B-trees.
index row contains a key and a pointer to:
|
![]() |
![]() |
![]() |
![]() |
An SQL index is a special lookup table that the database search engine can use to speed up data retrieval. It's like an alphabetical list of terms, with pointers to where they can be found.
-
Read Performance Improvement: When you ask MySQL to find something, it first checks if there's an index that can help. If there is, it uses the index to quickly pinpoint the location of the data, instead of scanning through every single record. This is how it dramatically improves "read" (SELECT) performance.
- How MySQL Converts SQL to Native and Fetches Records: When you write a SQL query like
SELECT * FROM feature WHERE name = "Card View";
, MySQL goes through several steps:- Parsing: MySQL first reads and understands your SQL query. It checks for syntax errors.
-
Optimization: This is where the magic happens! MySQL's "optimizer" (a smart piece of software) looks at your query and the available indexes. It tries to figure out the fastest way to get the data. It considers:
- Which indexes exist?
- How selective is the WHERE clause (how many rows will it likely match)?
- What's the cost of using an index versus a full table scan?
- Based on these factors, the optimizer generates an "execution plan." This plan is like a detailed instruction manual for fetching the data.
- Execution: MySQL then follows the execution plan.
-
Without an Index (
Table Scan
): If there's no suitable index, MySQL will perform a "full table scan." This means it literally goes through every single row in the feature table, one by one, checking if name equals "Card View". This is slow for large tables. -
With an Index (
Index Lookup
): If there's an index on the name column, MySQL will:- Go to the index.
- Quickly find "Card View" in the index (indexes are typically stored in a highly organized way, like a B-Tree, which allows for very fast lookups).
- The index entry for "Card View" contains "pointers" (or direct references) to the actual rows in the feature table where "Card View" exists.
- MySQL then directly jumps to those specific rows to retrieve the data. This is significantly faster.
-
Without an Index (
- How MySQL Converts SQL to Native and Fetches Records: When you write a SQL query like
-
Write Performance (Lack of Performance): While indexes are great for reading, they come with a cost for writing (
INSERT, UPDATE, DELETE
).- Whenever you add, change, or remove a record, MySQL not only has to update the actual data in the table but also needs to update all the indexes that involve those columns.
- This extra work means that INSERT, UPDATE, and DELETE operations can be slower on tables with many indexes. It's a trade-off: faster reads for slightly slower writes.
Here are the possible steps involved in SQL indexing, from planning to maintenance:
I. Planning and Design (Before Creation)
-
Identify Candidate Columns:
-
Frequently queried columns: Columns used in
WHERE
,JOIN
,ORDER BY
, andGROUP BY
clauses are prime candidates. -
Columns with high selectivity: Columns with a wide range of distinct values (e.g.,
product_id
,customer_name
) are better candidates than columns with few distinct values (e.g.,gender
). - Foreign key columns: Indexing foreign key columns significantly improves join performance.
- Primary keys: A clustered index is often automatically created on primary key columns, as they uniquely identify rows and are frequently used for lookups.
-
Columns with low
NULL
values: Indexes generally work better on columns that don't contain manyNULL
values.
-
Frequently queried columns: Columns used in
-
Understand Query Patterns:
-
What kind of queries are being run? Are they mostly SELECT statements? Are there many
INSERT
,UPDATE
, orDELETE
operations? Indexes improveSELECT
performance but can slow down write operations due to the overhead of maintaining the index structure. - What are the most critical queries in terms of performance? Focus on optimizing these first.
-
What kind of queries are being run? Are they mostly SELECT statements? Are there many
-
Choose the Right Index Type:
-
Clustered Index:
- Determines the physical order of data in the table.
- A table can have only one clustered index.
- Best for range queries or when data needs to be retrieved in a specific sorted order.
- Often created automatically on the primary key.
-
Non-Clustered Index:
- Does not affect the physical order of data.
- Creates a separate data structure with pointers to the actual data rows.
- A table can have multiple non-clustered indexes.
- Good for improving performance on queries that filter or sort based on columns not included in the clustered index.
-
Unique Index:
- Ensures that all values in the indexed column(s) are unique.
- Can be clustered or non-clustered.
- Used for data integrity as well as performance.
-
Composite/Multi-column Index:
- Created on two or more columns.
- Useful for queries that filter or sort on multiple columns together. The order of columns in the composite index matters.
-
Filtered Index (SQL Server specific):
- Indexes a subset of rows in a table, based on a
WHERE
clause. - Reduces index size and maintenance overhead for specific queries.
- Indexes a subset of rows in a table, based on a
- Other specialized index types: (e.g., Full-text indexes for searching text, spatial indexes for geographic data, XML indexes for XML data).
-
Clustered Index:
-
Consider the Trade-offs:
- Storage Space: Indexes require additional disk space.
-
Write Performance:
INSERT
,UPDATE
, andDELETE
operations can be slower as the index needs to be updated. - Maintenance Overhead: Indexes need to be maintained (reorganized/rebuilt) over time to reduce fragmentation.
II. Creation of Indexes
- Syntax: Use the CREATE INDEX statement. The basic syntax is:
CREATE [UNIQUE] [CLUSTERED | NONCLUSTERED] INDEX index_name
ON table_name (column1 [ASC|DESC], column2 [ASC|DESC], ...);
- index_name: A descriptive name for your index.
- table_name: The table on which the index is to be created.
- column1, column2, ...: The columns to include in the index. You can specify ascending (ASC) or descending (DESC) order.
-
Execution: Run the
CREATE INDEX
command in your database management system (DBMS). This process can take time for large tables, especially if it's a clustered index (as it involves physically reordering the data).
III. Monitoring and Maintenance (After Creation)
- Monitor Index Usage:
- Use database-specific tools or Dynamic Management Views (DMVs) (e.g.,
sys.dm_db_index_usage_stats
in SQL Server) to track how frequently indexes are being used by queries. - Identify unused or underutilized indexes.
- Analyze Query Execution Plans:
- Use
EXPLAIN
(PostgreSQL
,MySQL
) orSHOW PLAN
(SQL Server
) to see how the database optimizer is using (or not using) your indexes. This helps in understanding if your indexes are effective.
- Identify and Address Fragmentation:
- Over time, as data is inserted, updated, and deleted, indexes can become fragmented. Fragmentation can degrade query performance.
- Monitor fragmentation levels (e.g.,
sys.dm_db_index_physical_stats
in SQL Server).
- Perform Index Maintenance:
-
Reorganize Index: A lighter operation that defragments the index without rebuilding it from scratch. It reorders the leaf-level pages of the index. (e.g.,
ALTER INDEX ... REORGANIZE
in SQL Server). -
Rebuild Index: A more intensive operation that drops and re-creates the index. This removes fragmentation completely and can improve performance more significantly. (e.g.,
ALTER INDEX ... REBUILD
in SQL Server). Automate these tasks using scheduled jobs or maintenance plans.
- Remove Unnecessary Indexes:
- If an index is not being used or is providing little to no performance benefit, it's best to remove it to reduce storage overhead and improve write performance.
- Use the
DROP INDEX
statement:
DROP INDEX index_name ON table_name; -- SQL Server
DROP INDEX index_name; -- PostgreSQL, Oracle
ALTER TABLE table_name DROP INDEX index_name; -- MySQL
- Review and Re-evaluate:
- Regularly review your indexing strategy as your data and query patterns evolve.
- Adjust indexes as needed based on performance analysis and business requirements.
By following these steps, you can effectively implement and manage SQL indexes to significantly improve the performance of your database applications.
How it's fetched in MySQL (Huge Records vs. 10 Records):
-
Huge Records (e.g., 100,000,000 records):
- Without Index: Finding a specific record would be like searching for a needle in a massive haystack. MySQL would have to check millions of rows, taking a very long time (seconds, minutes, or even longer).
- With Index: The index acts like a detailed map. MySQL uses the map to go directly to the handful of locations where the "needle" might be. The difference in speed is enormous.
-
10 Records:
- Without Index: MySQL would quickly scan all 10 records. It's so few that the overhead of using an index might actually be slower than just scanning them directly.
- With Index: MySQL would still use the index if it exists and is deemed beneficial by the optimizer. However, the performance difference compared to a full scan would be negligible, or sometimes even slightly worse due to the overhead of going through the index structure. This highlights an important point: Indexes are most beneficial for large tables. For very small tables, they might not offer much, or could even slightly decrease performance due to their overhead.
Does Indexing Create a New Table?
No, an index does not create a completely separate, visible "new table" that you can query directly like SELECT * FROM idx_name_feature
;.
Instead, an index is a separate data structure that MySQL manages internally alongside your main table. Think of it as:
-
Main Table (
feature
): Contains all your actual data rows. -
Index (
idx_name_feature
): Contains a sorted list of the Name values, each pointing to the specific row(s) in the feature table where that Name appears.
How to View Indexes:
You can't "select" from an index directly, but you can see which indexes exist on a table:
-- To see indexes on your 'table_name' table
SHOW INDEX FROM table_name;
-- To see the full table definition including indexes
SHOW CREATE TABLE table_name;
Best Way to Check Indexing Performance (Using
EXPLAIN ANALYZE
)
The EXPLAIN ANALYZE
command (or just EXPLAIN
in older MySQL versions, but ANALYZE
gives actual execution times) is your best friend for understanding how MySQL executes a query and whether it's using an index.
Huge Records (e.g., 100,000 records)
Let's re-examine your examples:
Original Query (No Index on Name):
explain analyze select * from feature where name = "Cart View";
/*
EXPLAIN
-> Filter: (feature.Name = 'Cart View') (cost=121 rows=118) (actual time=0.523..1.93 rows=5 loops=1)
-> Table scan on feature (cost=121 rows=1182) (actual time=0.135..1.35 rows=1182 loops=1)
*/
Explanation:
- Table scan on feature: This is the key indicator! It means MySQL is reading every single row of the feature table (all 1182 rows, as actual time shows).
- cost=121: This is an estimated cost (a unit for comparison, lower is better).
- rows=1182: MySQL scanned all rows.
- actual time=0.135..1.35: It took between 0.135 and 1.35 milliseconds to scan the table.
- Filter: (feature.Name = 'Cart View'): After scanning, it then filters those 1182 rows to find the 5 that match "Cart View".
Creating the Index:
CREATE INDEX idx_name_feature ON feature (Name);
This command tells MySQL to build an index named idx_name_feature
on the Name column of the feature table. MySQL will now create that separate, sorted data structure.
Query After Creating Index:
explain analyze select * from feature where name = "Cart View"; -- Index
/*
EXPLAIN
-> Index lookup on feature using idx_name_feature (Name='Cart View') (cost=1.75 rows=5) (actual time=0.095..0.0985 rows=5 loops=1)
*/
Explanation:
-
Index lookup on feature using
idx_name_feature
: This is exactly what you want to see! It means MySQL used your newly created index. - cost=1.75: The estimated cost is drastically lower (1.75 vs. 121), indicating a much more efficient operation.
- rows=5: MySQL only had to "look up" 5 rows in the index, which directly pointed to the 5 matching data rows.
- actual time=0.095..0.0985: The actual time is significantly reduced (around 0.095 milliseconds, compared to over 1 millisecond for the table scan). This is a clear win!
Primary Key Example:
explain analyze select * from feature where id = 3; -- Primary key
/*
EXPLAIN
-> Rows fetched before execution (cost=0..0 rows=1) (actual time=500e-6..500e-6 rows=1 loops=1)
*/
Explanation:
- Rows fetched before execution: This indicates an extremely fast lookup. Primary keys are usually very highly optimized and often correspond to a "clustered index" in MySQL (specifically, InnoDB tables cluster data by the primary key). This means the data itself is physically stored in the order of the primary key, making lookups by ID incredibly fast.
- actual time=500e-6: This is 0.0005 milliseconds, practically instantaneous!
Indexing small tables: Indexing is generally not beneficial for tables with a small number of records.
For tables with very few records (like your company table with only 2), MySQL's optimizer intelligently opts for a full table scan, as the overhead of using an index often outweighs the benefits.
Let's look at your company table example where the index wasn't used despite being created:
explain analyze select * from company where name = "code"; -- Records count 2
Before:
-> Filter: (company.Name = 'code') (cost=0.45 rows=1) (actual time=0.688..0.688 rows=0 loops=1)
-> Table scan on company (cost=0.45 rows=2) (actual time=0.105..0.136 rows=2 loops=1)
After:
-> Filter: (company.Name = 'code') (cost=0.45 rows=1) (actual time=0.0587..0.0587 rows=0 loops=1)
-> Table scan on company (cost=0.45 rows=2) (actual time=0.0465..0.0533 rows=2 loops=1)
The reason the index isn't used here, even "After" creating it, is because the company table only has 2 records!
MySQL's optimizer is smart. For a table with just 2 records, it's often faster for MySQL to simply scan those 2 records directly from the table (a "table scan") than to go through the overhead of:
- Looking up the index.
- Traversing the index structure.
- Following the pointer(s) from the index to the actual data rows.
- The cost of using the index (even if minimal) might outweigh the benefit for such a tiny table. The
cost=0.45
androws=2
in yourEXPLAIN ANALYZE
output for both "Before" and "After" clearly show that MySQL decided a full table scan was the most efficient approach for a table of that size. The actual time is slightly faster "After" because perhaps some internal caching or other minor factors came into play, but it's still fundamentally doing a table scan.
When to consider indexing: While there's no fixed number, indexes typically become useful when tables have hundreds to thousands of records or more, especially if queries frequently filter or join on specific columns.
MySQL Indexing Commands Explained
-- Replace 'company' with your actual table name if different.
-- Replace 'idx_code' with your desired index name.
-- Replace 'code' with the actual column name you want to index.
-- 1. Display existing indexes on the 'company' table.
SHOW INDEXES FROM company;
-- 2. Display the full table definition, which includes index definitions.
SHOW CREATE TABLE company;
-- 3. Create an index on the 'code' column of the 'company' table.
-- (Uncomment and run this if you want to create the index)
CREATE INDEX idx_code ON company(code);
-- 4. Drop the previously created index (or an existing one).
-- (Uncomment and run this if you want to drop the index)
DROP INDEX idx_code ON company;
-- 5. Show the execution plan for a full table scan.
EXPLAIN SELECT * FROM company;
-- 6. Update statistics for the 'company' table.
-- (Always a good idea after significant data changes or index creation/deletion)
ANALYZE TABLE company;
-- 7. Show the actual execution plan and performance metrics for a query.
EXPLAIN ANALYZE SELECT * FROM company WHERE name = "code";
-- 8. Show the actual execution plan, forcing the use of a specific index.
-- (Useful for testing if an index improves performance or to see overhead)
EXPLAIN ANALYZE SELECT * FROM company FORCE INDEX (idx_code) WHERE name = 'code';
Here's a quick guide to understanding your indexing operations:
-
SHOW INDEXES FROM company;
- What it does: Displays all indexes currently defined on the company table.
-
Usefulness: Essential to verify if an index exists, its type (
UNIQUE
,PRIMARY
,MULTIPLE
), and the columns it covers.
-
CREATE INDEX idx_code ON company(code);
-
What it does: Creates a
non-clustered
index namedidx_code
on the code column of the company table. - Usefulness: Improves read performance for queries filtering or sorting by the code column.
-
What it does: Creates a
-
DROP INDEX idx_code ON company;
-
What it does: Deletes the index named
idx_code
from the company table. - Usefulness: Used to remove unnecessary or poorly performing indexes, reducing write overhead and storage space.
-
What it does: Deletes the index named
-
EXPLAIN SELECT * from company;
-
What it does: Shows the execution plan that MySQL's optimizer intends to use for the
SELECT * from company
query. -
Output you'd see: Likely
Type: ALL
(meaning a full table scan), asSELECT *
without aWHERE
clause usually doesn't benefit from indexes. - Usefulness: Helps predict query performance before execution.
-
What it does: Shows the execution plan that MySQL's optimizer intends to use for the
-
ANALYZE TABLE company;
- What it does: Collects and updates statistics about the company table (e.g., number of rows, distinct values in columns).
- Usefulness: Crucial for the MySQL optimizer! Up-to-date statistics help the optimizer make better decisions about which index (if any) to use, or if a table scan is more efficient. This is often "missed" but very important for index effectiveness. Run this after significant data changes.
-
EXPLAIN ANALYZE SELECT * from company where name = "code";
-
What it does: Executes the
SELECT
query and then shows the actual execution plan along with real-time performance metrics (like actual time and rows processed). - Output you'd typically see for a small table (like 2 records): Likely Table scan because the optimizer determines it's faster to just read all 2 records than use an index.
- Usefulness: Provides detailed, real-world insight into how a query performed and if indexes were truly utilized.
-
What it does: Executes the
-
EXPLAIN ANALYZE SELECT * FROM company FORCE INDEX (idx_code) WHERE name = 'code';
-
What it does: Executes the
SELECT
query, but forces MySQL to use theidx_code
index for the company table, even if the optimizer thinks a table scan would be better. -
Output you'd see: It will attempt to use idx_code. However, for a
2-record table
, you'll likely still see the actual time being similar or even slightly higher than a table scan, demonstrating why the optimizer usually avoids indexes on tiny tables. - Usefulness: A powerful diagnostic tool to test the performance of a specific index. You use this when you suspect MySQL's optimizer is making a wrong decision, or to confirm the overhead of an index on a small table.
-
What it does: Executes the
Key Takeaway: Indexes are a performance optimization for large datasets. For very small tables, the optimizer will often correctly determine that a full table scan is more efficient.