Default Blame Acceptor: Some Information on Index Rebuilding... Which Collected from various Blogs

There have been many discussions about whether rebuilding indexes is useful or not. Generally speaking, the need to rebuild b-tree indexes is very rare, basically because a b-tree index is largely self-managed or self-balanced.

The most common justifications given for rebuilding an index are:
- index becomes fragmented
- index grows and grows - deleted space is not re-used
- index clustering factor becomes out of sync

In fact most indexes remain both balanced and fragmentation-free because free leaf entries will be reused. Inserts/Updates and Deletes result in free slots being scattered around the index blocks, but these will typically be refilled.
The clustering factor reflects how sorted the table data is with respect to the given index key. Rebuilding an index never has an influence on the clustering factor but instead requires a table re-organization.

Secondly the impact of rebuilding the index can be quite significant, please read the following comments thoroughly:

1. Most scripts around depend on the index_stats dynamic table. This is populated by the command:

analyze index ... validate structure;

While this is a valid method to inspect the index, it grabs an exclusive table lock while analyzing the index. Especially for large indexes, this can be very dramatic, as DML operations on the table are not permitted during that time. While it can be run online without the locking considerations, it may consume additional time.

2. Redo activity and general performance may increase as a direct result of rebuilding an index.

Insert/update/delete causes the index to evolve over time as the index splits and grows. When the index is rebuild it will become more tightly packed; however as DML operations continue on the table the index splits have to be redone again until the index reaches it's equilibrium. As a result, the redo activity increases and the index splits are now more likely to impact performance directly as we consume more I/O, CPU, etc to serve the index restructuring. After a certain period of time the index may again experience 'issues' and may be re-flagged for a rebuild, causing the vicious cycle to continue. Therefore it is often better to leave the index in its natural equilibrium and/or at least prevent indexes from being rebuilt on a regular basis.

3. An index coalesce is often preferred instead of an index rebuild. It has the following advantages:

- does not require approximately 2 times the disk storage
- always online
- does not restructure the index, but combines index leaf blocks as much as possible, avoiding system overhead as explained in point 2.
Note: To re-allocate an index, to another tablespace for example a rebuild is required.

Due to the reasons listed above, it is strongly advised not to rebuild indexes on a regular basis but instead use proper diagnostics.

Please see the following note which lists a script that can be used to analyze the index structure. It does not use the 'analyze index validate structure' command but is based on the current table and index statistics to estimate the index size.

------------------------------------------------------------------------------------------------------------------------------------------

This script will verify the structure of a b-tree index based on the existing table and index statistics. The script calculates the following items:

- Estimate the size the index should be. Optimal packing can be specified

- The index layout

This script keeps a history of the the data gathered in the INDEX_HIST table. This can be useful to prevent a pattern in index rebuilding. The history log can be user defined

Important to note is that this script does not use the 'analyze table ... validate structure' but it relies on the internal function SYS_OP_LBID which is available as of 9i

Before reviewing this script and its suggestions it is highly recommended to review
NOTE 989093.1 - Index Rebuild, the Need vs the Implications

REQUIREMENTS

SQL, SQL*Plus, iSQL*Plus

CONFIGURING

1. Create a user that will contain the index statistics tables

2. Assign the 'dba' privilege to this user.

3. Execute the code located in the script section

If the script is run as a user other than SYS, you may encounter ORA-942 errors when creating the package body.

A role is lost when the procedure is called with definer rights, so unless the following SELECT privileges are granted explicitly the CREATE PACKAGE BODY fails:

grant select on dba_indexes to <user>;
grant select on dba_tables to <user>;
grant select on dba_ind_columns to <user>;
grant select on dba_tab_cols to <user>;
grant select on dba_objects to <user>;
grant select on v_$parameter to <user>;
INSTRUCTIONS

As the script relies on updated statistics, please gather statistics first to be able to see changes in the schema.

SQL> exec dbms_stats.gather_schema_stats('SCOTT');

Then run the next procedure:

SQL> exec index_util.inspect_schema ('SCOTT');

CAUTION

This sample code is provided for educational purposes only, and is not supported by Oracle Support. It has been tested internally, however, we do not guarantee that it will work for you. Ensure that you run it in your test environment before using.
SCRIPT

CREATE TABLE index_log (
owner VARCHAR2(30),
index_name VARCHAR2(30),
last_inspected DATE,
leaf_blocks NUMBER,
target_size NUMBER,
idx_layout CLOB);

ALTER TABLE index_log ADD CONSTRAINT pk_index_log PRIMARY KEY (owner,index_name);

CREATE TABLE index_hist (
owner VARCHAR2(30),
index_name VARCHAR2(30),
inspected_date DATE,
leaf_blocks NUMBER,
target_size NUMBER,
idx_layout VARCHAR2(4000));

ALTER TABLE index_hist ADD CONSTRAINT pk_index_hist PRIMARY KEY (owner,index_name,inspected_date);

--
-- Variables:
-- vMinBlks: Specifies the minimum number of leaf blocks for scanning the index
-- Indexes below this number will not be scanned/reported on
-- vScaleFactor: The scaling factor, defines the threshold of the estimated leaf block count
-- to be smaller than the supplied fraction of the current size.
-- vTargetUse : Supplied percentage utilisation. For example 90% equates to the default pctfree 10
-- vHistRet : Defines the number of records to keep in the INDEX_HIST table for each index entry
--

CREATE OR REPLACE PACKAGE index_util AUTHID CURRENT_USER IS
vMinBlks CONSTANT POSITIVE := 1000;
vScaleFactor CONSTANT NUMBER := 0.6;
vTargetUse CONSTANT POSITIVE := 90; -- equates to pctfree 10
vHistRet CONSTANT POSITIVE := 10; -- (#) records to keep in index_hist
procedure inspect_schema (aSchemaName IN VARCHAR2);
procedure inspect_index (aIndexOwner IN VARCHAR2, aIndexName IN VARCHAR2, aTableOwner IN VARCHAR2, aTableName IN VARCHAR2, aLeafBlocks IN NUMBER);
END index_util;
/

CREATE OR REPLACE PACKAGE BODY index_util IS
procedure inspect_schema (aSchemaName IN VARCHAR2) IS
begin
FOR r IN (select table_owner, table_name, owner index_owner, index_name, leaf_blocks
from dba_indexes
where owner = upper(aSchemaname)
and index_type in ('NORMAL','NORMAL/REV','FUNCTION-BASED NORMAL')
and partitioned = 'NO'
and temporary = 'N'
and dropped = 'NO'
and status = 'VALID'
and last_analyzed is not null
order by owner, table_name, index_name) LOOP

IF r.leaf_blocks > vMinBlks THEN
inspect_index (r.index_owner, r.index_name, r.table_owner, r.table_name, r.leaf_blocks);
END IF;
END LOOP;
commit;
end inspect_schema;
procedure inspect_index (aIndexOwner IN VARCHAR2, aIndexName IN VARCHAR2, aTableOwner IN VARCHAR2, aTableName IN VARCHAR2, aLeafBlocks IN NUMBER) IS
vLeafEstimate number;
vBlockSize number;
vOverhead number := 192; -- leaf block "lost" space in index_stats
vIdxObjID number;
vSqlStr VARCHAR2(4000);
vIndxLyt CLOB;
vCnt number := 0;
TYPE IdxRec IS RECORD (rows_per_block number, cnt_blocks number);
TYPE IdxTab IS TABLE OF IdxRec;
l_data IdxTab;
begin
select block_size into vBlockSize from dba_tablespaces;
select round (100 / vTargetUse * -- assumed packing efficiency
(ind.num_rows * (tab.rowid_length + ind.uniq_ind + 4) + sum((tc.avg_col_len) * (tab.num_rows) ) -- column data bytes
) / (vBlockSize - vOverhead)
) index_leaf_estimate
into vLeafEstimate
from (select /*+ no_merge */ table_name, num_rows, decode(partitioned,'YES',10,6) rowid_length
from dba_tables
where table_name = aTableName
and owner = aTableOwner) tab,
(select /*+ no_merge */ index_name, index_type, num_rows, decode(uniqueness,'UNIQUE',0,1) uniq_ind
from dba_indexes
where table_owner = aTableOwner
and table_name = aTableName
and owner = aIndexOwner
and index_name = aIndexName) ind,
(select /*+ no_merge */ column_name
from dba_ind_columns
where table_owner = aTableOwner
and table_name = aTableName
and index_owner = aIndexOwner
and index_name = aIndexName) ic,
(select /*+ no_merge */ column_name, avg_col_len
from dba_tab_cols
where owner = aTableOwner
and table_name = aTableName) tc
where tc.column_name = ic.column_name
group by ind.num_rows, ind.uniq_ind, tab.rowid_length;

IF vLeafEstimate < vScaleFactor * aLeafBlocks THEN
select object_id into vIdxObjID
from dba_objects
where owner = aIndexOwner
and object_name = aIndexName;
vSqlStr := 'SELECT rows_per_block, count(*) blocks FROM (SELECT /*+ cursor_sharing_exact ' ||
'dynamic_sampling(0) no_monitoring no_expand index_ffs(' || aTableName ||
',' || aIndexName || ') noparallel_index(' || aTableName ||
',' || aIndexName || ') */ sys_op_lbid(' || vIdxObjID ||
', ''L'', ' || aTableName || '.rowid) block_id, ' ||
'COUNT(*) rows_per_block FROM ' || aTableOwner || '.' || aTableName || ' GROUP BY sys_op_lbid(' ||
vIdxObjID || ', ''L'', ' || aTableName || '.rowid)) group by rows_per_block order by rows_per_block';
execute immediate vSqlStr BULK COLLECT INTO l_data;
vIndxLyt := '';

FOR i IN l_data.FIRST..l_data.LAST LOOP
vIndxLyt := vIndxLyt || l_data(i).rows_per_block || ' - ' || l_data(i).cnt_blocks || chr(10);
END LOOP;

select count(*) into vCnt from index_log where owner = aIndexOwner and index_name = aIndexName;

IF vCnt = 0
THEN insert into index_log values (aIndexOwner, aIndexName, sysdate, aLeafBlocks, round(vLeafEstimate,2), vIndxLyt);
ELSE vCnt := 0;

select count(*) into vCnt from index_hist where owner = aIndexOwner and index_name = aIndexName;

IF vCnt >= vHistRet THEN
delete from index_hist
where owner = aIndexOwner
and index_name = aIndexName
and inspected_date = (select MIN(inspected_date)
from index_hist
where owner = aIndexOwner
and index_name = aIndexName);
END IF;

insert into index_hist select * from index_log where owner = aIndexOwner and index_name = aIndexName;

update index_log
set last_inspected = sysdate,
leaf_blocks = aLeafBlocks,
target_size = round(vLeafEstimate,2),
idx_layout = vIndxLyt
where owner = aIndexOwner and index_name = aIndexName;

END IF;
END IF;
END inspect_index;
END index_util;
/

SAMPLE OUTPUT

To find out the indexes that meet the criteria:

SQL> select owner, index_name, last_inspected, leaf_blocks, target_size

from index_log

OWNER INDEX_NAME LAST_INSP LEAF_BLOCKS TARGET_SIZE

------------------------------ ------------------------------ --------- ----------- -----------

SYS I_ARGUMENT1 17-JUN-10 432 303

SYS I_ARGUMENT2 17-JUN-10 282 186

SYS I_COL1 17-JUN-10 288 182

SYS I_DEPENDENCY1 17-JUN-10 109 103

SYS I_DEPENDENCY2 17-JUN-10 136 95

SYS I_H_OBJ#_COL# 17-JUN-10 258 104

SYS WRH$_SQL_PLAN_PK 17-JUN-10 118 59

SYS WRI$_ADV_PARAMETERS_PK 17-JUN-10 210 121

SYS I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST 17-JUN-10 2268 1313

SYS I_WRI$_OPTSTAT_H_ST 17-JUN-10 1025 963

SYS I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST 17-JUN-10 338 191
To verify the Index layout the following query can be used:

SQL> select idx_layout

from index_log

where owner='SCOTT'

and index_name='T_IDX';

IDX_LAYOUT

------------

104 - 1

204 - 1

213 - 1

219 - 1

221 - 2

222 - 1

223 - 2

224 - 1

225 - 1

230 - 1

231 - 3

235 - 3

236 - 1

238 - 3

239 - 2

241 - 1

242 - 2

243 - 1

245 - 3

247 - 1

249 - 1

250 - 1

252 - 3

255 - 1

257 - 2

263 - 2

264 - 1

267 - 1

268 - 1

276 - 1

283 - 1

296 - 1

345 - 1
The first column lists the number of rows in the index block and the second column lists the number of blocks that have this number of index entries, for example there are 3 blocks with 238 rows, and 1 block with 345 rows.

To find out the evolution over time for a specific index :

SQL> select to_char(inspected_date,'DD-MON-YYYY HH24:MI:SS') inspected_date,

leaf_blocks, target_size

from index_hist

where index_name='T_IDX';

INSPECTED_DATE LEAF_BLOCKS TARGET_SIZE

-------------------- ----------- -----------

10-MAR-2010 10:04:04 432 303

10-APR-2010 10:04:03 435 430

10-MAY-2010 10:04:02 431 301

=================================================================================================================================
There are many compelling reasons to manage indexes within Oracle. In an OLTP system, index space is often greater than the space allocated for tables, and fast row data access is critical for sub-second response time. As we may know, Oracle offers a wealth of index structures:

B-tree indexes –.This is the standard tree index that Oracle has been using since the earliest releases.

Bitmap indexes – Bitmap indexes are used where an index column has a relatively small number of distinct values (low cardinality). These are super-fast for read-only databases, but are not suitable for systems with frequent updates

Bitmap join indexes – This is an index structure whereby data columns from other tables appear in a multi-column index of a junction table. This is the only create index syntax to employ a SQL-like from clause and where clause.

create bitmap index
part_suppliers_state
on
inventory( parts.part_type, supplier.state )
from
inventory i,
parts p,
supplier s
where
i.part_id=p.part_id
and
i.supplier_id=s.supplier_id;

In addition to these index structures we also see interesting use of indexes at runtime. Here is a sample of index-based access plans:

Nested loop joins – This row access method scans an index to collect a series of ROWID’s.

Index fast-full-scans – This is a “multi-block read” access where the index blocks are accessed via a “db file scattered read” to load index blocks into the buffers. Please note that this method does not read the index nodes.

Star joins – The star index has changed in structure several times, originally being a single-concatenated index and then changing to a bitmap index implementation. STAR indexes are super-fast when joining large read-only data warehouse tables.

Index combine access – This is an example of the use of the index_combine hint. This execution plan combines bitmap indexes to quickly resolve a low-cardinality Boolean expression:

select /*+ index_combine(emp, dept_bit, job_bit) */
ename,
job,
deptno,
mgr
from
emp
where
job = 'SALESMAN'
and
deptno = 30
;

Here is the execution plan that shows the index combine process:

OPERATION
----------------------------------------------------------------------
OPTIONS OBJECT_NAME POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
2
TABLE ACCESS
BY INDEX ROWID EMP 1
BITMAP CONVERSION
TO ROWIDS 1
BITMAP AND
BITMAP INDEX
SINGLE VALUE DEPT_BIT 1
BITMAP INDEX
SINGLE VALUE JOB_BIT 2

While the internals of Oracle indexing are very complex and open to debate, there are some things that you can do to explore the internal structures of your indexes. Let’s take a closer look at the method that I use to reveal index structures.

Oracle index rebuilding scripts

Ken Adkins, a respected Oracle author, notes that it is often difficult to pinpoint the exact reason that indexes benefit from a rebuild:

“The DBAs were pulling out their hair until they noticed that the size of the indexes were too large for the amount of data in the tables, and remembered this old “myth”, and decided to try rebuilding the indexes on these tables.

The DELETE with the multiple NOT EXISTS went from running for 2 hours to delete 30,000 records, to deleting over 100,000 records in minutes. Simply by rebuilding the indexes….”

An Oracle ACE notes this script to rebuild his indexes. "I eventually wrote a simple query that generates a list of candidates for index rebuilds, and the commands necessary to rebuild the indexes once the tables reached a point where it was necessary. The query reads a table we built called TABLE_MODIFICATIONS that we loaded each night from DBA_TAB_MODIFICATIONS before we ran statistics. Monitoring must be turned on to use the DBA_TAB_MODIFICATIONS table."

select
‘exec analyzedb.reorg_a_table4(’||””||rtrim(t.table_owner)||””||’,'||””||
rtrim(t.table_name)||””||’);’,
t.table_owner||’.'||t.table_name name,
a.num_rows,
sum(t.inserts) ins,
sum(t.updates) upd,
sum(t.deletes) del,
sum(t.updates)+sum(t.inserts)+sum(t.deletes) tot_chgs,
to_char((sum(t.deletes)/(decode(a.num_rows,0,1,a.num_rows)))*100.0,’999999.99') per_del,
round(((sum(t.updates)+sum(t.inserts)+sum(t.deletes))/(decode(a.num_rows,0,1,a.num_rows)) *100.0),2) per_chg
from analyzedb.table_modifications t,
all_tables a
where t.timestamp >= to_date(’&from_date’,'dd-mon-yyyy’) and
t.table_owner = a.owner and t.table_owner not in (’SYS’,'SYSTEM’) and
t.table_name=a.table_name
having (sum(t.deletes)/(decode(a.num_rows,0,1,a.num_rows)))*100.0 >=5
group by t.table_owner, t.table_name, a.num_rows
order by num_rows desc, t.table_owner, t.table_name;

The Debate Continues

Today, a battle is raging between the “academics” who do not believe that indexes should be rebuilt without expensive studies, and the “pragmatists” who rebuild indexes on a schedule because their end-users report faster response times.

To date, none of the world’s Oracle experts has determined a reliable rule for index rebuilding, and no expert has proven that index rebuilds “rarely” help. Getting statistically valid “proof” from a volatile production system would be a phenomenal challenge. In a large production system, it would be a massive effort to trace LIO from specific queries to specific indexes before and after the rebuild.

Academic approach - Many Oracle experts claim that indexes rarely benefit from rebuilding, yet none has ever provided empirical evidence that this is the case, or what logical I/O conditions arise in those “rare” cases where indexes benefit from rebuilding.

Pragmatic approach – Many IT managers force their Oracle DBAs to periodically rebuild indexes because the end-user community reports faster response times following the rebuild. The pragmatists are not interested in “proving” anything, they are just happy that the end-users are happy. Even if index rebuilding were to be proven as a useless activity, the Placebo Effect on the end-users is enough to justify the task.

It is clear that all 70 of the index metrics interact together in a predictable way. Some scientist should be able to take this data and reverse-engineer the internal rules for index rebuilding, if any actually exist. For now, the most any Oracle professional can do is to explore their indexes and learn how the software manages b-tree structures.

When can we "prove" a benefit from an index rebuild? Here, Robin Schumacher proves that an index that is rebuilt in a larger tablespace will contain more index entries be block, and have a flatter structure:

"As you can see, the amount of logical reads has been reduced in half simply by using the new 16K tablespace and accompanying 16K data cache."

In an OracleWorld 2003 presentation titled “Oracle Database 10g: The Self-Managing Database” by Sushil Kumar of Oracle Corporation, Kumar states that the new Automatic Maintenance Tasks (AMT) Oracle10g feature will "automatically detect and re-build sub-optimal indexes.“ However, the 10g segment advisor only recommends index rebuilding from a space reclamation perspective, not for performance.

This Kim Floss article shows the Oracle 10g segment advisor recommending a rebuild of an index:

“The page lists all the segments (table, index, and so on) that constitute the object under review. The default view ("View Segments Recommended to Shrink") lists any segments that have free space you can reclaim.”

Oracle index rebuild advisor (Source: Oracle Corporation)

Inside Oracle b-tree indexes

There are many myths and legends surrounding the use of Oracle indexes, especially the ongoing passionate debate about rebuilding of indexes for improving performance. Some experts claim that periodic rebuilding of Oracle b-tree indexes greatly improves space usage and access speed, while other experts maintain that Oracle indexes should “rarely” be rebuilt. Interestingly, Oracle reports that the new Oracle10g Automatic Maintenance Tasks (AMT) will automatically detect indexes that are in need of re-building. Here are the pros and cons of this highly emotional issue:

Arguments for Index Rebuilding – Many Oracle shops schedule periodic index rebuilding, and report measurable speed improvements after they rebuild their Oracle b-tree indexes. In an OracleWorld 2003 presentation titled Oracle Database 10 g: The Self-Managing Database by Sushil Kumar of Oracle Corporation, Kumar states that the Automatic Maintenance Tasks (AMT) Oracle10g feature will automatically detect and rebuild sub-optimal indexes. “AWR provides the Oracle Database 10g a very good 'knowledge' of how it is being used. By analyzing the information stored in AWR, the database can identify the need of performing routine maintenance tasks, such as optimizer statistics refresh, rebuilding indexes, etc. The Automated Maintenance Tasks infrastructure enables the Oracle Database to automatically perform those operations.”

Arguments against Index Rebuilding – Some Oracle in-house experts maintain that Oracle indexes are super-efficient at space re-use and access speed and that a b-tree index rarely needs rebuilding. They claim that a reduction in Logical I/O (LIO) should be measurable, and if there were any benefit to index rebuilding, someone would have come up with “provable” rules.

So who is right? I suspect that they both are correct. There is a huge body of evidence that index rebuilding makes the end-users report faster response time, and I have to wonder if this is only a Placebo Effect, with no scientific basis. Some experts suspect a Placebo Effect may be at work here, and the end-users, knowing that they have new index trees, report a performance gain when none exists.

Because of their extreme flexibility, Oracle b-tree indexes are quite complex, and to really gather scientific evidence we must examine all of the index metrics. Getting a meaningful measure of the benefit of an index rebuild in a production environment is very difficult because the system is under heavy load and usage patterns change constantly. Plus, many IT managers require periodic index re-building because it is a low-risk job and it curries favor from the end-user community.

Where are the index details?

Most Oracle professionals are aware of the dba_indexes view, which is populated with index statistics when indexes are analyzed. The dba_indexes view contains a great deal of important information for the SQL optimizer, but there is still more to see. Oracle provides an analyze index xxx validate structure command that provides additional statistics into a temporary tables called index_stats, which, sadly, is overlaid after each command.

To get the full picture, we must devise a table structure that will collect data from both sources. Here is a method that will do the job:

Create a temporary table to hold data from dba_indexes and index_stats

Verify quality of dbms_stats analysis

Populate temporary table from dba_indexes

Validate indexes and send output into temp table

IMPORTANT: Collecting the index_stats information is very time consuming and expensive and will introduce serious locking issues on production databases. It is strongly recommended that you perform these checks during scheduled downtime, or on a representative test database.

10g note: Tested against Oracle10g 10.2.0.1.0 on Windows XP SP2, and found

IOT_REDUNDANT_PKEY_ELIM VARCHAR2(3)
DROPPED VARCHAR2(3)
NUM_KEYS NUMBER

Need to be defined in the table index_details, and this crashes the subsequent table population and procedure.

Let’s start by creating a table to hold our index data. I call this table index_details:

drop table index_details;

Create table index_details
(
-- ********* The following is from dba_indexes ******************
OWNER_NAME VARCHAR2(30),
INDEX_NAME VARCHAR2(30),
INDEX_TYPE VARCHAR2(27),
TABLE_OWNER VARCHAR2(30),
TABLE_NAME VARCHAR2(30),
TABLE_TYPE VARCHAR2(11),
UNIQUENESS VARCHAR2(9),
COMPRESSION VARCHAR2(8),
PREFIX_LENGTH NUMBER,
TABLESPACE_NAME VARCHAR2(30),
INI_TRANS NUMBER,
MAX_TRANS NUMBER,
INITIAL_EXTENT NUMBER,
NEXT_EXTENT NUMBER,
MIN_EXTENTS NUMBER,
MAX_EXTENTS NUMBER,
PCT_INCREASE NUMBER,
PCT_THRESHOLD NUMBER,
INCLUDE_COLUMN NUMBER,
FREELISTS NUMBER,
FREELIST_GROUPS NUMBER,
PCT_FREE NUMBER,
LOGGING VARCHAR2(3),
BLEVEL NUMBER,
LEAF_BLOCKS NUMBER,
DISTINCT_KEYS NUMBER,
AVG_LEAF_BLOCKS_PER_KEY NUMBER,
AVG_DATA_BLOCKS_PER_KEY NUMBER,
CLUSTERING_FACTOR NUMBER,
STATUS VARCHAR2(8),
NUM_ROWS NUMBER,
SAMPLE_SIZE NUMBER,
LAST_ANALYZED DATE,
DEGREE VARCHAR2(40),
INSTANCES VARCHAR2(40),
PARTITIONED VARCHAR2(3),
TEMPORARY VARCHAR2(1),
GENERATED VARCHAR2(1),
SECONDARY VARCHAR2(1),
BUFFER_POOL VARCHAR2(7),
USER_STATS VARCHAR2(3),
DURATION VARCHAR2(15),
PCT_DIRECT_ACCESS NUMBER,
ITYP_OWNER VARCHAR2(30),
ITYP_NAME VARCHAR2(30),
PARAMETERS VARCHAR2(1000),
GLOBAL_STATS VARCHAR2(3),
DOMIDX_STATUS VARCHAR2(12),
DOMIDX_OPSTATUS VARCHAR2(6),
FUNCIDX_STATUS VARCHAR2(8),
JOIN_INDEX VARCHAR2(3),
-- ********* The following is from index_stats ******************
HEIGHT NUMBER,
BLOCKS NUMBER,
NAMEx VARCHAR2(30),
PARTITION_NAME VARCHAR2(30),
LF_ROWS NUMBER,
LF_BLKS NUMBER,
LF_ROWS_LEN NUMBER,
LF_BLK_LEN NUMBER,
BR_ROWS NUMBER,
BR_BLKS NUMBER,
BR_ROWS_LEN NUMBER,
BR_BLK_LEN NUMBER,
DEL_LF_ROWS NUMBER,
DEL_LF_ROWS_LEN NUMBER,
DISTINCT_KEYSx NUMBER,
MOST_REPEATED_KEY NUMBER,
BTREE_SPACE NUMBER,
USED_SPACE NUMBER,
PCT_USED NUMBER,
ROWS_PER_KEY NUMBER,
BLKS_GETS_PER_ACCESS NUMBER,
PRE_ROWS NUMBER,
PRE_ROWS_LEN NUMBER,
OPT_CMPR_COUNT NUMBER,
OPT_CMPR_PCTSAVE NUMBER
)
tablespace tools
storage (initial 5k next 5k maxextents unlimited);

(Note: the index_stats table has a column named PCT_USED even though Oracle indexes do not allow changes to this value.)

Now that we have a table that will hold all of the index details, the next step is to populate the table with data from freshly-analyzed indexes. Remember, you should always run dbms_stats to get current index statistics. Here is the script.

insert into index_details
(
OWNER_NAME,
INDEX_NAME,
INDEX_TYPE,
TABLE_OWNER,
TABLE_NAME,
TABLE_TYPE,
UNIQUENESS,
COMPRESSION,
PREFIX_LENGTH,
TABLESPACE_NAME,
INI_TRANS,
MAX_TRANS,
INITIAL_EXTENT,
NEXT_EXTENT,
MIN_EXTENTS,
MAX_EXTENTS,
PCT_INCREASE,
PCT_THRESHOLD,
INCLUDE_COLUMN,
FREELISTS,
FREELIST_GROUPS,
PCT_FREE,
LOGGING,
BLEVEL,
LEAF_BLOCKS,
DISTINCT_KEYS,
AVG_LEAF_BLOCKS_PER_KEY,
AVG_DATA_BLOCKS_PER_KEY,
CLUSTERING_FACTOR,
STATUS,
NUM_ROWS,
SAMPLE_SIZE,
LAST_ANALYZED,
DEGREE,
INSTANCES,
PARTITIONED,
TEMPORARY,
GENERATED,
SECONDARY,
BUFFER_POOL,
USER_STATS,
DURATION,
PCT_DIRECT_ACCESS,
ITYP_OWNER,
ITYP_NAME,
PARAMETERS,
GLOBAL_STATS,
DOMIDX_STATUS,
DOMIDX_OPSTATUS,
FUNCIDX_STATUS,
JOIN_INDEX
)
select * from dba_indexes
where owner not like 'SYS%'
;

Now that we have gathered the index details from dba_indexes, we must loop through iterations of the analyze index xxx validate structure command to populate our table with other statistics. Here is the script that I use to get all index details.

/* INDEX.STATS contains 1 row from last execution */
/* of ANALYZE INDEX ... VALIDATE STRUCTURE */
/* We need to loop through validates for each */
/* index and populate the table. */

DECLARE
v_dynam varchar2(100);
cursor idx_cursor is
select owner_name, index_name from index_details;

BEGIN
for c_row in idx_cursor loop
v_dynam := 'analyze index '||c_row.owner_name||'.'||c_row.index_name||
' validate structure';
execute immediate v_dynam;
update index_details set
(HEIGHT, BLOCKS, NAMEx, PARTITION_NAME, LF_ROWS, LF_BLKS,
LF_ROWS_LEN, LF_BLK_LEN, BR_ROWS, BR_BLKS, BR_ROWS_LEN,
BR_BLK_LEN, DEL_LF_ROWS, DEL_LF_ROWS_LEN, DISTINCT_KEYSx,
MOST_REPEATED_KEY, BTREE_SPACE, USED_SPACE, PCT_USED,
ROWS_PER_KEY, BLKS_GETS_PER_ACCESS, PRE_ROWS, PRE_ROWS_LEN,
OPT_CMPR_COUNT, OPT_CMPR_PCTSAVE)
= (select * from index_stats)
where index_details.owner_name = c_row.owner_name
and index_details.index_name = c_row.index_name;
if mod(idx_cursor%rowcount,50)=0 then
commit;
end if;
end loop;
commit;

END;
/

update
index_details a
set
num_keys =
(select
count(*)
from
dba_ind_columns b
where
a.owner_name = b.table_owner
and
a.index_name = b.index_name
)
;

After running the script from listing 3, we should now have complete index details for any index that we desire. However with 70 different metrics for each index, it can be quite confusing about which columns are the most important. To make queries easy, I create a view that only displays the columns that I find the most interesting. Here is my view.

drop view indx_stats;

Create view idx_stats
as
select
OWNER_NAME ,
INDEX_NAME ,
INDEX_TYPE ,
UNIQUENESS ,
PREFIX_LENGTH ,
BLEVEL ,
LEAF_BLOCKS ,
DISTINCT_KEYS ,
AVG_LEAF_BLOCKS_PER_KEY ,
AVG_DATA_BLOCKS_PER_KEY ,
CLUSTERING_FACTOR ,
NUM_ROWS ,
PCT_DIRECT_ACCESS ,
HEIGHT ,
BLOCKS ,
NAMEx ,
PARTITION_NAME ,
LF_ROWS ,
LF_BLKS ,
LF_ROWS_LEN ,
LF_BLK_LEN ,
BR_ROWS ,
BR_BLKS ,
BR_ROWS_LEN ,
BR_BLK_LEN ,
DEL_LF_ROWS ,
DEL_LF_ROWS_LEN ,
DISTINCT_KEYSx ,
MOST_REPEATED_KEY ,
BTREE_SPACE ,
USED_SPACE ,
PCT_USED ,
ROWS_PER_KEY ,
BLKS_GETS_PER_ACCESS ,
PRE_ROWS ,
PRE_ROWS_LEN ,
num_keys,
sum_key_len
from
Index_details;
While most of these column descriptions are self-evident, there are some that are especially important:

CLUSTERING_FACTOR – This is one of the most important index statistics because it indicates how well sequenced the index columns are to the table rows. If clustering_factor is low (about the same as the number of dba_segments.blocks in the table segment) then the index key is in the same order as the table rows and index range scans will be very efficient, with minimal disk I/O. As clustering_factor increases (up to dba_tables.num_rows), the index key is increasingly out of sequence with the table rows. Oracle’s cost-based SQL optimizer relies heavily upon clustering_factor to decide whether to use the index to access the table.

HEIGHT - As an index accepts new rows, the index blocks split. Once the index nodes have split to a predetermined maximum level the index will “spawn” into a new level.

BLOCKS – This is the number of blocks consumed by the index. This is dependent on the db_block_size. In Oracle9i and beyond, many DBAs create b-tree indexes in very large blocksizes (db_32k_cache_size) because the index will spawn less. Robin Schumacher has noted in his book Oracle Performance Troubleshooting notes “As you can see, the amount of logical reads has been reduced in half simply by using the new 16K tablespace and accompanying 16K data cache. Clearly, the benefits of properly using the new data caches and multi-block tablespace feature of Oracle9i and above are worth your investigation and trials in your own database.“

PCT_USED – This metric is very misleading because it looks identical to the dba_tables pct_used column, but has a different meaning. Normally, the pct_used threshold is the freelist unlink threshold, while in index_stats pct_used is the percentage of space allocated in the b-tree that is being used.

Is there a criterion for index rebuilding?

If we believe the anecdotal reports that index rebuilding improved end-user reported performance, how can we analyze this data and see what the criteria (if any) might be for an index rebuild?

For example, here are the criteria used by a fellow Oracle DBA who swears that rebuilding indexes with these criteria has a positive effect on his system performance:

-- *** Only consider when space used is more than 1 block ***
btree_space > 8192
and
-- *** The number of index levels is > 3 ***
(height > 3
-- *** The % being used is < 75% ***
or pct_used < 75
-- *** Deleted > 20% of total ***
or (del_lf_rows/(decode(lf_rows,0,1,lf_rows)) *100) > 20)
;

In reality I suspect that the rules are far more complicated than this simple formula. To see the commonality between indexes of similar nature you can use the data from your new index_details table to write summary queries (Listing 5). Here we see the average number of index blocks, leaf rows and leaf blocks for indexes of different heights.

This gives us a high-level idea of Oracle threshold for spawning an index onto new levels. We can take this same approach and attempt to answer the following questions:

1 - At what point does an index spawn to another level (height)? It should be a function of blocksize, key length and the number of keys.

2 - The number of deleted leaf nodes may not be enough to trigger an index rebuild. This is because if clustering_factor is low (dba_indexes.clustering_factor ~= dba_segments.blocks), then the rows are added in order, and the index is likely to reuse the deleted leaf nodes. On the other hand, if clustering_factor is high (dba_indexes.clustering_factor ~= dba_tables.num_rows), and the majority of queries use the index with fast-full scans or index range scans, then a rebuild of the underlying table (to resequence the rows) may be beneficial.

To illustrate, assume I have an index on the last_name column of a 1,000,000 row table and the clustering_factor is close to the number of blocks, indicating that the rows are in the same sequence as the index. In this case, a bulk delete of all people whose last_name begins with the letter “K” would leave a dense cluster of deleted leaf nodes on adjacent data blocks within the index tablespace. This large section of space is more likely to be reused than many tiny chunks.

We can also use the data from our detail table to compute our own metrics. In the example query below, we create a meta-rule for indexes:

Dense Full Block Space - This is the index key space (number of table rows * index key length) as a function of the blocksize and free index space.

Percent of Free Blocks - This is the estimated number of free blocks within the index.

Using these metrics, we can analyze the system and produce some very interesting reports of index internals:

col c1 heading 'Average|Height' format 99
col c2 heading 'Average|Blocks' format 999,999
col c3 heading 'Average|Leaf|Rows' format 9,999,999,999
col c4 heading 'Average|Leaf Row|Length' format 999,999,999
col c5 heading 'Average|Leaf Blocks' format 9,999,999
col c6 heading 'Average|Leaf Block|Length' format 9,999,999

select
height c1,
avg(blocks) c2,
avg(lf_rows) c3,
avg(lf_rows_len) c4,
avg(lf_blks) c5,
avg(lf_blk_len) c6
from
index_details
group by
height
;

Average Average Average
Average Average Leaf Leaf Row Average Leaf Block
Height Blocks Rows Length Leaf Blocks Length
------- -------- -------------- ------------ ----------- ----------
1 236 12 234 1 7,996
2 317 33,804 691,504 106 7,915
3 8,207 1,706,685 41,498,749 7,901 7,583
4 114,613 12,506,040 538,468,239 113,628 7,988
As we see, we can compute and spin this data in an almost infinite variety of ways.

Reader Comments on index rebuilding:

It has been my experience that the percentage of deletes required to affect performance actually goes down as the table gets larger, perhaps because the I/O effect is magnified with more data.

At my previous job, our observation was that on 20 million row tables and upward as little as 5% deletes, regardless of other factors, would cause sufficient performance degradation to justify an index rebuild. Admittedly this was a site with fairly tight budget constraints, so we couldn't afford absolute, top of the line hardware. We also didn't have the luxury of making sure every tablespace was ideally placed for performance, but we did a pretty good job with what we had. I actually wrote a script that would calculate the % deletes and generate the index rebuild command.

Also, there is the rebuild online option for indices, which does work but it will cause noticeable performance degradation if you try and do it during busy times. I believe it was available in 8i, but it would generate ora-600 errors when used on busy tables in 8i.

- Oracle ACE

See my related notes on index rebuilding:

Oracle index rebuilding

Oracle index rebuilding - indexes rebuild script

Oracle myth busters index rebuild disk io wait event analysis

Oracle index rebuilding: rebuild indexes

Predict Oracle table index benefit from rebuild reorganization

Oracle table index rebuilding benefits

Oracle Index rebuilding - Why when how

Identifying which Oracle Indexes to Rebuild

Oracle tuning: Blocksize and index tree structures

Reader Feedback:

10g usage note:

Tested against Oracle10g 10.2.0.1.0 on Windows XP SP2, and found

IOT_REDUNDANT_PKEY_ELIM VARCHAR2(3)
DROPPED VARCHAR2(3)
NUM_KEYS NUMBER

not defined in the table index_details, and this crashes the subsequent table population and procedure.

=====================================================================================================================

Index Rebuild from ORACLE Forums
=====================================================================================================================
When to rebuild index ?
By shailesh.mishra on Sep 20, 2009

Index is one of intersting objects in database, which always attract DBAs and Developers to fine tune their queries. Because index is just like salt in food. It should be used optimized quantity. Some time index doesnt show actual expected performance. In those cases we need to chaeck whether index need to be rebuild or not.
I went through certain research docs which shows certain criteria for rebuilding index:

The number of deleted leaf nodes - the number of index nodes that have been logically deleted as a result of row deletes. Remember that Oracle leaves "dead" index nodes in the index when rows are deleted. This is done to speed up SQL deletes, since Oracle does not have to rebalance the index tree when rows are deleted.
Index height - the number of levels that are spawned by the index as a result in row inserts. When a large amount of rows are added to a table, Oracle may spawn additional levels of an index to accommodate the new rows. Hence, an Oracle index may have four levels, but only in those areas of the index tree where the massive inserts have occurred. Oracle indexes can support many millions of entries in three levels

Gets per index access - the amount of logical I/O that is required to fetch a row with the index. As you may know, a logical "get" is not necessarily a physical I/O since much of the index may reside in the Oracle buffer cache. Unfortunately, Oracle does not make it easy to capture this information. In Oracle we must issue these commands:
ANALYZE INDEX index_name COMPUTE STATISTICS
ANALYZE INDEX index_name VALIDATE STRUCTURE

After you analyze the report above, you may want to consider rebuilding any index where the height is more than three levels, since three levels will support millions of index entries.

As you may know, you can easily rebuild an Oracle index with the command:

ALTER INDEX index_name REBUILD

Seeking to other docs I found there are some myths about it. I am a big fan of richard foote concept on oracle database specially about indexes. he has broken some myth about oracle internals. Following some example is based on his some of documents.

Is deleted leaf blocks are reused? Yes. but depends upon how soon data will be reinserted and while B-Tree will balance the tree will reuse it.

SQL> create table test_empty_block (id number, value varchar2(10));
Table created.
SQL> begin
2 for i in 1..10000 loop
3 insert into test_empty_block values (i, 'Bowie');
4 end loop;
5 commit;
6 end;
7 /
PL/SQL procedure successfully completed.
SQL> create index test_empty_block_idx on test_empty_block (id);
Index created.

SQL> delete test_empty_block where id between 1 and 9990;
9990 rows deleted.
SQL> commit;
Commit complete.
SQL> analyze index test_empty_block_idx validate structure;
Index analyzed.
SQL> select lf_blks, del_lf_rows from index_stats;
LF_BLKS DEL_LF_ROWS
---------- -----------
21 9990

Now reinsert a similar volume but after the last current values
SQL> begin
2 for i in 20000..30000 loop
3 insert into test_empty_block values (i, 'Bowie');
4 end loop;
5 commit;
6 end;
7 /
PL/SQL procedure successfully completed.
SQL> analyze index test_empty_block_idx validate structure;
Index analyzed.
SQL> select lf_blks, del_lf_rows from index_stats;
LF_BLKS DEL_LF_ROWS
---------- -----------
21 0
Note all empty blocks have been reused and deleted rows cleanout.

Following select statement was executed after the 9990 deletions in previous example
SQL> select /*+ index test_empty_blocks */ * from test_empty_blocks
where id between 1 and 100000;
10 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_EMPTY_BLOCKS'
2 1 INDEX (RANGE SCAN) OF 'TEST_EMPTY_BLOCKS_IDX' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
28 consistent gets
0 physical reads
0 redo size
549 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed

Deleted space is cleaned out by subsequent writes

Deleted space is cleaned out by delayed block cleanout

Fully emptied blocks are placed on freelist and recycled (although remain in the index structure)

Suggestions that deleted space can never be reused are wrong and yet another silly myth

Conditions for Rebuilds
Large free space (generally 50%+), which indexes rarely reach, and
Large selectivity, which most index accesses never reach, and
Response times are adversely affected, which rarely are.
Note requirement of some free space anyways to avoid insert and
subsequent free space issues
Benefit of rebuild based on various dependencies which include:
- Size of index
- Clustering Factor
- Caching characteristics
- Frequency of index accesses
- Selectivity (cardinality) of index accesses
- Range of selectivity (random or specific range)
- Efficiency of dependent SQL
- Fragmentation characteristics (does it effect portion of index frequently used)
- I/O characteristics of index (serve contention or I/O bottlenecks)
- The list goes on and on ....
Myths:

The vast majority of indexes do not require rebuilding

Oracle B-tree indexes can become "unbalanced" and need to be rebuilt is a myth

Deleted space in an index is "deadwood" and over time requires the index to be rebuilt is a myth

If an index reaches "x" number of levels, it becomes inefficient and requires the index to be rebuilt is a myth

If an index has a poor clustering factor, the index needs to be rebuilt is a myth

To improve performance, indexes need to be regularly rebuilt is a myth

I think above discussion will invoke us to test more results in quantitative analysis. I have given enough approach for it.
Test & Enjoy!!

Thursday, August 15, 2013

Some Information on Index Rebuilding... Which Collected from various Blogs

No comments:

Post a Comment

Total Pageviews

Reference