Refreshing Materialized Views
2016-03-01 14:46
344 查看
https://docs.oracle.com/database/121/DWHSG/refresh.htm#DWHSG8358
This chapter discusses how to refresh materialized views, which is a key element in maintaining good performance and consistent data when working with materialized views in a data warehousing environment.
This chapter includes the following sections:
Refreshing Materialized Views
Using Materialized Views with Partitioned Tables
Using Partitioning to Improve Data Warehouse Refresh
Optimizing DML Operations During Refresh
The database maintains data in materialized
views by refreshing them after changes to the base tables. The refresh method can be incremental or a complete refresh. There are two incremental refresh methods, known as log-based refresh and partition change tracking (PCT) refresh. The incremental refresh
is commonly called
faster than the complete refresh.
A complete refresh occurs when the materialized view is initially created when it is defined as
unless the materialized view references a prebuilt table or is defined as
Users can perform a complete refresh at any time after the materialized view is created. The complete refresh involves executing the query that defines the materialized view. This process can be slow, especially if the database must read and process huge amounts
of data.
An incremental refresh eliminates the need to rebuild materialized views from scratch. Thus, processing only the changes can result in a very fast refresh time. Materialized views can be refreshed either on demand or at regular time intervals. Alternatively,
materialized views in the same database as their base tables can be refreshed whenever a transaction commits its changes to the base tables.
For materialized views that use the log-based fast refresh method, a materialized view log and/or a direct loader log keep a record of changes to the base tables. A materialized view log is a schema object that records changes to a base table so that a materialized
view defined on the base table can be refreshed incrementally. Each materialized view log is associated with a single base table. The materialized view log resides in the same database and schema as its base table.
The PCT refresh method can be used if the modified base tables are partitioned and the modified base table partitions can be used to identify the affected partitions or portions of data in the materialized view. When there have been some partition maintenance
operations on the base tables, this is the only incremental refresh method that can be used. The PCT refresh removes all data in the affected materialized view partitions or affected portions of data and recomputes them from scratch.
For each of these refresh methods, you have two options for how the refresh is performed, namely in-place refresh and out-of-place refresh. The in-place refresh executes the refresh statements directly on the materialized view. The out-of-place refresh creates
one or more outside tables and executes the refresh statements on the outside tables and then switches the materialized view or affected materialized view partitions with the outside tables. Both in-place refresh and out-of-place refresh achieve good performance
in certain refresh scenarios. However, the out-of-place refresh enables high materialized view availability during refresh, especially when refresh statements take a long time to finish.
Also adopting the out-of-place mechanism, a new refresh method called synchronous refresh is introduced in Oracle Database 12c, Release 1. It targets the common usage scenario in the data warehouse where
both fact tables and their materialized views are partitioned in the same way or their partitions are related by a functional dependency.
The refresh approach enables you to keep a set of tables and the materialized views defined on them to be always in sync. In this refresh method, the user does not directly modify the contents of the base tables but must use the APIs provided by the synchronous
refresh package that will apply these changes to the base tables and materialized views at the same time to ensure their consistency. The synchronous refresh method is well-suited for data warehouses, where the loading of incremental data is tightly controlled
and occurs at periodic intervals.
When creating a materialized view, you have the option of specifying whether the refresh occurs
In the case of
the materialized view is changed every time a transaction commits, thus ensuring that the materialized view always contains the latest data. Alternatively, you can control the time when refresh of the materialized views occurs by specifying
In the case of
views, the refresh can be performed with refresh methods provided in either the
the
The
synchronous refresh, a new refresh method introduced in Oracle Database 12c, Release 1. For details, see Chapter
8, "Synchronous Refresh".
The
is described in this chapter. There are three basic types of refresh operations: complete refresh, fast refresh, and partition change tracking (PCT) refresh. These basic types have been enhanced in Oracle Database 12c,
Release 1 with a new refresh option called out-of-place refresh.
The
refresh operations:
Refresh one or more materialized views.
Refresh all materialized views.
Refresh all materialized views that depend on a specified master table or materialized view or list of master tables or materialized views.
See Also:
"Manual Refresh Using the DBMS_MVIEW Package" for more information
Performing a refresh operation requires temporary space to rebuild the indexes and can require additional space for performing the refresh operation itself. Some sites might prefer not to refresh all of their materialized views at the same time: as soon as
some underlying detail data has been updated, all materialized views using this data become stale. Therefore, if you defer refreshing your materialized views, you can either rely on your chosen rewrite integrity level to determine whether or not a stale materialized
view can be used for query rewrite, or you can temporarily disable query rewrite with an
Refreshing a materialized view automatically updates all of its indexes. In the case of full refresh, this requires temporary sort space to rebuild all indexes during refresh. This is because the full refresh truncates or deletes the table before inserting
the new full data volume. If insufficient temporary space is available to rebuild the indexes, then you must explicitly drop each index or mark it
to performing the refresh operation.
If you anticipate performing insert, update or delete operations on tables referenced by a materialized view concurrently with the refresh of that materialized view, and that materialized view includes joins and aggregation, Oracle recommends you use
refresh rather than
refresh.
An additional option when performing refresh is to use out-of-place refresh, where outside tables are used to improve materialized view availability and refresh performance in certain situations.
See Also:
Oracle OLAP User's Guide for
information regarding the refresh of cube organized materialized views
"The Out-of-Place Refresh Option" for a discussion of out-of-place refresh
This section contains the following topics:
Complete Refresh
Fast Refresh
Partition Change Tracking (PCT) Refresh
The Out-of-Place Refresh Option
ON COMMIT Refresh
Manual Refresh Using the DBMS_MVIEW Package
Refresh Specific Materialized Views with REFRESH
Refresh All Materialized Views with REFRESH_ALL_MVIEWS
Refresh Dependent Materialized Views with REFRESH_DEPENDENT
Using Job Queues for Refresh
When Fast Refresh is PossibleRecommended
Initialization Parameters for Parallelism
Monitoring a Refresh
Checking the Status of a Materialized View
Scheduling Refresh
A complete refresh occurs when the materialized view is initially defined as
unless the materialized view references a prebuilt table. For materialized views using
a complete refresh must be requested before it can be used for the first time. A complete refresh may be requested at any time during the life of any materialized view. The refresh involves reading the detail tables to compute the results for the materialized
view. This can be a very time-consuming process, especially if there are huge amounts of data to be read and processed. Therefore, you should always consider the time required to process a complete refresh before requesting it.
There are, however, cases when the only refresh method available for an already built materialized view is complete refresh because the materialized view does not satisfy the conditions specified in the following section for a fast refresh.
Most data warehouses have periodic incremental updates to their detail data. As described in "Materialized
View Schema Design", you can use the SQL*Loader or any bulk load utility to perform incremental loads of detail data. Fast refresh of your materialized views is usually efficient, because instead of having to recompute the entire materialized view, the
changes are applied to the existing data. Thus, processing only the changes can result in a very fast refresh time.
When there have been some partition maintenance
operations on the detail tables, this is the only method of fast refresh that can be used. PCT-based refresh on a materialized view is enabled only if all the conditions described in "About
Partition Change Tracking" are satisfied.
In the absence of partition maintenance operations on detail tables, when you request a
(
Oracle uses a heuristic rule to try log-based rule fast refresh before choosing PCT refresh. Similarly, when you request a
(
order: log-based fast refresh, PCT refresh, and complete refresh. Alternatively, you can request the PCT method (
Oracle can use
a materialized view if it satisfies the conditions in "Benefits of Partitioning a Materialized
View" and hence, make the PCT refresh process more efficient.
See Also:
"About Partition Change Tracking" for more information regarding partition change
tracking
Beginning with Oracle Database 12c Release
1, a new refresh option is available to improve materialized view refresh performance and availability. This refresh option is called out-of-place refresh because it uses outside tables during refresh as opposed to the existing "in-place" refresh that directly
applies changes to the materialized view container table. The out-of-place refresh option works with all existing refresh methods, such as
and
Out-of-place refresh is particularly effective when handling situations with large amounts of data changes, where conventional DML statements do not scale well. It also enables you to achieve a very high degree of availability because the materialized views
that are being refreshed can be used for direct access and query rewrite during the execution of refresh statements. In addition, it helps to avoid potential problems such as materialized view container tables becoming fragmented over time or intermediate
refresh results being seen.
In out-of-place refresh, the entire or affected portions of a materialized view are computed into one or more outside tables. For partitioned materialized views, if partition level change tracking is possible, and there are local indexes defined on the materialized
view, the out-of-place method also builds the same local indexes on the outside tables. This refresh process is completed by either switching between the materialized view and the outside table or partition exchange between the affected partitions and the
outside tables. During refresh, the outside table is populated by direct load, which is efficient.
This section contains the following topics:
Types of Out-of-Place Refresh
Restrictions and Considerations with Out-of-Place Refresh
There are three types of out-of-place refresh:
out-of-place fast refresh
This offers better availability than in-place fast refresh. It also offers better performance when changes affect a large part of the materialized view.
out-of-place PCT refresh
This offers better availability than in-place PCT refresh. There are two different approaches for partitioned and non-partitioned materialized views. If truncation and direct load are not feasible, you should use out-of-place refresh when the changes are relatively
large. If truncation and direct load are feasible, in-place refresh is preferable in terms of performance. In terms of availability, out-of-place refresh is always preferable.
out-of-place complete refresh
This offers better availability than in-place complete refresh.
Using the refresh interface in the
with
out-of-place fast refresh are attempted first, then out-of-place PCT refresh, and finally out-of-place complete refresh. An example is the following:
Out-of-place refresh has all the restrictions that apply when using the corresponding in-place refresh. In addition, it has the following restrictions:
Only materialized join views and materialized aggregate views are allowed
No
is permitted
No remote materialized views, cube materialized views, object materialized views are permitted
No
Not permitted if materialized view logs, triggers, or constraints (except
are defined on the materialized view
Not permitted if the materialized view contains the
Not applied to complete refresh within a
or an
Atomic mode is not permitted. If you specify
an error is displayed
For out-of-place PCT refresh, there is the following restriction:
No
grouping sets are permitted
For out-of-place fast refresh, there are the following restrictions:
No
grouping sets or outer joins are permitted
Not allowed for materialized join views when more than one base table is modified with mixed DML statements
Out-of-place refresh requires additional storage for the outside table and the indexes for the duration of the refresh. Thus, you must have enough available tablespace or auto extend turned on.
The partition exchange in out-of-place PCT refresh impacts the global index on the materialized view. Therefore, if there are global indexes defined on the materialized view container table, Oracle disables the global indexes before doing the partition exchange
and rebuild the global indexes after the partition exchange. This rebuilding is additional overhead.
A materialized view can be refreshed automatically using the
Therefore, whenever a transaction commits which has updated the tables on which a materialized view is defined, those changes are automatically reflected in the materialized view. The advantage of using this approach is you never have to remember to refresh
the materialized view. The only disadvantage is the time required to complete the commit will be slightly longer because of the extra processing involved. However, in a data warehouse, this should not be an issue because there is unlikely to be concurrent
processes trying to update the same table.
When a materialized view is refreshed
one of four refresh methods can be specified as shown in the following table. You can define a default option during the creation of the materialized view. Table
7-1 details the refresh options.
Table 7-1 ON
DEMAND Refresh Methods
Three refresh procedures are available in the
for performing
Each has its own unique set of parameters.
See Also:
Oracle Database Advanced
Replication for information showing how to use it in a replication environment
Oracle Database PL/SQL Packages
and Types Reference for detailed information about the
Use the
to refresh one or more materialized views. Some parameters are used only for replication, so they are not mentioned here. The required parameters to use this procedure are:
The comma-delimited list of materialized views to refresh
The refresh method:
The rollback segment to use
Refresh after errors (
A Boolean parameter. If set to
parameter is set to the number of refreshes that failed, and a generic error message indicates that failures occurred. The alert log for the instance gives details of refresh errors. If set to
the default, then refresh stops after it encounters the first error, and any remaining materialized views in the list are not refreshed.
The following four parameters are used by the replication process. For warehouse refresh, set them to
Atomic refresh (
If set to
set to
in separate transactions. If set to
by using parallel DML and truncate DDL on a materialized views. When a materialized view is refreshed in atomic mode, it is eligible for query rewrite if the rewrite integrity mode is set to
Atomic refresh cannot be guaranteed when refresh is performed on nested views.
Whether to use out-of-place refresh
This parameter works with all existing refresh methods (
So, for example, if you specify
then an out-of-place fast refresh is attempted. Similarly, if you specify
then out-of-place PCT refresh is attempted.
For example, to perform a fast refresh on the materialized view
the
Multiple materialized views can be refreshed at the same time, and they do not all have to use the same refresh method. To give them different refresh methods, specify multiple method codes in the same order as the list of materialized views (without commas).
For example, the following specifies that
completely refreshed and
a fast refresh:
If the refresh method is not specified, the default refresh method as specified in the materialized view definition is used.
An alternative to specifying the materialized views to refresh is to use the procedure
This procedure refreshes all materialized views. If any of the materialized views fails to refresh, then the number of failures is reported.
The parameters for this procedure are:
The number of failures (this is an
The refresh method:
Refresh after errors (
A Boolean parameter. If set to
parameter is set to the number of refreshes that failed, and a generic error message indicates that failures occurred. The alert log for the instance gives details of refresh errors. If set to
the default, then refresh stops after it encounters the first error, and any remaining materialized views in the list is not refreshed.
Atomic refresh (
If set to
set to
in separate transactions. If set to
by using parallel DML and truncate DDL on a materialized views. When a materialized view is refreshed in atomic mode, it is eligible for query rewrite if the rewrite integrity mode is set to
Atomic refresh cannot be guaranteed when refresh is performed on nested views.
Whether to use out-of-place refresh
This parameter works with all existing refresh method (
So, for example, if you specify
then an out-of-place fast refresh is attempted. Similarly, if you specify
then out-of-place PCT refresh is attempted.
An example of refreshing all materialized views is the following:
The third procedure,
refreshes only those materialized views that depend on a specific table or list of tables. For example, suppose the changes have been received for the
but not for
procedure can be called to refresh only those materialized views that reference the
The parameters for this procedure are:
The number of failures (this is an
The dependent table
The refresh method:
The rollback segment to use
Refresh after errors (
A Boolean parameter. If set to
parameter is set to the number of refreshes that failed, and a generic error message indicates that failures occurred. The alert log for the instance gives details of refresh errors. If set to
the default, then refresh stops after it encounters the first error, and any remaining materialized views in the list are not refreshed.
Atomic refresh (
If set to
set to
in separate transactions. If set to
by using parallel DML and truncate DDL on a materialized views. When a materialized view is refreshed in atomic mode, it is eligible for query rewrite if the rewrite integrity mode is set to
Atomic refresh cannot be guaranteed when refresh is performed on nested views.
Whether it is nested or not
If set to
specified set of tables based on a dependency order to ensure the materialized views are truly fresh with respect to the underlying base tables.
Whether to use out-of-place refresh
This parameter works with all existing refresh methods (
So, for example, if you specify
then an out-of-place fast refresh is attempted. Similarly, if you specify
then out-of-place PCT refresh is attempted.
To perform a full refresh on all materialized views that reference the
specify:
Job queues can be used to refresh multiple materialized views in parallel. If queues are not available, fast refresh sequentially refreshes each view in the foreground process. To make queues available, you must set the
This parameter defines the number of background job queue processes and determines how many materialized views can be refreshed concurrently. Oracle tries to balance the number of concurrent refreshes with the degree of parallelism of each refresh. The order
in which the materialized views are refreshed is determined by dependencies imposed by nested materialized views and potential for efficient refresh by using query rewrite against other materialized views (See"Scheduling
Refresh" for details). This parameter is only effective when
set to
If the process that is executing
interrupted or the instance is shut down, any refresh jobs that were executing in job queue processes are requeued and continue running. To remove these jobs, use the
See Also:
Oracle Database PL/SQL Packages
and Types Reference for detailed information about the
Not all materialized views may be fast refreshable. Therefore, use the package
determine what refresh methods are available for a materialized view.
If you are not sure how to make a materialized view fast refreshable, you can use the
which provides a script containing the statements required to create a fast refreshable materialized view.
See Also:
Oracle Database SQL
Tuning Guide
Chapter 5, "Basic Materialized Views" for further information about the
The following initialization parameters need to be set properly for parallelism to be effective:
be set high enough to take care of parallelism. You must consider the number of slaves needed for the refresh statement. For example, with a degree of parallelism of eight, you need 16 slave processes.
be set for the instance to manage the memory usage for sorts and joins automatically. If the memory parameters are set manually,
be less than
equal
Remember to analyze all tables and indexes for better optimization.
See Also:
Oracle
Database VLDB and Partitioning Guide
While a job is running, you can query
the
of each materialized view being refreshed.
To look at the progress of which jobs are on which queue, use:
Three views are provided for checking the status of a materialized view:
and
issue the following statement:
If the
the other displayed column values cannot be trusted as reflecting the true status. To revalidate the materialized view, issue the following statement:
Then reissue the
Several views are available that enable you to verify the status of base table partitions and determine which ranges of materialized view data are fresh and which are stale. The views are as follows:
To determine partition change tracking (PCT) information for the materialized view.
To display partition information for the detail table a materialized view is based on.
To determine which partitions are fresh.
To determine which subpartitions are fresh.
The use of these views is illustrated in the following examples. Figure 7-1 illustrates
a range-list partitioned table and a materialized view based on it. The partitions are P1, P2, P3, and P4, while the subpartitions are SP1, SP2, and SP3.
Figure 7-1 Determining PCT Freshness
Description of "Figure 7-1 Determining PCT Freshness"
Examples of Using Views to Determine Freshness
This section illustrates examples of determining the PCT and freshness information for materialized views and their detail tables.
Example 7-1 Verifying the PCT Status of a Materialized View
Query
materialized view, as shown in the following:
Example 7-2 Verifying the PCT Status in a Materialized View's Detail Table
Query
table information, as shown in the following:
Example 7-3 Verifying Which Partitions are Fresh
Query
information for partitions, as shown in the following:
Example 7-4 Verifying Which Subpartitions are Fresh
Query
information for subpartitions, as shown in the following:
Very often you have multiple materialized views in the database. Some of these can be computed by rewriting against others. This is very
common in data warehousing environment where you may have nested materialized views or materialized views at different levels of some hierarchy.
In such cases, you should create the materialized views as
and then issue one of the refresh procedures in
to refresh all the materialized views. Oracle Database computes the dependencies and refreshes the materialized views in the right order. Consider the example of a complete hierarchical cube described in "Examples
of Hierarchical Cube Materialized Views". Suppose all the materialized views have been created as
creates the metadata for all the materialized views. And, then, you can just call one of the refresh procedures in
to refresh all the materialized views in the right order:
The procedure refreshes the materialized views in the order of their dependencies (first
followed by
finally,
views gets rewritten against the one prior to it in the list).
The same kind of rewrite can also be used while doing PCT refresh. PCT refresh recomputes rows in a materialized view corresponding to changed rows in the detail tables. And, if there are other fresh materialized views available at the time of refresh, it can
go directly against them as opposed to going against the detail tables.
Hence, it is always beneficial to pass a list of materialized views to any of the refresh procedures in
(irrespective of the method specified) and let the procedure figure out the order of doing refresh on materialized views.
This section contains the following topics with tips on refreshing materialized views:
Tips for Refreshing Materialized Views with Aggregates
Tips for Refreshing Materialized Views Without Aggregates
Tips for Refreshing Nested Materialized Views
Tips for Fast Refresh with UNION ALL
Tips for Fast Refresh with Commit SCN-Based Materialized View Logs
Tips After Refreshing Materialized Views
Following are some guidelines for using the refresh mechanism for materialized views with aggregates.
For fast refresh, create materialized view logs on all detail tables involved in a materialized view with the
Include all columns from the table likely to be used in materialized views in the materialized view logs.
Fast refresh may be possible even if the
is omitted from the materialized view log. If it can be determined that only inserts or deletes will occur on all the detail tables, then the materialized view log does not require the
However, if updates to multiple tables are likely or required or if the specific update scenarios are unknown, make sure the
is included.
Use Oracle's bulk loader utility or direct-path
the
12c, the database automatically gathers table statistics as part of a bulk-load operation (CTAS and IAS) similar to how statistics are gathered when an index is created. By gathering statistics during the
data load, you avoid additional scan operations and provide the necessary statistics as soon as the data becomes available to the users. Note that, in the case of an IAS statement, statistics are only gathered if the table the data is being inserted into is
empty.
This is a lot more efficient than conventional insert. During loading, disable all constraints and re-enable when finished loading. Note that materialized view logs are required regardless of whether you use direct load or conventional DML.
Try to optimize the sequence of conventional mixed DML operations, direct-path
the fast refresh of materialized views. You can use fast refresh with a mixture of conventional DML and direct loads. Fast refresh can perform significant optimizations if it finds that only direct loads have occurred, as illustrated in the following:
Direct-path
Refresh materialized view
Conventional mixed DML
Refresh materialized view
You can use fast refresh with conventional mixed DML (
and
to perform significant optimizations in its processing if it detects that only inserts or deletes have been done to the tables, such as:
DML
the detail table
Refresh materialized views
DML update to the detail table
Refresh materialized view
Even more optimal is the separation of
If possible, refresh should be performed after each type of data change (as shown earlier) rather than issuing only one refresh at the end. If that is not possible, restrict the conventional DML to the table to inserts only, to get much better refresh performance.
Avoid mixing deletes and direct loads.
Furthermore, for refresh
Oracle keeps track of the type of DML done in the committed transaction. Therefore, do not perform direct-path
DML to other tables in the same transaction, as Oracle may not be able to optimize the refresh phase.
For
views, where refreshes automatically occur at the end of each transaction, it may not be possible to isolate the DML statements, in which case keeping the transactions short will help. However, if you plan to make numerous modifications to the detail table,
it may be better to perform them in one transaction, so that refresh of the materialized view is performed just once at commit time rather than after each update.
Oracle recommends partitioning the tables because it enables you to use:
Parallel DML
For large loads or refresh, enabling parallel DML helps shorten the length of time for the operation.
Partition change tracking (PCT) fast refresh
You can refresh your materialized views fast after partition maintenance operations on the detail tables. "About
Partition Change Tracking" for details on enabling PCT for materialized views.
Partitioning the materialized view also helps refresh performance as refresh can update the materialized view using parallel DML. For example, assume that the detail tables and materialized view are partitioned and have a parallel clause. The following sequence
would enable Oracle to parallelize the refresh of the materialized view.
Bulk load into the detail table.
Enable parallel DML with an
Refresh the materialized view.
For refresh using
For
delete existing rows in the materialized view, which is faster than a delete.
For
appropriately, this uses
delete rows in the affected partitions of the materialized view, which is faster than a delete.
For
if
to use the
When using
remember to set
Otherwise,
job queue processes greater than the number of processors.
If job queues are enabled and there are many materialized views to refresh, it is faster to refresh all of them in a single command than to call them individually.
Use
ensure refreshing a materialized view so that it can definitely be used for query rewrite. The best refresh method is chosen. If a fast refresh cannot be done, a complete refresh is performed.
Refresh all the materialized views in a single procedure call. This gives Oracle an opportunity to schedule refresh of all the materialized views in the right order taking into account dependencies imposed by nested materialized views and potential for efficient
refresh by using query rewrite against other materialized views.
If a materialized view contains joins but no aggregates, then having an index on each of the join column rowids in the detail table enhances refresh performance greatly, because this type of materialized view tends to be much larger than materialized views
containing aggregates. For example, consider the following materialized view:
Indexes should be created on columns
Partitioning is highly recommended, as is enabling parallel DML in the session before invoking refresh, because it greatly enhances refresh performance.
This type of materialized view can also be fast refreshed if DML is performed on the detail table. It is recommended that the same procedure be applied to this type of materialized view as for a single table aggregate. That is, perform one type of change (direct-path
DML) and then refresh the materialized view. This is because Oracle Database can perform significant optimizations if it detects that only one type of change has been done.
Also, Oracle recommends that the refresh be invoked after each table is loaded, rather than load all the tables and then perform the refresh.
For refresh
Oracle keeps track of the type of DML done in the committed transaction. Oracle therefore recommends that you do not perform direct-path and conventional DML to other tables in the same transaction because Oracle may not be able to optimize the refresh phase.
For example, the following is not recommended:
Direct load new data into the fact table
DML into the store table
Commit
Also, try not to mix different types of conventional DML statements if possible. This would again prevent using various optimizations during fast refresh. For example, try to avoid the following:
Insert into the fact table
Delete from the fact table
Commit
If many updates are needed, try to group them all into one transaction because refresh is performed just once at commit time, rather than after each update.
In a data warehousing environment, assuming that the materialized view has a parallel clause, the following sequence of steps is recommended:
Bulk load into the fact table
Enable parallel DML
An
Refresh the materialized view
All underlying objects are treated as
ordinary tables when refreshing materialized views. If the
option is specified, then all the materialized views are refreshed in the appropriate order at commit time. In other words, Oracle builds a partially ordered set of materialized views and refreshes them such that, after the successful completion of the refresh,
all the materialized views are fresh. The status of the materialized views can be checked by querying the appropriate
or
If any of the materialized views are defined as
(irrespective of whether the refresh method is
or
into account the dependencies between the materialized views) because the nested materialized view are refreshed with respect to the current contents of the other materialized views (whether fresh or not). This can be achieved by invoking the refresh procedure
against the materialized view at the top of the nested hierarchy and specifying the
as
If a refresh fails during commit time, the list of materialized views that has not been refreshed is written to the alert log, and you must manually refresh them along with all their dependent materialized views.
Use the same
views that you use on regular materialized views.
These procedures have the following behavior when used with nested materialized views:
If
is built on other materialized views, then
refreshed with respect to the current contents of the other materialized views (that is, the other materialized views are not made fresh first) unless you specify
If
then only materialized views that directly depend on
refreshed (that is, a materialized view that depends on a materialized view that depends on
not be refreshed) unless you specify
If
materialized views are refreshed is guaranteed to respect the dependencies between nested materialized views.
materialized view dependencies for an object.
You can use fast refresh
for materialized views that use the
by providing a maintenance column in the definition of the materialized view. For example, a materialized view with a
can be made fast refreshable as follows:
The form of a maintenance marker column, column
the example, must be
where each
has a distinct value for
You can often improve fast refresh performance by ensuring that your materialized view logs on the base table contain a
often significantly. By optimizing materialized view log processing
the fast refresh process can save time. The following example illustrates how to use this clause:
The materialized view refresh automatically uses the commit SCN-based materialized view log to save refresh time.
Note that only new materialized view logs can take advantage of
Existing materialized view logs cannot be altered to add
they are dropped and recreated.
When a materialized view is created on both base tables with timestamp-based materialized view logs and base tables with commit SCN-based materialized view logs, an error (ORA-32414) is raised stating that materialized view logs are not compatible with each
other for fast refresh.
After you have performed a load or incremental load and rebuilt the detail table indexes, you must re-enable integrity constraints (if any) and refresh the materialized views and materialized view indexes that are derived from that detail data. In a data warehouse
environment, referential integrity constraints are normally enabled with the
An important decision to make before performing a refresh operation is whether the refresh needs to be recoverable. Because materialized view data is redundant and can always be reconstructed from the detail tables, it might be preferable to disable logging
on the materialized view. To disable logging and run incremental refresh non-recoverably, use the
prior to refreshing.
If the materialized view is being refreshed using the
then, following refresh operations, consult the alert log
the trace file
check that no errors have occurred.
A major maintenance component
of a data warehouse is synchronizing (refreshing) the materialized views when the detail data changes. Partitioning the underlying detail tables can reduce the amount of time taken to perform the refresh task. This is possible because partitioning enables
refresh to use parallel DML to update the materialized view. Also, it enables the use of partition change tracking.
"Fast Refresh with Partition Change Tracking" provides additional information
about PCT refresh.
In a data warehouse, changes to the detail tables can often entail partition maintenance operations, such as
and
To maintain the materialized view after such operations used to require manual maintenance (see also
or complete refresh. You now have the option of using an addition to fast refresh known as partition change tracking (PCT) refresh.
For PCT to be available, the detail tables must be partitioned. The partitioning of the materialized view itself has no bearing on this feature. If PCT refresh is possible, it occurs automatically and no user intervention is required in order for it to occur.
See "About Partition Change Tracking" for PCT requirements.
The following examples illustrate the use of this feature:
PCT Fast Refresh Scenario 1
PCT Fast Refresh Scenario 2
PCT Fast Refresh Scenario 3
In this scenario, assume
using the
partitioned by the
not a partitioned table.
Create the materialized view. The following materialized view satisfies requirements for PCT.
Run the
which tables allow PCT refresh.
As can be seen from the partial sample output from
any partition maintenance operation performed on the
allows PCT fast refresh. However, PCT is not possible after partition maintenance operations or updates to the
as there is insufficient information contained in
PCT refresh to be possible. Note that the
is not partitioned and hence can never allow for PCT refresh. Oracle Database applies PCT refresh if it can determine that the materialized view has sufficient information to support PCT for all the updated tables. You can verify which partitions are fresh
and stale with views such as
See "Analyzing Materialized View Capabilities" for information on
how to use this procedure and also some details regarding PCT-related views.
Suppose at some later point, a
partition in the sales table becomes necessary.
Insert some data into the
Fast refresh
Fast refresh automatically performs a PCT refresh as it is the only fast refresh possible in this scenario. However, fast refresh will not occur if a partition maintenance operation occurs when any update has taken place to a table on which PCT is not enabled.
This is shown in "PCT Fast Refresh Scenario 2".
"PCT Fast Refresh Scenario 1" would also be appropriate if the materialized view
was created using the
in the following:
In this scenario, the first three steps are the same as in "PCT Fast Refresh Scenario
1". Then, the
is performed, but before the materialized view refresh occurs, records are inserted into the
The same as in "PCT Fast Refresh Scenario 1".
The same as in "PCT Fast Refresh Scenario 1".
The same as in "PCT Fast Refresh Scenario 1".
After issuing the same
Fast Refresh Scenario 1", some data is inserted into the
Refresh
The materialized view is not fast refreshable because DML has occurred to a table on which PCT fast refresh is not possible. To avoid this occurring, Oracle recommends performing a fast refresh immediately after any partition maintenance operation on detail
tables for which partition tracking fast refresh is available.
If the situation in "PCT Fast Refresh Scenario 2" occurs, there are
two possibilities; perform a complete refresh or switch to the
outlined in the following, if suitable. However, it should be noted that
partition change tracking fast refresh are not compatible. Once the
has been issued, PCT refresh is no longer be applied to this materialized view, until a complete refresh is done. Moreover, you should not use
you have taken manual action to ensure that the materialized view is indeed fresh.
A common situation in a data warehouse is the use of rolling windows of data. In this case, the detail table and the materialized view may contain say the last 12 months of data. Every month, new data for a month is added to the table and the oldest month is
deleted (or maybe archived). PCT refresh provides a very efficient mechanism to maintain the materialized view in this case.
The new data is usually added to the detail table by adding a new partition and exchanging it with a table containing the new data.
Next, the oldest partition is dropped or truncated.
Now, if the materialized view satisfies all conditions for PCT refresh.
Fast refresh will automatically detect that PCT is available and perform a PCT refresh.
ETL (Extraction, Transformation
and Loading) is done on a scheduled basis to reflect changes made to the original source system. During this step, you physically insert the new, clean data into the production data warehouse schema, and take all of the other steps necessary (such as building
indexes, validating constraints, taking backups) to make this new data available to the end users. Once all of this data has been loaded into the data warehouse, the materialized views have to be updated to reflect the latest data.
The partitioning scheme of the data warehouse is often crucial in determining the efficiency of refresh operations in the data warehouse load process.
In fact, the load process is often the primary consideration in choosing the partitioning scheme of data warehouse tables and indexes.
The partitioning scheme of the largest data warehouse tables (for example, the fact table in a star schema) should be based upon the loading paradigm of the data warehouse.
Most data warehouses are loaded with new data on a regular schedule. For example, every night, week, or month, new data is brought into the data warehouse. The data being loaded at the end of the week or month typically corresponds to the transactions for the
week or month. In this very common scenario, the data warehouse is being loaded by time. This suggests that the data warehouse tables should be partitioned on a date column. In our data warehouse example, suppose the new data is loaded into the
every month. Furthermore, the
partitioned by month. These steps show how the load process proceeds to add the data for a new month (January 2001) to the table
Place the new data into a separate table,
can be directly loaded into
the data warehouse, or this data can be the result of previous data transformation operations that have already occurred in the data warehouse.
the exact same columns, data types, and so forth, as the
Gather statistics on the
Create indexes and add constraints on
and constraints on
to the indexes and constraints on
parallel and should use the
For example:
Apply all constraints to the
are present on the
integrity constraints. A typical constraint would be:
If the partitioned table
key that is enforced with a global index structure, ensure that the constraint on
validated without the creation of an index structure, as in the following:
The creation of the constraint with
cause the creation of a unique index, which does not match a local index structure of the partitioned table. You must not have any index structure built on the nonpartitioned table to be exchanged for existing global indexes of the partitioned table. The exchange
command would fail.
Add the
In order to add this new data to the
must do two things. First, you must add a new partition to the
You use an
This adds an empty partition to the
Then, you can add our newly created table to this partition using the
This exchanges the new, empty partition with the newly loaded table.
The
that were already present on the
For unique constraints (such as the unique constraint on
you can use the
as shown previously. This automatically maintains your global index structures as part of the partition maintenance operation and keep them accessible throughout the whole process. If there were only foreign-key constraints, the exchange operation would be
instantaneous.
Note that, if you use synchronous refresh, instead of performing Step 3, you must register the
using the
See Chapter 8, "Synchronous Refresh" for more information.
The benefits of this partitioning technique are significant. First, the new data is loaded with minimal resource utilization. The new data is loaded into an entirely separate table, and the index processing and constraint processing are applied only to the
new partition. If the
12 partitions, then a new month's worth of data contains approximately four GB. Only the new month's worth of data must be indexed. None of the indexes on the remaining 46 GB of data must be modified at all. This partitioning scheme additionally ensures that
the load processing time is directly proportional to the amount of new data being loaded, not to the total size of the
Second, the new data is loaded with minimal impact on concurrent queries. All of the operations associated with data loading are occurring on a separate
Therefore, none of the existing data or indexes of the
is affected during this data refresh process. The
its indexes remain entirely untouched throughout this refresh process.
Third, in case of the existence of any global indexes, those are incrementally maintained as part of the exchange command. This maintenance does not affect the availability of the existing global index structures.
The exchange operation can be viewed as a publishing mechanism. Until the data warehouse administrator exchanges the
into the
data. Once the exchange has occurred, then any end user query accessing the
is immediately able to see the
Partitioning is useful not only for adding new data but also for removing and archiving data. Many data warehouses maintain a rolling window of data. For example, the data warehouse stores the most recent 36 months of
Just as a new partition can be added to the
(as described earlier), an old partition can be quickly (and independently) removed from the
These two benefits (reduced resources utilization and minimal end-user impact) are just as pertinent to removing a partition as they are to adding a partition.
Removing data from a partitioned table does not necessarily mean that the old data is physically deleted from the database. There are two alternatives for removing old data from a partitioned table. First, you can physically delete all data from the database
by dropping the partition containing the old data, thus freeing the allocated space:
Also, you can exchange the old partition with an empty table of the same structure; this empty table is created equivalent to steps 1 and 2 described in the load process. Assuming the new empty table stub is named
the following SQL statement empties partition
Note that the old data is still existent as the exchanged, nonpartitioned table
If the partitioned table was setup in a way that every partition is stored in a separate tablespace, you can archive (or transport) this table using Oracle Database's transportable tablespace framework before dropping the actual data (the tablespace). See "Transportation
Using Transportable Tablespaces" for further details regarding transportable tablespaces.
In some situations, you might not want to drop the old data immediately, but keep it as part of the partitioned table; although the data is no longer of main interest, there are still potential queries accessing this old, read-only data. You can use Oracle's
data compression to minimize the space usage of the old data. You also assume that at least one compressed partition is already part of the partitioned table.
See Also:
Oracle Database Administrator's
Guide for more information regarding table compression
Oracle
Database VLDB and Partitioning Guide for more information regarding partitioning and table compression
A typical scenario might not only need to compress old data, but also to merge several old partitions to reflect the granularity for a later backup of several merged partitions. Let us assume that a backup (partition) granularity is on a quarterly base for
any quarter, where the oldest month is more than 36 months behind the most recent month. In this case, you are therefore compressing and merging
and
Create the new merged partition in parallel in another tablespace. The partition is compressed as part of the
The partition
indexes for the new merged partition. You therefore have to rebuild them:
Alternatively, you can choose to create the new compressed table outside the partitioned table and exchange it back. The performance and the temporary space consumption is identical for both methods:
Create an intermediate table to hold the new merged information. The following statement inherits all
from the original table by default:
Create the equivalent index structure for table
for the existing table
Prepare the existing table sales for the exchange with the new compressed table
Because the table to be exchanged contains data actually covered in three partitions, you have to create one matching partition, having the range boundaries you are looking for. You simply have to drop two of the existing partitions. Note that you have to
drop the lower two partitions
the lower boundary of a range partition is always defined by the upper (exclusive) boundary of the previous partition:
You can now exchange table
Unlike what the name of the partition suggests, its boundaries cover Q1-1998.
Both methods apply to slightly different business scenarios: Using the
invalidates the local index structures for the affected partition, but it keeps all data accessible all the time. Any attempt to access the affected partition through one of the unusable index structures raises an error. The limited availability time is approximately
the time for re-creating the local bitmap index structures. In most cases, this can be neglected, because this part of the partitioned table should not be accessed too often.
The CTAS approach, however, minimizes unavailability of any index structures close to zero, but there is a specific time window, where the partitioned table does not have all the data, because you dropped two partitions. The limited availability time is approximately
the time for exchanging the table. Depending on the existence and number of global indexes, this time window varies. Without any existing global indexes, this time window is a matter of a fraction to few seconds.
These examples are a simplification of the data warehouse rolling window load scenario. Real-world data warehouse refresh characteristics are always more complex. However, the advantages of this rolling window approach are not diminished in more complex scenarios.
Note that before you add single or multiple compressed partitions to a partitioned table for the first time, all local bitmap indexes must be either dropped or marked unusable. After the first compressed partition is added, no additional actions are necessary
for all subsequent operations involving compressed partitions. It is irrelevant how the compressed partitions are added to the partitioned table.
See Also:
Oracle
Database VLDB and Partitioning Guide for more information regarding partitioning and table compression
Oracle Database Administrator's
Guide for further details about partitioning and table compression.
This section describes the following two typical scenarios where partitioning is used with refresh:
Refresh Scenario 1
Refresh Scenario 2
Data is loaded daily. However, the data warehouse contains two years of data, so that partitioning by day might not be desired.
The solution is to partition by week or month (as appropriate). Use
add the new data to an existing partition. The
only affects a single partition, so the benefits described previously remain intact. The
could occur while the partition remains a part of the table. Inserts into a single partition can be parallelized:
The indexes of this
parallel as well. An alternative is to use the
You can do this by exchanging the
and then using an
this technique when dropping and rebuilding indexes is more efficient than maintaining them.
New data feeds, although consisting primarily of data for the most recent day, week, and month, also contain some data from previous time periods.
Solution 1
Use parallel SQL operations (such as
to separate the new data from the data in previous time periods. Process the old data separately using other techniques.
New data feeds are not solely time based. You can also feed new data into a data warehouse with data from multiple operational systems on a business need basis. For example, the sales data from direct channels may come into the data warehouse separately from
the data from indirect channels. For business reasons, it may furthermore make sense to keep the direct and indirect data in separate partitions.
Solution 2
Oracle supports composite range-list partitioning. The primary partitioning strategy of the sales table could be range partitioning based on
shown in the example. However, the subpartitioning is a list based on the channel attribute. Each subpartition can now be loaded independently of each other (for each distinct channel) and added in a rolling window operation as discussed before. The partitioning
strategy addresses the business needs in the most optimal manner.
You can optimize DML performance through the following techniques:
Implementing an Efficient MERGE Operation
Maintaining Referential Integrity
Purging Data
Commonly, the data that is extracted from a source system is not simply a list of new records that needs to be inserted into the data warehouse. Instead,
this new data set is a combination of new records as well as modified records. For example, suppose that most of data extracted from the OLTP systems will be new sales transactions. These records are inserted into the warehouse's
but some records may reflect modifications of previous transactions, such as returned merchandise or transactions that were incomplete or incorrect when initially loaded into the data warehouse. These records require updates to the
As a typical scenario, suppose that there is a table called
contains both inserts and updates that are applied to the
When designing the entire data warehouse load process, it was determined that the
would contain records with the following semantics:
If a given
exists in
by adding the
from the
Otherwise, insert the entire new record from the
into the
This
merge. A merge can be executed using one SQL statement.
Example 7-5 MERGE Operation
In addition to using the
into a target table, you can also use it to:
Perform an
statement.
Apply additional
of the
The
condition yields true.
Example 7-6 Omitting the INSERT Clause
In some data warehouse applications, it is not allowed to add new rows to historical information, but only to update them. It may also happen that you do not want to update but only insert new information. The following example demonstrates
with
Example 7-7 Omitting the UPDATE Clause
The following statement illustrates an example of omitting an
When the
a regular join of the source and the target tables. When the
is omitted, Oracle Database performs an antijoin of the source and the target tables. This makes the join between the source and target table more efficient.
Example 7-8 Skipping the UPDATE Clause
In some situations, you may want to skip the
when merging a given row into the table. In this case, you can use an optional
in the
As a result, the
a given condition is true. The following statement illustrates an example of skipping the
This shows how the
if the condition
true. The condition predicate can refer to both the target and the source table.
Example 7-9 Conditional Inserts with MERGE Statements
You may want to skip the
a given row into the table. So an optional
is added to the
As a result, the
a given condition is true. The following statement offers an example:
This example shows that the
skipped if the condition
not true, and
true. The condition predicate can refer to the source table only. The condition predicate can only refer to the source table.
Example 7-10 Using the DELETE Clause with MERGE Statements
You may want to cleanse tables while populating or updating them. To do this, you may want to consider using the
in a
Thus when a row is updated in
condition
condition yields true.
The
Only the rows from the destination of the
be deleted. The only rows that are affected by the
the ones that are updated by this
Thus, although a given row of the destination table meets the delete condition, if it does not join under the
condition, it is not deleted.
Example 7-11 Unconditional Inserts with MERGE Statements
You may want to insert all of the source rows into a table. In this case, the join between the source and target table can be avoided. By identifying special constant join conditions that always result to
for example, 1=0, such
and the join condition are suppressed.
In some data warehousing environments, you might want to insert new data into tables in order to guarantee referential integrity. For example, a data warehouse may derive
an operational system that retrieves data directly from cash registers.
refreshed nightly. However, the data for the
table may be derived from a separate operational system. The
table may only be refreshed once for each week, because the
changes relatively slowly. If a new product was introduced on Monday, then it is possible for that product's
appear in the
that
Although the sales transactions of the new product may be valid, this sales data do not satisfy the referential integrity constraint between the
table and the
the new sales transactions, you might choose to insert the sales transactions into the
However, you might also wish to maintain the referential integrity relationship between the
This can be accomplished by inserting new rows into the
as placeholders for the unknown products.
As in previous examples, assume that the new data for the
is staged in a separate table,
(which can be parallelized), the
be altered to reflect the new products:
Occasionally, it is necessary to
remove large amounts of data from a data warehouse. A very common scenario is the rolling window discussed previously, in which older data is rolled out of the data warehouse to make room for new data.
However, sometimes other data might need to be removed from a data warehouse. Suppose that a retail company has previously sold products from
and that
subsequently gone out of business. The business users of the warehouse may decide that they are no longer interested in seeing any data related to
so this data should be deleted.
One approach to removing a large volume of data is to use parallel delete as shown in the following statement:
This SQL statement spawns one parallel process for each partition. This approach is much more efficient than a series of
and none of the data in the
moved. However, this approach also has some disadvantages. When removing a large percentage of rows, the
leaves many empty row-slots in the existing partitions. If new data is being loaded using a rolling window technique (or is being loaded using direct-path
load), then this storage space is not reclaimed. Moreover, even though the
is parallelized, there might be more efficient methods. An alternative method is to re-create the entire
keeping the data for all product categories except
This approach may be more efficient than a parallel delete. However, it is also costly in terms of the amount of disk space, because the
must effectively be instantiated twice.
An alternative method to utilize less space is to re-create the
one partition at a time:
Continue this process for each partition in the
https://docs.oracle.com/database/121/DWHSG/refresh.htm#DWHSG8358
This chapter discusses how to refresh materialized views, which is a key element in maintaining good performance and consistent data when working with materialized views in a data warehousing environment.
This chapter includes the following sections:
Refreshing Materialized Views
Using Materialized Views with Partitioned Tables
Using Partitioning to Improve Data Warehouse Refresh
Optimizing DML Operations During Refresh
Refreshing Materialized Views
The database maintains data in materializedviews by refreshing them after changes to the base tables. The refresh method can be incremental or a complete refresh. There are two incremental refresh methods, known as log-based refresh and partition change tracking (PCT) refresh. The incremental refresh
is commonly called
FASTrefresh as it usually performs
faster than the complete refresh.
A complete refresh occurs when the materialized view is initially created when it is defined as
BUILD
IMMEDIATE,
unless the materialized view references a prebuilt table or is defined as
BUILD
DEFERRED.
Users can perform a complete refresh at any time after the materialized view is created. The complete refresh involves executing the query that defines the materialized view. This process can be slow, especially if the database must read and process huge amounts
of data.
An incremental refresh eliminates the need to rebuild materialized views from scratch. Thus, processing only the changes can result in a very fast refresh time. Materialized views can be refreshed either on demand or at regular time intervals. Alternatively,
materialized views in the same database as their base tables can be refreshed whenever a transaction commits its changes to the base tables.
For materialized views that use the log-based fast refresh method, a materialized view log and/or a direct loader log keep a record of changes to the base tables. A materialized view log is a schema object that records changes to a base table so that a materialized
view defined on the base table can be refreshed incrementally. Each materialized view log is associated with a single base table. The materialized view log resides in the same database and schema as its base table.
The PCT refresh method can be used if the modified base tables are partitioned and the modified base table partitions can be used to identify the affected partitions or portions of data in the materialized view. When there have been some partition maintenance
operations on the base tables, this is the only incremental refresh method that can be used. The PCT refresh removes all data in the affected materialized view partitions or affected portions of data and recomputes them from scratch.
For each of these refresh methods, you have two options for how the refresh is performed, namely in-place refresh and out-of-place refresh. The in-place refresh executes the refresh statements directly on the materialized view. The out-of-place refresh creates
one or more outside tables and executes the refresh statements on the outside tables and then switches the materialized view or affected materialized view partitions with the outside tables. Both in-place refresh and out-of-place refresh achieve good performance
in certain refresh scenarios. However, the out-of-place refresh enables high materialized view availability during refresh, especially when refresh statements take a long time to finish.
Also adopting the out-of-place mechanism, a new refresh method called synchronous refresh is introduced in Oracle Database 12c, Release 1. It targets the common usage scenario in the data warehouse where
both fact tables and their materialized views are partitioned in the same way or their partitions are related by a functional dependency.
The refresh approach enables you to keep a set of tables and the materialized views defined on them to be always in sync. In this refresh method, the user does not directly modify the contents of the base tables but must use the APIs provided by the synchronous
refresh package that will apply these changes to the base tables and materialized views at the same time to ensure their consistency. The synchronous refresh method is well-suited for data warehouses, where the loading of incremental data is tightly controlled
and occurs at periodic intervals.
When creating a materialized view, you have the option of specifying whether the refresh occurs
ON
DEMANDor
ON
COMMIT.
In the case of
ON
COMMIT,
the materialized view is changed every time a transaction commits, thus ensuring that the materialized view always contains the latest data. Alternatively, you can control the time when refresh of the materialized views occurs by specifying
ON
DEMAND.
In the case of
ON
DEMANDmaterialized
views, the refresh can be performed with refresh methods provided in either the
DBMS_SYNC_REFRESHor
the
DBMS_MVIEWpackages:
The
DBMS_SYNC_REFRESHpackage contains the APIs for
synchronous refresh, a new refresh method introduced in Oracle Database 12c, Release 1. For details, see Chapter
8, "Synchronous Refresh".
The
DBMS_MVIEWpackage contains the APIs whose usage
is described in this chapter. There are three basic types of refresh operations: complete refresh, fast refresh, and partition change tracking (PCT) refresh. These basic types have been enhanced in Oracle Database 12c,
Release 1 with a new refresh option called out-of-place refresh.
The
DBMS_MVIEWpackage contains three APIs for performing
refresh operations:
DBMS_MVIEW.REFRESH
Refresh one or more materialized views.
DBMS_MVIEW.REFRESH_ALL_MVIEWS
Refresh all materialized views.
DBMS_MVIEW.REFRESH_DEPENDENT
Refresh all materialized views that depend on a specified master table or materialized view or list of master tables or materialized views.
See Also:
"Manual Refresh Using the DBMS_MVIEW Package" for more information
Performing a refresh operation requires temporary space to rebuild the indexes and can require additional space for performing the refresh operation itself. Some sites might prefer not to refresh all of their materialized views at the same time: as soon as
some underlying detail data has been updated, all materialized views using this data become stale. Therefore, if you defer refreshing your materialized views, you can either rely on your chosen rewrite integrity level to determine whether or not a stale materialized
view can be used for query rewrite, or you can temporarily disable query rewrite with an
ALTER
SYSTEM
SET
QUERY_REWRITE_ENABLED = FALSEstatement. After refreshing the materialized views, you can re-enable query rewrite as the default for all sessions in the current database instance by specifying
ALTER
SYSTEM
SET
QUERY_REWRITE_ENABLEDas
TRUE.
Refreshing a materialized view automatically updates all of its indexes. In the case of full refresh, this requires temporary sort space to rebuild all indexes during refresh. This is because the full refresh truncates or deletes the table before inserting
the new full data volume. If insufficient temporary space is available to rebuild the indexes, then you must explicitly drop each index or mark it
UNUSABLEprior
to performing the refresh operation.
If you anticipate performing insert, update or delete operations on tables referenced by a materialized view concurrently with the refresh of that materialized view, and that materialized view includes joins and aggregation, Oracle recommends you use
ON
COMMITfast
refresh rather than
ON
DEMANDfast
refresh.
An additional option when performing refresh is to use out-of-place refresh, where outside tables are used to improve materialized view availability and refresh performance in certain situations.
See Also:
Oracle OLAP User's Guide for
information regarding the refresh of cube organized materialized views
"The Out-of-Place Refresh Option" for a discussion of out-of-place refresh
This section contains the following topics:
Complete Refresh
Fast Refresh
Partition Change Tracking (PCT) Refresh
The Out-of-Place Refresh Option
ON COMMIT Refresh
Manual Refresh Using the DBMS_MVIEW Package
Refresh Specific Materialized Views with REFRESH
Refresh All Materialized Views with REFRESH_ALL_MVIEWS
Refresh Dependent Materialized Views with REFRESH_DEPENDENT
Using Job Queues for Refresh
When Fast Refresh is PossibleRecommended
Initialization Parameters for Parallelism
Monitoring a Refresh
Checking the Status of a Materialized View
Scheduling Refresh
Complete Refresh
A complete refresh occurs when the materialized view is initially defined as BUILD
IMMEDIATE,
unless the materialized view references a prebuilt table. For materialized views using
BUILD
DEFERRED,
a complete refresh must be requested before it can be used for the first time. A complete refresh may be requested at any time during the life of any materialized view. The refresh involves reading the detail tables to compute the results for the materialized
view. This can be a very time-consuming process, especially if there are huge amounts of data to be read and processed. Therefore, you should always consider the time required to process a complete refresh before requesting it.
There are, however, cases when the only refresh method available for an already built materialized view is complete refresh because the materialized view does not satisfy the conditions specified in the following section for a fast refresh.
Fast Refresh
Most data warehouses have periodic incremental updates to their detail data. As described in "MaterializedView Schema Design", you can use the SQL*Loader or any bulk load utility to perform incremental loads of detail data. Fast refresh of your materialized views is usually efficient, because instead of having to recompute the entire materialized view, the
changes are applied to the existing data. Thus, processing only the changes can result in a very fast refresh time.
Partition Change Tracking (PCT) Refresh
When there have been some partition maintenanceoperations on the detail tables, this is the only method of fast refresh that can be used. PCT-based refresh on a materialized view is enabled only if all the conditions described in "About
Partition Change Tracking" are satisfied.
In the absence of partition maintenance operations on detail tables, when you request a
FASTmethod
(
method => 'F') of refresh through procedures in
DBMS_MVIEWpackage,
Oracle uses a heuristic rule to try log-based rule fast refresh before choosing PCT refresh. Similarly, when you request a
FORCEmethod
(
method => '?'), Oracle chooses the refresh method based on the following attempt
order: log-based fast refresh, PCT refresh, and complete refresh. Alternatively, you can request the PCT method (
method => 'P'), and Oracle uses the PCT method provided all PCT requirements are satisfied.
Oracle can use
TRUNCATE
PARTITIONon
a materialized view if it satisfies the conditions in "Benefits of Partitioning a Materialized
View" and hence, make the PCT refresh process more efficient.
See Also:
"About Partition Change Tracking" for more information regarding partition change
tracking
The Out-of-Place Refresh Option
Beginning with Oracle Database 12c Release1, a new refresh option is available to improve materialized view refresh performance and availability. This refresh option is called out-of-place refresh because it uses outside tables during refresh as opposed to the existing "in-place" refresh that directly
applies changes to the materialized view container table. The out-of-place refresh option works with all existing refresh methods, such as
FAST(
'F'),
COMPLETE('
C'),
PCT(
'P'),
and
FORCE(
'?').
Out-of-place refresh is particularly effective when handling situations with large amounts of data changes, where conventional DML statements do not scale well. It also enables you to achieve a very high degree of availability because the materialized views
that are being refreshed can be used for direct access and query rewrite during the execution of refresh statements. In addition, it helps to avoid potential problems such as materialized view container tables becoming fragmented over time or intermediate
refresh results being seen.
In out-of-place refresh, the entire or affected portions of a materialized view are computed into one or more outside tables. For partitioned materialized views, if partition level change tracking is possible, and there are local indexes defined on the materialized
view, the out-of-place method also builds the same local indexes on the outside tables. This refresh process is completed by either switching between the materialized view and the outside table or partition exchange between the affected partitions and the
outside tables. During refresh, the outside table is populated by direct load, which is efficient.
This section contains the following topics:
Types of Out-of-Place Refresh
Restrictions and Considerations with Out-of-Place Refresh
Types of Out-of-Place Refresh
There are three types of out-of-place refresh:out-of-place fast refresh
This offers better availability than in-place fast refresh. It also offers better performance when changes affect a large part of the materialized view.
out-of-place PCT refresh
This offers better availability than in-place PCT refresh. There are two different approaches for partitioned and non-partitioned materialized views. If truncation and direct load are not feasible, you should use out-of-place refresh when the changes are relatively
large. If truncation and direct load are feasible, in-place refresh is preferable in terms of performance. In terms of availability, out-of-place refresh is always preferable.
out-of-place complete refresh
This offers better availability than in-place complete refresh.
Using the refresh interface in the
DBMS_MVIEWpackage,
with
method
=
?and
out_of_place
=
true,
out-of-place fast refresh are attempted first, then out-of-place PCT refresh, and finally out-of-place complete refresh. An example is the following:
DBMS_MVIEW.REFRESH('CAL_MONTH_SALES_MV', method => '?', atomic_refresh => FALSE, out_of_place => TRUE);
Restrictions and Considerations with Out-of-Place Refresh
Out-of-place refresh has all the restrictions that apply when using the corresponding in-place refresh. In addition, it has the following restrictions:Only materialized join views and materialized aggregate views are allowed
No
ON
COMMITrefresh
is permitted
No remote materialized views, cube materialized views, object materialized views are permitted
No
LOBcolumns are permitted
Not permitted if materialized view logs, triggers, or constraints (except
NOT
NULL)
are defined on the materialized view
Not permitted if the materialized view contains the
CLUSTERINGclause
Not applied to complete refresh within a
CREATEor
ALTER
MATERIALIZED
VIEWsession
or an
ALTER
TABLEsession
Atomic mode is not permitted. If you specify
atomic_refreshas
TRUEand
out_of_placeas
TRUE,
an error is displayed
For out-of-place PCT refresh, there is the following restriction:
No
UNION
ALLor
grouping sets are permitted
For out-of-place fast refresh, there are the following restrictions:
No
UNION
ALL,
grouping sets or outer joins are permitted
Not allowed for materialized join views when more than one base table is modified with mixed DML statements
Out-of-place refresh requires additional storage for the outside table and the indexes for the duration of the refresh. Thus, you must have enough available tablespace or auto extend turned on.
The partition exchange in out-of-place PCT refresh impacts the global index on the materialized view. Therefore, if there are global indexes defined on the materialized view container table, Oracle disables the global indexes before doing the partition exchange
and rebuild the global indexes after the partition exchange. This rebuilding is additional overhead.
ON COMMIT Refresh
A materialized view can be refreshed automatically using the ON
COMMITmethod.
Therefore, whenever a transaction commits which has updated the tables on which a materialized view is defined, those changes are automatically reflected in the materialized view. The advantage of using this approach is you never have to remember to refresh
the materialized view. The only disadvantage is the time required to complete the commit will be slightly longer because of the extra processing involved. However, in a data warehouse, this should not be an issue because there is unlikely to be concurrent
processes trying to update the same table.
Manual Refresh Using the DBMS_MVIEW Package
When a materialized view is refreshed ON
DEMAND,
one of four refresh methods can be specified as shown in the following table. You can define a default option during the creation of the materialized view. Table
7-1 details the refresh options.
Table 7-1 ON
DEMAND Refresh Methods
Refresh Option | Parameter | Description |
---|---|---|
COMPLETE | C | Refreshes by recalculating the defining query of the materialized view. |
FAST | F | Refreshes by incrementally applying changes to the materialized view. For local materialized views, it chooses the refresh method which is estimated by optimizer to be most efficient. The refresh methods considered are log-based FASTand FAST_PCT. |
FAST_PCT | P | Refreshes by recomputing the rows in the materialized view affected by changed partitions in the detail tables. |
FORCE | ? | Attempts a fast refresh. If that is not possible, it does a complete refresh. For local materialized views, it chooses the refresh method which is estimated by optimizer to be most efficient. The refresh methods considered are log based FAST, FAST_PCT, and COMPLETE. |
DBMS_MVIEWpackage
for performing
ON
DEMANDrefresh.
Each has its own unique set of parameters.
See Also:
Oracle Database Advanced
Replication for information showing how to use it in a replication environment
Oracle Database PL/SQL Packages
and Types Reference for detailed information about the
DBMS_MVIEWpackage
Refresh Specific Materialized Views with REFRESH
Use the DBMS_MVIEW.REFRESHprocedure
to refresh one or more materialized views. Some parameters are used only for replication, so they are not mentioned here. The required parameters to use this procedure are:
The comma-delimited list of materialized views to refresh
The refresh method:
F-Fast,
P-Fast_PCT,
?-Force,
C-Complete
The rollback segment to use
Refresh after errors (
TRUEor
FALSE)
A Boolean parameter. If set to
TRUE, the
number_of_failuresoutput
parameter is set to the number of refreshes that failed, and a generic error message indicates that failures occurred. The alert log for the instance gives details of refresh errors. If set to
FALSE,
the default, then refresh stops after it encounters the first error, and any remaining materialized views in the list are not refreshed.
The following four parameters are used by the replication process. For warehouse refresh, set them to
FALSE, 0,0,0.
Atomic refresh (
TRUEor
FALSE)
If set to
TRUE, then all refreshes are done in one transaction. If
set to
FALSE, then each of the materialized views is refreshed non-atomically
in separate transactions. If set to
FALSE, Oracle can optimize refresh
by using parallel DML and truncate DDL on a materialized views. When a materialized view is refreshed in atomic mode, it is eligible for query rewrite if the rewrite integrity mode is set to
stale_tolerated.
Atomic refresh cannot be guaranteed when refresh is performed on nested views.
Whether to use out-of-place refresh
This parameter works with all existing refresh methods (
F,
P,
C,
?).
So, for example, if you specify
Fand
out_of_place
=
true,
then an out-of-place fast refresh is attempted. Similarly, if you specify
Pand
out_of_place
=
true,
then out-of-place PCT refresh is attempted.
For example, to perform a fast refresh on the materialized view
cal_month_sales_mv,
the
DBMS_MVIEWpackage would be called as follows:
DBMS_MVIEW.REFRESH('CAL_MONTH_SALES_MV', 'F', '', TRUE, FALSE, 0,0,0, FALSE, FALSE);
Multiple materialized views can be refreshed at the same time, and they do not all have to use the same refresh method. To give them different refresh methods, specify multiple method codes in the same order as the list of materialized views (without commas).
For example, the following specifies that
cal_month_sales_mvbe
completely refreshed and
fweek_pscat_sales_mvreceive
a fast refresh:
DBMS_MVIEW.REFRESH('CAL_MONTH_SALES_MV, FWEEK_PSCAT_SALES_MV', 'CF', '', TRUE, FALSE, 0,0,0, FALSE, FALSE);
If the refresh method is not specified, the default refresh method as specified in the materialized view definition is used.
Refresh All Materialized Views with REFRESH_ALL_MVIEWS
An alternative to specifying the materialized views to refresh is to use the procedure DBMS_MVIEW.REFRESH_ALL_MVIEWS.
This procedure refreshes all materialized views. If any of the materialized views fails to refresh, then the number of failures is reported.
The parameters for this procedure are:
The number of failures (this is an
OUTvariable)
The refresh method:
F-Fast,
P-Fast_PCT,
?-Force,
C-Complete
Refresh after errors (
TRUEor
FALSE)
A Boolean parameter. If set to
TRUE, the
number_of_failuresoutput
parameter is set to the number of refreshes that failed, and a generic error message indicates that failures occurred. The alert log for the instance gives details of refresh errors. If set to
FALSE,
the default, then refresh stops after it encounters the first error, and any remaining materialized views in the list is not refreshed.
Atomic refresh (
TRUEor
FALSE)
If set to
TRUE, then all refreshes are done in one transaction. If
set to
FALSE, then each of the materialized views is refreshed non-atomically
in separate transactions. If set to
FALSE, Oracle can optimize refresh
by using parallel DML and truncate DDL on a materialized views. When a materialized view is refreshed in atomic mode, it is eligible for query rewrite if the rewrite integrity mode is set to
stale_tolerated.
Atomic refresh cannot be guaranteed when refresh is performed on nested views.
Whether to use out-of-place refresh
This parameter works with all existing refresh method (
F,
P,
C,
?).
So, for example, if you specify
Fand
out_of_place
=
true,
then an out-of-place fast refresh is attempted. Similarly, if you specify
Pand
out_of_place
=
true,
then out-of-place PCT refresh is attempted.
An example of refreshing all materialized views is the following:
DBMS_MVIEW.REFRESH_ALL_MVIEWS(failures,'C','', TRUE, FALSE, FALSE);
Refresh Dependent Materialized Views with REFRESH_DEPENDENT
The third procedure, DBMS_MVIEW.REFRESH_DEPENDENT,
refreshes only those materialized views that depend on a specific table or list of tables. For example, suppose the changes have been received for the
orderstable
but not for
customerpayments. The refresh dependent
procedure can be called to refresh only those materialized views that reference the
orderstable.
The parameters for this procedure are:
The number of failures (this is an
OUTvariable)
The dependent table
The refresh method:
F-Fast,
P-Fast_PCT,
?-Force,
C-Complete
The rollback segment to use
Refresh after errors (
TRUEor
FALSE)
A Boolean parameter. If set to
TRUE, the
number_of_failuresoutput
parameter is set to the number of refreshes that failed, and a generic error message indicates that failures occurred. The alert log for the instance gives details of refresh errors. If set to
FALSE,
the default, then refresh stops after it encounters the first error, and any remaining materialized views in the list are not refreshed.
Atomic refresh (
TRUEor
FALSE)
If set to
TRUE, then all refreshes are done in one transaction. If
set to
FALSE, then each of the materialized views is refreshed non-atomically
in separate transactions. If set to
FALSE, Oracle can optimize refresh
by using parallel DML and truncate DDL on a materialized views. When a materialized view is refreshed in atomic mode, it is eligible for query rewrite if the rewrite integrity mode is set to
stale_tolerated.
Atomic refresh cannot be guaranteed when refresh is performed on nested views.
Whether it is nested or not
If set to
TRUE, refresh all the dependent materialized views of the
specified set of tables based on a dependency order to ensure the materialized views are truly fresh with respect to the underlying base tables.
Whether to use out-of-place refresh
This parameter works with all existing refresh methods (
F,
P,
C,
?).
So, for example, if you specify
Fand
out_of_place
=
true,
then an out-of-place fast refresh is attempted. Similarly, if you specify
Pand
out_of_place
=
true,
then out-of-place PCT refresh is attempted.
To perform a full refresh on all materialized views that reference the
customerstable,
specify:
DBMS_MVIEW.REFRESH_DEPENDENT(failures, 'CUSTOMERS', 'C', '', FALSE, FALSE, FALSE);
Using Job Queues for Refresh
Job queues can be used to refresh multiple materialized views in parallel. If queues are not available, fast refresh sequentially refreshes each view in the foreground process. To make queues available, you must set the JOB_QUEUE_PROCESSESparameter.
This parameter defines the number of background job queue processes and determines how many materialized views can be refreshed concurrently. Oracle tries to balance the number of concurrent refreshes with the degree of parallelism of each refresh. The order
in which the materialized views are refreshed is determined by dependencies imposed by nested materialized views and potential for efficient refresh by using query rewrite against other materialized views (See"Scheduling
Refresh" for details). This parameter is only effective when
atomic_refreshis
set to
FALSE.
If the process that is executing
DBMS_MVIEW.REFRESHis
interrupted or the instance is shut down, any refresh jobs that were executing in job queue processes are requeued and continue running. To remove these jobs, use the
DBMS_JOB.REMOVEprocedure.
See Also:
Oracle Database PL/SQL Packages
and Types Reference for detailed information about the
DBMS_JOBpackage
When Fast Refresh is Possible
Not all materialized views may be fast refreshable. Therefore, use the package DBMS_MVIEW.EXPLAIN_MVIEWto
determine what refresh methods are available for a materialized view.
If you are not sure how to make a materialized view fast refreshable, you can use the
DBMS_ADVISOR.TUNE_MVIEWprocedure,
which provides a script containing the statements required to create a fast refreshable materialized view.
See Also:
Oracle Database SQL
Tuning Guide
Chapter 5, "Basic Materialized Views" for further information about the
DBMS_MVIEWpackage
Recommended Initialization Parameters for Parallelism
The following initialization parameters need to be set properly for parallelism to be effective:PARALLEL_MAX_SERVERSshould
be set high enough to take care of parallelism. You must consider the number of slaves needed for the refresh statement. For example, with a degree of parallelism of eight, you need 16 slave processes.
PGA_AGGREGATE_TARGETshould
be set for the instance to manage the memory usage for sorts and joins automatically. If the memory parameters are set manually,
SORT_AREA_SIZEshould
be less than
HASH_AREA_SIZE.
OPTIMIZER_MODEshould
equal
all_rows.
Remember to analyze all tables and indexes for better optimization.
See Also:
Oracle
Database VLDB and Partitioning Guide
Monitoring a Refresh
While a job is running, you can querythe
V$SESSION_LONGOPSview to tell you the progress
of each materialized view being refreshed.
SELECT * FROM V$SESSION_LONGOPS;
To look at the progress of which jobs are on which queue, use:
SELECT * FROM DBA_JOBS_RUNNING;
Checking the Status of a Materialized View
Three views are provided for checking the status of a materialized view: DBA_MVIEWS,
ALL_MVIEWS,
and
USER_MVIEWS. To check if a materialized view is fresh or stale,
issue the following statement:
SELECT MVIEW_NAME, STALENESS, LAST_REFRESH_TYPE, COMPILE_STATE FROM USER_MVIEWS ORDER BY MVIEW_NAME; MVIEW_NAME STALENESS LAST_REF COMPILE_STATE ---------- --------- -------- ------------- CUST_MTH_SALES_MV NEEDS_COMPILE FAST NEEDS_COMPILE PROD_YR_SALES_MV FRESH FAST VALID
If the
compile_statecolumn shows
NEEDS
COMPILE,
the other displayed column values cannot be trusted as reflecting the true status. To revalidate the materialized view, issue the following statement:
ALTER MATERIALIZED VIEW [materialized_view_name] COMPILE;
Then reissue the
SELECTstatement.
Viewing Partition Freshness
Several views are available that enable you to verify the status of base table partitions and determine which ranges of materialized view data are fresh and which are stale. The views are as follows:*_USER_MVIEWS
To determine partition change tracking (PCT) information for the materialized view.
*_USER_MVIEW_DETAIL_RELATIONS
To display partition information for the detail table a materialized view is based on.
*_USER_MVIEW_DETAIL_PARTITION
To determine which partitions are fresh.
*_USER_MVIEW_DETAIL_SUBPARTITION
To determine which subpartitions are fresh.
The use of these views is illustrated in the following examples. Figure 7-1 illustrates
a range-list partitioned table and a materialized view based on it. The partitions are P1, P2, P3, and P4, while the subpartitions are SP1, SP2, and SP3.
Figure 7-1 Determining PCT Freshness
Description of "Figure 7-1 Determining PCT Freshness"
Examples of Using Views to Determine Freshness
This section illustrates examples of determining the PCT and freshness information for materialized views and their detail tables.
Example 7-1 Verifying the PCT Status of a Materialized View
Query
USER_MVIEWSto access PCT information about the
materialized view, as shown in the following:
SELECT MVIEW_NAME, NUM_PCT_TABLES, NUM_FRESH_PCT_REGIONS, NUM_STALE_PCT_REGIONS FROM USER_MVIEWS WHERE MVIEW_NAME = MV1; MVIEW_NAME NUM_PCT_TABLES NUM_FRESH_PCT_REGIONS NUM_STALE_PCT_REGIONS ---------- -------------- --------------------- --------------------- MV1 1 9 3
Example 7-2 Verifying the PCT Status in a Materialized View's Detail Table
Query
USER_MVIEW_DETAIL_RELATIONSto access PCT detail
table information, as shown in the following:
SELECT MVIEW_NAME, DETAILOBJ_NAME, DETAILOBJ_PCT, NUM_FRESH_PCT_PARTITIONS, NUM_STALE_PCT_PARTITIONS FROM USER_MVIEW_DETAIL_RELATIONS WHERE MVIEW_NAME = MV1;
MVIEW_NAME DETAILOBJ_NAME DETAIL_OBJ_PCT NUM_FRESH_PCT_PARTITIONS NUM_STALE_PCT_PARTITIONS ---------- -------------- -------------- ------------------------ ------------------------ MV1 T1 Y 3 1
Example 7-3 Verifying Which Partitions are Fresh
Query
USER_MVIEW_DETAIL_PARTITIONto access PCT freshness
information for partitions, as shown in the following:
SELECT MVIEW_NAME,DETAILOBJ_NAME,DETAIL_PARTITION_NAME, DETAIL_PARTITION_POSITION,FRESHNESS FROM USER_MVIEW_DETAIL_PARTITION WHERE MVIEW_NAME = MV1;
MVIEW_NAME DETAILOBJ_NAME DETAIL_PARTITION_NAME DETAIL_PARTITION_POSITION FRESHNESS ---------- -------------- --------------------- ------------------------- --------- MV1 T1 P1 1 FRESH MV1 T1 P2 2 FRESH MV1 T1 P3 3 STALE MV1 T1 P4 4 FRESH
Example 7-4 Verifying Which Subpartitions are Fresh
Query
USER_MVIEW_DETAIL_SUBPARTITIONto access PCT freshness
information for subpartitions, as shown in the following:
SELECT MVIEW_NAME,DETAILOBJ_NAME,DETAIL_PARTITION_NAME, DETAIL_SUBPARTITION_NAME, DETAIL_SUBPARTITION_POSITION,FRESHNESS FROM USER_MVIEW_DETAIL_SUBPARTITION WHERE MVIEW_NAME = MV1;
MVIEW_NAME DETAILOBJ DETAIL_PARTITION DETAIL_SUBPARTITION_NAME DETAIL_SUBPARTITION_POS FRESHNESS ---------- --------- ---------------- ------------------------ ----------------------- --------- MV1 T1 P1 SP1 1 FRESH MV1 T1 P1 SP2 1 FRESH MV1 T1 P1 SP3 1 FRESH MV1 T1 P2 SP1 1 FRESH MV1 T1 P2 SP2 1 FRESH MV1 T1 P2 SP3 1 FRESH MV1 T1 P3 SP1 1 STALE MV1 T1 P3 SP2 1 STALE MV1 T1 P3 SP3 1 STALE MV1 T1 P4 SP1 1 FRESH MV1 T1 P4 SP2 1 FRESH MV1 T1 P4 SP3 1 FRESH
Scheduling Refresh
Very often you have multiple materialized views in the database. Some of these can be computed by rewriting against others. This is verycommon in data warehousing environment where you may have nested materialized views or materialized views at different levels of some hierarchy.
In such cases, you should create the materialized views as
BUILD DEFERRED,
and then issue one of the refresh procedures in
DBMS_MVIEWpackage
to refresh all the materialized views. Oracle Database computes the dependencies and refreshes the materialized views in the right order. Consider the example of a complete hierarchical cube described in "Examples
of Hierarchical Cube Materialized Views". Suppose all the materialized views have been created as
BUILD DEFERRED. Creating the materialized views as
BUILD DEFERREDonly
creates the metadata for all the materialized views. And, then, you can just call one of the refresh procedures in
DBMS_MVIEWpackage
to refresh all the materialized views in the right order:
DECLARE numerrs PLS_INTEGER; BEGIN DBMS_MVIEW.REFRESH_DEPENDENT ( number_of_failures => numerrs, list=>'SALES', method => 'C'); DBMS_OUTPUT.PUT_LINE('There were ' || numerrs || ' errors during refresh'); END; /
The procedure refreshes the materialized views in the order of their dependencies (first
sales_hierarchical_mon_cube_mv,
followed by
sales_hierarchical_qtr_cube_mv, then,
sales_hierarchical_yr_cube_mvand
finally,
sales_hierarchical_all_cube_mv). Each of these materialized
views gets rewritten against the one prior to it in the list).
The same kind of rewrite can also be used while doing PCT refresh. PCT refresh recomputes rows in a materialized view corresponding to changed rows in the detail tables. And, if there are other fresh materialized views available at the time of refresh, it can
go directly against them as opposed to going against the detail tables.
Hence, it is always beneficial to pass a list of materialized views to any of the refresh procedures in
DBMS_MVIEWpackage
(irrespective of the method specified) and let the procedure figure out the order of doing refresh on materialized views.
Tips for Refreshing Materialized Views
This section contains the following topics with tips on refreshing materialized views:Tips for Refreshing Materialized Views with Aggregates
Tips for Refreshing Materialized Views Without Aggregates
Tips for Refreshing Nested Materialized Views
Tips for Fast Refresh with UNION ALL
Tips for Fast Refresh with Commit SCN-Based Materialized View Logs
Tips After Refreshing Materialized Views
Tips for Refreshing Materialized Views with Aggregates
Following are some guidelines for using the refresh mechanism for materialized views with aggregates.For fast refresh, create materialized view logs on all detail tables involved in a materialized view with the
ROWID,
SEQUENCEand
INCLUDING
NEW
VALUESclauses.
Include all columns from the table likely to be used in materialized views in the materialized view logs.
Fast refresh may be possible even if the
SEQUENCEoption
is omitted from the materialized view log. If it can be determined that only inserts or deletes will occur on all the detail tables, then the materialized view log does not require the
SEQUENCEclause.
However, if updates to multiple tables are likely or required or if the specific update scenarios are unknown, make sure the
SEQUENCEclause
is included.
Use Oracle's bulk loader utility or direct-path
INSERT(
INSERTwith
the
APPENDhint for loads). Starting in Oracle Database
12c, the database automatically gathers table statistics as part of a bulk-load operation (CTAS and IAS) similar to how statistics are gathered when an index is created. By gathering statistics during the
data load, you avoid additional scan operations and provide the necessary statistics as soon as the data becomes available to the users. Note that, in the case of an IAS statement, statistics are only gathered if the table the data is being inserted into is
empty.
This is a lot more efficient than conventional insert. During loading, disable all constraints and re-enable when finished loading. Note that materialized view logs are required regardless of whether you use direct load or conventional DML.
Try to optimize the sequence of conventional mixed DML operations, direct-path
INSERTand
the fast refresh of materialized views. You can use fast refresh with a mixture of conventional DML and direct loads. Fast refresh can perform significant optimizations if it finds that only direct loads have occurred, as illustrated in the following:
Direct-path
INSERT(SQL*Loader or
INSERT /*+ APPEND */) into the detail table
Refresh materialized view
Conventional mixed DML
Refresh materialized view
You can use fast refresh with conventional mixed DML (
INSERT,
UPDATE,
and
DELETE) to the detail tables. However, fast refresh is able
to perform significant optimizations in its processing if it detects that only inserts or deletes have been done to the tables, such as:
DML
INSERTor
DELETEto
the detail table
Refresh materialized views
DML update to the detail table
Refresh materialized view
Even more optimal is the separation of
INSERTand
DELETE.
If possible, refresh should be performed after each type of data change (as shown earlier) rather than issuing only one refresh at the end. If that is not possible, restrict the conventional DML to the table to inserts only, to get much better refresh performance.
Avoid mixing deletes and direct loads.
Furthermore, for refresh
ON
COMMIT,
Oracle keeps track of the type of DML done in the committed transaction. Therefore, do not perform direct-path
INSERTand
DML to other tables in the same transaction, as Oracle may not be able to optimize the refresh phase.
For
ON
COMMITmaterialized
views, where refreshes automatically occur at the end of each transaction, it may not be possible to isolate the DML statements, in which case keeping the transactions short will help. However, if you plan to make numerous modifications to the detail table,
it may be better to perform them in one transaction, so that refresh of the materialized view is performed just once at commit time rather than after each update.
Oracle recommends partitioning the tables because it enables you to use:
Parallel DML
For large loads or refresh, enabling parallel DML helps shorten the length of time for the operation.
Partition change tracking (PCT) fast refresh
You can refresh your materialized views fast after partition maintenance operations on the detail tables. "About
Partition Change Tracking" for details on enabling PCT for materialized views.
Partitioning the materialized view also helps refresh performance as refresh can update the materialized view using parallel DML. For example, assume that the detail tables and materialized view are partitioned and have a parallel clause. The following sequence
would enable Oracle to parallelize the refresh of the materialized view.
Bulk load into the detail table.
Enable parallel DML with an
ALTER
SESSION
ENABLE
PARALLEL
DMLstatement.
Refresh the materialized view.
For refresh using
DBMS_MVIEW.REFRESH, set the parameter
atomic_refreshto
FALSE.
For
COMPLETErefresh, this causes a
TRUNCATEto
delete existing rows in the materialized view, which is faster than a delete.
For
PCTrefresh, if the materialized view is partitioned
appropriately, this uses
TRUNCATE
PARTITIONto
delete rows in the affected partitions of the materialized view, which is faster than a delete.
For
FASTor
FORCErefresh,
if
COMPLETEor PCT refresh is chosen, this is able
to use the
TRUNCATEoptimizations described earlier.
When using
DBMS_MVIEW.REFRESHwith
JOB_QUEUES,
remember to set
atomicto
FALSE.
Otherwise,
JOB_QUEUESis not used. Set the number of
job queue processes greater than the number of processors.
If job queues are enabled and there are many materialized views to refresh, it is faster to refresh all of them in a single command than to call them individually.
Use
REFRESH
FORCEto
ensure refreshing a materialized view so that it can definitely be used for query rewrite. The best refresh method is chosen. If a fast refresh cannot be done, a complete refresh is performed.
Refresh all the materialized views in a single procedure call. This gives Oracle an opportunity to schedule refresh of all the materialized views in the right order taking into account dependencies imposed by nested materialized views and potential for efficient
refresh by using query rewrite against other materialized views.
Tips for Refreshing Materialized Views Without Aggregates
If a materialized view contains joins but no aggregates, then having an index on each of the join column rowids in the detail table enhances refresh performance greatly, because this type of materialized view tends to be much larger than materialized viewscontaining aggregates. For example, consider the following materialized view:
CREATE MATERIALIZED VIEW detail_fact_mv BUILD IMMEDIATE AS SELECT s.rowid "sales_rid", t.rowid "times_rid", c.rowid "cust_rid", c.cust_state_province, t.week_ending_day, s.amount_sold FROM sales s, times t, customers c WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id;
Indexes should be created on columns
sales_rid,
times_ridand
cust_rid.
Partitioning is highly recommended, as is enabling parallel DML in the session before invoking refresh, because it greatly enhances refresh performance.
This type of materialized view can also be fast refreshed if DML is performed on the detail table. It is recommended that the same procedure be applied to this type of materialized view as for a single table aggregate. That is, perform one type of change (direct-path
INSERTor
DML) and then refresh the materialized view. This is because Oracle Database can perform significant optimizations if it detects that only one type of change has been done.
Also, Oracle recommends that the refresh be invoked after each table is loaded, rather than load all the tables and then perform the refresh.
For refresh
ON
COMMIT,
Oracle keeps track of the type of DML done in the committed transaction. Oracle therefore recommends that you do not perform direct-path and conventional DML to other tables in the same transaction because Oracle may not be able to optimize the refresh phase.
For example, the following is not recommended:
Direct load new data into the fact table
DML into the store table
Commit
Also, try not to mix different types of conventional DML statements if possible. This would again prevent using various optimizations during fast refresh. For example, try to avoid the following:
Insert into the fact table
Delete from the fact table
Commit
If many updates are needed, try to group them all into one transaction because refresh is performed just once at commit time, rather than after each update.
In a data warehousing environment, assuming that the materialized view has a parallel clause, the following sequence of steps is recommended:
Bulk load into the fact table
Enable parallel DML
An
ALTER
SESSION
ENABLE
PARALLEL
DMLstatement
Refresh the materialized view
Tips for Refreshing Nested Materialized Views
All underlying objects are treated asordinary tables when refreshing materialized views. If the
ON
COMMITrefresh
option is specified, then all the materialized views are refreshed in the appropriate order at commit time. In other words, Oracle builds a partially ordered set of materialized views and refreshes them such that, after the successful completion of the refresh,
all the materialized views are fresh. The status of the materialized views can be checked by querying the appropriate
USER_,
DBA_,
or
ALL_MVIEWSview.
If any of the materialized views are defined as
ON
DEMANDrefresh
(irrespective of whether the refresh method is
FAST,
FORCE,
or
COMPLETE), you must refresh them in the correct order (taking
into account the dependencies between the materialized views) because the nested materialized view are refreshed with respect to the current contents of the other materialized views (whether fresh or not). This can be achieved by invoking the refresh procedure
against the materialized view at the top of the nested hierarchy and specifying the
nestedparameter
as
TRUE.
If a refresh fails during commit time, the list of materialized views that has not been refreshed is written to the alert log, and you must manually refresh them along with all their dependent materialized views.
Use the same
DBMS_MVIEWprocedures on nested materialized
views that you use on regular materialized views.
These procedures have the following behavior when used with nested materialized views:
If
REFRESHis applied to a materialized view
my_mvthat
is built on other materialized views, then
my_mvis
refreshed with respect to the current contents of the other materialized views (that is, the other materialized views are not made fresh first) unless you specify
nested => TRUE.
If
REFRESH_DEPENDENTis applied to materialized view
my_mv,
then only materialized views that directly depend on
my_mvare
refreshed (that is, a materialized view that depends on a materialized view that depends on
my_mvwill
not be refreshed) unless you specify
nested => TRUE.
If
REFRESH_ALL_MVIEWSis used, the order in which the
materialized views are refreshed is guaranteed to respect the dependencies between nested materialized views.
GET_MV_DEPENDENCIESprovides a list of the immediate (or direct)
materialized view dependencies for an object.
Tips for Fast Refresh with UNION ALL
You can use fast refreshfor materialized views that use the
UNION
ALLoperator
by providing a maintenance column in the definition of the materialized view. For example, a materialized view with a
UNION
ALLoperator
can be made fast refreshable as follows:
CREATE MATERIALIZED VIEW fast_rf_union_all_mv AS SELECT x.rowid AS r1, y.rowid AS r2, a, b, c, 1 AS marker FROM x, y WHERE x.a = y.b UNION ALL SELECT p.rowid, r.rowid, a, c, d, 2 AS marker FROM p, r WHERE p.a = r.y;
The form of a maintenance marker column, column
MARKERin
the example, must be
numeric_or_string_literal
AS
column_alias,
where each
UNION
ALLmember
has a distinct value for
numeric_or_string_literal.
Tips for Fast Refresh with Commit SCN-Based Materialized View Logs
You can often improve fast refresh performance by ensuring that your materialized view logs on the base table contain a WITH
COMMIT
SCNclause,
often significantly. By optimizing materialized view log processing
WITH
COMMIT
SCN,
the fast refresh process can save time. The following example illustrates how to use this clause:
CREATE MATERIALIZED VIEW LOG ON sales WITH ROWID (prod_id, cust_id, time_id, channel_id, promo_id, quantity_sold, amount_sold), COMMIT SCN INCLUDING NEW VALUES;
The materialized view refresh automatically uses the commit SCN-based materialized view log to save refresh time.
Note that only new materialized view logs can take advantage of
COMMIT
SCN.
Existing materialized view logs cannot be altered to add
COMMIT
SCNunless
they are dropped and recreated.
When a materialized view is created on both base tables with timestamp-based materialized view logs and base tables with commit SCN-based materialized view logs, an error (ORA-32414) is raised stating that materialized view logs are not compatible with each
other for fast refresh.
Tips After Refreshing Materialized Views
After you have performed a load or incremental load and rebuilt the detail table indexes, you must re-enable integrity constraints (if any) and refresh the materialized views and materialized view indexes that are derived from that detail data. In a data warehouseenvironment, referential integrity constraints are normally enabled with the
NOVALIDATEor
RELYoptions.
An important decision to make before performing a refresh operation is whether the refresh needs to be recoverable. Because materialized view data is redundant and can always be reconstructed from the detail tables, it might be preferable to disable logging
on the materialized view. To disable logging and run incremental refresh non-recoverably, use the
ALTER
MATERIALIZED
VIEW...
NOLOGGINGstatement
prior to refreshing.
If the materialized view is being refreshed using the
ON
COMMITmethod,
then, following refresh operations, consult the alert log
alert_SID
.logand
the trace file
ora_SID
_number.trcto
check that no errors have occurred.
Using Materialized Views with Partitioned Tables
A major maintenance componentof a data warehouse is synchronizing (refreshing) the materialized views when the detail data changes. Partitioning the underlying detail tables can reduce the amount of time taken to perform the refresh task. This is possible because partitioning enables
refresh to use parallel DML to update the materialized view. Also, it enables the use of partition change tracking.
"Fast Refresh with Partition Change Tracking" provides additional information
about PCT refresh.
Fast Refresh with Partition Change Tracking
In a data warehouse, changes to the detail tables can often entail partition maintenance operations, such as DROP,
EXCHANGE,
MERGE,
and
ADD
PARTITION.
To maintain the materialized view after such operations used to require manual maintenance (see also
CONSIDER
FRESH)
or complete refresh. You now have the option of using an addition to fast refresh known as partition change tracking (PCT) refresh.
For PCT to be available, the detail tables must be partitioned. The partitioning of the materialized view itself has no bearing on this feature. If PCT refresh is possible, it occurs automatically and no user intervention is required in order for it to occur.
See "About Partition Change Tracking" for PCT requirements.
The following examples illustrate the use of this feature:
PCT Fast Refresh Scenario 1
PCT Fast Refresh Scenario 2
PCT Fast Refresh Scenario 3
PCT Fast Refresh Scenario 1
In this scenario, assume salesis a partitioned table
using the
time_idcolumn and
productsis
partitioned by the
prod_categorycolumn. The table
timesis
not a partitioned table.
Create the materialized view. The following materialized view satisfies requirements for PCT.
CREATE MATERIALIZED VIEW cust_mth_sales_mv BUILD IMMEDIATE REFRESH FAST ON DEMAND ENABLE QUERY REWRITE AS SELECT s.time_id, s.prod_id, SUM(s.quantity_sold), SUM(s.amount_sold), p.prod_name, t.calendar_month_name, COUNT(*), COUNT(s.quantity_sold), COUNT(s.amount_sold) FROM sales s, products p, times t WHERE s.time_id = t.time_id AND s.prod_id = p.prod_id GROUP BY t.calendar_month_name, s.prod_id, p.prod_name, s.time_id;
Run the
DBMS_MVIEW.EXPLAIN_MVIEWprocedure to determine
which tables allow PCT refresh.
MVNAME CAPABILITY_NAME POSSIBLE RELATED_TEXT MSGTXT ----------------- --------------- -------- ------------ ---------------- CUST_MTH_SALES_MV PCT Y SALES CUST_MTH_SALES_MV PCT_TABLE Y SALES CUST_MTH_SALES_MV PCT_TABLE N PRODUCTS no partition key or PMARKER in SELECT list CUST_MTH_SALES_MV PCT_TABLE N TIMES relation is not partitionedtable
As can be seen from the partial sample output from
EXPLAIN_MVIEW,
any partition maintenance operation performed on the
salestable
allows PCT fast refresh. However, PCT is not possible after partition maintenance operations or updates to the
productstable
as there is insufficient information contained in
cust_mth_sales_mvfor
PCT refresh to be possible. Note that the
timestable
is not partitioned and hence can never allow for PCT refresh. Oracle Database applies PCT refresh if it can determine that the materialized view has sufficient information to support PCT for all the updated tables. You can verify which partitions are fresh
and stale with views such as
DBA_MVIEWSand
DBA_MVIEW_DETAIL_PARTITION.
See "Analyzing Materialized View Capabilities" for information on
how to use this procedure and also some details regarding PCT-related views.
Suppose at some later point, a
SPLIToperation of one
partition in the sales table becomes necessary.
ALTER TABLE SALES SPLIT PARTITION month3 AT (TO_DATE('05-02-1998', 'DD-MM-YYYY')) INTO (PARTITION month3_1 TABLESPACE summ, PARTITION month3 TABLESPACE summ);
Insert some data into the
salestable.
Fast refresh
cust_mth_sales_mvusing the
DBMS_MVIEW.REFRESHprocedure.
EXECUTE DBMS_MVIEW.REFRESH('CUST_MTH_SALES_MV', 'F', '',TRUE,FALSE,0,0,0,FALSE);
Fast refresh automatically performs a PCT refresh as it is the only fast refresh possible in this scenario. However, fast refresh will not occur if a partition maintenance operation occurs when any update has taken place to a table on which PCT is not enabled.
This is shown in "PCT Fast Refresh Scenario 2".
"PCT Fast Refresh Scenario 1" would also be appropriate if the materialized view
was created using the
PMARKERclause as illustrated
in the following:
CREATE MATERIALIZED VIEW cust_sales_marker_mv BUILD IMMEDIATE REFRESH FAST ON DEMAND ENABLE QUERY REWRITE AS SELECT DBMS_MVIEW.PMARKER(s.rowid) s_marker, SUM(s.quantity_sold), SUM(s.amount_sold), p.prod_name, t.calendar_month_name, COUNT(*), COUNT(s.quantity_sold), COUNT(s.amount_sold) FROM sales s, products p, times t WHERE s.time_id = t.time_id AND s.prod_id = p.prod_id GROUP BY DBMS_MVIEW.PMARKER(s.rowid), p.prod_name, t.calendar_month_name;
PCT Fast Refresh Scenario 2
In this scenario, the first three steps are the same as in "PCT Fast Refresh Scenario1". Then, the
SPLITpartition operation to the
salestable
is performed, but before the materialized view refresh occurs, records are inserted into the
timestable.
The same as in "PCT Fast Refresh Scenario 1".
The same as in "PCT Fast Refresh Scenario 1".
The same as in "PCT Fast Refresh Scenario 1".
After issuing the same
SPLIToperation, as shown in "PCT
Fast Refresh Scenario 1", some data is inserted into the
timestable.
ALTER TABLE SALES SPLIT PARTITION month3 AT (TO_DATE('05-02-1998', 'DD-MM-YYYY')) INTO (PARTIITION month3_1 TABLESPACE summ, PARTITION month3 TABLESPACE summ);
Refresh
cust_mth_sales_mv.
EXECUTE DBMS_MVIEW.REFRESH('CUST_MTH_SALES_MV', 'F', '', TRUE, FALSE, 0, 0, 0, FALSE, FALSE); ORA-12052: cannot fast refresh materialized view SH.CUST_MTH_SALES_MV
The materialized view is not fast refreshable because DML has occurred to a table on which PCT fast refresh is not possible. To avoid this occurring, Oracle recommends performing a fast refresh immediately after any partition maintenance operation on detail
tables for which partition tracking fast refresh is available.
If the situation in "PCT Fast Refresh Scenario 2" occurs, there are
two possibilities; perform a complete refresh or switch to the
CONSIDER
FRESHoption
outlined in the following, if suitable. However, it should be noted that
CONSIDER
FRESHand
partition change tracking fast refresh are not compatible. Once the
ALTER
MATERIALIZED
VIEW
cust_mth_sales_mv
CONSIDER
FRESHstatement
has been issued, PCT refresh is no longer be applied to this materialized view, until a complete refresh is done. Moreover, you should not use
CONSIDER
FRESHunless
you have taken manual action to ensure that the materialized view is indeed fresh.
A common situation in a data warehouse is the use of rolling windows of data. In this case, the detail table and the materialized view may contain say the last 12 months of data. Every month, new data for a month is added to the table and the oldest month is
deleted (or maybe archived). PCT refresh provides a very efficient mechanism to maintain the materialized view in this case.
PCT Fast Refresh Scenario 3
The new data is usually added to the detail table by adding a new partition and exchanging it with a table containing the new data.ALTER TABLE sales ADD PARTITION month_new ... ALTER TABLE sales EXCHANGE PARTITION month_new month_new_table
Next, the oldest partition is dropped or truncated.
ALTER TABLE sales DROP PARTITION month_oldest;
Now, if the materialized view satisfies all conditions for PCT refresh.
EXECUTE DBMS_MVIEW.REFRESH('CUST_MTH_SALES_MV', 'F', '', TRUE, FALSE, 0, 0, 0, FALSE, FALSE);
Fast refresh will automatically detect that PCT is available and perform a PCT refresh.
Using Partitioning to Improve Data Warehouse Refresh
ETL (Extraction, Transformationand Loading) is done on a scheduled basis to reflect changes made to the original source system. During this step, you physically insert the new, clean data into the production data warehouse schema, and take all of the other steps necessary (such as building
indexes, validating constraints, taking backups) to make this new data available to the end users. Once all of this data has been loaded into the data warehouse, the materialized views have to be updated to reflect the latest data.
The partitioning scheme of the data warehouse is often crucial in determining the efficiency of refresh operations in the data warehouse load process.
In fact, the load process is often the primary consideration in choosing the partitioning scheme of data warehouse tables and indexes.
The partitioning scheme of the largest data warehouse tables (for example, the fact table in a star schema) should be based upon the loading paradigm of the data warehouse.
Most data warehouses are loaded with new data on a regular schedule. For example, every night, week, or month, new data is brought into the data warehouse. The data being loaded at the end of the week or month typically corresponds to the transactions for the
week or month. In this very common scenario, the data warehouse is being loaded by time. This suggests that the data warehouse tables should be partitioned on a date column. In our data warehouse example, suppose the new data is loaded into the
salestable
every month. Furthermore, the
salestable has been
partitioned by month. These steps show how the load process proceeds to add the data for a new month (January 2001) to the table
sales.
Place the new data into a separate table,
sales_01_2001. This data
can be directly loaded into
sales_01_2001from outside
the data warehouse, or this data can be the result of previous data transformation operations that have already occurred in the data warehouse.
sales_01_2001has
the exact same columns, data types, and so forth, as the
salestable.
Gather statistics on the
sales_01_2001table.
Create indexes and add constraints on
sales_01_2001. Again, the indexes
and constraints on
sales_01_2001should be identical
to the indexes and constraints on
sales. Indexes can be built in
parallel and should use the
NOLOGGINGand the
COMPUTE
STATISTICSoptions.
For example:
CREATE BITMAP INDEX sales_01_2001_customer_id_bix ON sales_01_2001(customer_id) TABLESPACE sales_idx NOLOGGING PARALLEL 8 COMPUTE STATISTICS;
Apply all constraints to the
sales_01_2001table that
are present on the
salestable. This includes referential
integrity constraints. A typical constraint would be:
ALTER TABLE sales_01_2001 ADD CONSTRAINT sales_customer_id REFERENCES customer(customer_id) ENABLE NOVALIDATE;
If the partitioned table
saleshas a primary or unique
key that is enforced with a global index structure, ensure that the constraint on
sales_pk_jan01is
validated without the creation of an index structure, as in the following:
ALTER TABLE sales_01_2001 ADD CONSTRAINT sales_pk_jan01 PRIMARY KEY (sales_transaction_id) DISABLE VALIDATE;
The creation of the constraint with
ENABLEclause would
cause the creation of a unique index, which does not match a local index structure of the partitioned table. You must not have any index structure built on the nonpartitioned table to be exchanged for existing global indexes of the partitioned table. The exchange
command would fail.
Add the
sales_01_2001table to the
salestable.
In order to add this new data to the
salestable, you
must do two things. First, you must add a new partition to the
salestable.
You use an
ALTER
TABLE...
ADD
PARTITIONstatement.
This adds an empty partition to the
salestable:
ALTER TABLE sales ADD PARTITION sales_01_2001 VALUES LESS THAN (TO_DATE('01-FEB-2001', 'DD-MON-YYYY'));
Then, you can add our newly created table to this partition using the
EXCHANGE
PARTITIONoperation.
This exchanges the new, empty partition with the newly loaded table.
ALTER TABLE sales EXCHANGE PARTITION sales_01_2001 WITH TABLE sales_01_2001 INCLUDING INDEXES WITHOUT VALIDATION UPDATE GLOBAL INDEXES;
The
EXCHANGEoperation preserves the indexes and constraints
that were already present on the
sales_01_2001table.
For unique constraints (such as the unique constraint on
sales_transaction_id),
you can use the
UPDATE
GLOBAL
INDEXESclause,
as shown previously. This automatically maintains your global index structures as part of the partition maintenance operation and keep them accessible throughout the whole process. If there were only foreign-key constraints, the exchange operation would be
instantaneous.
Note that, if you use synchronous refresh, instead of performing Step 3, you must register the
sales_01_2001table
using the
DBMS_SYNC_REFRESH.REGISTER_PARTITION_OPERATIONpackage.
See Chapter 8, "Synchronous Refresh" for more information.
The benefits of this partitioning technique are significant. First, the new data is loaded with minimal resource utilization. The new data is loaded into an entirely separate table, and the index processing and constraint processing are applied only to the
new partition. If the
salestable was 50 GB and had
12 partitions, then a new month's worth of data contains approximately four GB. Only the new month's worth of data must be indexed. None of the indexes on the remaining 46 GB of data must be modified at all. This partitioning scheme additionally ensures that
the load processing time is directly proportional to the amount of new data being loaded, not to the total size of the
salestable.
Second, the new data is loaded with minimal impact on concurrent queries. All of the operations associated with data loading are occurring on a separate
sales_01_2001table.
Therefore, none of the existing data or indexes of the
salestable
is affected during this data refresh process. The
salestable and
its indexes remain entirely untouched throughout this refresh process.
Third, in case of the existence of any global indexes, those are incrementally maintained as part of the exchange command. This maintenance does not affect the availability of the existing global index structures.
The exchange operation can be viewed as a publishing mechanism. Until the data warehouse administrator exchanges the
sales_01_2001table
into the
salestable, end users cannot see the new
data. Once the exchange has occurred, then any end user query accessing the
salestable
is immediately able to see the
sales_01_2001data.
Partitioning is useful not only for adding new data but also for removing and archiving data. Many data warehouses maintain a rolling window of data. For example, the data warehouse stores the most recent 36 months of
salesdata.
Just as a new partition can be added to the
salestable
(as described earlier), an old partition can be quickly (and independently) removed from the
salestable.
These two benefits (reduced resources utilization and minimal end-user impact) are just as pertinent to removing a partition as they are to adding a partition.
Removing data from a partitioned table does not necessarily mean that the old data is physically deleted from the database. There are two alternatives for removing old data from a partitioned table. First, you can physically delete all data from the database
by dropping the partition containing the old data, thus freeing the allocated space:
ALTER TABLE sales DROP PARTITION sales_01_1998;
Also, you can exchange the old partition with an empty table of the same structure; this empty table is created equivalent to steps 1 and 2 described in the load process. Assuming the new empty table stub is named
sales_archive_01_1998,
the following SQL statement empties partition
sales_01_1998:
ALTER TABLE sales EXCHANGE PARTITION sales_01_1998 WITH TABLE sales_archive_01_1998 INCLUDING INDEXES WITHOUT VALIDATION UPDATE GLOBAL INDEXES;
Note that the old data is still existent as the exchanged, nonpartitioned table
sales_archive_01_1998.
If the partitioned table was setup in a way that every partition is stored in a separate tablespace, you can archive (or transport) this table using Oracle Database's transportable tablespace framework before dropping the actual data (the tablespace). See "Transportation
Using Transportable Tablespaces" for further details regarding transportable tablespaces.
In some situations, you might not want to drop the old data immediately, but keep it as part of the partitioned table; although the data is no longer of main interest, there are still potential queries accessing this old, read-only data. You can use Oracle's
data compression to minimize the space usage of the old data. You also assume that at least one compressed partition is already part of the partitioned table.
See Also:
Oracle Database Administrator's
Guide for more information regarding table compression
Oracle
Database VLDB and Partitioning Guide for more information regarding partitioning and table compression
Refresh Scenarios
A typical scenario might not only need to compress old data, but also to merge several old partitions to reflect the granularity for a later backup of several merged partitions. Let us assume that a backup (partition) granularity is on a quarterly base forany quarter, where the oldest month is more than 36 months behind the most recent month. In this case, you are therefore compressing and merging
sales_01_1998,
sales_02_1998,
and
sales_03_1998into a new, compressed partition
sales_q1_1998.
Create the new merged partition in parallel in another tablespace. The partition is compressed as part of the
MERGEoperation:
ALTER TABLE sales MERGE PARTITIONS sales_01_1998, sales_02_1998, sales_03_1998 INTO PARTITION sales_q1_1998 TABLESPACE archive_q1_1998 COMPRESS UPDATE GLOBAL INDEXES PARALLEL 4;
The partition
MERGEoperation invalidates the local
indexes for the new merged partition. You therefore have to rebuild them:
ALTER TABLE sales MODIFY PARTITION sales_q1_1998 REBUILD UNUSABLE LOCAL INDEXES;
Alternatively, you can choose to create the new compressed table outside the partitioned table and exchange it back. The performance and the temporary space consumption is identical for both methods:
Create an intermediate table to hold the new merged information. The following statement inherits all
NOT
NULLconstraints
from the original table by default:
CREATE TABLE sales_q1_1998_out TABLESPACE archive_q1_1998 NOLOGGING COMPRESS PARALLEL 4 AS SELECT * FROM sales WHERE time_id >= TO_DATE('01-JAN-1998','dd-mon-yyyy') AND time_id < TO_DATE('01-APR-1998','dd-mon-yyyy');
Create the equivalent index structure for table
sales_q1_1998_outthan
for the existing table
sales.
Prepare the existing table sales for the exchange with the new compressed table
sales_q1_1998_out.
Because the table to be exchanged contains data actually covered in three partitions, you have to create one matching partition, having the range boundaries you are looking for. You simply have to drop two of the existing partitions. Note that you have to
drop the lower two partitions
sales_01_1998and
sales_02_1998;
the lower boundary of a range partition is always defined by the upper (exclusive) boundary of the previous partition:
ALTER TABLE sales DROP PARTITION sales_01_1998;ALTER TABLE sales DROP PARTITION sales_02_1998;
You can now exchange table
sales_q1_1998_outwith partition
sales_03_1998.
Unlike what the name of the partition suggests, its boundaries cover Q1-1998.
ALTER TABLE sales EXCHANGE PARTITION sales_03_1998 WITH TABLE sales_q1_1998_out INCLUDING INDEXES WITHOUT VALIDATION UPDATE GLOBAL INDEXES;
Both methods apply to slightly different business scenarios: Using the
MERGE
PARTITIONapproach
invalidates the local index structures for the affected partition, but it keeps all data accessible all the time. Any attempt to access the affected partition through one of the unusable index structures raises an error. The limited availability time is approximately
the time for re-creating the local bitmap index structures. In most cases, this can be neglected, because this part of the partitioned table should not be accessed too often.
The CTAS approach, however, minimizes unavailability of any index structures close to zero, but there is a specific time window, where the partitioned table does not have all the data, because you dropped two partitions. The limited availability time is approximately
the time for exchanging the table. Depending on the existence and number of global indexes, this time window varies. Without any existing global indexes, this time window is a matter of a fraction to few seconds.
These examples are a simplification of the data warehouse rolling window load scenario. Real-world data warehouse refresh characteristics are always more complex. However, the advantages of this rolling window approach are not diminished in more complex scenarios.
Note that before you add single or multiple compressed partitions to a partitioned table for the first time, all local bitmap indexes must be either dropped or marked unusable. After the first compressed partition is added, no additional actions are necessary
for all subsequent operations involving compressed partitions. It is irrelevant how the compressed partitions are added to the partitioned table.
See Also:
Oracle
Database VLDB and Partitioning Guide for more information regarding partitioning and table compression
Oracle Database Administrator's
Guide for further details about partitioning and table compression.
Scenarios for Using Partitioning for Refreshing Data Warehouses
This section describes the following two typical scenarios where partitioning is used with refresh:Refresh Scenario 1
Refresh Scenario 2
Refresh Scenario 1
Data is loaded daily. However, the data warehouse contains two years of data, so that partitioning by day might not be desired.The solution is to partition by week or month (as appropriate). Use
INSERTto
add the new data to an existing partition. The
INSERToperation
only affects a single partition, so the benefits described previously remain intact. The
INSERToperation
could occur while the partition remains a part of the table. Inserts into a single partition can be parallelized:
INSERT /*+ APPEND*/ INTO sales PARTITION (sales_01_2001) SELECT * FROM new_sales;
The indexes of this
salespartition is maintained in
parallel as well. An alternative is to use the
EXCHANGEoperation.
You can do this by exchanging the
sales_01_2001partition of the
salestable
and then using an
INSERToperation. You might prefer
this technique when dropping and rebuilding indexes is more efficient than maintaining them.
Refresh Scenario 2
New data feeds, although consisting primarily of data for the most recent day, week, and month, also contain some data from previous time periods.Solution 1
Use parallel SQL operations (such as
CREATE
TABLE...
AS
SELECT)
to separate the new data from the data in previous time periods. Process the old data separately using other techniques.
New data feeds are not solely time based. You can also feed new data into a data warehouse with data from multiple operational systems on a business need basis. For example, the sales data from direct channels may come into the data warehouse separately from
the data from indirect channels. For business reasons, it may furthermore make sense to keep the direct and indirect data in separate partitions.
Solution 2
Oracle supports composite range-list partitioning. The primary partitioning strategy of the sales table could be range partitioning based on
time_idas
shown in the example. However, the subpartitioning is a list based on the channel attribute. Each subpartition can now be loaded independently of each other (for each distinct channel) and added in a rolling window operation as discussed before. The partitioning
strategy addresses the business needs in the most optimal manner.
Optimizing DML Operations During Refresh
You can optimize DML performance through the following techniques:Implementing an Efficient MERGE Operation
Maintaining Referential Integrity
Purging Data
Implementing an Efficient MERGE Operation
Commonly, the data that is extracted from a source system is not simply a list of new records that needs to be inserted into the data warehouse. Instead,this new data set is a combination of new records as well as modified records. For example, suppose that most of data extracted from the OLTP systems will be new sales transactions. These records are inserted into the warehouse's
salestable,
but some records may reflect modifications of previous transactions, such as returned merchandise or transactions that were incomplete or incorrect when initially loaded into the data warehouse. These records require updates to the
salestable.
As a typical scenario, suppose that there is a table called
new_salesthat
contains both inserts and updates that are applied to the
salestable.
When designing the entire data warehouse load process, it was determined that the
new_salestable
would contain records with the following semantics:
If a given
sales_transaction_idof a record in
new_salesalready
exists in
sales, then update the
salestable
by adding the
sales_dollar_amountand
sales_quantity_soldvalues
from the
new_salestable to the existing row in the
salestable.
Otherwise, insert the entire new record from the
new_salestable
into the
salestable.
This
UPDATE-ELSE-INSERToperation is often called a
merge. A merge can be executed using one SQL statement.
Example 7-5 MERGE Operation
MERGE INTO sales s USING new_sales n ON (s.sales_transaction_id = n.sales_transaction_id) WHEN MATCHED THEN UPDATE SET s.sales_quantity_sold = s.sales_quantity_sold + n.sales_quantity_sold, s.sales_dollar_amount = s.sales_dollar_amount + n.sales_dollar_amount WHEN NOT MATCHED THEN INSERT (sales_transaction_id, sales_quantity_sold, sales_dollar_amount) VALUES (n.sales_transcation_id, n.sales_quantity_sold, n.sales_dollar_amount);
In addition to using the
MERGEstatement for unconditional
UPDATE
ELSE
INSERTfunctionality
into a target table, you can also use it to:
Perform an
UPDATEonly or
INSERTonly
statement.
Apply additional
WHEREconditions for the
UPDATEor
INSERTportion
of the
MERGEstatement.
The
UPDATEoperation can even delete rows if a specific
condition yields true.
Example 7-6 Omitting the INSERT Clause
In some data warehouse applications, it is not allowed to add new rows to historical information, but only to update them. It may also happen that you do not want to update but only insert new information. The following example demonstrates
INSERT-only
with
UPDATE-only functionality:
MERGE USING Product_Changes S -- Source/Delta table INTO Products D1 -- Destination table 1 ON (D1.PROD_ID = S.PROD_ID) -- Search/Join condition WHEN MATCHED THEN UPDATE -- update if join SET D1.PROD_STATUS = S.PROD_NEW_STATUS
Example 7-7 Omitting the UPDATE Clause
The following statement illustrates an example of omitting an
UPDATE:
MERGE USING New_Product S -- Source/Delta table INTO Products D2 -- Destination table 2 ON (D2.PROD_ID = S.PROD_ID) -- Search/Join condition WHEN NOT MATCHED THEN -- insert if no join INSERT (PROD_ID, PROD_STATUS) VALUES (S.PROD_ID, S.PROD_NEW_STATUS)
When the
INSERTclause is omitted, Oracle Database performs
a regular join of the source and the target tables. When the
UPDATEclause
is omitted, Oracle Database performs an antijoin of the source and the target tables. This makes the join between the source and target table more efficient.
Example 7-8 Skipping the UPDATE Clause
In some situations, you may want to skip the
UPDATEoperation
when merging a given row into the table. In this case, you can use an optional
WHEREclause
in the
UPDATEclause of the
MERGE.
As a result, the
UPDATEoperation only executes when
a given condition is true. The following statement illustrates an example of skipping the
UPDATEoperation:
MERGE USING Product_Changes S -- Source/Delta table INTO Products P -- Destination table 1 ON (P.PROD_ID = S.PROD_ID) -- Search/Join condition WHEN MATCHED THEN UPDATE -- update if join SET P.PROD_LIST_PRICE = S.PROD_NEW_PRICE WHERE P.PROD_STATUS <> "OBSOLETE" -- Conditional UPDATE
This shows how the
UPDATEoperation would be skipped
if the condition
P.PROD_STATUS <> "OBSOLETE"is not
true. The condition predicate can refer to both the target and the source table.
Example 7-9 Conditional Inserts with MERGE Statements
You may want to skip the
INSERToperation when merging
a given row into the table. So an optional
WHEREclause
is added to the
INSERTclause of the
MERGE.
As a result, the
INSERToperation only executes when
a given condition is true. The following statement offers an example:
MERGE USING Product_Changes S -- Source/Delta table INTO Products P -- Destination table 1 ON (P.PROD_ID = S.PROD_ID) -- Search/Join condition WHEN MATCHED THEN UPDATE -- update if join SET P.PROD_LIST_PRICE = S.PROD_NEW_PRICE WHERE P.PROD_STATUS <> "OBSOLETE" -- Conditional WHEN NOT MATCHED THEN INSERT (PROD_ID, PROD_STATUS, PROD_LIST_PRICE) -- insert if not join VALUES (S.PROD_ID, S.PROD_NEW_STATUS, S.PROD_NEW_PRICE) WHERE S.PROD_STATUS <> "OBSOLETE"; -- Conditional INSERT
This example shows that the
INSERToperation would be
skipped if the condition
S.PROD_STATUS <> "OBSOLETE"is
not true, and
INSERTonly occurs if the condition is
true. The condition predicate can refer to the source table only. The condition predicate can only refer to the source table.
Example 7-10 Using the DELETE Clause with MERGE Statements
You may want to cleanse tables while populating or updating them. To do this, you may want to consider using the
DELETEclause
in a
MERGEstatement, as in the following example:
MERGE USING Product_Changes S INTO Products D ON (D.PROD_ID = S.PROD_ID) WHEN MATCHED THEN UPDATE SET D.PROD_LIST_PRICE =S.PROD_NEW_PRICE, D.PROD_STATUS = S.PROD_NEWSTATUS DELETE WHERE (D.PROD_STATUS = "OBSOLETE") WHEN NOT MATCHED THEN INSERT (PROD_ID, PROD_LIST_PRICE, PROD_STATUS) VALUES (S.PROD_ID, S.PROD_NEW_PRICE, S.PROD_NEW_STATUS);
Thus when a row is updated in
products, Oracle checks the delete
condition
D.PROD_STATUS = "OBSOLETE", and deletes the row if the
condition yields true.
The
DELETEoperation is not as same as that of a complete
DELETEstatement.
Only the rows from the destination of the
MERGEcan
be deleted. The only rows that are affected by the
DELETEare
the ones that are updated by this
MERGEstatement.
Thus, although a given row of the destination table meets the delete condition, if it does not join under the
ONclause
condition, it is not deleted.
Example 7-11 Unconditional Inserts with MERGE Statements
You may want to insert all of the source rows into a table. In this case, the join between the source and target table can be avoided. By identifying special constant join conditions that always result to
FALSE,
for example, 1=0, such
MERGEstatements are optimized
and the join condition are suppressed.
MERGE USING New_Product S -- Source/Delta table INTO Products P -- Destination table 1 ON (1 = 0) -- Search/Join condition WHEN NOT MATCHED THEN -- insert if no join INSERT (PROD_ID, PROD_STATUS) VALUES (S.PROD_ID, S.PROD_NEW_STATUS)
Maintaining Referential Integrity
In some data warehousing environments, you might want to insert new data into tables in order to guarantee referential integrity. For example, a data warehouse may derive salesfrom
an operational system that retrieves data directly from cash registers.
salesis
refreshed nightly. However, the data for the
productdimension
table may be derived from a separate operational system. The
productdimension
table may only be refreshed once for each week, because the
producttable
changes relatively slowly. If a new product was introduced on Monday, then it is possible for that product's
product_idto
appear in the
salesdata of the data warehouse before
that
product_idhas been inserted into the data warehouses
producttable.
Although the sales transactions of the new product may be valid, this sales data do not satisfy the referential integrity constraint between the
productdimension
table and the
salesfact table. Rather than disallow
the new sales transactions, you might choose to insert the sales transactions into the
salestable.
However, you might also wish to maintain the referential integrity relationship between the
salesand
producttables.
This can be accomplished by inserting new rows into the
producttable
as placeholders for the unknown products.
As in previous examples, assume that the new data for the
salestable
is staged in a separate table,
new_sales. Using a single
INSERTstatement
(which can be parallelized), the
producttable can
be altered to reflect the new products:
INSERT INTO product (SELECT sales_product_id, 'Unknown Product Name', NULL, NULL ... FROM new_sales WHERE sales_product_id NOT IN (SELECT product_id FROM product));
Purging Data
Occasionally, it is necessary toremove large amounts of data from a data warehouse. A very common scenario is the rolling window discussed previously, in which older data is rolled out of the data warehouse to make room for new data.
However, sometimes other data might need to be removed from a data warehouse. Suppose that a retail company has previously sold products from
XYZ
Software,
and that
XYZ
Softwarehas
subsequently gone out of business. The business users of the warehouse may decide that they are no longer interested in seeing any data related to
XYZ
Software,
so this data should be deleted.
One approach to removing a large volume of data is to use parallel delete as shown in the following statement:
DELETE FROM sales WHERE sales_product_id IN (SELECT product_id FROM product WHERE product_category = 'XYZ Software');
This SQL statement spawns one parallel process for each partition. This approach is much more efficient than a series of
DELETEstatements,
and none of the data in the
salestable needs to be
moved. However, this approach also has some disadvantages. When removing a large percentage of rows, the
DELETEstatement
leaves many empty row-slots in the existing partitions. If new data is being loaded using a rolling window technique (or is being loaded using direct-path
INSERTor
load), then this storage space is not reclaimed. Moreover, even though the
DELETEstatement
is parallelized, there might be more efficient methods. An alternative method is to re-create the entire
salestable,
keeping the data for all product categories except
XYZ
Software.
CREATE TABLE sales2 AS SELECT * FROM sales, product WHERE sales.sales_product_id = product.product_id AND product_category <> 'XYZ Software' NOLOGGING PARALLEL (DEGREE 8) #PARTITION ... ; #create indexes, constraints, and so on DROP TABLE SALES; RENAME SALES2 TO SALES;
This approach may be more efficient than a parallel delete. However, it is also costly in terms of the amount of disk space, because the
salestable
must effectively be instantiated twice.
An alternative method to utilize less space is to re-create the
salestable
one partition at a time:
CREATE TABLE sales_temp AS SELECT * FROM sales WHERE 1=0; INSERT INTO sales_temp SELECT * FROM sales PARTITION (sales_99jan), product WHERE sales.sales_product_id = product.product_id AND product_category <> 'XYZ Software'; <create appropriate indexes and constraints on sales_temp> ALTER TABLE sales EXCHANGE PARTITION sales_99jan WITH TABLE sales_temp;
Continue this process for each partition in the
salestable.
相关文章推荐
- Travis CI android 持续集成
- sql优化
- 日期函数
- 实时监控Cat之旅~分布式消息树的实现原理与测试
- 类集
- 【原创】Silverlight DataGrid对核心控件DataGrid的任意单元格进行获取和设置分析。
- js判断正整数
- 网络基本原理和加密
- java工作中的经验60条
- 每个软件开发人员必须知道的关于 Unicode 和字符集最小限度的知识(没有理由不知道!)
- Android自定义ViewGroup之子控件的自动换行和添加删除
- Word 开发资料集合
- iOS和JS交互
- iOS开发-文件管理(一)
- 破解版 Sublime text3破解版下载以及安装emmet插件,前端开发神器
- ZOJ 2795-Ambiguous permutations
- HDU2015偶数求和
- 苦逼码农与老板关于加薪的斗智斗勇
- 同步类
- [Unity2d系列教程] 005.Unity如何使用外部触控插件FingerGuesture