X

Best Practices from Oracle Development's A‑Team

Partitioned Repository for WebCenter Content using Oracle Database 11g

Introduction

One of the biggest challenges for content management solutions is related to the storage management due the high volumes of the unstoppable growth of information.

Even if you have storage appliances and a lot of terabytes, things like backup, compression, deduplication, storage relocation, encryption, availability could be a nightmare.

Main Article

A standard option with the Oracle WebCenter Content is to store data to the database. And the Oracle Database allows you leverage features such as compression, deduplication, encryption and seamless backup.

But with huge volume, the challenge is passed to the DBA to keep the WebCenter Content Database up and running.

One solution is the use of DB partitions for your content repository, but what are the implications of this? Can you fit this with your business requirements?

Well, yes. It’s up to you how you will manage that. You just need a good plan. During you storage plan "brainstorm” you need to think, need to store petabytes of documents? You need everything on-line? There’s a way to logically separate the “good content” from the “legacy content”?

The first thing that comes to your mind probably is: use the creation date of the document, but remember that this document could receive a lot of revisions. Maybe you can consider the revision creation date. Your plan can have also complex rules like per Document Type or per a custom metadata like department or an hybrid per date, per DocType and an specific virtual folder.

Extrapolating the use, you can have your repository distributed in different servers, different disks, different disk types (Such as ssds, sas, sata, tape,…), separated according to your business requirements, separating the “hot” content from the legacy and easily matching your compliance requirements.

If you think to use "by revision", the simple way is to consider the dId, that is the sequential unique id for every content created using the WebCenter Content or the dLastModified that is the date field of the FileStorage table that contains the date of inclusion of the content to the DB Table using SecureFiles.

Using the scenario of partitioned repository with hierarchical separation by date, we will transform the FileStorage table in an partitioned table using “Partition by Range” of the dLastModified column (You can use the dId or a join with other tables for other metadata such as dDocType, Security, etc…).

The test scenario below covers:

  • Previous data on the JDBC Storage to be migrated to the new partitioned JDBC Storage
  • Partition by Date
  • Automatic generation of new partitions based on a pre-defined interval (Available only with Oracle Database 11g+)
  • Deduplication and Compression for legacy data
  • Oracle WebCenter Content 11g PS5/PS6+ (If PS5, use MLR13+. PS6+ - 11.1.1.7+ recommended)

For the test case you need some data stored using JDBC Storage to be the “legacy” data. If you have not done before, just create an Storage rule pointed to the JDBC Storage:

File Store Providers

Enable the metadata StorageRule in the UI and upload some documents using this rule.

For this test case you can run using the schema owner or a dba user. We will use the schema owner TESTS_OCS.

This is just a test and a proper backup of your environment is high recommended.

When you use the schema owner, you need some privileges. Use the dba user to grant the privileges needed:

REM Grant privileges required for online redefinition. GRANT EXECUTE ON DBMS_REDEFINITION TO TESTS_OCS; GRANT ALTER ANY TABLE TO TESTS_OCS; GRANT DROP ANY TABLE TO TESTS_OCS; GRANT LOCK ANY TABLE TO TESTS_OCS; GRANT CREATE ANY TABLE TO TESTS_OCS; GRANT SELECT ANY TABLE TO TESTS_OCS; REM Privileges required to perform cloning of dependent objects. GRANT CREATE ANY TRIGGER TO TESTS_OCS; GRANT CREATE ANY INDEX TO TESTS_OCS;

In our test scenario we will separate the content as Legacy, Day1, Day2, Day3 and Future. This last one will partitioned automatically using three tablespaces in a round-robin mode. In a real scenario, the partition rule could be per month, per year, or any rule that you choose.

Table spaces for the test scenario:

CREATE TABLESPACE TESTS_OCS_PART_LEGACY DATAFILE 'tests_ocs_part_legacy.dat' SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_DAY1 DATAFILE 'tests_ocs_part_day1.dat' SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_DAY2 DATAFILE 'tests_ocs_part_day2.dat' SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_DAY3 DATAFILE 'tests_ocs_part_day3.dat' SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_ROUND_ROBIN_A 'tests_ocs_part_round_robin_a.dat' DATAFILE SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_ROUND_ROBIN_B 'tests_ocs_part_round_robin_b.dat' DATAFILE SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_ROUND_ROBIN_C 'tests_ocs_part_round_robin_c.dat' DATAFILE SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED;

Before start, gather optimizer statistics on the actual FileStorage table:

EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'FileStorage', cascade => TRUE);

Now check if is possible execute the redefinition process:

EXEC DBMS_REDEFINITION.CAN_REDEF_TABLE('TESTS_OCS', 'FileStorage',DBMS_REDEFINITION.CONS_USE_PK);

If no error messages, you are good to go.

Create a Partitioned Interim FileStorage table.

You need to create a new table with the partition information to act as an interim table:

CREATE TABLE FILESTORAGE_Part   (     DID          NUMBER(*,0) NOT NULL ENABLE,     DRENDITIONID VARCHAR2(30 CHAR) NOT NULL ENABLE,     DLASTMODIFIED TIMESTAMP (6),     DFILESIZE  NUMBER(*,0),     DISDELETED VARCHAR2(1 CHAR),     BFILEDATA BLOB   )   LOB (BFILEDATA) STORE AS SECUREFILE     (         ENABLE STORAGE IN ROW         NOCACHE LOGGING         KEEP_DUPLICATES         NOCOMPRESS     )   PARTITION BY RANGE (DLASTMODIFIED)   INTERVAL (NUMTODSINTERVAL(1,'DAY'))   STORE IN (TESTS_OCS_PART_ROUND_ROBIN_A, TESTS_OCS_PART_ROUND_ROBIN_B, TESTS_OCS_PART_ROUND_ROBIN_C)   (           PARTITION FILESTORAGE_PART_LEGACY VALUES LESS THAN (TO_DATE('05-APR-2012 12.00.00 AM', 'DD-MON-YYYY HH.MI.SS AM'))             TABLESPACE TESTS_OCS_PART_LEGACY             LOB (BFILEDATA) STORE AS SECUREFILE               ( TABLESPACE TESTS_OCS_PART_LEGACY                 RETENTION NONE                 DEDUPLICATE                 COMPRESS HIGH               ),           PARTITION FILESTORAGE_PART_DAY1 VALUES LESS THAN (TO_DATE('06-APR-2012 07.25.00 PM', 'DD-MON-YYYY HH.MI.SS AM'))             TABLESPACE TESTS_OCS_PART_DAY1             LOB (BFILEDATA) STORE AS SECUREFILE               ( TABLESPACE TESTS_OCS_PART_DAY1                 RETENTION AUTO                 KEEP_DUPLICATES                 COMPRESS               ),           PARTITION FILESTORAGE_PART_DAY2 VALUES LESS THAN (TO_DATE('06-APR-2012 07.55.00 PM', 'DD-MON-YYYY HH.MI.SS AM'))             TABLESPACE TESTS_OCS_PART_DAY2             LOB (BFILEDATA) STORE AS SECUREFILE               ( TABLESPACE TESTS_OCS_PART_DAY2                 RETENTION AUTO                 KEEP_DUPLICATES                 NOCOMPRESS               ),           PARTITION FILESTORAGE_PART_DAY3 VALUES LESS THAN (TO_DATE('06-APR-2012 07.58.00 PM', 'DD-MON-YYYY HH.MI.SS AM'))             TABLESPACE TESTS_OCS_PART_DAY3             LOB (BFILEDATA) STORE AS SECUREFILE               ( TABLESPACE TESTS_OCS_PART_DAY3                 RETENTION AUTO                 KEEP_DUPLICATES                 NOCOMPRESS               )   );

After the creation, you will see the defined partitions.

Defined Partitions

Note that only the fixed range partitions have been created; none of the interval partitions have been created.

Start the redefinition process:

BEGIN     DBMS_REDEFINITION.START_REDEF_TABLE(          uname => 'TESTS_OCS'         ,orig_table => 'FileStorage'         ,int_table => 'FileStorage_PART'         ,col_mapping => NULL         ,options_flag => DBMS_REDEFINITION.CONS_USE_PK     ); END;

This operation can take some time to complete, depending how much contents that you have and on the size of the table.

Using the DBA user, you can check the progress with this command:

SELECT * FROM v$sesstat WHERE sid = 1;

Copy dependent objects:

DECLARE redefinition_errors PLS_INTEGER := 0; BEGIN     DBMS_REDEFINITION.COPY_TABLE_DEPENDENTS(         uname => 'TESTS_OCS'         ,orig_table => 'FileStorage'         ,int_table => 'FileStorage_PART'         ,copy_indexes => DBMS_REDEFINITION.CONS_ORIG_PARAMS         ,copy_triggers =>  TRUE         ,copy_constraints =>  TRUE         ,copy_privileges =>  TRUE         ,ignore_errors =>  TRUE         ,num_errors => redefinition_errors         ,copy_statistics => FALSE         ,copy_mvlog => FALSE     );     IF (redefinition_errors > 0) THEN         DBMS_OUTPUT.PUT_LINE('>>> FileStorage to FileStorage_PART temp copy Errors: ' || TO_CHAR(redefinition_errors));     END IF; END;

With the DBA user, verify that there are no errors:

SELECT object_name, base_table_name, ddl_txt FROM DBA_REDEFINITION_ERRORS;

Two lines related to the constrains will be showed, this is expected.

Synchronize the interim table FileStorage_PART:

BEGIN   DBMS_REDEFINITION.SYNC_INTERIM_TABLE(     uname      => 'TESTS_OCS',     orig_table => 'FileStorage',     int_table  => 'FileStorage_PART'); END;

Gather statistics on the new table:

EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'FileStorage_PART', cascade => TRUE);

Complete the redefinition:

BEGIN   DBMS_REDEFINITION.FINISH_REDEF_TABLE(     uname      => 'TESTS_OCS',     orig_table => 'FileStorage',     int_table  => 'FileStorage_PART'); END;

During the execution the FileStorage table is locked in exclusive mode until the operation is finished.

After the last command, the FileStorage table is partitioned.

If you have contents out of the range partition, you should see the new partitions created automatically, not generating an error if you “forgot” to create all the future ranges. You will see something like:

Partitions with Legacy partition

You now can drop the FileStorage_PART table:

DROP TABLE FileStorage_PART PURGE;

To check the FileStorage table is valid and is partitioned, use the command:

SELECT num_rows,partitioned FROM   user_tables WHERE  table_name = 'FILESTORAGE';

You can list the contents of the FileStorage table in a specific partition, for example:

SELECT * FROM   FileStorage PARTITION (FILESTORAGE_PART_LEGACY)

Some useful commands that you can use to check the partitions: (note that you need to run using a DBA user)

SELECT * FROM   DBA_TAB_PARTITIONS WHERE  table_name = 'FILESTORAGE'; SELECT * FROM   DBA_TABLESPACES WHERE  tablespace_name like 'TESTS_OCS%';

After the redefinition process complete you have a new FileStorage table storing all content that has the Storage rule pointed to the JDBC Storage and partitioned using the rule set during the creation of the temporary interim FileStorage_PART table.

At this point, you can test the WebCenter Content downloading the documents (Original and Renditions). Note that the content could be already in the cache area. Look in the weblayout directory to see if a file with the same id is there. Then click on the web rendition of your test file and see if the file has been created and if you can open it. This means that all is working.

The redefinition process can be repeated many times. This allows you to test which is the better layout, over and over again.

Now some interesting maintenance actions related to the partitions:

        1. Make a tablespace read only.
      • No issues viewing, the WebCenter Content does not alter the revisions
      • When you try to delete content that is part of a read-only tablespace, an error occurs and the document will not be deleted
      • The only way to prevent errors today is creating a custom component that checks the partitions and if you have an document in an “Read Only” repository, execute the deletion process of the metadata and mark the document to be deleted on the next db maintenance, like a new redefinition.

 

        2. Take a tablespace offline for archiving purposes or any other reason.
      • When you try to open a document that is included in this tablespace, you receive an error that was unable to retrieve the content, but the other online tablespaces are not affected.
      • Same behavior when deleting documents.
      • Again, a custom component is the solution. If you have a document “out of range”, the component can show an message that the repository for that document is offline. This can be extended to an option to the user to request to put online again.

 

        3. Moving some legacy content to an offline repository (table) using the Exchange option to move the content from one partition to a empty nonpartitioned table like FileStorage_LEGACY. Note that this option will remove the registers from the FileStorage and will not be able to open the stored content. You always need to keep in mind the indexes and constraints.

 

      4. A redefinition separating the original content (vault) from the renditions and separate by date ate the same time. This could be an option for DAM environments that want to have a special place for the renditions and put the original files in storage with less performance.
    • The process will be the same. You just need to change the script of the interim table to use composite partitioning. It will be something like:
CREATE TABLE FILESTORAGE_RenditionPart   (     DID          NUMBER(*,0) NOT NULL ENABLE,     DRENDITIONID VARCHAR2(30 CHAR) NOT NULL ENABLE,     DLASTMODIFIED TIMESTAMP (6),     DFILESIZE  NUMBER(*,0),     DISDELETED VARCHAR2(1 CHAR),     BFILEDATA BLOB   )   LOB (BFILEDATA) STORE AS SECUREFILE     (         ENABLE STORAGE IN ROW         NOCACHE LOGGING         KEEP_DUPLICATES         NOCOMPRESS     )   PARTITION BY LIST (DRENDITIONID)   SUBPARTITION  BY RANGE (DLASTMODIFIED)   (   PARTITION Vault VALUES ('primaryFile')       ( SUBPARTITION FILESTORAGE_VAULT_LEGACY VALUES LESS THAN (TO_DATE('05-APR-2012 12.00.00 AM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_VAULT_DAY1 VALUES LESS THAN (TO_DATE('06-APR-2012 07.25.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_VAULT_DAY2 VALUES LESS THAN (TO_DATE('06-APR-2012 07.55.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_VAULT_DAY3 VALUES LESS THAN (TO_DATE('06-APR-2012 07.58.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_VAULT_FUTURE VALUES LESS THAN (MAXVALUE)       )  ,PARTITION WebLayout VALUES ('webViewableFile')       ( SUBPARTITION FILESTORAGE_WEBLAYOUT_LEGACY VALUES LESS THAN (TO_DATE('05-APR-2012 12.00.00 AM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_WEBLAYOUT_DAY1 VALUES LESS THAN (TO_DATE('06-APR-2012 07.25.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_WEBLAYOUT_DAY2 VALUES LESS THAN (TO_DATE('06-APR-2012 07.55.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_WEBLAYOUT_DAY3 VALUES LESS THAN (TO_DATE('06-APR-2012 07.58.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_WEBLAYOUT_FUTURE VALUES LESS THAN (MAXVALUE)       )  ,PARTITION Special VALUES ('Special')       ( SUBPARTITION FILESTORAGE_SPECIAL_LEGACY VALUES LESS THAN (TO_DATE('05-APR-2012 12.00.00 AM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_SPECIAL_DAY1 VALUES LESS THAN (TO_DATE('06-APR-2012 07.25.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_SPECIAL_DAY2 VALUES LESS THAN (TO_DATE('06-APR-2012 07.55.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_SPECIAL_DAY3 VALUES LESS THAN (TO_DATE('06-APR-2012 07.58.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE       , SUBPARTITION FILESTORAGE_SPECIAL_FUTURE VALUES LESS THAN (MAXVALUE)       )   )ENABLE ROW MOVEMENT;

The next post related to partitioned repositories will come with an sample component to handle the possible exceptions when you need to take offline an tablespace/partition or move to another place.

Also, we can include some integration to the Retention Management and Records Management.

Another subject related to partitioning is the ability to create a FileStore Provider pointed to a different database, raising the level of the distributed storage vs. performance.

Let us know if this is important for you. If you have a use case not listed, please leave a comment.

Note: This blog post was originally posted on The Content Rave Blog

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha

Recent Content