Getting The Most Out Of Your WebCenter Content Database Metadata Search Engine configuration On Oracle 11g Database

Metadatasearch

 

Introduction

Being able to retrieve information as quickly as possible in a Content Management implementation is extremely important to provide knowledge workers with the data they need to do their jobs as efficiently as possible. This article is targeted at providing information on some recent discoveries on ways to keep your WebCenter Content Metadata search at peak performance. It needs to be noted that although this article is for WebCenter Content which is the application issues were experienced in, the remedies described in this article are database centric and require the skills of your database administer to implement. Also these techniques could also likely benefit any application which is query intensive.

Main Article

As most everybody is aware of WebCenter Content 11g provides three different search engine options that are available in the core product they are

    • Oracle Text Search
    • Database Full Text
    • Database Metadata

The first two options (Oracle Text & Database Full Text) provide you with capabilities to not only search based on the metadata that is tagged on the content but also search on the text within the checked-in documents themselves (e.g. Full Text Searching). With the third option (Database Metadata) the metadata criteria that are specified on searches are distilled down into standard SQL Select statements with where predicates comprised of the metadata fields specified as the search criteria that are then submitted to the database to execute and return results.

Recently I have been dealing with a couple of cases with customers whose WebCenter Content Server search engine configuration were Database Metadata and were experiencing significant performance issues with searches. The impact of the performance issues resulted in the JDBC connection timing out and WebCenter Content generating an error page to the users. The size of these customers repositories in terms of rows in the tables of the WebCenter Content schema used in searches were in the range of 40 to 80 million rows. These customers had applied the standard tuning to their WebCenter Content database schemas like

  • Ensuring indexes were applied to the most frequently referenced metadata fields from the DocMeta table used in searches.
  • Performed regular gathering of statistics on the database instance

Yet despite these standard tuning efforts they were still experiencing significant performance issues with searching due to latency on the database.

Upon analyzing the situation deeper using things like Automated Workload Repository Reports (AWR) it was realized that a great percentage of query execution time was being spent in User I/O. High User I/O time means the database is spending the majority of the time doing physical I/O on its disk subsystem to gather the data to return the results to a SQL query that was submitted. Anytime large amounts physical disk I/O is involved with any application or database it is going to be one of the slower parts due to the physical logistics of spindle turning and head seeks involved leading to higher latency. If you are experiencing performance issues with you WebCenter Content metadata search and are seeing the same symptoms appear in your AWR reports the first thing that should be done is to ensure that your database disk sub-system is not the thing introducing significant bottlenecks. If this has been eliminated and the health of the database disk subsystem has been deemed to be acceptable then the following configuration on the database has been effective at addressing these symptoms and restoring performance to the WebCenter Content metadata search.

There are four primary tables in the WebCenter Content database schema that are involved when the WebCenter Content search engine configuration is Database.Metadata they are:

  • Revisions
  • DocMeta
  • Documents
  • RevClasses

Using the advanced compression feature of the database, advanced compression was applied at the table space level which by compressing blocks will significantly reduce physical I/O. In addition to further speed performance parallelism (parallel degree ) was applied to the four tables outlined above along with parallelism was applied to all indexes applied to the tables. The best degree of parallelism (e.g. value of ) applied is dependent on the physical compute resources (e.g. CPU/Memory) that have been allocated to the database servers and some trial and error testing is required to find the best parallel degree to apply. With parallel degree you also want to be sure that you are striking a good balance between optimal performance and CPU saturation. If you over allocate parallel degree you risk inflicting CPU saturation and negatively impacting performance due to not having enough compute resources available for O/S kernel operations on the database servers. The increase in query performance will of course vary due to things like data volumes, queries, and hardware sizing. Testing in environments where this issue was experienced yielded average gains of 5.9 times (590%) to 10 times (1000%). As it pertains to advanced compression it needs to be noted that this is an additional licensable feature of the Oracle 11g database so you need to be sure you are licensed for it before exercising it as a method of tuning. Over and above the performance benefits that are outlined in this article, advanced compression will also provide you with storage savings due to the compression factor that is applied to reduce the physical footprint of the database on disk. The degree of compression gained will vary based on data patterns. In one instance where this was applied it took the physical storage footprint from 19G to 9G. This article has been focused on the tables involved in WebCenter Content metadata searching but it should be noted if there are any other queries that have been implemented in customizations to the WebCenter Content platform that suffer from the same symptoms these methods could also be applied to tables involved in those queries as well.

Implementation of any of the tuning changes outlined in this article should first be implemented in a non production environment that has comparable infrastructure and data volumes to a production environment. Testing to monitor impact and results should first be done directly against the database by submitting the application queries through a tool like SQL plus or SQL Developer. When testing your search queries in SQL plus use of bind variables (as opposed to literal values) should be applied to where predicates to have the execution of the queries the same as when they are received from the application. The AWR reports should be used as a way to monitor the results of the tuning. After the appropriate tuning has been achieved as identified by AWR reports and improved raw query response times then testing searches through the WebCenter Content application can be performed.

Add Your Comment