Scalable shared databases
Scalable shared databases let you attach a read-only reporting database to multiple server instances over a storage area network (SAN). A reporting database is a read-only database that is built from one or more production databases that are used exclusively for reporting purposes. To be made into a scalable shared database, a reporting database must reside on one or more dedicated read-only volumes. The primary purpose of these read-only volumes is to host the reporting database or a coordinated set of reporting databases. These volumes are known as reporting volumes.
Benefits
Scalable shared databases offer the following benefits:
- Provide workload scale-out of reporting databases by using commodity servers. A scalable shared database is a cost-effective way of making read-only data marts or data warehouses available to multiple server instances for reporting purposes, such as running queries or using SQL Server 2005 Reporting Services.
- Provide workload isolation. Each server uses its own memory, CPU, and tempdb database.
- Guarantee the same view of reporting data from all servers if all the server instances are configured identically. For example, all servers would use a single collation.
Note Optionally, you can update the reporting database on a second reporting volume. For more information, see the "Maximize the availability of a scalable shared database" section.
Restrictions
The following restrictions exist for a scalable shared database:
- The database must be on a read-only volume.
- The data files can be accessed over an SAN.
- Scalable shared databases are supported only on Microsoft Windows Server 2003 Service Pack 1 (SP1) or a later version of Windows Server 2003.
Update cycle of a reporting database
When you use a scalable shared database for a reporting database, it involves a three phase update cycle:
- Build phase: The update cycle of a reporting database starts with the build phase. Before a reporting database can be built, the administrator mounts the reporting volume on the production system and makes it read/write. When a volume is in a read/write state, the volume can only be mounted on one system. If the volume is mounted on more than one system, filesystem corruption might occur. The administrator then builds the database by using one of the data-copy methods provided by SQL Server 2005 for copying data or databases. After the database is built, the administrator sets the volume to read-only, and then dismounts it.
- Attach phase: The attach phase comes after the build phase. The attach phase makes the database available as a scalable shared database. The attach phase must be performed on each of reporting servers individually. To configure the reporting database as a scalable shared database, the administrator mounts the read-only reporting volumes onto a reporting server over the SAN. After the administrator makes sure that each volume is set to read-only, the administrator attaches the reporting database on an instance of SQL Server. The reporting database on an instance of SQL Server is also known as a reporting server instance. Because each reporting volume is read-only, attaching the database sets it to read-only. At this point, the reporting database becomes a scalable shared database that can be accessed by clients by using the reporting server.
Note If you use a second reporting volume when you update the reporting database, you must choose between a rolling upgrade and a synchronized upgrade. For more information, see the "Maximize the availability of a scalable shared database" section. - Detach phase: The third phase is the detach phase. Typically, the reporting database eventually becomes stale. The database must be refreshed to keep the reporting data current. The detach phase is the process of removing a stale reporting database from service as a scalable shared database. Before you can make an updated reporting database available on a particular reporting server, the detach phase must be completed on that server. When a reporting database must be refreshed, it must be detached from all the server instances. To start the detach phase, the database administrator first stops the query work load that is coming in to the database from all the server instances. On each server instance, the database administrator obtains exclusive access to the database, and then detaches it. The database administrator then dismounts the volume from each host system. When the detach phase is complete, the reporting volume is disconnected from the SAN.
Note To maximize the availability of reporting data, we recommend that you alternate update cycles between two reporting volumes as a best practice. When the first reporting volume is still mounted to the reporting servers, you can mount the second volume to the production server, and then build an up-to-date version of the reporting database. For more information, see the "Maximize the availability of a scalable shared database" section.
Note Each phase consists of a series of steps that must be performed by a user who has Database Administrator rights. In this article, that user will be referred to as the database administrator.
Important To configure a scalable shared database, the SAN environment must already be working correctly.
Examples of scalable shared databases
In subsequent update cycles, the database can be updated or rebuilt. The preferred method depends on your business requirements. You can use scalable shared databases in the following two ways:
- Data mart database: The simplest use of a scalable shared database is a data mart database. A data mart database is extracted periodically from the contents of a data warehouse and is used for reporting. To update the data mart database, drop the database and then replace it with a new version.
- Reporting from an updatable database: When the database that is being reported from does not have to be transformed from the source database, the database can be periodically updated. To periodically update the database, create a full backup of the production database, and then restore the database backup on the reporting volume or volumes.
Make sure that the environment is correct for a scalable shared database
A scalable shared database must be on a read-only volume that can be accessed over a SAN. The reporting servers must be running the following:
- Windows Server 2003 SP1 or a later version of Windows Server 2003
- SQL Server 2005 Enterprise Edition or a later version of SQL Server 2005
For supportability, we recommend that you limit your scalable shared database configurations to eight server instances. However, SQL Server 2005 does not limit the number of concurrent instances that can access a scalable shared database. Typically, each server instance runs on a separate reporting server. However, running multiple reporting server instances on a reporting server is supported.
Configure your environment
To make sure that your environment supports scalable shared databases, we recommend that you follow these guidelines:
- Make sure that the reporting servers for a particular reporting database are running on identical operating systems. Whenever you upgrade a reporting server, upgrade any other reporting servers that serve the same scalable shared database or databases. For example, if you apply a software update or service pack for Windows or SQL Server 2005 to any one of the reporting servers, apply the same software update or service pack to all the reporting servers.
Note Frequently, you can perform rolling upgrades of the reporting servers as long as you complete the rolling upgrade in a timely manner.
- Scalable shared databases are tested under a concurrent access workload by up to eight server instances of SQL Server 2005 Enterprise Edition. SQL Server 2005 does not enforce an instance limit. However, we recommend that you limit your scalable shared database configurations to eight server instances for each shared database.
- If the data files of the production database span multiple volumes, you must use the same number of reporting volumes. In contrast, because the reporting database is set to read-only, its log files can co-exist with data files on a reporting volume.
- To simplify the process of building or updating a reporting database, we recommend that the path of the reporting database be the same as the production database. This includes using both the same drive letter for the reporting volume and the same directory path for the database. For example, if the production database is on E:\SQLdata, use E as the drive letter of the reporting volume, if it is possible. Additionally, use \SQLdata as the directory of the reporting database, if it is possible. However, a script that has explicit paths can handle any differences. If the reporting volume uses a different drive letter than the production volume, you may have to make the following modifications:
- If you build the reporting database by restoring a database backup, the RESTORE DATABASE statement must have a WITH MOVE clause that specifies the full path of the restored data files.
- If your reporting database is a copy of the production database, the FOR ATTACH clause of the CREATE DATABASE statement must list every file. The FOR ATTACH clause must also specify its full path when you attach the reporting database. This is always a best practice.
Note As a best practice, use the same drive letter on every server when you mount a reporting volume onto your reporting servers. This practice helps you manage the volume across the different servers.
- The reporting database must be on a read-only volume that can be accessed over the SAN from all the reporting servers:
- After you mount the reporting volume onto a reporting server, make sure that the reporting volume is correctly mounted and that the data files can be accessed. To do this, enter DIR
<drive-letter>:\<database-directory> at a command prompt, where <drive-letter> is the letter assigned to the reporting volume, and
<database-directory> specifies the location of the database's data files on the volume. Run this test from each reporting server to make sure that you receive the same results for them all. - To make sure that the reporting database is set read-only, try to create a file on the volume. The easiest method is to try to copy or save a plain text file on the volume. The attempt should fail because the volume is read-only.
Note If you are performing these steps manually, we recommend that you repeat these tests in every update cycle when you remount the reporting volume on each reporting server. If you script the steps to move reporting volumes back and forth between the production server and the reporting servers, testing is no longer required after you are sure that your scripts are working correctly.
Phase 1: The build phase
Build or refresh a scalable shared database
A reporting database must be built and refreshed manually. This process is the first phase of the update cycle for a reporting database and is known as the build phase. The build phase may involve updating a stale database or building a new version.
Typically, the current version of a reporting database eventually becomes stale. The reporting database must be periodically refreshed to keep the reporting data up to date.
Complete the build phase
You can refresh a stale reporting database by updating the outdated data in the existing database or by rebuilding the database.
Note Before you can refresh an existing reporting database, the database must be detached from each reporting server instance. Additionally, the reporting volume must be dismounted from each reporting server. For more information, see the "Detach a scalable shared database" section.
To refresh a stale reporting database, follow these steps on the production server:
- Use your hardware vendor's utilities to unmask the logical unit numbers (LUNs) that correspond to the reporting volumes. This action makes the volumes accessible to the production server.
- Mount the reporting volume, and then mark it as read/write. To use the Diskpart command-line utility to mount the volume, enter the following commands at a command prompt:DiskPart
DISKPART> select volume=<drive-number>
DISKPART> assign letter=<drive-letter>
DISKPART> attribute clear readonly
DISKPART> exit
In this step,
<drive-number> is the volume number that is assigned by Windows, and <drive-letter> is the letter that is assigned to the reporting volume. - If you are refreshing an existing reporting database, follow these steps:
- Attach the database to a server instance. Typically, this would be the production server instance.
CREATE DATABASE <database_name> ON <filespec_list>
FOR ATTACH
- Set the database to read/write access by using the following Transact-SQL statement.
ALTER DATABASE <database_name> SET READ_WRITE
For more information, see SQL Server 2005 Books Online.
- Build the database.
To refresh a reporting database, you can update the outdated data, rebuild the database, or do whatever else you think is required to refresh the data. The administrator builds the database by using any one of the data-copy methods that are provided by SQL Server 2005 for copying data or databases. For more information, see the "Methods for building or updating a database" section.
Note In reporting databases, we recommend that page verify be set to checksum, the default. To change this setting, use ALTER DATABASE. - Set the database to read-only by using the following Transact-SQL statement.
ALTER DATABASE <database_name> SET READ_ONLY
- Detach the database by using the following Transact-SQL statement.
sp_detach_db @dbname='<database_name>'
In this step, <database_name> is the name of the database. - Mark the volume as read-only, and then dismount the volume from the production server. To use the Diskpart command-line utility to dismount the volume, enter the following commands at a command prompt.
DiskPart
DISKPART> select volume=<drive-number>
DISKPART> attribute set readonly
DISKPART> remove
In this step, <drive-number> is the volume number that is assigned by Windows, and
<drive-letter> is the letter that is assigned to the reporting volume. - Use your hardware vendor's utilities to mask the LUNs that correspond to the reporting volumes. This action makes the volumes inaccessible to the production server.
Now, the reporting database can be made available as a scalable shared database. For more information, see the "Attach a scalable shared database" section.
Methods for building or refreshing a database
Note When you build a reporting database, we recommend that you always use the same path for the production database and the reporting databases. Additionally, we recommend that you use the same drive letter for the production and reporting volume when the volume is mounted on the reporting servers, if it is possible.
SQL Server 2005 currently supports the following methods for porting data into a database or for porting a whole database:
- SQL Server Integration Services: You can create or copy a database by running Integration Services packages and by using the Execute SQL task or the Transfer Database task:
- The Execute SQL task runs SQL statements or stored procedures from a package. When you use the Execute SQL task, you can create a database by running a CREATE DATABASE statement. You can then populate the database by copying in one or more tables or views.
- The Transfer Database task can copy a database in the same server instance or between instances.
Note You can also create a database by using the SQL Server Import and Export Wizard, but you must copy at least one table or view.
- Backup and restore: You can restore a backup of a production database on the reporting volume. To do this, restore and recover a full database backup onto the reporting volume:
- If you are using the same drive letter, mount the reporting volume onto a different host, and then connect to a server instance there to restore the database.
- If the reporting volume uses a different drive letter than the production volume, the RESTORE DATABASE statement must have a WITH MOVE clause that specifies the drive letter of the reporting volume in the path of the restored database.
- Copy the production database onto the reporting volume: Before you can manually copy a database or use the Detach and Attach method of the Copy Database Wizard, you must take the database offline. After you copy the database, bring the database back online. However, the Copy Database Wizard offers an alternative method. The SMO Transfer method copies the database although the database remains online. Although the SMO Transfer method is slower than the Detach and Attach method, the SMO Transfer method preserves active connections to the database.
For more information about these data-copy methods, see SQL Server 2005 Books Online.
When the reporting database is ready, you must complete the build phase. For more information, see the "Phase 1: The build phase" section.
Phase 2: The attach phase
Attach a shared scalable database
After you build or update a reporting database and you dismount the reporting volume from the production server, an administrator must make the database available as a scalable shared database. This process is known as the attach phase.
Complete the attach phase
In this phase, an administrator must perform the following steps:
- Use your hardware vendor's utilities to unmask the LUNs that correspond to the reporting volumes. This action makes the volumes accessible to clients from each reporting server.
- On each reporting server, mount the volume that corresponds to the LUN.
Note To simplify the process of building or updating a reporting database, we recommend that you always mount its reporting volume by using the same drive letter as the production volume. For example, if the production database is on drive E on the production server, the reporting volume should also be mounted as drive E on each reporting server, if it is possible.
To use the Diskpart command-line utility to mount the volume, enter the following commands at a command prompt. DiskPart
DISKPART> select volume=<drive-number>
DISKPART> assign letter=<drive-letter>
DISKPART> exit
In this step, <drive-number> is the volume number that is assigned by Windows, and
<drive-letter> is the letter that you want to use for the reporting volume on the reporting server.
Note The reporting volume must be read-only. We recommend that it be marked as read-only before the volume is dismounted from the production server. If the volume was not marked as read-only, set the volume to read-only after you mount the volume on the first reporting server. For more information, see the "Phase 1: The build phase" section.
As a best practice, you should make sure that the volume is accessible as a read-only volume over the SAN after you mount a reporting volume to each reporting server. For more information, see the "Make sure that the environment is correct for a scalable shared database" section. - Attach the database to the reporting server instance or instances on each reporting server. For more information, see SQL Server 2005 Books Online.
The reporting database is now available as a scalable shared database, and queries can continue.
Phase 3: The detach phase
Detach a scalable shared database
Typically, the current version of a reporting database eventually becomes stale and must be refreshed to keep the reporting data up to date. The process of removing a stale reporting database from service as a scalable shared database is known as the detach phase. This phase is the third and final phase of the update cycle for a reporting database. Before you can make an updated reporting database available on a particular reporting server, the detach phase must be completed on that server.
Complete the detach phase
In this phase, an administrator must perform the following steps on each reporting server:
- Disable new queries on the database, and then let current queries complete gracefully, if it is possible.
- Detach the database from each reporting server instance by using the sp_detach_db @dbname='<database_name>'command.
In this step,
<database_name> is the name of the database. For more information about the sp_detach_db command, see SQL Server 2005 Books Online. - On each reporting server, dismount the reporting volume. To dismount the volume by using the Diskpart command-line utility, enter the following commands at a command prompt.
DiskPart
DISKPART> select volume <drive-number>
DISKPART> remove
In this step, <drive-letter> is the letter that assigned to the reporting volume. - Use your hardware vendor's utilities to mask the LUNs that correspond to the reporting volumes. This action makes the volumes inaccessible to clients from each reporting server.
Alternative strategies for detaching a stale reporting database
When you replace the stale version of a database, you must consider the business requirements for your reporting environment. You should assess which of the following business requirements take precedence in your environment:
- Preserving currently running transactions until they finish.
- Completing the update within a limited timeframe.
Based on which requirement takes precedence, you can decide how to manage the detach phase on each reporting server. You can manage the detach phase in the following ways:
- Let the transactions finish before you detach the reporting server: To preserve all in-progress transactions, you must start the detach phase by stopping incoming I/O activity to the reporting volume. Then, on each reporting server instance, wait to detach the database until all the current transactions are finished. When the database has been detached from all the server instances, you can dismount the reporting volume.
- Update the database during a limited timeframe: In this case, you should obtain exclusive access to the database on each server instance with a termination time that allows for your timeframe. If any queries do not finish within that termination time, they will be stopped. Those queries will have to wait until after the update to be restarted. After the queries are stopped, you can detach the database from each server instance, and then dismount the reporting volume from each reporting server.
At this point, you are ready for the next build phase. Alternatively, if you have already refreshed the database on another reporting volume like we recommend, you can now perform the attach phase for the alternative volume. For more information, see the "Maximize the availability of a scalable shared database" section.
Maximize the availability of a scalable shared database
To maximize the availability of reporting data, we recommend that you alternate update cycles between two reporting volumes. When the first reporting volume is still mounted to the reporting servers, you can mount the second volume to the production server and build an up-to-date version of the reporting database.
If you update the reporting database on a second reporting volume, consider the following options:
- If you want all your reporting databases to return identical results to clients, you must detach the old copy from all the server instances before you attach the new copy to any one of them.
- If you can tolerate clients receiving different results on different server instances when you update the reporting database, you can perform a rolling upgrade of the reporting database. You would finish the update cycle on one reporting server at a time.
Synchronized, time-sensitive updates of all reporting servers
This section describes several strategies for updating the content of a scalable shared database, depending on your business requirements:
- You must keep all reporting servers in sync.
- You must accomplish the update within a limited timeframe. This timeframe is more critical than preserving currently running transactions.
When you synchronize the database on all the reporting servers, the reporting database is unavailable between the detach phase for the stale version of the database and the attach phase of the fresh version.
To synchronize the update cycle on all the reporting server instances and finish the update cycle within a limited timeframe, follow these steps:
- To keep the content in sync, you must finish the detach phase on all the reporting servers before any one of the reporting servers can be updated. If any long-running queries are active on any server, you must stop them.
- After you dismount the first reporting volume from all the server instances, you can start to update the reporting servers. On each reporting server, mount another volume that contains a more current version of the reporting database. Attach that version to the local reporting server instance. As soon as the database is attached on a particular instance, stopped transactions can be restarted on that instance.
Rolling upgrades of reporting servers
A rolling upgrade lets you to refresh the reporting database on one reporting server when a stale reporting database remains temporarily available on another reporting server. For a while, both the stale version and the refreshed version of the database are available at the same time. Depending on your business requirements, a rolling upgrade can occur in a limited timeframe or the rolling upgrade can be relatively open-ended to let current transactions finish.
Let transactions finish before the rolling upgrade
In this strategy, a rolling upgrade lets the database administrator to wait for long-running transactions to finish on one reporting server when the database on another reporting server is refreshed. This strategy addresses the following business requirements:
- The reporting servers do not have to be kept in sync. This permits a rolling upgrade between the stale reporting database and the updated reporting database.
- You have an unlimited timeframe to accomplish the update, or your deadline is less critical than preserving currently running transactions.
To perform this form of rolling upgrade, follow these steps on one server instance at a time:
- To preserve all in-progress transactions, you must start the detach phase by stopping incoming I/O activity to the reporting volume. If a long-running query delays the upgrade on a server instance, wait for the query to finish before you take the server instance offline.
- After all the transactions are finished on this server instance, detach the reporting database.
- After you detach a particular reporting database from all the server instances, attach a more current version of the reporting database to that server instance.
- To make the server instance available again for reporting queries, attach an updated copy of the database.
Finish the rolling upgrade in a limited time
In this strategy, a rolling upgrade lets the database administrator to maintain uninterrupted reporting service by briefly letting the stale version of the database to remain available to new queries on some reporting servers. The service remains uninterrupted when you update the database on another reporting server. This strategy addresses the following business requirements:
- The reporting servers do not have to be kept in sync. This permits a rolling upgrade between the stale reporting database and the updated reporting database.
- You must accomplish the update in a limited timeframe. This deadline is more critical than preserving currently running transactions.
To perform this form of rolling upgrade, follow these steps on one reporting server at a time:
- Stop incoming I/O activity to the reporting volume, and, optionally, wait for short transactions to finish on a server instance before you detach its reporting database.
- Finish the detach phase on that server. For more information, see the "Detach a scalable shared database" section.
- Make the updated version of the reporting database available again for reporting queries. For more information, see the "Attach a shared scalable database" section.
This kind of rolling upgrade guarantees that the overall reporting capability is never interrupted. This strategy lets you to tolerate fairly long-running transactions on some of the server instances for a while. However, given the limited timeframe for updating all the reporting databases, if a long-running query significantly delays the upgrade on a server instance, you will have to stop that query. The query can wait to be rerun on the same server instance after its reporting database has been refreshed, or the query can be restarted sooner on an updated server.