Notice: This website is an unofficial Microsoft Knowledge Base (hereinafter KB) archive and is intended to provide a reliable access to deleted content from Microsoft KB. All KB articles are owned by Microsoft Corporation. Read full disclaimer for more details.

How to generate a script of the necessary database metadata to create a statistics-only database in SQL Server


View products that this article applies to.

INTRODUCTION

The query optimizer in Microsoft SQL Server uses the following types of information to determine an optimal query plan: 
  • The database metadata
  • The hardware environment
  • The database session state
Typically, you must simulate all these same types of information if you want to reproduce the behavior of the query optimizer on a test system.

Microsoft Customer Support Services may ask you to generate a script of the database metadata. Microsoft Customer Support Services uses this script of the database metadata to investigate an optimizer issue. This article describes the steps to generate the statistics script. The article also describes how the query optimizer uses the information.

↑ Back to the top


More Information

If you use SQL Server 2005, before you follow the steps to generate the script, make sure that SQL Server Management Studio is the SQL Server 2005 Service Pack 2 version or a later version. If you use earlier versions of SQL Server Management Studio in SQL Server 2005, the Script Wizard does not contain all the necessary options for the steps in this article to work correctly.

Script the whole database

When you generate a statistics-only clone database, it may be easier and more reliable to script the whole database instead of scripting individual objects. When you script the whole database, you receive the following benefits:
  • You avoid issues with missing dependent objects that are required to reproduce the issue.
  • You require significantly fewer steps to select the necessary objects.
Note If you generate a script for a database, and the metadata for the database contains thousands of objects, the scripting process consumes significant CPU resources. We recommend that you generate the script during off-peak hours. Or, you can use the second option to generate the script for individual objects.

To script each database that is referenced by your query, follow these steps:
  1. Open SQL Server Management Studio.
  2. In the Object Explorer, expand Databases, and then locate the database that you want to script.
  3. Right-click the database, point to Tasks, and then click Generate Scripts.
  4. In the Script Wizard, verify that the correct database is selected. Click to select the Script all objects in the selected database check box, and then click Next.
  5. In the Choose Script Options dialog box, change the following settings from the default value to the value that is listed in the following table.
    Scripting optionValue to select
    Ansi PaddingTrue
    Continue Scripting on ErrorTrue
    Generate Script for Dependent ObjectsTrue
    Include System Constraint NamesTrue
    Script CollationTrue
    Script Database CreateTrue
    Script LoginsTrue
    Script Object Level PermissionsTrue
    Script StatisticsScript Statistics and histograms
    Script IndexesTrue
    Script TriggersTrue
    Note The Script Logins option and the Script Object Level Permissions option may not be required unless the schema contains objects that are owned by logins other than dbo.
  6. Click Next.
  7. Click the Script to File option, and then enter a file name.
  8. Click Finish.

Script individual objects

You may only script individual objects that are referenced by a particular query instead of scripting the complete database. However, unless all database objects were created by using the WITH SCHEMABINDING clause, the dependency information in the sys.depends system table may not always be accurate. This inaccuracy may cause one of the following issues:
  • The scripting process does not script a dependent object.
  • The scripting process may script objects in the incorrect order. To run the script successfully, you must manually edit the generated script.
Therefore, we do not recommend that you script individual objects, unless the database has lots of objects and scripting would otherwise take too long. If you must use script individual objects, follow these steps:
  1. In SQL Server Management Studio, expand Databases, and then locate the database that you want to script.
  2. Right-click the database, point to Script Database As, point to CREATE to, and then click File.
  3. Enter a file name, and then click Save.

    The core database container will be scripted. This container includes files, file groups, the database, and properties.
  4. Right-click the database, point to Tasks, and then click Generate Scripts.
  5. Make sure that the correct database is selected, and then click Next.
  6. In the Choose Script Options dialog box, change the following settings from the default value to the value that is listed in the following table.
    Scripting optionValue to select
    Ansi PaddingTrue
    Continue Scripting on ErrorTrue
    Include System Constraint NamesTrue
    Generate Script for Dependent ObjectsTrue
    Script CollationTrue
    Script LoginsTrue
    Script Object Level PermissionsTrue
    Script StatisticsScript Statistics and histograms
    Script USE DATABASETrue
    Script IndexesTrue
    Script TriggersTrue
    Note The Script Logins option and the Script Object Level Permissions option may not be required unless the schema contains objects that are owned by logins other than dbo.
  7. In the Choose Object Types dialog box, select all database object types that the problematic query references.

    For example, if the query only references tables, select Tables. If the query references a view, select Views and Tables. If the problematic query uses a user-defined function, select Functions.
  8. When you have selected all the object types that are referenced by the query, click Next.
  9. A dialog box appears for each database object type that you selected in step 7. In each dialog box, select the specific tables, views, functions, or other database objects, and then click Next.
  10. Click the Script to File option, and then specify the same file name that you entered in step 3.
  11. Click Finish to start the scripting.
When the scripting has finished, send the script file to the Microsoft support engineer. The Microsoft Support Engineer may also request the following information:
  • The hardware configuration, including the number of processors and how much physical memory exists
  • The SET options that were active when you ran the query
Note You may have already provided this information by sending a SQLDiag report or a SQL Profiler trace. You may have also used another method to provide this information.

How the information is used

The following tables help explain how the query optimizer uses this information to select a query plan.

Metadata

ConstraintsThe query optimizer frequently uses constraints to detect contradictions between the query and the underlying schema. For example, if the query has a "WHERE col = 5" clause and a "CHECK (col < 5)" check constraint exists, the query optimizer knows that no rows will match.

The query optimizer makes similar types of deductions about nullability. For example, the "WHERE col IS NULL" clause is known to be true or false depending on the nullability of the column and whether the column is from the outer table of an outer join. The presence of FOREIGN KEY constraints is useful to determine cardinality and the appropriate join order. The query optimizer can use constraint information to eliminate joins or simplify predicates. These changes may remove the requirement to access the base tables.
StatisticsThe statistics information contains density and a histogram that shows the distribution of the leading column of the index and statistics key. Depending on the nature of the predicate, the query optimizer may use density, the histogram, or both to estimate the cardinality of a predicate. Up-to-date statistics are required for accurate cardinality estimates. The cardinality estimates are used as an input in estimating the cost of an operator. Therefore, you must have good cardinality estimates to obtain optimal query plans.
Table size (number of rows and pages)The query optimizer uses the histograms and density to calculate the probability that a given predicate is true or false. The final cardinality estimate is calculated by multiplying the probability by the number of rows that are returned by the child operator. The number of pages in the table or the index is a factor in estimating the IO cost. The table size is used to calculate the cost of a scan, and it is useful when you estimate the number of pages that will be accessed during an index seek.
Database optionsSeveral database options can affect optimization. The AUTO_CREATE_STATISTICS and AUTO_UPDATE_STATISTICS options affect whether the query optimizer will create new statistics or update statistics that are out of date. The PARAMETERIZATION level affects how the input query is parameterized before the input query is handed to the query optimizer. Parameterization can affect cardinality estimation and can also prevent matching against indexed views and other types of optimizations. The DATE_CORRELATION_OPTIMIZATION setting causes the optimizer to search for correlations between columns. This setting affects cardinality and cost estimation.

Environment

Session SET optionsThe ANSI_NULLS setting affects whether the "NULL = NULL" expression evaluates as true. Cardinality estimation for outer joins may change depending on the current setting. Additionally, ambiguous expressions may also change. For example, the "col = NULL" expression evaluates differently based on the setting. However, the "col IS NULL" expression always evaluates the same way.
Hardware resourcesThe cost for sort and hash operators depends on the relative amount of memory that is available to SQL Server. For example, if the size of the data is larger than the cache, the query optimizer knows that the data must always be spooled to disk. However, if the size of the data is much smaller than the cache, the operation is likely to be done in memory. SQL Server also considers different optimizations if the server has more than one processor and if parallelism has not been disabled by using a "MAXDOP" hint or the max degree of parallelism configuration option.

↑ Back to the top


Keywords: kb, kbexpertiseinter, kbexpertiseadvanced, kbsql2005engine, kbhowto, kbinfo

↑ Back to the top

Article Info
Article ID : 914288
Revision : 1
Created on : 1/7/2017
Published on : 1/27/2015
Exists online : False
Views : 526