MSFT Support response to INSERT..INTO SELECT..issue

Post date: Jul 11, 2010 1:54:48 PM

I had my systems contact email MSFT support about the issue. Their initial response was less than stellar:

Thank you for contacting Microsoft Professional Online Support Service. My name is Sunny Liu I am assigned to help you with this service request. For your information, the support incident ID is XXXXXXXXXXXXXXXXX. It is my pleasure going to work with you.

Before we move forward, I'd like to explain a bit about web response support. We will support you via emails and each response is usually within approximately 24 hours(excluding weekends). If the technical issue is very urgent and you prefer real-time communication via phone, you can contact our phone service by calling 1-800-936-4900. Please be sure to reference this case number with the Customer Service Representative when you call in so a duplicate issue is not created. Thank you a lot for your understanding.

From your problem description, I understand the issue is that you have concerns on the different behaviors of doing INSERT INTO ...SELECT ORDER BY clause on a local table and on a remote table. If there is anything I misunderstood, please do not hesitate to let me know.

With the article KB273586, we know when we use a SELECT INTO or INSERT… SELECT queries with the IDENTITY function and an ORDER BY clause, the identity values that are generated are not guaranteed to have the same order as the order that is provided by the ORDER BY clause. And the presence of TOP or SET ROWCOUNT also does not guarantee the order of the rows. This behavior is by design.

However we can get the IDENTITY values to be assigned in a sequential fashion that follows the ordering in the ORDER BY clause by creating a table that contains a column with theIDENTITY property and then running an INSERT .. SELECT … ORDER BY query to populate this table. The identity values that are generated depend on the position of the GetIdentity()function in the query tree (showplan). From the execution plan you provided, we can see the getidentity() function is after the ORDER BY clause, so the identity values obey to the ORDER BY clause. That is INSERT queries that use SELECT with ORDER BY to populate rows guarantees how identity values are computed.

|--Table Insert(OBJECT:([SourceDB].[schemaname].[tempDest]), SET:([SourceDB].[schemaname].[tempDest].[Name] = [SourceDB].[schemaname].[tempSource].[Name],[SourceDB].[schemaname].[tempDest].[KeyValue] = [Expr1008],[SourceDB].[schemaname].[tempDest].[SeqID] = [Expr1009]))

|--Compute Scalar(DEFINE:([Expr1008]=getidentity((1066276516),(18),NULL), [Expr1009]=newsequentialid()))

|--Top(ROWCOUNT est 0)

|--Sort(ORDER BY:([SourceDB].[schemaname].[tempSource].[Name] ASC))

|--Table Scan(OBJECT:([SourceDB].[schemaname].[tempSource]))

Hope the above explanation is clear for you. If anything else is unclear or you have any further questions about this issue, please feel free to let me know. I’m glad to do further discussion with you.

My response to the above:

Thanks for your response. The basic problem/issue however has not been addressed:

In my example, the REMOTE target table DOES have an IDENTITY column which according to your response and the article should guarantee ordering. Why is the ORDER BY clause while performing an INSERT into the remote table completely ignored as shown in the plan? It is obeyed for local inserts but ignored for remote.

MSFT now comes back with "fixed in SQL 2008" response

I have reproduced the problem on my SQL Server 2005. We can determine this behavior is by design or it’s a bug of SQL Server 2005. After doing research, I found we have two directions to work around the problem:

1) Use TOP N in the select statement to enforce the query to sort:

insert into RemoteServer.DestDB.schemaname.tempDest (Name)

select TOP 10000 Name

from tempSource

order by name asc

The execution plan of insert into a Remote table after adding TOP 10000 as below:

2) Create a index on the order by column (name) to sort the data in advance. This will avoid the query optimizer to do sort in its run time.

CREATE CLUSTERED INDEX [cl_name] ON [dbo].[tempSource]

(

[Name] ASC

) ON [PRIMARY]

After creating index on the [name] column, the execution plan of insert into Local table as below, that avoids sorting operation:

While the execution plan of insert into the Remote table as below:

Since there are some unexpected issues in the design of SQL Server 2005, which results in the remote insert with ORDER BY clause does not sort. Our developer team has noticed the problem and now the issue has been fixed in SQL Server 2008.

We respond:

Are there any plans from Microsoft to patch this problem in 2005?

The developer that I opened this case on behalf is already using the TOP method to work around this issue. He also wanted to confirm that this bug is fixed in 2008. Are you certain that it is fixed in 2008?

Thanks for your help.

MSFT Response and confirmation of fix in 2008

Since there is greater risk to modify the codes of the current version (it may result in other bugs), our developer decided not to fix the issue in SQL Server 2005. I appreciate your understanding.

Yes, the bug has been fixed in SQL Server 2008. Here is my test for you:

1) Create source and destination tables on a local SQL Server 2008 RTM(Build 10.0.1600):

create table tempDest(KeyValue INT IDENTITY(1,1),

Name NVARCHAR(20), SeqID uniqueidentifier DEFAULT(NEWSEQUENTIALID()))

create table tempSource(KeyValue INT IDENTITY(1,1),

Name NVARCHAR(20), SeqID uniqueidentifier DEFAULT(NEWSEQUENTIALID()))

2) Create destination table on a remote SQL Server 2008 database:

create table tempDest(KeyValue INT IDENTITY(1,1),

Name NVARCHAR(20), SeqID uniqueidentifier DEFAULT(NEWSEQUENTIALID()))

3) Populate source table on local db:

insert into tempSource (Name) values ('Rec3')

insert into tempSource (Name) values ('Rec2')

insert into tempSource (Name) values ('Rec1')

select * from tempSource order by name asc

4) Insert into Destination table on remote server/db:

--SET SHOWPLAN_TEXT on

--go

insert into [yusm].test.dbo.tempDest (Name)

select Name

from tempSource

order by name asc

The execution plan as below:

We can see the ORDER BY clause do Sort in the execution plan when performing the INSERT INTO a remote table.

5) Insert into Destination table on local server/db:

insert into tempDest (Name)

select Name

from tempSource

order by name asc

6) Select from remote destination table order by identity:

select * from [yusm].test.dbo.tempDest order by Keyvalue asc

Select from local destination table order by identity:

select * from tempDest order by Keyvalue asc

The query results from the remote destination table are the same as the local destination table now.

If anything else is unclear or you have any further questions about this issue, please feel free to let me know. Thank you.