Showing posts with label statement. Show all posts
Showing posts with label statement. Show all posts

Wednesday, March 21, 2012

Performance challenge with UNION ALL

Hi all

I have encountered a challenge while working with a UNION ALL statement. At first I was really happy, I had my tsql query go from 1min+ to just above 1sec. Nice...

But my joy didn't last. I have boiled my issue down to this. If I use a variable in my UNION ALL statement it will take approx 50 sec, if I enter a number it will take approx 1 sec. Below is the two queries, the table used contains parent-child relations and my query wan't to recursively fetch all childs and their children etc from the parent with ID = 3939.

My question is now, since this is to be used in a SP, are there any way I can use a variable value in a UNION ALL without my query will be that much slower?

Best regards

Anders

-- Fast example, approx 1 sec

DECLARE @.tempTable TABLE(relationId int);
WITH usesRelations (relationId, childVersionId) AS
(SELECT Id, childVersionID
FROM dbo.relations relation
WHERE relation.ParentVersionID = 3939
UNION ALL
SELECT p.Id, p.childVersionID
FROM dbo.relations AS p INNER JOIN
usesRelations AS A on A.childversionId = p.ParentVersionId
)
INSERT INTO @.temptable SELECT distinct relationID from usesRelations

-- Slow example, approx 50 sec
DECLARE @.ObjectID int
SET @.ObjectID = 3939
DECLARE @.tempTable TABLE(relationId int);
WITH usesRelations (relationId, childVersionId) AS
(SELECT Id, childVersionID
FROM dbo.relations relation
WHERE relation.ParentVersionID = @.ObjectId
UNION ALL
SELECT p.Id, p.childVersionID
FROM dbo.relations AS p INNER JOIN
usesRelations AS A on A.childversionId = p.ParentVersionId
)
INSERT INTO @.temptable SELECT distinct relationID from usesRelations

That is not a fair comparison. May be SQL Server autoparameterized the first statement and used the constant value as the value for the parameter. In the second batch you are using a variable and SQL Server does not uses variables to estimate cardinality. You can use "OPTION (RECOMPILE)" in the "select" statement, or create a stored procedure with parameters, or use sp_executesql also with parameters (I see you are using table variables so this will not be an option).

Code Snippet

create procedure dbo.p1

@.ObjectID int

as

set nocount on

DECLARE @.tempTable TABLE(relationId int);

WITH usesRelations (relationId, childVersionId)

AS

(

SELECT Id, childVersionID

FROM dbo.relations relation

WHERE relation.ParentVersionID = @.ObjectId

UNION ALL

SELECT p.Id, p.childVersionID

FROM dbo.relations AS p INNER JOIN usesRelations AS A on A.childversionId = p.ParentVersionId

)

INSERT INTO @.temptable SELECT distinct relationID from usesRelations

go

exec dbo.p1 3939

go

AMB

|||Thanks for the reply.

The query was to become a stored procedure anyway. I was only having it as a query while i tried optimizing it - but I see now, that it might be an idea to work with it as a stored procedure all the way.

Im not sure i totally understand why SQL server why it has such a huge impact changing a constant with a variable - but anyways...

Making the statement a stored procedure with a parameter instead makes it perfectly fit and fast.

Thanks for helping me out on this one...

/Anders
|||

> Im not sure i totally understand why SQL server why it has such a huge impact changing a constant with a variable - but anyways...

I shouldn't have said that SQL Server does not use variables to estimate cardinality, instead, I should have said that the estimation is not as accurate as when parameters are present, together with useful indexes or statistics.

SQL server uses the values of the parameters during compilation time (when you invoque the sp and there is not a plan in cache) to estimate cardinality based on the expressions in the "where" and "join" clauses. SQL Server does not know the value of variables during that time, or may be is affraid that those values could change before execution the statement, so it uses other estimations, like the value from "All Density" from the statistics. If there is an index or statistic that is useful to estimate cardinality for those expressions, SQL Server uses the histogram when those values are known, like when you recompile the statement (SS 2005).

Statistics Used by the Query Optimizer in Microsoft SQL Server 2005

http://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx

AMB

Tuesday, March 20, 2012

Performance benefits between a JOIN statement vs a nested SELECT

Is there a performance bonus between these 2 sql statements?
Using JOIN:
SELECT * FROM A, B WHERE A.ID = B.ID;
Using nested SELECT:
SELECT * FROM A WHERE A.ID = (SELECT ID FROM B);
ThanksThere may be only a small amount of difference, but these two statements are
really very different. The join will return data from both A and B, meaning
that the entire row will be needed from B, instead of just the ID column.
And unless there is a 1 to 1 relationship between A and B, you are going to
get different results.
Performancewise, the second should be faster (and could be faster still
probably if you rewrote it using exists)
select *
from A
where exists (select *
from B
where a.id = b.id)
But the bottom line is that you are dealing with two different queries that
are not equivalent. So depending on the question you are trying to answer,
who cares which is fastest, it is which gives you the correct answer.
----
Louis Davidson - drsql@.hotmail.com
SQL Server MVP
Compass Technology Management - www.compass.net
Pro SQL Server 2000 Database Design -
http://www.apress.com/book/bookDisplay.html?bID=266
Blog - http://spaces.msn.com/members/drsql/
Note: Please reply to the newsgroups only unless you are interested in
consulting services. All other replies may be ignored :)
"cheem" <cheem@.msn.com> wrote in message
news:%23i7qoxVGFHA.1836@.tk2msftngp13.phx.gbl...
> Is there a performance bonus between these 2 sql statements?
> Using JOIN:
> SELECT * FROM A, B WHERE A.ID = B.ID;
>
> Using nested SELECT:
> SELECT * FROM A WHERE A.ID = (SELECT ID FROM B);
> Thanks
>|||Otherwise you can use
Select * from A where id in (Select id from B)
Madhivanan
Louis Davidson wrote:
> There may be only a small amount of difference, but these two
statements are
> really very different. The join will return data from both A and B,
meaning
> that the entire row will be needed from B, instead of just the ID
column.
> And unless there is a 1 to 1 relationship between A and B, you are
going to
> get different results.
> Performancewise, the second should be faster (and could be faster
still
> probably if you rewrote it using exists)
> select *
> from A
> where exists (select *
> from B
> where a.id = b.id)
> But the bottom line is that you are dealing with two different
queries that
> are not equivalent. So depending on the question you are trying to
answer,
> who cares which is fastest, it is which gives you the correct answer.
> --
>
----
> Louis Davidson - drsql@.hotmail.com
> SQL Server MVP
> Compass Technology Management - www.compass.net
> Pro SQL Server 2000 Database Design -
> http://www.apress.com/book/bookDisplay.html?bID=266
> Blog - http://spaces.msn.com/members/drsql/
> Note: Please reply to the newsgroups only unless you are interested
in
> consulting services. All other replies may be ignored :)
> "cheem" <cheem@.msn.com> wrote in message
> news:%23i7qoxVGFHA.1836@.tk2msftngp13.phx.gbl...|||cheem
Be aware if your id is null in the #b table you don't get a right output .
Look at below example
create table #a
(
[id] int null
)
insert into #a values (1)
insert into #a values (2)
create table #b
(
[id] int null
)
insert into #b values (null)
insert into #b values (2)
select * from #a where [id] not in (select [id]from #b)--Wrong one
Always USE
select * from #a where [id] not in (select [id]from #b where #a.id=#b.id)
drop table #a,#b
"cheem" <cheem@.msn.com> wrote in message
news:%23i7qoxVGFHA.1836@.tk2msftngp13.phx.gbl...
> Is there a performance bonus between these 2 sql statements?
> Using JOIN:
> SELECT * FROM A, B WHERE A.ID = B.ID;
>
> Using nested SELECT:
> SELECT * FROM A WHERE A.ID = (SELECT ID FROM B);
> Thanks
>|||cheem,
If you are only interested in the columns values of table A, and if
there is at most one matching row in table B for each row in table A,
then there are basically three methods to achieve the same thing:
Inner Join
--
1a) SELECT A.* FROM A, B WHERE A.ID = B.ID
1b) SELECT A.* FROM A INNER JOIN B ON B.ID = A.ID
Exists
--
2) SELECT * FROM A WHERE EXISTS (SELECT 1 FROM B WHERE B.ID = A.ID)
In
--
3) SELECT * FROM A WHERE ID IN (SELECT ID FROM B)
Which method is fastest depends on the relative table size, and the
number of distinct values in table B.
Method 1 has the advantage that the optimizer can use the MERGE JOIN
operator which is very efficient.
Method 2 has good performance overall, and particularly if table (or
selection of) table A is relatively small compared to table B
Method 3 is very fast if table B is just a few unique values. If you use
method 3, then make sure B.ID never contains NULL.
So as usual: it depends.
Hope this helps,
Gert-Jan
cheem wrote:
> Is there a performance bonus between these 2 sql statements?
> Using JOIN:
> SELECT * FROM A, B WHERE A.ID = B.ID;
> Using nested SELECT:
> SELECT * FROM A WHERE A.ID = (SELECT ID FROM B);
> Thanks

Saturday, February 25, 2012

Perform several operations in 1 ALTER TABLE?

Hi there. Does anyone know of a why to perform several operations on a table within only 1 ALTER TABLE statement? I haven't found anything to date and don't even know if it's possible.
Thanks,
Angelwhat's the point? are you worried about performance? you shouldn't be, because you shouldn't be altering your tables at all in production, other than to fix bugs or release new versions of a product. it shouldn't be an everyday thing.|||I'm almost certain that this is a homework question. If you post a URL to the assignment, or a scanned image of the handout you got in class we can provide you with better help.

Yes, you can make more than one change in a single ALTER TABLE statement. No, you shouldn't do it because of the reasons cited by Jezemine and your present uncertainty in the process.

-PatP

Perfomance issue

Looking for suggestions about what I might look for..
During periods of time when no INSERTS are taking place, SELECT statement
durations are around 16-30ms for hundreds of requests per minute.
When there is a batch of INSERTS that are sent to the table via SP that
number in the many thousands..for a while both INSERTS and SELECTS coexist at
around the same duration of 16-30 ms..watching this with Profiler.
Then, for some reason..some SELECT and few INSERTS will go up to the 4000 -
5000 ms range. Then things will quiet down, then go back up to 4000ms. These
don't have to be on the same table and frequently are not...INSERTS are going
to 1 table while the majority are going to another table in the same DB. This
happens on all the servers we have so it's not specific to one server. They
all have the same hardware and software.
I've looked for deadlocks and there are none. I've set some of the common
things in PERFMON such as CACHE BUFFER HIT RATIO. Looks ok. There's 4GB of
RAM on the server and there's 1GB available.
The MDF has it's own separate drive and the LDF has it's own drive.
It's almost looks like theres some sort of contention between the INSERT and
the SELECT, but since it's in separate tables...i have no idea what to look
for.
any help appreciated.
The servers were recently upgraded from SQL 7 .. I'm not sure if this was
happening under SQL 7. I don't think so. I've updated statistics on all
servers and recreated the clustered indexes.
SQL 2000 with SP4.
thanks,
DonTry updating stats with FULLSCAN. Also, trace the Lock Escalation event.
--
Tom
----
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinpub.com
.
"donsql22222" <donsql22222@.discussions.microsoft.com> wrote in message
news:776E5217-2EDB-40FF-BFA6-E8A3E1ADF980@.microsoft.com...
Looking for suggestions about what I might look for..
During periods of time when no INSERTS are taking place, SELECT statement
durations are around 16-30ms for hundreds of requests per minute.
When there is a batch of INSERTS that are sent to the table via SP that
number in the many thousands..for a while both INSERTS and SELECTS coexist
at
around the same duration of 16-30 ms..watching this with Profiler.
Then, for some reason..some SELECT and few INSERTS will go up to the 4000 -
5000 ms range. Then things will quiet down, then go back up to 4000ms. These
don't have to be on the same table and frequently are not...INSERTS are
going
to 1 table while the majority are going to another table in the same DB.
This
happens on all the servers we have so it's not specific to one server. They
all have the same hardware and software.
I've looked for deadlocks and there are none. I've set some of the common
things in PERFMON such as CACHE BUFFER HIT RATIO. Looks ok. There's 4GB of
RAM on the server and there's 1GB available.
The MDF has it's own separate drive and the LDF has it's own drive.
It's almost looks like theres some sort of contention between the INSERT and
the SELECT, but since it's in separate tables...i have no idea what to look
for.
any help appreciated.
The servers were recently upgraded from SQL 7 .. I'm not sure if this was
happening under SQL 7. I don't think so. I've updated statistics on all
servers and recreated the clustered indexes.
SQL 2000 with SP4.
thanks,
Don|||On Mon, 13 Feb 2006 16:38:26 -0800, "donsql22222"
<donsql22222@.discussions.microsoft.com> wrote:
>It's almost looks like theres some sort of contention between the INSERT and
>the SELECT, but since it's in separate tables...i have no idea what to look
>for.
Of course there's contention, for LRU pages in memory, the moreso
because they are different tables!
Run the queries during the slow period from QA with SET STATISTICS IO
ON, and look at the physical reads numbers going from 0 to whatever.
J.