Wednesday, March 28, 2012

Performance difference over "not exists" and left join?

Hi. I have a query that inserts rows in a table where the rows do not
already exists. I have two techniques I use to do this one is "not exists":
where not exists
(
select
*
from
Supply
where
Supply.BranchNumber = XmlSupply.BranchNumber
and
Supply.RetailerCode = XMLSupply.RetailerCode
and
Supply.TitleCode = XMLSupply.TitleCode
and
Supply.IssueYear = XMLSupply.IssueYear
and
Supply.IssueNumber = XMLSupply.IssueNumber
)
and the other is a left join and using a "where is null":
left join
Supply s
on
s.BranchNumber = xs.BranchNumber
and
s.RetailerCode = xs.RetailerCode
and
s.TitleCode = xs.TitleCode
and
s.IssueYear = xs.IssueYear
and
s.IssueNumber = xs.IssueNumber
where
s.BranchNumber is null
I cannot see any performance difference between the two. But then the table
only has a small sample in at present. Is there a difference between them?
Should I favour one over the other?
McGy
[url]http://mcgy.blogspot.com[/url]Yes, you don't see any differences because the tables are small. I have seen
some performance improvement with LEFT JOIN on very large tables,so test it
ans see what is going on
"McGy" <anon@.anon.com> wrote in message
news:esRtvucRGHA.5296@.tk2msftngp13.phx.gbl...
> Hi. I have a query that inserts rows in a table where the rows do not
> already exists. I have two techniques I use to do this one is "not
> exists":
> where not exists
> (
> select
> *
> from
> Supply
> where
> Supply.BranchNumber = XmlSupply.BranchNumber
> and
> Supply.RetailerCode = XMLSupply.RetailerCode
> and
> Supply.TitleCode = XMLSupply.TitleCode
> and
> Supply.IssueYear = XMLSupply.IssueYear
> and
> Supply.IssueNumber = XMLSupply.IssueNumber
> )
> and the other is a left join and using a "where is null":
> left join
> Supply s
> on
> s.BranchNumber = xs.BranchNumber
> and
> s.RetailerCode = xs.RetailerCode
> and
> s.TitleCode = xs.TitleCode
> and
> s.IssueYear = xs.IssueYear
> and
> s.IssueNumber = xs.IssueNumber
> where
> s.BranchNumber is null
> I cannot see any performance difference between the two. But then the
> table
> only has a small sample in at present. Is there a difference between them?
> Should I favour one over the other?
> --
> McGy
> [url]http://mcgy.blogspot.com[/url]
>
>|||Check the execution plans. SQL Server will probably generate the same plans
if the queries are semantically identical and you'll get the same
performance if that's the case. Note that the queries will be the same
semantically only if you are including all the primary key columns from the
Supply table.
Hope this helps.
Dan Guzman
SQL Server MVP
"McGy" <anon@.anon.com> wrote in message
news:esRtvucRGHA.5296@.tk2msftngp13.phx.gbl...
> Hi. I have a query that inserts rows in a table where the rows do not
> already exists. I have two techniques I use to do this one is "not
> exists":
> where not exists
> (
> select
> *
> from
> Supply
> where
> Supply.BranchNumber = XmlSupply.BranchNumber
> and
> Supply.RetailerCode = XMLSupply.RetailerCode
> and
> Supply.TitleCode = XMLSupply.TitleCode
> and
> Supply.IssueYear = XMLSupply.IssueYear
> and
> Supply.IssueNumber = XMLSupply.IssueNumber
> )
> and the other is a left join and using a "where is null":
> left join
> Supply s
> on
> s.BranchNumber = xs.BranchNumber
> and
> s.RetailerCode = xs.RetailerCode
> and
> s.TitleCode = xs.TitleCode
> and
> s.IssueYear = xs.IssueYear
> and
> s.IssueNumber = xs.IssueNumber
> where
> s.BranchNumber is null
> I cannot see any performance difference between the two. But then the
> table
> only has a small sample in at present. Is there a difference between them?
> Should I favour one over the other?
> --
> McGy
> [url]http://mcgy.blogspot.com[/url]
>
>|||Thanks for that.
McGy
[url]http://mcgy.blogspot.com[/url]
"Dan Guzman" <guzmanda@.nospam-online.sbcglobal.net> wrote in message
news:OSzsTYdRGHA.4960@.TK2MSFTNGP12.phx.gbl...
> Check the execution plans. SQL Server will probably generate the same
plans
> if the queries are semantically identical and you'll get the same
> performance if that's the case. Note that the queries will be the same
> semantically only if you are including all the primary key columns from
the
> Supply table.
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
> "McGy" <anon@.anon.com> wrote in message
> news:esRtvucRGHA.5296@.tk2msftngp13.phx.gbl...
them?
>|||McGy (anon@.anon.com) writes:
> Hi. I have a query that inserts rows in a table where the rows do not
> already exists. I have two techniques I use to do this one is "not
> exists":
>...
> and the other is a left join and using a "where is null":
>...
> I cannot see any performance difference between the two. But then the
> table only has a small sample in at present. Is there a difference
> between them? Should I favour one over the other?
Some people claim that LEFT JOIN is faster than NOT EXISTS, but the only
right thing is to benchmark for each case.
Personally, I prefer to write my query to express what I am looking for,
which means that in most cases, I use NOT EXISTS.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||Let's do a quick test
use pubs
go
--query 1
select Title as [titles not sold]
from titles t left outer join sales s on t.title_id = s.title_id
where s.title_id is null
--query2
select Title as [titles not sold]
from titles t where not exists (select * from sales s where t.title_id =
s.title_id)
Hit CTRL + K, press F5
Now look at the execution plan
query 1 = 50.02%
query 2 = 49.98%
so it looks like query 2 is a little (very little) more efficient in this
case since the first query has to apply a filter (s.title_id is null)
http://sqlservercode.blogspot.com/
"McGy" <anon@.anon.com> wrote in message
news:esRtvucRGHA.5296@.tk2msftngp13.phx.gbl...
> Hi. I have a query that inserts rows in a table where the rows do not
> already exists. I have two techniques I use to do this one is "not
> exists":
> where not exists
> (
> select
> *
> from
> Supply
> where
> Supply.BranchNumber = XmlSupply.BranchNumber
> and
> Supply.RetailerCode = XMLSupply.RetailerCode
> and
> Supply.TitleCode = XMLSupply.TitleCode
> and
> Supply.IssueYear = XMLSupply.IssueYear
> and
> Supply.IssueNumber = XMLSupply.IssueNumber
> )
> and the other is a left join and using a "where is null":
> left join
> Supply s
> on
> s.BranchNumber = xs.BranchNumber
> and
> s.RetailerCode = xs.RetailerCode
> and
> s.TitleCode = xs.TitleCode
> and
> s.IssueYear = xs.IssueYear
> and
> s.IssueNumber = xs.IssueNumber
> where
> s.BranchNumber is null
> I cannot see any performance difference between the two. But then the
> table
> only has a small sample in at present. Is there a difference between them?
> Should I favour one over the other?
> --
> McGy
> [url]http://mcgy.blogspot.com[/url]
>
>|||Really good thread :-)... I also go with "NOT EXISTS", I have not got any
chance to use it on production environment or else i would have strongly
suggested that, as there was a design change in my dataware housing project.
I got a link just search for "NOT EXISTS" in that,
http://msdn.microsoft.com/library/d...ntbpwithdts.asp
In that MS suggest to use "NOT EXISTS" to insert new dimesnion records...
What my peers say is using 'NOT EXISTS' can minimise table scan while
searching for records...
1. Avoid using IN’s where possible. An IN can be replaced by an EXISTS or
a
NOT EXISTS which is invariably faster as an EXISTS works by creating a join
between the two tables. If you really want to use IN, place the values that
appear most commonly in the database towards the start of the list. This is
because the query executioner basically loops through the list left to right
looking for the presence of each value in turn.
Have a look in below blog also,
http://blogs.claritycon.com/blogs/t...egory/1007.aspx
Thanks,
Sree
[Please specify the version of Sql Server as we can save one thread and time
asking back if its 2000 or 2005]
"SQL" wrote:

> Let's do a quick test
> use pubs
> go
> --query 1
> select Title as [titles not sold]
> from titles t left outer join sales s on t.title_id = s.title_id
> where s.title_id is null
> --query2
> select Title as [titles not sold]
> from titles t where not exists (select * from sales s where t.title_id =
> s.title_id)
> Hit CTRL + K, press F5
> Now look at the execution plan
> query 1 = 50.02%
> query 2 = 49.98%
> so it looks like query 2 is a little (very little) more efficient in this
> case since the first query has to apply a filter (s.title_id is null)
> http://sqlservercode.blogspot.com/
>
> "McGy" <anon@.anon.com> wrote in message
> news:esRtvucRGHA.5296@.tk2msftngp13.phx.gbl...
>
>

No comments:

Post a Comment