Showing posts with label own. Show all posts
Showing posts with label own. Show all posts

Wednesday, March 28, 2012

Performance for nightly jobs

I have a job that takes 9 hours that I need to optimize. The job is an sp
that populates a flat table. Here are the rules of the job.
Members own contracts. Contracts are made up of diffrent products that are
grouped in product categories. A certain product may be in several diffrent
contracts. My job needs to display the member, the product category, and a
distinct count of products in that category for the member.
I have optimized the query so far by taking it out of a cursor but I have
one last query that is still hurting my performance.
The first pass (which has a slight flaw) looked liked so...
SELECT a.member_id,
b.product_category,
COUNT(b.product_id)
FROM member_contract a (NOLOCK)
JOIN product b (NOLOCK)
ON a.contract_id = b.contract_id
GROUP BY a.member_id,
b.product_category
The result time was beautiful in which the whole query took less than 2 min
and the correct number of rows in the result set. Problem is that the
products are dupplicated and I do not get an acurate count or products in a
product category for each member (I get a number larger than what is actually
designated for that particular member), so I put in the key word distinct.
This really droped performance down. the new query looks like so...
SELECT a.member_id,
b.product_category,
COUNT(DISTINCT b.product_id)
FROM member_contract a (NOLOCK)
JOIN product b (NOLOCK)
ON a.contract_id = b.contract_id
GROUP BY a.member_id,
b.product_category
I have introduced the following indexes on the product table trying to
optimize this query.
1st
contract_id,
product_category,
product_id
2nd
product_category,
contract_id,
product_id
There is a pk & clustered index on the prooduct table that is not the
product_id (it is simply an identity column that cannot be changed because I
have a full-text index on another unrelated column).
Table product has 3 million rows
Table member_contract has 33,000 rows
For those that have taken the time to read through my pain...Thank you I
really appreciate it.
Jared
Hi,
Please post DDL (CREATE TABLE statements) and sample data (INSERT
statements, just enough to duplicate your incorrect count issue); there may
be another way to re-write the query without the DISTINCT keyword that's
slowing you down.
"kirk1880" <kirk1880@.discussions.microsoft.com> wrote in message
news:C70994A5-DAB1-4501-BF48-8966FA33C73F@.microsoft.com...
> I have a job that takes 9 hours that I need to optimize. The job is an sp
> that populates a flat table. Here are the rules of the job.
> Members own contracts. Contracts are made up of diffrent products that
are
> grouped in product categories. A certain product may be in several
diffrent
> contracts. My job needs to display the member, the product category, and
a
> distinct count of products in that category for the member.
> I have optimized the query so far by taking it out of a cursor but I have
> one last query that is still hurting my performance.
> The first pass (which has a slight flaw) looked liked so...
> SELECT a.member_id,
> b.product_category,
> COUNT(b.product_id)
> FROM member_contract a (NOLOCK)
> JOIN product b (NOLOCK)
> ON a.contract_id = b.contract_id
> GROUP BY a.member_id,
> b.product_category
> The result time was beautiful in which the whole query took less than 2
min
> and the correct number of rows in the result set. Problem is that the
> products are dupplicated and I do not get an acurate count or products in
a
> product category for each member (I get a number larger than what is
actually
> designated for that particular member), so I put in the key word distinct.
> This really droped performance down. the new query looks like so...
>
> SELECT a.member_id,
> b.product_category,
> COUNT(DISTINCT b.product_id)
> FROM member_contract a (NOLOCK)
> JOIN product b (NOLOCK)
> ON a.contract_id = b.contract_id
> GROUP BY a.member_id,
> b.product_category
> I have introduced the following indexes on the product table trying to
> optimize this query.
> 1st
> contract_id,
> product_category,
> product_id
> 2nd
> product_category,
> contract_id,
> product_id
> There is a pk & clustered index on the prooduct table that is not the
> product_id (it is simply an identity column that cannot be changed because
I
> have a full-text index on another unrelated column).
>
> Table product has 3 million rows
> Table member_contract has 33,000 rows
> For those that have taken the time to read through my pain...Thank you I
> really appreciate it.
> Jared
|||CREATE TABLE member_contract
(
member_id int,
contract_id int
)
CREATE TABLE product
(
contract_id,
product_category_id,
product_id
)
INSERT member_contract
values
(1,1)
INSERT member_contract
values
(1,2)
INSERT member_contract
values
(2,1)
INSERT member_contract
values
(2,3)
insert product
values
(1, 1, 1)
insert product
values
(1, 1, 1)
insert product
values
(1, 1, 1)
insert product
values
(1, 1, 2)
insert product
values
(1, 2, 1)
insert product
values
(1, 2, 1)
insert product
values
(2, 1, 4)
insert product
values
(2, 2, 2)
"Adam Machanic" wrote:

> Hi,
> Please post DDL (CREATE TABLE statements) and sample data (INSERT
> statements, just enough to duplicate your incorrect count issue); there may
> be another way to re-write the query without the DISTINCT keyword that's
> slowing you down.
>
> "kirk1880" <kirk1880@.discussions.microsoft.com> wrote in message
> news:C70994A5-DAB1-4501-BF48-8966FA33C73F@.microsoft.com...
> are
> diffrent
> a
> min
> a
> actually
> I
>
>
|||Kirk,
How can a product change categories?
insert product
values
(1, 1, 2)
insert product
values
(1, 2, 1)
"kirk1880" <kirk1880@.discussions.microsoft.com> wrote in message
news:CE4E31B2-A028-42FB-8B3C-4E74DD983881@.microsoft.com...
> CREATE TABLE product
> (
> contract_id,
> product_category_id,
> product_id
> )
>
|||The short answer is bad database design, The long answer is a lose term (and
when I say lose I mean very lose), for the word category in which a product
can be part of a package like needles, or it can be cross referanced in the
subscription drug catagory.
Right now I am playing with grouping all items I need to remove the dups and
then aggregating on the result of the correlated query. So far it is not
helping much.
"Adam Machanic" wrote:

> Kirk,
> How can a product change categories?
> insert product
> values
> (1, 1, 2)
> insert product
> values
> (1, 2, 1)
>
> "kirk1880" <kirk1880@.discussions.microsoft.com> wrote in message
> news:CE4E31B2-A028-42FB-8B3C-4E74DD983881@.microsoft.com...
>
>
|||Kirk,
I also notice that you don't have a primary key on the product table, and
you have duplicate data. Is that the case in the real database? I don't
think there's going to be a way to speed up this process until you deal with
your data integrity issues.
"kirk1880" <kirk1880@.discussions.microsoft.com> wrote in message
news:3B37AB13-099F-4FAD-B874-3F192A2B7C13@.microsoft.com...
> The short answer is bad database design, The long answer is a lose term
(and
> when I say lose I mean very lose), for the word category in which a
product
> can be part of a package like needles, or it can be cross referanced in
the
> subscription drug catagory.
> Right now I am playing with grouping all items I need to remove the dups
and
> then aggregating on the result of the correlated query. So far it is not
> helping much.
sql

Performance for nightly jobs

I have a job that takes 9 hours that I need to optimize. The job is an sp
that populates a flat table. Here are the rules of the job.
Members own contracts. Contracts are made up of diffrent products that are
grouped in product categories. A certain product may be in several diffrent
contracts. My job needs to display the member, the product category, and a
distinct count of products in that category for the member.
I have optimized the query so far by taking it out of a cursor but I have
one last query that is still hurting my performance.
The first pass (which has a slight flaw) looked liked so...
SELECT a.member_id,
b.product_category,
COUNT(b.product_id)
FROM member_contract a (NOLOCK)
JOIN product b (NOLOCK)
ON a.contract_id = b.contract_id
GROUP BY a.member_id,
b.product_category
The result time was beautiful in which the whole query took less than 2 min
and the correct number of rows in the result set. Problem is that the
products are dupplicated and I do not get an acurate count or products in a
product category for each member (I get a number larger than what is actuall
y
designated for that particular member), so I put in the key word distinct.
This really droped performance down. the new query looks like so...
SELECT a.member_id,
b.product_category,
COUNT(DISTINCT b.product_id)
FROM member_contract a (NOLOCK)
JOIN product b (NOLOCK)
ON a.contract_id = b.contract_id
GROUP BY a.member_id,
b.product_category
I have introduced the following indexes on the product table trying to
optimize this query.
1st
contract_id,
product_category,
product_id
2nd
product_category,
contract_id,
product_id
There is a pk & clustered index on the prooduct table that is not the
product_id (it is simply an identity column that cannot be changed because I
have a full-text index on another unrelated column).
Table product has 3 million rows
Table member_contract has 33,000 rows
For those that have taken the time to read through my pain...Thank you I
really appreciate it.
JaredHi,
Please post DDL (CREATE TABLE statements) and sample data (INSERT
statements, just enough to duplicate your incorrect count issue); there may
be another way to re-write the query without the DISTINCT keyword that's
slowing you down.
"kirk1880" <kirk1880@.discussions.microsoft.com> wrote in message
news:C70994A5-DAB1-4501-BF48-8966FA33C73F@.microsoft.com...
> I have a job that takes 9 hours that I need to optimize. The job is an sp
> that populates a flat table. Here are the rules of the job.
> Members own contracts. Contracts are made up of diffrent products that
are
> grouped in product categories. A certain product may be in several
diffrent
> contracts. My job needs to display the member, the product category, and
a
> distinct count of products in that category for the member.
> I have optimized the query so far by taking it out of a cursor but I have
> one last query that is still hurting my performance.
> The first pass (which has a slight flaw) looked liked so...
> SELECT a.member_id,
> b.product_category,
> COUNT(b.product_id)
> FROM member_contract a (NOLOCK)
> JOIN product b (NOLOCK)
> ON a.contract_id = b.contract_id
> GROUP BY a.member_id,
> b.product_category
> The result time was beautiful in which the whole query took less than 2
min
> and the correct number of rows in the result set. Problem is that the
> products are dupplicated and I do not get an acurate count or products in
a
> product category for each member (I get a number larger than what is
actually
> designated for that particular member), so I put in the key word distinct.
> This really droped performance down. the new query looks like so...
>
> SELECT a.member_id,
> b.product_category,
> COUNT(DISTINCT b.product_id)
> FROM member_contract a (NOLOCK)
> JOIN product b (NOLOCK)
> ON a.contract_id = b.contract_id
> GROUP BY a.member_id,
> b.product_category
> I have introduced the following indexes on the product table trying to
> optimize this query.
> 1st
> contract_id,
> product_category,
> product_id
> 2nd
> product_category,
> contract_id,
> product_id
> There is a pk & clustered index on the prooduct table that is not the
> product_id (it is simply an identity column that cannot be changed because
I
> have a full-text index on another unrelated column).
>
> Table product has 3 million rows
> Table member_contract has 33,000 rows
> For those that have taken the time to read through my pain...Thank you I
> really appreciate it.
> Jared|||CREATE TABLE member_contract
(
member_id int,
contract_id int
)
CREATE TABLE product
(
contract_id,
product_category_id,
product_id
)
INSERT member_contract
values
(1,1)
INSERT member_contract
values
(1,2)
INSERT member_contract
values
(2,1)
INSERT member_contract
values
(2,3)
insert product
values
(1, 1, 1)
insert product
values
(1, 1, 1)
insert product
values
(1, 1, 1)
insert product
values
(1, 1, 2)
insert product
values
(1, 2, 1)
insert product
values
(1, 2, 1)
insert product
values
(2, 1, 4)
insert product
values
(2, 2, 2)
"Adam Machanic" wrote:

> Hi,
> Please post DDL (CREATE TABLE statements) and sample data (INSERT
> statements, just enough to duplicate your incorrect count issue); there ma
y
> be another way to re-write the query without the DISTINCT keyword that's
> slowing you down.
>
> "kirk1880" <kirk1880@.discussions.microsoft.com> wrote in message
> news:C70994A5-DAB1-4501-BF48-8966FA33C73F@.microsoft.com...
> are
> diffrent
> a
> min
> a
> actually
> I
>
>|||Kirk,
How can a product change categories?
insert product
values
(1, 1, 2)
insert product
values
(1, 2, 1)
"kirk1880" <kirk1880@.discussions.microsoft.com> wrote in message
news:CE4E31B2-A028-42FB-8B3C-4E74DD983881@.microsoft.com...
> CREATE TABLE product
> (
> contract_id,
> product_category_id,
> product_id
> )
>|||The short answer is bad database design, The long answer is a lose term (an
d
when I say lose I mean very lose), for the word category in which a product
can be part of a package like needles, or it can be cross referanced in the
subscription drug catagory.
Right now I am playing with grouping all items I need to remove the dups and
then aggregating on the result of the correlated query. So far it is not
helping much.
"Adam Machanic" wrote:

> Kirk,
> How can a product change categories?
> insert product
> values
> (1, 1, 2)
> insert product
> values
> (1, 2, 1)
>
> "kirk1880" <kirk1880@.discussions.microsoft.com> wrote in message
> news:CE4E31B2-A028-42FB-8B3C-4E74DD983881@.microsoft.com...
>
>|||Kirk,
I also notice that you don't have a primary key on the product table, and
you have duplicate data. Is that the case in the real database? I don't
think there's going to be a way to speed up this process until you deal with
your data integrity issues.
"kirk1880" <kirk1880@.discussions.microsoft.com> wrote in message
news:3B37AB13-099F-4FAD-B874-3F192A2B7C13@.microsoft.com...
> The short answer is bad database design, The long answer is a lose term
(and
> when I say lose I mean very lose), for the word category in which a
product
> can be part of a package like needles, or it can be cross referanced in
the
> subscription drug catagory.
> Right now I am playing with grouping all items I need to remove the dups
and
> then aggregating on the result of the correlated query. So far it is not
> helping much.

Tuesday, March 20, 2012

Performance baselines

Hi,
I'm currently looking into moving a table to it's own file group with
a view to improving performance.
I want some way to prove that there is a performance improvement so
want to take a performance base line of current activity to compare to
once I've moved the table. I'm going to use
sys.dm_io_virtual_file_stats to get an over view of what IO has been
taken away from the main file group but would also like to get some
table level stats. The number of times the table is accessed should
not change so I was wondering what table level data is available for
me to show an improvement. That is, if there is an improvement!
Cheers
SimonI think you can do one thing here:
Just note down the time taken(duration) for queries that are getting fired
on this table with existing setup during off peak or peak hours as per ur
envt. and then run them(queries or procedures) again once u r done with
movement and compare the stats.
Manu
"simon.pope.public@.gmail.com" wrote:
> Hi,
> I'm currently looking into moving a table to it's own file group with
> a view to improving performance.
> I want some way to prove that there is a performance improvement so
> want to take a performance base line of current activity to compare to
> once I've moved the table. I'm going to use
> sys.dm_io_virtual_file_stats to get an over view of what IO has been
> taken away from the main file group but would also like to get some
> table level stats. The number of times the table is accessed should
> not change so I was wondering what table level data is available for
> me to show an improvement. That is, if there is an improvement!
> Cheers
> Simon
>|||On 13 Sep, 20:40, manu <m...@.discussions.microsoft.com> wrote:
> I think you can do one thing here:
> Just note down the time taken(duration) for queries that are getting fired
> on this table with existing setup during off peak or peak hours as per ur
> envt. and then run them(queries or procedures) again once u r done with
> movement and compare the stats.
> Manu
>
> "simon.pope.pub...@.gmail.com" wrote:
> > Hi,
> > I'm currently looking into moving a table to it's own file group with
> > a view to improving performance.
> > I want some way to prove that there is a performance improvement so
> > want to take a performance base line of current activity to compare to
> > once I've moved the table. I'm going to use
> > sys.dm_io_virtual_file_stats to get an over view of what IO has been
> > taken away from the main file group but would also like to get some
> > table level stats. The number of times the table is accessed should
> > not change so I was wondering what table level data is available for
> > me to show an improvement. That is, if there is an improvement!
> > Cheers
> > Simon- Hide quoted text -
> - Show quoted text -
Cheers manu, I think you're right. It is a performance enhancement
after all and so time taken to query data in the table is as good an
indicator as any.

Saturday, February 25, 2012

perfmon to remote server

I'm not running SQLServer on my own workstation.
I am running perfmon, just as a matter of course, showing CPU and disk
and net traffic. I'd also like to monitor some remote SQLServers.
When I first clicked on the Add Counters dialog and typed in the name
of a SQLServer (say, "MySQLServer1"), it didn't connect, nothing
happened. But after I opened a fresh connection to MySQLServer1 for
the query analyzer, when I went back to perfmon, now it does list
\\MySQLServer1. But, it shows no properties for it.
Do I need some more advanced admin privileges on the remote boxes (or
network generally) for this to work?
Thanks.
JoshLocal Admins to read performance counters. Probably more specialized
security could be given, but Local Admins have the rights.
Sincerely,
Anthony Thomas
"jxstern" wrote:
> I'm not running SQLServer on my own workstation.
> I am running perfmon, just as a matter of course, showing CPU and disk
> and net traffic. I'd also like to monitor some remote SQLServers.
> When I first clicked on the Add Counters dialog and typed in the name
> of a SQLServer (say, "MySQLServer1"), it didn't connect, nothing
> happened. But after I opened a fresh connection to MySQLServer1 for
> the query analyzer, when I went back to perfmon, now it does list
> \\MySQLServer1. But, it shows no properties for it.
> Do I need some more advanced admin privileges on the remote boxes (or
> network generally) for this to work?
> Thanks.
> Josh
>

perfmon to remote server

I'm not running SQLServer on my own workstation.
I am running perfmon, just as a matter of course, showing CPU and disk
and net traffic. I'd also like to monitor some remote SQLServers.
When I first clicked on the Add Counters dialog and typed in the name
of a SQLServer (say, "MySQLServer1"), it didn't connect, nothing
happened. But after I opened a fresh connection to MySQLServer1 for
the query analyzer, when I went back to perfmon, now it does list
\\MySQLServer1. But, it shows no properties for it.
Do I need some more advanced admin privileges on the remote boxes (or
network generally) for this to work?
Thanks.
JoshLocal Admins to read performance counters. Probably more specialized
security could be given, but Local Admins have the rights.
Sincerely,
Anthony Thomas
"jxstern" wrote:

> I'm not running SQLServer on my own workstation.
> I am running perfmon, just as a matter of course, showing CPU and disk
> and net traffic. I'd also like to monitor some remote SQLServers.
> When I first clicked on the Add Counters dialog and typed in the name
> of a SQLServer (say, "MySQLServer1"), it didn't connect, nothing
> happened. But after I opened a fresh connection to MySQLServer1 for
> the query analyzer, when I went back to perfmon, now it does list
> \\MySQLServer1. But, it shows no properties for it.
> Do I need some more advanced admin privileges on the remote boxes (or
> network generally) for this to work?
> Thanks.
> Josh
>

perfmon to remote server

I'm not running SQLServer on my own workstation.
I am running perfmon, just as a matter of course, showing CPU and disk
and net traffic. I'd also like to monitor some remote SQLServers.
When I first clicked on the Add Counters dialog and typed in the name
of a SQLServer (say, "MySQLServer1"), it didn't connect, nothing
happened. But after I opened a fresh connection to MySQLServer1 for
the query analyzer, when I went back to perfmon, now it does list
\\MySQLServer1. But, it shows no properties for it.
Do I need some more advanced admin privileges on the remote boxes (or
network generally) for this to work?
Thanks.
Josh
Local Admins to read performance counters. Probably more specialized
security could be given, but Local Admins have the rights.
Sincerely,
Anthony Thomas
"jxstern" wrote:

> I'm not running SQLServer on my own workstation.
> I am running perfmon, just as a matter of course, showing CPU and disk
> and net traffic. I'd also like to monitor some remote SQLServers.
> When I first clicked on the Add Counters dialog and typed in the name
> of a SQLServer (say, "MySQLServer1"), it didn't connect, nothing
> happened. But after I opened a fresh connection to MySQLServer1 for
> the query analyzer, when I went back to perfmon, now it does list
> \\MySQLServer1. But, it shows no properties for it.
> Do I need some more advanced admin privileges on the remote boxes (or
> network generally) for this to work?
> Thanks.
> Josh
>