Monday, March 12, 2012

Performance Advice

Hi,

The app I work on has a few complex queries (selects with 5-9 tables joined, data volumes of 200,000 rows +). These typically tend to aggregate info from the tables in question. They work on large volumes of data. We also have some stored procs that works on several rows of data (again in large volumes) and perform writes else where.

Overall the tables that these two types of queries run on contain millions of rows of data, and they normally work on a subset that is also substancially large - 100, 000 rows for example.

The app I work on is some sort of a tool(decision support type) and we dont expect that these queries be used like typical transaction processing systems. They will probably only be used by a few people at a time (1-3).

I've noticed that running this feature on the dev db box (2 GHZ Desktop PC with 1 GB ram) the CPU spikes to 100% and the DB server starts refusing connections to other users. Overall untill the long running queries are completed the DB may not be used by any other user.

I've tried to run the same app on a multi proc server config box. What I find is that such queries still use one of the CPUs 100%. But connections are not refused and other users can still use the database probably because of the other procs on the box.

Although this is a somewhat odd question - what does this mean ? does it mean that the query is poorly written ? Yes it is complex and we've tried to tune it to the extent possible - indexes, join sequences, intermediate materilization etc.

is it wrong to throw hardware at such a problem ? How much more might the query be tweaked ? and what will be the threshold level after which the query may not be tweaked anymore and needs better hardware ? How does one determine this ?

What I'm hoping to get with the above questions are just pointers for me to explore further and learn how I should proceed.

Any suggestions will be great.

Thanks,

Avinash

Hello,

I am not sure whether you are using MAX DOP option in your query running in production on multiple CPU's. Please look up "Degree Of Paralellism" in BOL.

I hope this helps....

Thanks.....

|||

Do any other applications use the database(s) on this server? Are they affected by your complex queries?

If performance turns out to be a serious problem, and if you tend to frequently aggregate your data, then you could consider using Analysis Services as part of your solution.

Chris

|||

Hello Chris,

Many thanks for your response. I dont have an exact answer to your question because what we build is a product and it may very well be installed on a shared database server.

But what I have seen in general is that the complex queries eat up a 100% of just one CPU per user and the number of CPUs being used 100% goes up with additional users performing the complex query. In my scenario, we dont expect any more than 1 or 2 users running such queries. If the app is running on a quad board machine, other database users will not be absolutely stranded. That is assuming that the other database users dont have similar demands from the database.

I've seen other Transaction Processing kind of apps work allright on another database on the same database server. Each complex query chews up one processor from what I've seen.

Honestly - I'm not conversant with analysis services. I will look into it. But I was hoping to get some ideas on capacity planning and determining what may be the base hardware requirement to service our needs knowing/having some idea of the volume we work with.

Any ideas will be appreciated.

Thanks,

Avinash

|||hi,
firstly i think if you use long running transactions you should consider Windows Workflow Foundation.by coincidence i have searched about this condition. i have a database which has a table with an average of 3 million rows and a relative table which has a 1,5 million rows and lastly a relative table has a 2 million rows. so i thought about what can i do. despite i have a server which has lower system capacities than yours, my server's cpu never spikes to 100%. i think there is a problem in your sql statement. once i had 2 table with 100 000 and 250 000 rows and as yours my server crashed too, i looked at task manager and saw that this query increased memory to 7 GB! anyway i suggest you to use index tuning wizard. if you didnt index your tables correctly you should do that right now. but if you indexed your tables then you should consider partitioning on tables. as i said before i execute long queries and i get response between 0-1 second! if you clarify your queries functionality then i can help you.(e.g are they insert, update or select queries?)
|||

Hello Ibrahim,

My queries are mostly complex selects with a whole bunch where conditions. They typically aggrregate data across several thousands of rows and they need to sift through millions of rows to find the subset of data.

But also I have some inserts - but they have additional read logic which again means selects with complex conditions and aggregations.

The point that I seem to have accepted is - when you say "as i said before i execute long queries and i get response between 0-1 second! if " that cannot be an absolute statement right ? In the sense that at some point your needs will require better hardware. It cannot be that any need can be satisfied by existing hardware. What I'm trying to understand is how do you determine that for your needs, X type of hardware is non negotiable.

Thanks,

Avinash

|||In my past experience, I split a complex join statement to multiple small statement. That will reduce CPU usage. Also you can review your SQL statement to find out whether there has some logic error to reduce number of temp records generated.

No comments:

Post a Comment