500 Million Rows about 10 cols about 16 bytes each col.
What is the best way to approach multiple aggregate functions against
this data ?
I have pretty much any option available to me including 8 Dual 3 ghz
Zeons on GB eth with 2gb ram that are at my disposal
What is the best scheme for handling data of this size with expensive
queries (aggregares etc)
Partitioned Table structure , (horizontal or vertical)
Federated Servers ?
Any point in the right direction is appreciated right now my plan is
as follows
8 Servers in 4 pairs , each pair load balanced and containing the
same Data as it mate in the pair,
Partitioning the table between the 4 pairs
Is this reasonable ?
P.S. there is no concern on redundancy at this point.
Thanks
Chris> What is the best scheme for handling data of this size with expensive
> queries (aggregares etc)
Analysis Services?
A|||"WertmanTheMad" <cwertman@.webchamps.com> wrote in message
news:13990782.0411091224.45f30ed4@.posting.google.com...
> Ok here goes,
> 500 Million Rows about 10 cols about 16 bytes each col.
> What is the best way to approach multiple aggregate functions against
> this data ?
> I have pretty much any option available to me including 8 Dual 3 ghz
> Zeons on GB eth with 2gb ram that are at my disposal
> What is the best scheme for handling data of this size with expensive
> queries (aggregares etc)
> Partitioned Table structure , (horizontal or vertical)
> Federated Servers ?
> Any point in the right direction is appreciated right now my plan is
> as follows
> 8 Servers in 4 pairs , each pair load balanced and containing the
> same Data as it mate in the pair,
> Partitioning the table between the 4 pairs
> Is this reasonable ?
> P.S. there is no concern on redundancy at this point.
> Thanks
> Chris
Pre-aggregating your data and using analysis services is probably the
fastest method, but you may also take a look at Indexed Views in the BOL.
There are some assumptions and requirements with Indexed views however and
you should read those carefully. Things to consider are how often the data
is modified, how often it is queried etc. An indexed view can be helpful
if the data is relatively static and computations are deterministic. It is
also helpful if the queries will be using the views on a regular basis. (As
opposed to once a month or once a week).
50 million rows however is quite a lot, so I'm not sure whether or not
Indexed Views will be the most helpful.
HTH
Rick Sawtell
MCT, MCSD, MCDBA|||"WertmanTheMad" <cwertman@.webchamps.com> wrote in message
news:13990782.0411091224.45f30ed4@.posting.google.com...
> Ok here goes,
> 500 Million Rows about 10 cols about 16 bytes each col.
> What is the best way to approach multiple aggregate functions against
> this data ?
> I have pretty much any option available to me including 8 Dual 3 ghz
> Zeons on GB eth with 2gb ram that are at my disposal
> What is the best scheme for handling data of this size with expensive
> queries (aggregares etc)
> Partitioned Table structure , (horizontal or vertical)
> Federated Servers ?
> Any point in the right direction is appreciated right now my plan is
> as follows
> 8 Servers in 4 pairs , each pair load balanced and containing the
> same Data as it mate in the pair,
> Partitioning the table between the 4 pairs
> Is this reasonable ?
> P.S. there is no concern on redundancy at this point.
> Thanks
> Chris
You could also an consider indexed view with the aggregated data in view
definition.
http://snodland.blogspot.com
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.788 / Virus Database: 533 - Release Date: 01/11/2004|||Well I jacked up my test data to 1 billion rows (I feel like Dr. Evil
every time I say that) for testing, Im going to partition it against
the 8 servers for now and Ive come up with a fairly clean partitioning
scheme (I think
Someone mailed me a link to a rather lengthy and detailed MS run
trhrough on Federated Servers and it does indeed look like its going
to be my ticket, I (because of size) am going to seperate the table
between 16 partitions (2 on each server) and (about 10 or so of the
partitions probably wont be even selected on more than once a week) so
I am thinking this is the route
The only thing I am a little fuzzy about is replication from my MAIN
DB (it has a 1b rows) out to my federation, Anyone have any
suggestions on replication to my federation.
I should have also mentioned my data is read only on the federation
and partition table, It is just being reported on, to that end it does
need to be current.
Thanks for the quick responses.
Any help on pointers to replication with partitioned tables would be
great.
Chris
cwertman@.webchamps.com (WertmanTheMad) wrote in message
news:<13990782.0411091224.45f30ed4@.posting.google.com>...
> Ok here goes,
> 500 Million Rows about 10 cols about 16 bytes each col.
> What is the best way to approach multiple aggregate functions against
> this data ?
> I have pretty much any option available to me including 8 Dual 3 ghz
> Zeons on GB eth with 2gb ram that are at my disposal
> What is the best scheme for handling data of this size with expensive
> queries (aggregares etc)
> Partitioned Table structure , (horizontal or vertical)
> Federated Servers ?
> Any point in the right direction is appreciated right now my plan is
> as follows
> 8 Servers in 4 pairs , each pair load balanced and containing the
> same Data as it mate in the pair,
> Partitioning the table between the 4 pairs
> Is this reasonable ?
> P.S. there is no concern on redundancy at this point.
> Thanks
> Chris|||Since your initial post was talking about multiple aggregates, instead of
looking up some small subset of detailed rows, you are making a mistake with
your implementaion... You are setting up a huge OLTP database structure for
a problem that looks like an OLAP problem..
If you are talking about showing product sales, by time period, by store,,,
some sort of aggregates like that - OLAP is definitely the answer... You
will find responses almost instantaneous....
I don't mean to be pushy and arrogant, but please take another look at
Analysis Services for this... You'll have fewer licenses, fewer machines,
faster response - about the only downside is that you'll have to get
familiar with another set of tools etc, (and that might not be bad either..)
Good luck on your project.
Wayne Snyder, MCDBA, SQL Server MVP
Mariner, Charlotte, NC
www.mariner-usa.com
(Please respond only to the newsgroups.)
I support the Professional Association of SQL Server (PASS) and it's
community of SQL Server professionals.
www.sqlpass.org
"WertmanTheMad" <cwertman@.webchamps.com> wrote in message
news:13990782.0411091224.45f30ed4@.posting.google.com...
> Ok here goes,
> 500 Million Rows about 10 cols about 16 bytes each col.
> What is the best way to approach multiple aggregate functions against
> this data ?
> I have pretty much any option available to me including 8 Dual 3 ghz
> Zeons on GB eth with 2gb ram that are at my disposal
> What is the best scheme for handling data of this size with expensive
> queries (aggregares etc)
> Partitioned Table structure , (horizontal or vertical)
> Federated Servers ?
> Any point in the right direction is appreciated right now my plan is
> as follows
> 8 Servers in 4 pairs , each pair load balanced and containing the
> same Data as it mate in the pair,
> Partitioning the table between the 4 pairs
> Is this reasonable ?
> P.S. there is no concern on redundancy at this point.
> Thanks
> Chris|||Also, it looks like you are willing to throw a lot of hardware at this
problem, but you're only going to put 2 GB on each server? That doesn't
make sense.
You should consider using the Analysis Services, but pump one of those, or a
couple of them, up to 8 GB!
Sincerely,
Anthony Thomas
"WertmanTheMad" <cwertman@.webchamps.com> wrote in message
news:13990782.0411091224.45f30ed4@.posting.google.com...
Ok here goes,
500 Million Rows about 10 cols about 16 bytes each col.
What is the best way to approach multiple aggregate functions against
this data ?
I have pretty much any option available to me including 8 Dual 3 ghz
Zeons on GB eth with 2gb ram that are at my disposal
What is the best scheme for handling data of this size with expensive
queries (aggregares etc)
Partitioned Table structure , (horizontal or vertical)
Federated Servers ?
Any point in the right direction is appreciated right now my plan is
as follows
8 Servers in 4 pairs , each pair load balanced and containing the
same Data as it mate in the pair,
Partitioning the table between the 4 pairs
Is this reasonable ?
P.S. there is no concern on redundancy at this point.
Thanks
Chris
No comments:
Post a Comment