Saturday, February 25, 2012

Perform aggregate functions on uniqueidentifiers

For some reason, [on sql2k] one cannot perform "Count(X)" where X is of type
uniqueidentifier. Will future versions of sql server suffer from this
limitation? 2003 or 2005?
We came across this problem when we had to execute a query with multiple
table joins.
Hasani,
The workaround that I use is to store them as BINARY(16).
"Hasani (remove nospam from address)" <hblackwell@.n0sp4m.popstick.com> wrote
in message news:%233sMM$flEHA.3564@.TK2MSFTNGP14.phx.gbl...
> For some reason, [on sql2k] one cannot perform "Count(X)" where X is of
type
> uniqueidentifier. Will future versions of sql server suffer from this
> limitation? 2003 or 2005?
> We came across this problem when we had to execute a query with multiple
> table joins.
>
|||clever, i'll tell my supervisor tomorrow.
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:Oc23mQglEHA.592@.TK2MSFTNGP11.phx.gbl...
> Hasani,
> The workaround that I use is to store them as BINARY(16).
>
> "Hasani (remove nospam from address)" <hblackwell@.n0sp4m.popstick.com>
> wrote
> in message news:%233sMM$flEHA.3564@.TK2MSFTNGP14.phx.gbl...
> type
>
|||"Hasani (remove nospam from address)" <hblackwell@.n0sp4m.popstick.com> wrote
in message news:%23gbFEiglEHA.2892@.tk2msftngp13.phx.gbl...
> clever, i'll tell my supervisor tomorrow.
If you want to get even tricker, you can experiment with doing something
like this when you store the GUID:
SELECT CONVERT(BINARY(6), GETDATE()) + CONVERT(BINARY(10), NEWID()) AS
DateGUID
This reduces the uniqueness a bit (removes 6 of the 16 bytes), but not
too much because there are only so many rows you can insert every 3
milliseconds. The upside is that you can now cluster on your GUID column
without destroying INSERT performance.
|||Will sql server allow binary columntypes as primary keys?
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:eb$oqkglEHA.3712@.TK2MSFTNGP15.phx.gbl...
> "Hasani (remove nospam from address)" <hblackwell@.n0sp4m.popstick.com>
> wrote
> in message news:%23gbFEiglEHA.2892@.tk2msftngp13.phx.gbl...
> If you want to get even tricker, you can experiment with doing
> something
> like this when you store the GUID:
> SELECT CONVERT(BINARY(6), GETDATE()) + CONVERT(BINARY(10), NEWID()) AS
> DateGUID
> This reduces the uniqueness a bit (removes 6 of the 16 bytes), but not
> too much because there are only so many rows you can insert every 3
> milliseconds. The upside is that you can now cluster on your GUID column
> without destroying INSERT performance.
>
|||"Hasani (remove nospam from address)" <hblackwell@.n0sp4m.popstick.com> wrote
in message news:eP7D5uglEHA.712@.TK2MSFTNGP09.phx.gbl...
> Will sql server allow binary columntypes as primary keys?
Yes. When I have used GUIDs as primary keys (rarely, I don't think it's
a great idea most of the time), I have used the BINARY(16) technique. More
recently I've used the date concatenation technique in a project and it
worked out very well.
|||What are you reasons for not using a guid as a primary key?
We currently use integers as a primary key, but we use a stored procedure to
generate a unqiue random non-sequential integer, and we store this value in
a table to stop duplicates. In that scenario, I'm arguing that we should
just use uniqueidentifier types because we seem to just be reinventing the
wheel, but then someone mention the aggregate function thing with
uniqueidentifier types. I'm not aware of any penalties associated with using
uniqueidentifier types though, other than, it will require more bytes per
column, than an int.
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:%23OvcnxglEHA.596@.tk2msftngp13.phx.gbl...
> "Hasani (remove nospam from address)" <hblackwell@.n0sp4m.popstick.com>
> wrote
> in message news:eP7D5uglEHA.712@.TK2MSFTNGP09.phx.gbl...
> Yes. When I have used GUIDs as primary keys (rarely, I don't think
> it's
> a great idea most of the time), I have used the BINARY(16) technique.
> More
> recently I've used the date concatenation technique in a project and it
> worked out very well.
>
|||Hasani (remove nospam from address) wrote:
> What are you reasons for not using a guid as a primary key?
> We currently use integers as a primary key, but we use a stored
> procedure to generate a unqiue random non-sequential integer, and we
> store this value in a table to stop duplicates. In that scenario, I'm
> arguing that we should just use uniqueidentifier types because we
> seem to just be reinventing the wheel, but then someone mention the
> aggregate function thing with uniqueidentifier types. I'm not aware
> of any penalties associated with using uniqueidentifier types though,
> other than, it will require more bytes per column, than an int.
You're right in that it's a lot more bytes per row using a UID as
opposed to an INT IDENTITY. Four times the storage, which translates to
a much larger index when using a uniqueidentifier. And as Adam
eloquently mentioned, using a UID as a clustered key does not work well
because you get a lot of page splitting and head movement on the drives.
Adding a date component as a prefix to the UID prevents much of th epage
splitting, increasing insert performance. However, using a UID as
clustered key means propagating that key to all non-clustered indexes,
making them much larger as well.
If you can, I would stick with an INT IDENTITY column for a PK.
David G.
|||"Hasani (remove nospam from address)" <hblackwell@.n0sp4m.popstick.com> wrote
in message news:OUyrc5glEHA.2892@.tk2msftngp13.phx.gbl...
> What are you reasons for not using a guid as a primary key?
> We currently use integers as a primary key, but we use a stored procedure
to
> generate a unqiue random non-sequential integer, and we store this value
in
> a table to stop duplicates. In that scenario, I'm arguing that we should
> just use uniqueidentifier types because we seem to just be reinventing the
> wheel, but then someone mention the aggregate function thing with
> uniqueidentifier types. I'm not aware of any penalties associated with
using
> uniqueidentifier types though, other than, it will require more bytes per
> column, than an int.
I think David G pointed out most of the issues in his post, so I'll
instead refer to the only times I have had to use a GUID, which is when the
application itself was responsible for creating the key. Applications
cannot reliably create unique integers, so GUIDs are pretty much the only
choice (or natural primary keys, if there's one available).
Also, why would you want to use a non-sequential random integer instead
of an IDENTITY?
|||Maybe I contradicted myself when I said non-sequential random...
We essentially need a random number generator to use as a primary key value.
I don't know if sql supports it. All I've seen is a unique number generator
that increments by one on every insert. It's unique but not random. The
problem is is, this value is going to be made public and we don't want to
make it obvious that it's just an incrementing value (think cookies and
websessions).
What we currently do (sometimes) is have 2 columns, I that's an
autoincrementing int that's a primary key, and the other is a
uniqueidentifer column that isn't a primary key (but may have a constraint
to make sure there are no duplicates), and we would make the uniqueidentifer
value public so in a cookie, it would always look random.
I don't feel comfortable in the scenario because you have 2 columns that are
doing the same thing (preserving/ensuring uniqueness). So I'm trying to look
at all the tradeoffs of using a uniqueidentifier instead of an int, and vice
versa.
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:%23IUly9mlEHA.1652@.TK2MSFTNGP09.phx.gbl...
> "Hasani (remove nospam from address)" <hblackwell@.n0sp4m.popstick.com>
> wrote
> in message news:OUyrc5glEHA.2892@.tk2msftngp13.phx.gbl...
> to
> in
> using
> I think David G pointed out most of the issues in his post, so I'll
> instead refer to the only times I have had to use a GUID, which is when
> the
> application itself was responsible for creating the key. Applications
> cannot reliably create unique integers, so GUIDs are pretty much the only
> choice (or natural primary keys, if there's one available).
> Also, why would you want to use a non-sequential random integer instead
> of an IDENTITY?
>

No comments:

Post a Comment