I've come accross this post http://blogs.msdn.com/benjguin/archive/2005/05/10/416110.aspx where it compares the performance of a T-SQL cursor approach with a VB.NET CLR Stored procedure, and finds the latter faster.
Using the same example, I wrote a client application that performs the same operation using a DataReader, and compared it with invoking the SQLCLR stored procedure. Using the client-side cursor that scans 1 million rows is around 5 times _faster_ than the stored procedure, even when running the client in a different box.
What am I missing?
The code is below.
Sub Main()
Using cn As New SqlConnection("server=192.168.103.64\ctp;database=test;integrated security=yes")
cn.Open()
Dim stopWatch As New Stopwatch()
stopWatch.Start()
Dim cmd As New SqlCommand("select cast(a.ID * b.ID as decimal) from tableA a, tableA b", cn)
Dim total As SqlTypes.SqlDecimal
Using reader As SqlDataReader = cmd.ExecuteReader()
While reader.Read()
total = total + reader.GetDecimal(0)
End While
End Using
stopWatch.Stop()
Console.WriteLine(stopWatch.ElapsedMilliseconds)
stopWatch.Reset()
stopWatch.Start()
cmd = New SqlCommand("VBCursor", cn)
cmd.CommandType = CommandType.StoredProcedure
cmd.ExecuteNonQuery()
stopWatch.Stop()
Console.WriteLine(stopWatch.ElapsedMilliseconds)
End Using
End Sub
If you need to aggregate things, then either a built-in or a custom aggregate is the way to go, and it's very unlikely that you'll find a faster way of doing that.
As to why you're seeing this performance difference, here are some thoughts:
When you execute a top-level SQL statement that returns rows to the client, once the server started to produce rows those rows are packaged in TDS "packets" and sent to the client, a bunch of rows at a time, so processing is very efficient.
On the other hand, when you're using the context connection, we need to transition between native and managed for every row (at least in this version).
That means that while we do save the cost of the network hop in the context connection, the networking layer is very efficient in packaging and sending the data to the client; also, for context connections there is still per-row cost that we have to pay. For fast networks, the difference can almost dissapear for these extreme cases.
Our own testing indicates that in many cases of this kind performance should be more or less on par. Also, we continue to do tunning of the inproc code paths.
All that said, 5 times is clearly something I wouldn't expect. Did you try running the code inside the server but with a regular connection instead of a context connection?
Pablo Castro
Program Manager - ADO.NET Team
Microsoft Corp.
No comments:
Post a Comment