Bridging the SQL/CLR Gap: the SqlTypes Library
The types exposed by the
.NET Framework and by SQL Server are in many cases similar, but
generally incompatible. A few major issues come up when dealing with SQL
Server and .NET interoperability from the perspective of data types:
First and
foremost, all native SQL Server data types are nullable—that is, an
instance of any given type can either hold a valid value in the domain
of the type or represent an unknown (NULL). Types in .NET generally do not support this idea (note that C#'s null or VB .NET's nothing are not the same as SQL Server's NULL).
The
second difference between the type systems has to do with
implementation. Format, precision, and scale of the types involved in
each system differ dramatically. For example, .NET's DateTime type supports a much larger range and much greater precision than does SQL Server's DATETIME type.
The
third major difference has to do with runtime behavior of types in
conjunction with operators. In SQL Server, virtually all operations
involving at least one NULL instance of a type results in NULL.
As the .NET Framework types are not natively nullable, this is not
generally a concern for its type system. In addition to nullability,
differences may result from handling overflows, underflows, and other
potential errors inconsistently. For instance, adding 1 to a 32-bit integer with the value of 2147483647 (the maximum 32-bit integer value) in a .NET language may result in the value "wrapping around," producing −2147483648. In SQL Server, this behavior will never occur—instead, an overflow exception will result.
NOTE
The .NET 2.0 Framework adds something called a nullable type,
which further muddies the waters. Nullable types allow developers to
treat value type variables similarly to reference type variables in
terms of the ability to dereference them. This has nothing to do with
data access or database NULL values, and many developers were disappointed upon the release of .NET 2.0 to discover that ADO.NET
does not accept nullable types as arguments to SQL parameters. Again,
do not fall into the trap of confusing .NET's idea of null with SQL
Server's idea of NULL—these are very different concepts.
In order to provide a layer of abstraction between the two type paradigms, the .NET 2.0 Framework ships with a namespace called System.Data.SqlTypes.
This namespace includes a series of structures that map SQL Server
types and behaviors into .NET. Each of these structures implements
nullability through the INullableIsNull property that allows callers to determine whether a given instance of the type is NULL. Furthermore, these types conform to the same range, precision, and operator rules as SQL Server's native types. interface, which exposes an
Proper use of the SqlTypes
types is, simply put, the most effective way of ensuring that data
marshaled into and out of SQLCLR routines is handled correctly by each
type system. It is my recommendation that, whenever possible, all
methods exposed as SQLCLR objects use SqlTypes
types as both input and output parameters, rather than standard .NET
types. This will require a bit more development work upfront, but it
should "futureproof" your code to some degree and help avoid type
incompatibility issues.
Wrapping Code to Promote Cross-Tier Reuse
One of the primary selling
points (as well as use cases) for SQLCLR integration, especially in
shops that use the .NET Framework for application development, is the
ability to easily move or share code between tiers when it makes sense.
Unfortunately, some of the design necessities of working in the SQLCLR
environment do not translate well to the application tier, and vice
versa. One such example is use of the SqlTypes;
although it is recommended that they be used for all interfaces in
SQLCLR routines, that prescription does not make sense in the
application tier, because the SqlTypes
do not support the full range of operators and options that the native
.NET types support. Using them for everything would make data access
simple, but would rob you of the ability to do many complex data
manipulation tasks, and would therefore be more of a hindrance than a
helpful change.
Rewriting code or
maintaining multiple versions customized for different tiers simply does
not promote maintainability. In the best-case scenario, any given piece
of logic used by an application should be coded in exactly one
place—regardless of how many different components use the logic, or
where it's deployed. This is one of the central design goals of
object-oriented programming, and it's important to remember that it also
applies to code being reused inside of SQL Server.
Instead of rewriting routines and types to make them compatible with the SqlTypes
and implement other database-specific logic, I recommend that you get
into the habit of designing wrapper methods and classes. These wrappers
should map the SqlTypes inputs and
outputs to the .NET types actually used by the original code, and call
into the original routines via assembly references. Wrappers are also a
good place to implement database-specific logic that may not exist in
the original routines.
In addition to
the maintainability benefits for the code itself, creating wrappers has a
couple of other advantages. First of all, unit tests will not need to
be rewritten—the same tests that work in the application tier will still
apply in the data tier (although you may want to write secondary unit
tests for the wrapper routines). Secondly—and perhaps more
importantly—wrapping your original assemblies can help maintain a
least-privileged coding model and help enhance security.
A Simple Example: E-Mail Address Format Validation
It's quite common on web
forms to be asked for your e-mail address, and you've no doubt
encountered forms that tell you whether you've entered an e-mail address
that does not comply with the standard format. This is a quicker—but
obviously less effective—way to validate an e-mail address than actually
sending an e-mail and waiting for a response, and it gives the user
immediate feedback if something is obviously incorrect.
In addition to using this
logic for front-end validation, it makes sense to implement the same
thing in the database in order to drive a CHECK
constraint. That way, any data that makes its way to the
database—regardless of whether it already went through the check in the
application—will be double-checked for correctness.
Following is a simple example method that uses a regular expression to validate the format of an address:
public static bool IsValidEmailAddress(string emailAddress)
{
//Validate the e-mail address
Regex r =
new Regex(@"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*");
return (r.IsMatch(emailAddress));
}
This code could, of course,
be used as-is in both SQL Server and the application tier—using it in
SQL Server would simply require loading the assembly and registering the
function. But this has some issues: the most obvious is the lack of
proper NULL handling. As-is, this method will return an ArgumentException when a NULL is passed in. Depending on your business requirements, a better choice would probably be either NULL or false.
Another potential issue is methods that require slightly different
logic in the database vs. the application tier. In the case of this
method, it's difficult to imagine how you might enhance the logic for
use in a different tier, but for other methods, such modification would
present a maintainability challenge.
The solution is
to catalog the assembly containing this method in SQL Server, but not
expose the method as a SQLCLR UDF. Instead, create a wrapper method that
uses the SqlTypes and internally
calls the initial method. This means that the initial method will not
have to be modified in order to create a version that properly
interfaces with the database, and the same assembly can be deployed in
any tier. Following is a sample that shows a wrapper method created over
the IsValidEmailAddress method, in order to expose a SQLCLR UDF version that properly supports NULL inputs and outputs. Note that I've created the inner method in a class called UtilityMethods and have also included a using statement for the namespace used in the UtilityMethods assembly.
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlBoolean IsValidEmailAddress(
SqlString emailAddress)
{
//Return NULL on NULL input
if (emailAddress.IsNull)
return (SqlBoolean.Null);
bool isValid = UtilityMethods.IsValidEmailAddress(emailAddress.Value);
return (new SqlBoolean(isValid));
}
Note that this
technique is usable not only for loading assemblies from the application
tier into SQL Server, but also for going the other way—migrating logic
back out of the data tier. Given the nature of SQLCLR, the potential for
code mobility should always be considered, and developers should
consider designing methods using wrappers even when creating code
specifically for use in the database—this will maximize the potential
for reuse later, when or if the same logic needs to be migrated to
another tier, or even if the logic needs to be reused more than once
inside of the data tier itself.
Cross-assembly references
have other benefits as well, when working in the SQLCLR environment. By
properly leveraging references, it is possible to create a much more
robust, secure SQLCLR solution. The following sections introduce the
security and reliability features that are used by the SQLCLR hosted run
time, and show how you can exploit them via assembly references to
manage security on a granular level.