What's New With VS and Data

Roger Jennings discusses the new data features that made it into Visual Studio and SQL Server 2005, and a couple features that were dropped late in the process.

Roger Jennings writes frequently for VSM, in addition to serving as an advisor to the VSM editorial team on issues pertaining to data and Visual Studio, including SQL Server. Roger spoke recently with VSM Editor In Chief Patrick Meader about the new data features in Visual Studio and SQL Server.

PM: What do you see as the key new features of SQL Server 2005 for developers?

RJ: Obviously, CLR support draws the most attention from .NET coders, but there are plenty of other additions to SQL Server 2005 that will appeal to organizations needing a developer-oriented, enterprise-grade relational database management system. New varchar(max) and varbinary(max) columns replace the aging text and image data types. Cell-level encryption and integrated key management prevent disclosure of confidential information, even to DBAs with SA privileges. ReportViewer controls let Windows and Web apps display fully formatted tabular and crosstab reports—as well as charts and graphs—without running a Report Services server. The native XML data type and XQuery enables fine-grained, indexed searches on specific element and attribute values.

Multiple Active Result Sets (MARS) enable multiple concurrent SqlDataReader read and write operations on multiple SqlCommand objects that share a single SqlConnection object. MARS brings SQL Server 2005 to parity with Oracle and other RDBMSs that enable multiple readers on a single connection. The VS Data team's Angel Saenz-Badillos is "concerned that this feature is going to be misused." In other words, MARS gives .NET developers the opportunity to "shoot themselves in the foot"—forewarned is forearmed.

Developers should evaluate SQL Server 2005's Reporting Services, Report Builder, built-in native Web services, Service Broker for asynchronous server-to-server messaging with guaranteed delivery, and Query Notifications for invalidating a Web app's cache. However, don't expect to put projects that use these new features or run managed code in SQL Server's process without agreement by your DBA(s).

PM: How about new SQL Server 2005 features for data architects and DBAs?

RJ: Microsoft hopes this release will transcend SQL Server's department-level stigma by providing dual-core Xeon support in 64-bit SQL Server 2005 Standard Edition with a single-core license, adding 64-bit Opteron and Itanium versions, and delivering fail-over clustering. But there's no "free lunch" with the Standard Edition; a single CPU license increases from $4,999 to $5,999. High-availability features in the Enterprise Edition include partitioning, online indexing, online restore, transactional replication with Oracle databases, and advanced data and text mining. Enterprise Edition licenses remain at $24,911 per CPU. Unfortunately, there's no upgrade discount for SQL Server 2000 licensees.

The Gartner Group attributes much of the 10.3 percent growth of the total relational database market from 2003 to 2004 to new business intelligence features—data analysis, warehousing, and reporting—as well as the weakening U.S. dollar. Gartner's May 2005 report, which states there is no clear winner in the overall RDBMS market share race, gave IBM 34.1 percent, Oracle 33.7 percent, and Microsoft 20 percent of the total market. Steve Ballmer countered Gartner's 2004 revenue data at the San Francisco release event with the number of RDBMS server licenses for IBM DB2 (72,000) 10.0 percent, Oracle (245,000) 33.8 percent, and SQL Server 2000 (406,000) 56.1 percent. It didn't take a high business I.Q. to attribute a substantial part of Microsoft's lead-in licensed units to inclusion of Analysis Services (online analytical processing, or OLAP) with SQL Server 2000 Standard Edition. So the SQL Server team beefed up Analysis Services, included Reporting Services in all SQL Server 2005 editions, reworked Data Transformation Services into SQL Server Integration Services (SSIS), and added new data-mining features. They also integrated Analysis Services, Reporting Services, and SSIS management with the VS 2005 UI and SQL Server Management Studio (SSMS) as Business Intelligence Development Studio (BIDS).

When you consider SQL Server 2005's new business intelligence features and management tools, the extra $1,000 for the Standard Edition is a bargain. eWeek's Lisa Vass quotes an Edgewood Solutions LLC survey that claims new SQL Server 2005 features were sufficient "to make 31 percent of respondents consider switching from another database." (In this case, "another database" probably is Oracle.) If you don't need the Standard Edition's BI features, consider the $3,995 per CPU Workgroup Edition.

PM: Can you tell me about other significant features that benefit DBAs?

RJ: Other new features that benefit DBAs directly include T-SQL enhancements, user and schema separation (for example, Northwind.Sales.Orders replaces Northwind.dbo.Orders), more granular database permissions, backup media set mirroring, full-text catalogs in database backups, and online restore.

Microsoft announced on SQL Server 2005's official release date—November 7, 2005—Enterprise Edition support for scalable shared databases. Knowledge Base article Q910378 describes this new scale-out feature: "The scalable shared database feature allows you to attach a read-only reporting database to multiple server instances over a storage area network (SAN). A reporting database is a read-only database that is built from one or more production databases that are used exclusively for reporting purposes. To be made into a scalable shared database, a reporting database must reside on one or more dedicated, read-only volumes whose primary purpose is to host the reporting database or a coordinated set of reporting databases. These volumes are known as reporting volumes." Reporting databases must run under Windows Server 2003 and require a complex three-phase build, attach, and detach exercise.

PM: What role does XML play in SQL Server 2005?

RJ: SQL Server 2005 elevates XML to a native data type with extended XQuery 1.0 syntax for data-retrieval and update capabilities. XML columns store well-formed XML documents or fragments as binary large objects (BLOBs). The new native XML data type supports XML namespaces, validates documents against one or more XML schemas, and enables creating primary XML indexes and adding PATH, VALUE, PROPERTY, or all three secondary XML indexes. XML indexes speed execution of XQuery expressions, but are much more resource-intensive and take longer to generate than conventional non-clustered relational indexes. Non-standard XQuery methods permit updating XML content.

Storing documents in XML columns is an alternative to storing XML content as a text BLOB in a varchar(max) column, or shredding it into multiple relational columns. Shredding replaces full-text search on varchar columns with indexed SELECT queries on columns that contain element or attribute values. Choosing the optimum approach for XML storage, retrieval, and updating isn't a simple task.

SQL Server 2005's new setup program adds SQLXML 4.0, which replaces the SQLXML 3.0 Web download for SQL Server 2000. SQLXML 4.0's FOR XML clause adds a new PATH mode, supports nested FOR XML queries, and provides a TYPE directive to specify the XML data type for the result.

PM: Similarly, what's new in VS related to data, but not part of SQL Server specifically?

RJ: The Microsoft Developer Tools Roadmap promised that VS 2005 would "radically enhance the experience of manipulating and retrieving data" by simplifying "data source design from within the development environment." The roadmap—along with many other pre-release documents for VS and SQL Server 2005—went 404 on November 7, 2005, but Microsoft did manage to deliver on its commitment to streamline data access and manipulation in VS 2005.

ADO.NET 2.0 and a plethora of new data-bound controls and components greatly reduce the amount of code required to create Windows GUIs and Web pages that perform "create, retrieve, update, and delete" (CRUD) operations on client/server and ISAM tables. Smart-client (a.k.a. Windows) projects let you drag tables from the new Data Sources window and drop them on a form to generate an almost-complete master-details editing form in less than five minutes. New data components—such as TableAdapters, Binding Sources, and BindingNavigators—simplify record navigation. The DataGridView control replaces the less-than-stellar DataGrid and exposes an extraordinary number of events for programming data-management operations. You also can bind controls to business objects, XML InfoSet documents, and Web services.

ASP.NET 2.0 Web developers get a new bound GridView control to replace ASP.NET 1.x's DataGrid. The GridView provides declarative formatting, sorting, editing, deleting, and paging capabilities, but doesn't support insert operations. The DetailsView and FormView controls simplify creating master-details pages and enable inserts. Smart tags simplify adding and editing templates for data-bound Web controls.

PM: Five years is a long time. Does this time show—or not show—as much as you'd like in SQL Server 2005?

RJ: As Steve Ballmer said in his keynote for the San Francisco release event, SQL Server 2005, VS 2005, and BizTalk Server 2006 were "a little bit long in the making." I was surprised by Paul Flessner's September 14, 2005 "SQL Server 2005 Update" letter, which warned early adopters not to expect the promised Database Mirroring feature and SQL Server 2005 Express Manager in the October 2005 release-to-manufacturing (RTM) bits. Paul committed to release Database Mirroring for general use only "as soon as ... it is ready." The SQL Server group recently posted the November CTP of Express Manager's replacement—a scaled-down version of SSMS called SQL Server Management Studio Express Edition (SSMSEE). The RTM version is likely to appear in the first half of 2006. For VS developers who have Express Edition only, any management tool with a GUI is better than none. You can't install SSMSEE side-by-side with the RTM version of SSMS. Don't count on the unsupported SQL Server Express Utility (SSEUtil.exe) to compensate for the demise of the developmentally disabled Express Manager, but you'll need it to activate a User Instance for SMSSEE.

Removal of XQuery 1.0 support from the .NET 2.0 Framework was another disappointment. The official explanation is that it's impossible to guarantee forward compatibility and lack of XQuery in .NET 2.0 won't block "any mainstream customer scenarios." Microsoft has been touting XQuery 1.0 extensions to .NET since early 2002, and the lack of a final W3C XQuery 1.0 recommendation didn't stop the SQL Server team from implementing an XQuery 1.0 subset.

PM: What is your favorite new feature in this release?

RJ: It's not easy to pick a single favorite new feature, but I spend a lot of time creating and manipulating XML InfoSets and developing Web services. So I'd go for the native XML data type, which appears to me to be superior to XML storage and processing in current IBM DB2 and Oracle implementations.

On the tools front, I especially like integration of SQL Enterprise Manager and Query Analyzer tools into SSMS with the VS 2005 IDE. SQL Server 2005's business intelligence features are major marketing points, so I also like the capability to design and execute SSIS packages and create Analysis Services or Report Services projects from VS 2005's New Project dialog.

PM: SQL Server 2005 Express Edition improves on MSDE 2000 in a lot of ways. Can you walk us through what it includes, where it stops short, and how you might take advantage of it if you're an enterprise/business developer?

RJ: SQL Server Express (SSX) has restrictions similar to MSDE 1.0 and 2000 with one important exception: SSX doesn't have a query execution governor. Instead of a maximum of five simultaneous queries, SSX will execute as many simultaneous T-SQL batches or stored procedures as a single CPU and a 1 GB buffer pool will handle. MSDE 2000 supports two processors and up to 2 GB of RAM, but MSDE's database size limit is 2 GB. SSX ignores a second processor, if it's present, but lets databases grow to 4 GB without a log-file size limit. There's no restriction on the number of databases you can attach to SSX. Thus, SSX is more than competitive with Windows builds of open-source databases, such as MySQL and PostgreSQL, as well as "free" databases from Oracle and Sybase. The beta version of Oracle 10g Express Edition for Windows or Linux has a 1 GB RAM limit and restricts user data to a maximum of 4 GB, but supports Oracle Text for searching. The Express version of Sybase Adaptive Server Enterprise database for Linux supports 2 GB of RAM, but only 5 GB total database storage.

SSX delivers almost all new SQL Server 2005 Standard Edition features. Exceptions are graphical management tools, full-text search, high-availability features, SQL Agent, business intelligence capabilities, replication as a publisher, Service Broker, and native XML Web services. SSX supports subscriptions to merge, snapshot, and transactional replication publishers, and can interact with messages from Service Brokers of other SQL Server 2005 editions. Microsoft promises to add full-text search and provide a scaled-down version of SSMS to SSX in 2006. SSX enables User Instances, which permit XCopy deployment of projects with attachable SSX databases. SSX installation and patching is much simpler than that for MSDE.

SSX is the logical choice for client-side data storage and processing that exceeds the capabilities of persisted DataSets or local replication subscribers. SSX also is an excellent candidate for use with Web sites that have moderate traffic, data processing requirements, or both. You also can expect Office 12 data access to be optimized for SSX with the new SQL Native Client rather than the MDAC stack. SQL Native Client combines OLE DB and ODBC drivers in a single component and supports new SQL Server 2005 data types.

PM: Is the innate support for .NET language, including the support for writing stored procedures in C# or VB.NET, the most misunderstood feature in SQL Server 2005?

RJ: I'm not sure if CLR support in SQL Server 2005 remains the most misunderstood feature, but it clearly has the potential to be the most misused feature. Microsoft set the stage for SQLCLR misuse by promoting managed code as a crutch for VB and C# developers who weren't competent to write T-SQL statements for stored procedures, user-defined functions, or other database objects. Substituting managed code for T-SQL stored procedures or batches that perform CRUD operations on tables or views almost always results in slower execution and increased resource requirements. Creating user-defined data types (UDDTs) with an SQL Server Project is a chancy process, because modifying the UDDT requires dropping all table columns of the type before dropping and re-creating the UDDT. It's not a good idea to create simple UDDTs, such as the ubiquitous demo data type that combines X and Y double values in a single column of the Point data type.

An appropriate use of SQLCLR stored procedures is to replace extended stored procedures (XProcs), including OLE Automation stored procedures (sp_OAs). Another potential use for SQLCLR stored procedures or table-valued functions is as a substitute for cursor-based operations that suffer from poor performance, excess resource consumption, or both.

PM: What are LINQ, DLinq, and its related acronyms, and why will developers care about these technologies in the future?

RJ: Language Integrated Query (LINQ) is a technical preview of a new object-relational mapping (O-RM) technology (DLinq) and a set of XML document management enhancements (XLinq) that are expected to be included in the next version of Visual Studio—currently codenamed "Orcas." LINQ is the successor to Microsoft Research's Cω (C-Omega, formerly X# and Xen) extensions to the .NET type system and C# language. The objective of LINQ, DLinq, and XLinq is to eliminate, or at least minimize, the "impedance mismatch" between query languages—such as T-SQL—and programming languages like VB and C#. LINQ depends on VS 2005's generic and nullable types, and forthcoming lambda functions, anonymous types, anonymous arrays, object initializers, and extension methods in VB 9.0 and C# 3.0.

You can expect LINQ to get additional attention as WinFS gains maturity, because LINQ will replace ObjectSpaces as the mapping and querying method for WinFS entities. Developers will use DLinq to write SQL-style queries directly in VB or C# with IntelliSense and compiler-based type-checking. Currently, only the C# tech preview supports DLinq; Microsoft promises a VB implementation of DLinq before the end of 2005.