DB design and app development: Why can't we be friends?
- By Alan R. Earls
How much does an application developer need to know about a relational database? How much programming or application knowledge does a DBA or DB designer need to have? The questions play out differently in different organizations at different times.
Increasingly, it seems the barriers between development and data are blurring, with the designer often acting as the focal point of communication between the two camps. Useful tools are only part of the equation when pursuing success.
Erik Veerman, a senior consultant at Intellinet Corp., Charlotte, N.C., who has been working on a major SQL Server-based project called NetMon for Atlanta-based Cox Communications Inc., makes just such a case. The project involved using many products from the SQL Server suite, including Analysis Services along with an ETL tool called Data Transformation Services. Reporting is handled by Reporting Services, which was in beta form at the time the project began, but has now been formally released by Microsoft.
Veerman said it is not unusual to encounter application developers with a limited or incomplete understanding of databases. Indeed, he suggested that when databases perform poorly, most of the time it is not a vendor issue, but a matter of poorly thought-out design. "When you get down to a comparison of the DBs, they are all pretty good," he said.
Veerman said there is a paradigm shift underway in the database world. "In the past, you had DBAs that were primarily responsible for keeping systems up and running," perhaps with some responsibility for optimization. But now, he said, more organizations are breaking out a separate area of responsibility for design -- database architects. And, at a later stage of the project, specialized business intelligence developers now focus on analysis and reporting capabilities.
"In any data warehouse project that involves a lot of high-volume, disparate systems, there are a lot of challenges," he admitted. In general, said Veerman, Intellinet has been able to accomplish everything it needs through SQL Server Suite, which includes a wide range of tools. However, he admitted that there are "some very good tools besides what Microsoft supplies, but they are typically higher in price." Indeed, Veerman argued that the Microsoft tools provide a lower TCO than alternatives. "We have been doing this sort of work for six or seven years and we feel we can do anything," he added.
Joe McKendrick, an analyst at Santa Cruz, Calif.-based Evans Data Corp., said many companies have multiple database environments ranging from DB2 and Oracle to emerging products like Open Source MySQL.
But McKendrick noted that it is not all technical skills. Developers must also have business skills and the ability to work with other IT areas. "The ability to translate applications for a Web environment and deploy Web services is part of that," he said. Database security is another area where companies are putting extra effort, he said.
Carl Olofson, program director for information management and data integration software at Framingham, Mass.-based analyst firm IDC, agreed with McKendrick's point that knowing how the database will be used is the first order of business.
Taking a slightly different tack, Betsy Burton, an analyst at Gartner Inc., Stamford, Conn., contended that the dividing lines between general application development and database development are fading because "there is very little application development anymore that doesn't include databases. Developers know this but management doesn't," she added.
Cuong Do provided ADT with his perspective on developers' everyday struggles with databases. Do is a little like one of the characters on the popular "CSI: Crime Scene Investigation" television show, except instead of prodding corpses and dusting for fingerprints, Do dissects computer applications, especially databases. Do is a director in the electronic evidence consulting practice of FTI, a Washington, D.C., consulting firm. His areas of expertise include database management and application development, and he is particularly adept in the areas of data analysis, financial application development and Web development. His company specializes in information technology forensics -- rebuilding "lost" systems and assembling data in support of litigation, and other kinds of investigations. Thus, Do has an interesting perspective on the flaws and merits of the leading commercial DB products and on what it takes to kick-start them to life.
Do has also done some pure database development over the course of his career. With that combined perspective he offers a long list of the challenges and complexities, ranging from infrastructure to budget to anticipating growth -- as well as the more straightforward "housekeeping" issues -- a developer needs to master.
In other kinds of applications, he noted, there are relatively straightforward ways to optimize performance but in the world of databases it is usually much more complex and highly dependent on who is using it and how they are using it.
To be successful in a development effort, Do said it is important that you start by understanding the intent of the database and the underlying business issues it addresses. Then, he said, you need to look at budget constraints, hardware requirements and what is already available, including software licenses. But that is only the beginning, he noted, because databases are complex and demand planning. Things like bandwidth requirements, likely peak processing periods, how frequently data changes, RAID levels, whether you will need multiple servers, and how loads will be balanced are among the additional issues that need to be researched and understood before any actual development begins.
"Most of the time you also have to consider the conversion of older data, too," said Do, as well as look ahead to see whether what you are planning to build will, itself, be easily scaled or migrated to handle eventual growth. Security must also be addressed. Early on it is important to establish who has access and who can do what. Frequently, such questions are divided into group levels and user levels. The group can look at a specific table and update that table and certain records. Or, by focusing on individual users, they can nail down specific permissions and determine what portion of the data individuals will be able to view.
Getting closer to the time when you can begin actual hands-on development, further tasks include scoping out data tables, establishing naming conventions for fields and preparing to normalize data. "Ideally, normalization should save space and provide good performance," said Do. "What you want to avoid is things like bad data and lookups that don't exist," he added. Rules -- again based on understanding the data and the business applications -- play a key role.
Once you design the database and tables, you can also consider whether you want to employ indexing, which can help your server run with less utilization so the user will get data as quickly as possible.
According to Do, store procedures are also important. These are SQL select statements and commands that are compiled and that an application can call on. "They are much faster than what you run in embedded SQL at the application level," said Do.
Assuming you master everything involved with architecting your database -- and that you have put in place all the right features -- you will still have to contend with the real world where things almost never work as well as they are supposed to. Once a database grows a bit, for instance, you are likely to see bottlenecks. Indeed, said Do, "performance tuning and indexing is where people spend most of their time." Fortunately, there is an abundance of tools. Do cited SQL Server, which has its own performance tuner and a profiler that allows you to record all the activity going on in that database. He said it can report on the "cost" of each query and how long it takes, while a separate indexing performance tool tells you if there are things you could index to improve the performance of that query. Third-party tools like Rational Rose (Rational was recently acquired by IBM) can also test queries to see if they are efficient.
Finally, Do recommends thinking through your maintenance needs and what kind of talent will be needed or available to handle those tasks.
Vendors weigh in
Microsoft's emphasis on ease of development has informed the course of SQL Server. The company bills it as a complete database platform with integrated business intelligence, development and management tools at no additional charge. Microsoft officials say they have focused on providing business value out of the box with features like OLAP, data mining, data warehousing, Accelerator for BI, SQLXML and Web services toolkits, Notification Services, SQL Server CE, SQL Server 2000 (64-bit) and MSDE.
With SQL Server, the company aims to provide administrators and developers with a full set of database tools out of the box. The product also includes additional tools and components that can extend databases functionality with full-text searching capabilities, OLAP, data mining, reporting and natural language queries. Downloadable tools, including extended XML capabilities in SQL Server 2000, a View Mapper and tools for creating XML Web services via SQLXML 3.0, are also available.
Meanwhile, at Oracle Corp., Redwood Shores, Calif., Ted Farrell, chief architect for application development tools, noted that the company has been building development tools for 20 years and is now focusing on ways to boost the productivity of the whole database development team. For example, he said, the knowledge of the DBA is crucial to a database development effort, but it is not always fully engaged.
He said one of Oracle's goals is to unify the database development experience by building a level of abstraction above the technology so that everyone can focus more on business logic rather than on development details. "One of our slogans is productivity with choice," he said. In practice, that means groups can approach database development in different ways, depending on their strengths. According to Farrell, a product called the Oracle Application Development Framework can help unify the effort to produce final results.
"You can pick the work style and technology that will boost production for the entire group," he said.
Farrell believes that third-party products have a place -- a natural result of the size of the market -- but he insists that Oracle, with lots of feedback from demanding customers, has been in an ideal position to enhance the development process with its own toolset. "I think independent tool vendors are almost becoming a thing of the past since they don't control their own platform," he said.
Jeff Jones, IBM's director of strategy for DB2 Information Management Software, said the first thing a database developer needs to understand is where the data will come from. "That sounds simplistic, but the choices of where you are going to get data are not simple anymore," he said. Database developers are challenged to reach out across a distributed data environment. Jones said IBM's DB2 Information Integrator was designed with that challenge in mind. Indeed, he said the Integrator product is reflective of the extremely distributed nature of data, which is compelling a manage-in-place approach -- almost the opposite of the traditional tendency to centralize information. The Integrator understands how to work with the SQL in other platforms, such as Oracle or Informix, and provides linkages via ODBC. The end result, Jones said, is that you can write a query that can gathers data from multiple sources but still looks like a single database.
Working out the meaning of data elements with end users is also an important development task. "Ironing out sources and agreements with the end-user community is your number one job," he said.
IBM also provides a large tool set through WebSphere studio. "Since the beginning, we have been tightly woven to DB2 so the connectivity can be fast and native," said Jones.
Looking to the future, Jones said IBM's goal is self-managed or "autonomic" computing. "As we develop ever more complex capabilities we must recognize that usage can't get more complex," he said. As an example of how IBM is trying to make complexity go away, Jones described a cost-based optimizer within the toolset that can rewrite SQL without human involvement, and a configuration advisor that can help get a DB2 database up and running, as well as tuned.
Another capability called Health Center can monitor and correct problems based on pre-written policies that, in turn, are created through a process that implements complex conditional thresholds.
"Our big focus is to make the DBA's life simpler so they can focus on design and higher order things rather than just on backup and recovery," he said.
The third-party view
Robin Schumacher, vice president of product management at Embarcadero Technologies Inc., a San Francisco tool vendor, defends the role of third-party suppliers. "Probably the most interesting thing we see today is that many developers have to switch back and forth between different platforms," he said. Although SQL provides a common thread, the procedural code differs sharply among different vendors.
"This is where developers often have trouble," he added. To help people up the learning curve, Embarcadero offers Rapid SQL, a single tool that supports the movement of SQL code between platforms. "If I'm an Oracle developer and I need to do work in Microsoft SQL, our tool will create the appropriate procedural language," he explained. It is a capability that can jumpstart projects and expand the capability of individuals, he suggested.
Tuning and troubleshooting are also an area where Embarcadero has responded to the developer community, Schumacher noted. "If you have a database application where the processor used to complete a task in a minute and it now takes 30 minutes, there could be hundreds of lines of code involved," he said. So the Embarcadero tool can work through the code, line by line, with a timing mechanism that indicates which lines take the longest to run. "We also have a profiling tool that can show you the history of the application and help identify when performance began to degrade and what code changes were associated," he said.
The tool does not simply point fingers; it can also optimize through an iterative process. "For example, it might rewrite the code 40 times and identify which rewrite was the fastest," he said.
"Especially in smaller shops, you are more likely to find developers moving into the DBA role," he said. "But if they don't know how to lay the proper foundation, they can get in trouble."
That is where database design tools come in. Not surprisingly, Embarcadero has offerings there, too. "These tools help you lay out the interrelationships in a data model and then generate the database itself from those models," he said. Not forgetting that databases can have a long life, Schumacher points out that tools -- again Embarcadero offers some -- also need to help with maintenance. One Embarcadero tool creates documentation that can be easily shared and updated. DBArtisan, the company's administration tool, works across multiple platforms and "can automatically create objects," he explained.
"As a former DBA I can't tell you how many times I have had developers say in response to a problem, 'Gee, I just modified 1,000 lines of code,' but now they need to put the code back to where it was two weeks ago. We have a change manager feature that a developer can run every night -- it can even be run automatically," Schumacher said.
Schumacher listed two final issues involving database development. For one thing, he said, developers are always given the job of migrating data, a task that can be simplified with Database Studio. And testing is always a major challenge. Here, again, without the proper tools, developers may be burdened with the need to write their own test scripts and then try to draft some colleagues into "banging on the application" manually. In addition to being labor-intensive, that style of testing may not provide a very realistic perspective on what 1,000 users will do to the system when it is up and running. So, he noted, Embarcadero offers its own automated "extreme" testing capability.
"The way I look at it is as a developer, where at each stage you need tools for certain issues," said Bryan Huddleston, product marketing manager for database development solutions at Irvine, Calif.-based Quest Software Inc. Developers, in his view, now rely on a range of technologies. For instance, in the case of Oracle, that means everything from PL/SQL to Java and C++. Development tools need to span that range, he said.
Quest offers a product called Toad that provides a GUI to Oracle to speed up PL/SQL development and simplify administration. Toad has advanced editors that allow users to work on multiple files simultaneously -- even different file types such as SQL, PL/SQL, HTML, Java and text. Hot keys, auto-correct, type-ahead, syntax highlighting, version control and numerous other productivity features speed development, while editing and testing are made easier with integrated result sets, explain plans, tracing and DBMS_OUTPUT views.
But David Antsey, president of the Oracle Development Tools User Group (ODTUG), warns that real database development expertise does not come cheap. Too many tools, he said, rely on object technology, which is not the same thing as the SQL that provides the bedrock database functionality. "They are like vinegar and water," that just do not mix well, he said. What he would like to see are more developers with a higher level of comfort with SQL. Of course, he admitted, the problem goes beyond the nitty-gritty of coding. He said the age-old problem is that the IT world has not always embraced or understood the importance of the database on the back end.
"That's just as true now as ever," he said. Furthermore, said Antsey, "there has always been a mindset in the OO world that the database is just a big garage where you store your junk." Getting past that mindset and understanding the subtleties, challenges -- and pleasures -- of the database is what is needed, he said, if anyone is going to master the technology.
Meanwhile, database design may take significant new directions. Viewers predict a trend in the works that would see business process modeling, database design and UML object system design used in some type of combination. Pivotal database design tools such as Computer Associates' venerable ERwin may show the way. Early last month, the vendor released a version of its AllFusion Modeling Suite that is anchored by the ERwin Data Modeler and includes an IDE for automating design processes, synchronizing data models and troubleshooting architectural problems.
Please see the following related stories:
"Getting real about real-time BI" by Alan R. Earls
"New wrinkles in data integration business" by Alan R. Earls
"Bringing DB developers into the fold" by Alan R. Earls
"Sorting out the meaning of BPM" by Wayne Eckerson