Beginning SQL Programming: Pt. 4

Thursday Aug 9th 2001 by DatabaseJournal.com Staff
Share:

Our final excerpt covers data warehouse with multiple SQL requesters, the human roles in DBMS, when to use SQL, scaling, speed, price, universality, and analytic capabilities, among other areas.

Data Warehouse with Multiple SQL Requesters

Last, let us look at an example that is probably realistic for most businesses. We have one datastore for the enterprise that consists of several data servers and a mainframe, collectively called a Data Warehouse or a Distributed Data Processing system. That conglomerate of a datastore receives SQL requests for internal business operations from a business logic server. The business logic server is processing operations as directed by employees using PCs and front-end software on their desks.

The company also has a website serviced by a web server. The web server interacts with visitors and then sends SQL requests to the same enterprise data warehouse. Although there are many brands of software involved and many platforms they are all able to speak and understand SQL.

We can trace the information flow in this graphic by starting with the PCs on the right. They are connected to either the Department Server or the Web Server. These servers can then create a SQL request using their business logic software and pass the request to one or more of the Data Servers. This is usually done through a high-speed LAN which only connects servers (represented on the far left). After the SQL Request has been processed the results can return from left to right through the servers and to the PCs.

Data Warehousing Diagram
(Click image for full size)

The Human Roles in a DBMS

With the increasing size and complexity of computing tasks, IT departments frequently grow to have several specialists contributing to the database solutions of the company. If you are coming from a background in desktop databases you may not have a clear idea of the titles and roles of the players. Obviously, in a small company one person may have more than one of these roles –  as a SQL programmer you may even be expected to cover responsibilities beyond just programming.

The DataBase Administrator (DBA) specializes in installing and maintaining the DBMS, including the physical devices, backups and recoveries. Frequently the DBA is the person responsible for maintaining the security scheme as well as troubleshooting the DBMS. Note that the DBA is not usually specifically responsible for any given database. Rather, the DBA keeps the DBMS running as a service for all the users that have databases on that DBMS. Each DBMS vendor has training and certification programs for its particular systems.

The Systems Administrator (sometimes called the System Operator or "SysOp") specializes in the Operating System (OS) and connectivity of the servers, both data servers and others. "SysAdmins" perform back-ups of the OS and monitor the traffic load between servers. "SysAdmins" are likely also to be involved in the security settings of servers needed by SQL programmers.

A newer position is the Security Administrator. Valuable corporate data is now exposed through the web to a degree that would horrify the last generation of IT professionals. As hackers improve their sophistication and means of collaboration, IT shops have responded by hiring individuals with specific training in establishing hardware and software safeguards and in monitoring the servers for signs of intrusion. SQL programmers may have to work closely with a Security Administrator to develop ways of using data that minimize risk, for example, creating Views rather than permitting full table access (see Chapter 16).

A Database Designer creates the overall scheme of the datastore, generally using specialized database design tools. The designer will not only understand the principles of databases but also the client's business model. The result from the designer's desk will be a layout of tables, relationships and queries needed to satisfy the client's needs.

The Database Programmer writes code and interfaces to implement the design. The programmer may be less familiar with the business rules and more familiar with how to write SQL statements and front ends to implement the design. Whereas Database Designers are more involved at the start of a project, programmers are involved for as long as the database is in use and changes and amendments need to be performed. Many programmers joining a team will have no contact with the designer who left after the database deployment. On legacy systems it may even be difficult to find any manuals regarding how the system was designed.

Last, never forget the user who is, directly or indirectly, paying your fee.

When to Use SQL

My students frequently ask, "When should I be using SQL?" They may have created a database in a lower level desktop system and are considering switching to a DBMS that will require SQL. SQL is a language, not a software product, so the real question is best phrased as "When do I need to use a heavy-duty DBMS, which would then require me to start communicating in SQL?" There are several factors to consider which we'll look at in turn in this section.

Scaling

Many small sites or applications start in desk top systems such as Access. As they grow, a number of problems become apparent with the Access JET database engine. The primary problem is that it was never designed to support many users at once. Database software designed for the desktop generally fails when more than a few people try to use the data at the same time. Microsoft Access will perform well if five users are on a LAN. When you get to ten or twenty concurrent users, problems begin. Obviously a desktop database cannot handle the problems introduced by large numbers of concurrent hits from a web site.

Second, the Access file-based system becomes fragile when the amount of data starts to climb into the gigabyte range. Lower scale systems like Access also lack the security that more powerful systems offer. Access databases, for example, are quite easy to copy and walk away with. The individual installations can also be readily accessed.

So for these reasons companies move from Access type desk top systems to a more robust DBMS. Almost all of these heavy duty DBMS rely on SQL as the main form of communication. In summary, although there is nothing about SQL itself that concerns scaling, there is a need to use SQL in order to communicate with DBMS that scale well.

Speed

When SQL-enabled DBMS first appeared they were slower than previous DBMS. However, as they have taken over the market during the last ten years, SQL data engines have been the focus of an intense effort to improve performance. There are heated contests between the major vendors for bragging rights to the fastest machine with the lowest transaction price. But the intense competition drives the vendors to produce faster, more robust DBMS that work at lower and lower costs per transaction.

When you consider your quest for improved speed, consider that pure speed problems will be evident at all times, not just at peak usage. They are usually the result of more complex queries, particularly with the advanced techniques we study later like multiple joins on data that cannot be indexed. Although SQL itself does not cure speed problems, the implementation of faster DBMS does, and those faster DBMS will probably require communications in SQL.

Price

It is more expensive to run a server-centric DBMS (that only speaks SQL) than it is to use a desktop system such as Paradox or Access (for which you do not necessarily need to use SQL). First, the software is more expensive. Second, in most cases you have to run more expensive operating systems in order to support the DBMS. Third, you have to tune the OS differently to optimize for a DBMS than for other applications, so you generally need a server dedicated to the DBMS. Last, you will need personnel with more expensive qualifications.

However, the price of hardware and software is frequently the smallest item in an IT budget. As your data center grows, at some point the reliability, performance and standardization benefits of a DBMS that uses SQL will outweigh the cost.

A note on price –  you can get started on a SQL-centric DBMS for almost nothing. In fact, we have included the 120-day trial version of Microsoft SQL Server 2000 with this book. You can download other trial versions such as Oracle Personal Edition. But remember, that these are trial versions; when you are ready to build your business on a SQL-centric DBMS you will be paying a considerable amount more than you did for Access.

Universality

An alternative to using SQL statements is to write code in a procedural language like C++. The problem with this approach is that you are then closely tied to the procedural language, the metadata and the specific DBMS. If there is a change in the structure of the tables the code must change. If a new DBMS is installed, most of the code must be revised to mesh with the new DBMS system of pointers, recordset definitions, etc. But by using SQL statements almost all changes are handled by the DBMS, behind the scenes from the programmer.

Analytic Capabilities

An advanced feature of SQL (which is beyond the scope of this book) is the emergence of analytical tools that allow managers to extract business knowledge from large amounts of data. These tasks frequently require the constructions of multi-dimensional aggregates of data. These tools are referred to by the generic names of Online Analytical Processing (OLAP), Decision Support Systems (DSS) and Executive Information Systems (EIS). Each vendor then has proprietary trademarks for their version. All of these tools are only available in full-scale DBMS that require SQL for their means of communication.

When to Skip Moving to a SQL-Centric DBMS

As much as we have talked about the power of SQL, we must be honest in describing those situations where an alternative system would be better. You are probably better off staying with software that does not require SQL (like Access) in most of these situations:

  • Only one or a few people will use the database at once.
  • The data can be stored within 50% of the disk capacity available to you.
  • You are comfortable with the amount of media and operator time required by your backup plan.
  • You do not expect to significantly expand the quantity of data or number of users.
  • You do not plan to make the database available on an Internet site.
  • The vast majority of the data you use is in a form that is not relational, such as files in folders.

If you the answer is yes to all of the above then a desktop solution like Microsoft Access will be cheaper and easier to learn. Most of what you will need to do can be performed through wizards and the drag 'n' drop interface, thus avoiding the need to invest in the hardware, software and learning for a SQL implementation.

As we discussed earlier on in this chapter, you can still use SQL with Access even though the system does not require it. Hence why we can use a desktop system such as Microsoft Access in order to learn SQL.


Development of a system in a SQL-centric, heavy-duty DBMS is more time-consuming than in a drag and drop DBMS like Access. You may consider prototyping in Access to get a set of forms and reports for the revision and approval process with your clients. Then with a model approved you can create the final product with far fewer hours spent in coding SQL. You may even find that a lot of the queries you made in Access can be cut and pasted into your final code.

Summary

Structured Query Language, tested and true over many years, is the standard way to communicate with a database. SQL is almost universally understood by both front ends and back ends. Front ends present a user interface such as Visual Basic, C++, ASP, and Microsoft Office applications. They have the ability to connect to a datastore and send it a SQL statement. Back ends are Database Management Systems such as Oracle, Microsoft SQL Server, Sybase and sometimes, desktop systems like Access. Back ends can accept a SQL statement and return a result. SQL is also used for communication directly between different DBMS without a user interface.

SQL is a set-based language rather then a procedural language. The programmer creates a single SQL statement describing the desired result and sends it to the DBMS. The DBMS then uses internal code to achieve that result. The internal code will vary among DBMS vendors, but that does not matter to the SQL programmer using ANSI-SQL. Regardless of the DBMS the result will be the same.

In the next chapter we will discuss the specifics of setting up several back ends and front ends. Then we study the code to create a connection and transport your SQL statements from front to back. We'll also cover how to handle the results when they return to the front end.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved