Expert One-on-One: Oracle: Pt. 4

Thursday Oct 11th 2001 by DatabaseJournal.com Staff
Share:

This fourth and final installment looks at making the database run faster and the DBA-developer relationship.

How Do I Make it Run Faster?

The question in the heading is one I get asked all the time. Everyone is looking for the fast = true switch, assuming 'database tuning' means that you tune the database. In fact, it is my experience that more than 80 percent (frequently much more, 100 percent) of all performance gains are to be realized at the application level — not the database level. You cannot tune a database until you have tuned the applications that run on the data.

As time goes on there are some switches we can 'throw' at the database level to help lessen the impact of egregious programming blunders. For example, Oracle 8.1.6 adds a new parameter, CURSOR_SHARING=FORCE. This feature implements an 'auto binder' if you will. It will silently take a query written as SELECT * FROM EMP WHERE EMPNO = 1234 and rewrite it for us as SELECT * FROM EMP WHERE EMPNO = :x. This can dramatically decrease the number of hard parses, and decrease the library latch waits we discussed in the Architecture sections — but (there is always a but) it can have some side effects. You may hit an issue (a.k.a. 'bug') with regards to this feature, for example in the first release:

ops$tkyte@ORA8I.WORLD> alter session set cursor_sharing=force;
Session altered.

ops$tkyte@ORA8I.WORLD> select * from dual where dummy='X'and 1=0;
select * from dual where dummy='X'and 1=0
                               *
ERROR at line 1:
ORA-00933: SQL command not properly ended

ops$tkyte@ORA8I.WORLD> alter session set cursor_sharing=exact;
Session altered.

ops$tkyte@ORA8I.WORLD> select * from dual where dummy='X'and 1=0;
no rows selected

The way they rewrote the query (because of the lack of whitespace between 'X' and the word AND) didn't work in 8.1.6. The query ended up being:

select * from dual where dummy=:SYS_B_0and :SYS_B_1=:SYS_B_2;

The key word AND became part of the bind variable :SYS_B_0. In 8.1.7, however, this query is rewritten as:

select * from dual where dummy=:”SYS_B_0”and :”SYS_B_1”=:”SYS_B_2”;

This works syntactically but might negatively affect your program's performance. For example, in the above, notice how 1=0 (also False) is rewritten to be :”SYS_B_1” = :”SYS_B_2”. The optimizer no longer has all of the information at parse time, it can no longer see that this query returns zero rows (before it even executes it). While I don't expect you to have lots of queries with 1=0 in them, I would expect you to have some queries that do use literals in them on purpose. You may have a column with very skewed values in it, for example 90 percent of the values of the column are greater than 100, 10 percent are less then 100. Further, 1 percent is less then 50. You would want the query:

select * from t where x < 50;

to use an index, and the query:

select * from t where x > 100;

to not use an index. If you use CURSOR_SHARING = FORCE, the optimizer will not have the 50 or 100 values to consider when optimizing — hence it will come up with a generic plan that probably does not use the index (even if 99.9 percent of your queries are of the type WHERE x < 50).

Additionally, I have found that while CURSOR_SHARING = FORCE runs much faster than parsing and optimizing lots of unique queries, I have also found it to be slower than using queries where the developer did the binding. This arises not from any inefficiency in the cursor sharing code, but rather in inefficiencies in the program itself. In Chapter 10, Tuning Strategies and Tools, we'll discover how parsing of SQL queries can affect our performance. In many cases, an application that does not use bind variables is not efficiently parsing and reusing cursors either. Since the application believes each query is unique (it built them as unique statements) it will never use a cursor more than once. The fact is that if the programmer had used bind variables in the first place, they could have parsed a query once and reused it many times. It is this overhead of parsing that decreases the overall potential performance you could see.

Basically, it is important to keep in mind that simply turning on CURSOR_SHARING = FORCE will not necessarily fix your problems. It may very well introduce new ones. CURSOR_SHARING is, in some cases, a very useful tool, but it is not a silver bullet. A well- developed application would never need it. In the long term, using bind variables where appropriate, and constants when needed, is the correct approach.

Even if there are some switches that can be thrown at the database level, and they are truly few and far between, problems relating to concurrency issues and poorly executing queries (due to poorly written queries or poorly structured data) cannot be fixed with a switch. These situations require rewrites (and frequently a re-architecture). Moving datafiles around, changing the multi-block read count, and other 'database' level switches frequently have a minor impact on the overall performance of an application. Definitely not anywhere near the 2, 3, ... N times increase in performance you need to achieve to make the application acceptable. How many times has your application been 10 percent too slow? 10 percent too slow, no one complains about. Five times too slow, people get upset. I repeat: you will not get a 5-times increase in performance by moving datafiles around. You will only achieve this by fixing the application — perhaps by making it do significantly less I/O.

Performance is something you have to design for, to build to, and to test for continuously throughout the development phase. It should never be something to be considered after the fact. I am amazed at how many times people wait until the application has been shipped to their customer, put in place and is actually running before they even start to tune it. I've seen implementations where applications are shipped with nothing more than primary keys — no other indexes whatsoever. The queries have never been tuned or stress tested. The application has never been tried out with more than a handful of users. Tuning is considered to be part of the installation of the product. To me, that is an unacceptable approach. Your end users should be presented with a responsive, fully tuned system from day one. There will be enough 'product issues' to deal with without having poor performance be the first thing they experience. Users are expecting a few 'bugs' from a new application, but at least don't make them wait a painfully long time for them to appear on screen.

The back cover of this book talks of the importance of a DBA knowing what the developers are trying to accomplish and of developers knowing how to exploit the DBA's data management strategies. It's certainly true that the most successful information systems are based on a symbiotic relationship between the DBA and the application developer. In this section I just want to give a developer's perspective on the division of work between developer and DBA (assuming that every serious development effort has a DBA team).

As a developer, you should not necessarily have to know how to install and configure the software. That should be the role of the DBA and perhaps the SA (System Administrator). Setting up Net8, getting the listener going, configuring MTS, enabling connection pooling, installing the database, creating the database, and so on — these are functions I place in the hands of the DBA/SA.

In general, a developer should not have to know how to tune the operating system. I myself generally leave this task to the SAs for the system. As a software developer for database applications you will need to be competent in use of your operating system of choice, but you shouldn't expect to have to tune it.

Perhaps one of the biggest concerns of the DBA is how to back up and restore a database, and I would say that this is the sole responsibility of the DBA. Understanding how rollback and redo work — yes, that is something a developer has to know. Knowing how to perform a tablespace point in time recovery is something a developer can skip over. Knowing that you can do it might come in handy, but actually having to do it — no.

Tuning at the database instance level, figuring out what the optimum SORT_AREA_SIZE should be — that's typically the job of the DBA. There are exceptional cases where a developer might need to change some setting for a session, but at the database level, the DBA is responsible for that. A typical database supports more than just a single developer's application. Only the DBA who supports all of the applications can make the right decision.

Allocating space and managing the files is the job of the DBA. Developers will contribute their estimations for space (how much they feel they will need) but the DBA/SA will take care of the rest.

Basically, developers do not need to know how to run the database. They need to know how to run in the database. The developer and the DBA will work together on different pieces of the same puzzle. The DBA will be visiting you, the developer, when your queries are consuming too many resources, and you will be visiting them when you cannot figure out how to make the system go any faster (that's when instance tuning can be done, when the application is fully tuned).

This will all vary by environment, but I would like to think that there is a division. A good developer is usually a very bad DBA, and vice versa. They are two different skillsets, two different mindsets, and two different personalities in my opinion.

Summary

Here we have taken a somewhat anecdotal look at why you need to know the database. The examples I have given are not isolated — they happen every day, day in and day out. I observe a continuous cycle of this happening over and over again and again. Let's quickly recap the key points. If you are developing with Oracle:

  • You need to understand the Oracle architecture. You don't have to know it so well that you are able to rewrite the server if you wanted but you should know it well enough that you are aware of the implications of using a particular feature.
  • You need to understand locking and concurrency control and that every database implements this differently. If you don't, your database will give 'wrong' answers and you will have large contention issues — leading to poor performance.
  • Do not treat the database as a black box, something you need not understand. The database is the most critical piece of most applications. To try to ignore it would be fatal.
  • Do not re-invent the wheel. I've seen more then one development team get in trouble, not only technically but on a personal level, due to a lack of awareness what Oracle provides for free. This will happen when it is pointed out that the feature they just spent the last couple of months implementing was actually a core feature of the database all along.
  • Solve problems as simply as possible, using as much of Oracle's built-in functionality as possible. You paid a lot for it.
  • Software projects come and go, programming languages and frameworks come and go. We developers are expected to have systems up and running in weeks, maybe months, and then move on to the next problem. If we re-invent the wheel over and over, we will never come close to keeping up with the frantic pace of development. Just as you would never build your own hash table class in Java — since it comes with one — you should use the database functionality you have at your disposal. The first step to being able to do that, of course, is to understand what it is you have at your disposal. Read on.
Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved