Linked Servers

Thursday Apr 4th 2002 by Neil Boyle
Share:

Linked servers provide connectivity to external data sources, allowing you to access data and run procedures on servers other than the one you are currently on. Neil Boyle's latest article explores linked servers in complete detail, with illustrations and code examples to assist in fully explaining the topic.

Linked servers provide connectivity to external data sources, allowing you to access data and run procedures in SQL Server other than the one you are currently running on.

Moreover, linked servers provide connectivity to just about any data source you can get an ODBC driver for, providing you with a DML interface to Oracle databases, Excel spreadsheets, flat files and many others. In many cases, this makes linked servers a viable alternative to using DTS or BCP to get data from external sources.

In Part One of this article we will look at connecting SQL Servers together as linked servers, and in Part Two we will consider connectivity to other data sources.

To get the best out of the example code, you will need to be familiar with the SQL Server client connectivity tool and with setting up and permissioning Login and User IDs.

All the examples here are created using SQL Server 2000, though they should work fine on SQL Server 7 as well.

Linked SQL Servers

Linking two servers creates a "one way" connection between the two servers:

Two servers linked together
Click image for full size

So if I log on to my MiyajimaXP server and set up Fuji as a linked server, I will then be able to query data and run stored procedures on the Fuji servers without directly logging on to the Fuji server. It does not set up a two way link, though--I cannot yet go to the Fuji server and access the MiyajimaXP server from there. To accomplish this, I would have to set up another linked server going back the other way.

Setting up a basic Linked Server

Step one in setting up a linked server is to ensure that you have Network Connectivity between the two servers, and to set up an alias using the SQL Server Client Connectivity tool. Once that is done, we need to tell the "local" server about the linked (or "remote") server. Here's the SQL to set up a linked server:

sp_addlinkedserver 
    'Fuji,
    , N'SQL Server'

In this example I have used the Alias name set up using the Client Connectivity tool--that's just to keep things simple for now--later on we will look at using logical names for servers and using multiple links.

Once that's done, we have to handle permissioning between the two servers--we need to specify what user and password we will use when logging in to the linked server.

By default, any logins that exist on both servers will be allowed to execute queries across the link, provided that the passwords are identical on each server--otherwise the user will get a "Login failed" message. Ensuring user passwords and permissions are in sync for both servers may be an administrative headache for the DBA, but there you go.

Setting up linked server logins

There are other approaches you can take to linked server security. One is to ensure that any login across the linked server uses only a standard login ID at the remote end. This command will ensure that anyone that accesses the Fuji server logs into that server with the same Login ID.

sp_addlinkedsrvlogin 
    @rmtsrvname = FUJI
    , @useself = false
    , @locallogin = NULL
    , @rmtuser = fuji_PUBS
    , @rmtpassword = fuji_PUBS

This command maps all the Local logins to the login "fuji_PUBS" on the remote server--so all the accesses over the linked server will inherit the permissions of fuji_PUBS on the remote server. Using this technique, user passwords and permissions do not need to be kept in sync, but if you change the password for fuji_PUBS on the linked server, you will have to reset the Linked server login also.

Testing the link

A simple way to test the link is to run this basic query--the first line shows your connectivity on the local server and the second line shows the connectivity details on the remote server:

select @@serverName, user_name(), db_name(), @@spid

select * from 
   openquery (FUJI, 'select @@serverName, user_name(), db_name(), @@spid')

In this simple example you can also see the basic syntax for running queries against a linked server, but more on that later.

More on logins

The fuji_PUBS login will now become the default login for anyone accessing the linked server, and they will of course inherit the permissions of that login on the remote server. We can override this for specific users by mapping their logon ID to another login on the linked server. This example sets up a link between a login called 'Tokyo' that exists on both servers:

sp_addlinkedsrvlogin 
    @rmtsrvname = FUJI
    , @useself = true
    , @locallogin = tokyo
    , @rmtuser = tokyo
    , @rmtpassword = Tokyo

Now, if anyone signs in as "Tokyo" and accesses the linked server, they will log in to the linked server with their own ID, while all other users will still get the default ID on the linked server.

Dropping Logins

You can drop a specific linked server login with the command:

sp_droplinkedsrvlogin 
    @rmtsrvname = FUJI
    , @locallogin = Tokyo

Dropping Servers

If you need to drop an entire server, you can do so with the sp_dropserver command. All the logins associated with that server must first be dropped before the server itself can be dropped, but the easy way to do that is to specify the 'droplogins' parameter of the sp_dropserver SP, which gets it to do the hard work for you.

sp_dropserver fuji, droplogins

Logical Server Names

You might want to set up multiple links to the same target server, an obvious reason for this being to lock down security. You might want one set of users to access one database and another to access a different database, but not let the two groups have access to both databases.

Multiple links to the same target server
Click image for full size

The code below will set up two logical linked servers to the same machine--one will be designed to access the PUBS database (called fuji_PBS_DB) while the other (fuji_NORTHWIND_DB) gives access to the Northwind database.

To test the code, you will need to set up two logins on your target server, called fuji_NORTHWIND and fuji_PUBS, and grant access to each in the relevant databases.

sp_addlinkedserver    	
fuji_PBS_DB, droplogins
   @server='fuji_PUBS_DB', 	
   @srvproduct='',
   @provider='SQLOLEDB', 
   @datasrc='FUJI'
go

sp_addlinkedsrvlogin 
   @rmtsrvname = fuji_PUBS_DB
   , @useself = false
   , @locallogin = NULL
   , @rmtuser = fuji_PUBS
   , @rmtpassword = fuji_PUBS
go

sp_addlinkedserver    	
fuji_PBS_DB, droplogins
   @server='fuji_NORTHWIND_DB', 	
   @srvproduct='',
   @provider='SQLOLEDB', 
   @datasrc='FUJI'
go

sp_addlinkedsrvlogin 
   @rmtsrvname = fuji_NORTHWIND_DB
   , @useself = false
   , @locallogin = NULL
   , @rmtuser = fuji_NORTHWIND
   , @rmtpassword = fuji_NORTHWIND
go

Again, we can run our simple test to demonstrate the difference between the two servers:

select * from openquery 
   (fuji_PUBS_DB, 'select @@serverName, 
    user_name(), db_name(), @@spid')

select * from openquery 
   (fuji_NORTHWIND_DB, 'select @@serverName, 
    user_name(), db_name(), @@spid')

[Each of the above selects originally appeared on a single line. We've wrapped them to multiple lines here for Web formatting purposes. -Ed.]

A Point About Security

What we have done here is to set up two linked servers and map all the logins from the local server onto one specific login on the linked server. This means that anyone who knows about the existence of the linked server setup and has access to the local server will also have access to the relevant data on the linked server as well.

Given that I said we wanted to separate the two groups of users and not give them access to the other group's data, this is not exactly an ideal solution.

In this instance it would be better to map individual logins on the local server to their required login on the linked server.

Another Point About Security

As this technique can 'open up' your servers if you do not use it carefully, it's probably best not to shout about it. Do not give the functionality directly to end users--encapsulate it in stored procedures, functions or client-side code, and keep strict controls on how users and developers use it. Users only need their data--they don't need to know how it got there.

Using Your Linked Server

Having set all this up, what can we to with it? Let's look at some examples:

We have already seen the basic syntax for accessing the remote data:

Select * from openquery (server, 'your query here')

This causes SQL Server to open a connection on the target server, log in, execute the query you specified, return the data to the server you are working on, and finally shut down the link.

This query does exactly the same thing, but the syntax is much neater and more flexible to use:

select * from fuji_PUBS_DB.pubs.dbo.authors

With this simplified syntax we can easily join tables on different servers. On my remote server I changed the name of a couple of people in the authors table of the PUBS database, and produced this simple query to return the data that was different across the two servers:

select r.au_fname as remote_fname, 
  r.au_lname as remote_lname, 
  l.*
from  authors l 		
join  fuji_PUBS_DB.pubs.dbo.authors r
on    l.au_id = r.au_id
where l.au_fname <> r.au_fname OR l.au_lname <> r.au_lname

Performance Issues

The trip to the remote server consists of a large overhead, not only in logging in to the remote server, but in passing the data back across the network. When I ran a SHOWPLAN performance analysis on the above query, it estimated that the trip to the remote server took up more than twice the resources it needed to access the data locally. When I padded out the authors table on both servers with some extra data and run the showplan again, the trip to the remote server then took three times as much resources.

If you pass back more data than you need from the remote server, you will decrease the speed of the query and possibly impact performance on the network as a whole. In extreme cases, your Network Manager may well come and shout at you!

Cutting down on traffic

Let's say for argument's sake that I "suspect" certain rows from the pubs table on the remote server are incorrect. I can improve performance only by bringing these suspect rows back across the network.

By simply coding the row IDs into the Where clause, SQL Server 2000 is smart enough to include the clause in the query it sends to the remote server, so only the rows I specify are sent back, not the entire table.

select r.au_fname as remote_fname, 
  r.au_lname as remote_lname, 
  l.*
from  authors l 		
join  fuji_PUBS_DB.pubs.dbo.authors r
on    l.au_id = r.au_id
where (
      l.au_fname <> r.au_fname OR
      l.au_lname <> r.au_lname
      )
and   r.au_id in (
      '172-32-1176', 
      '213-46-8915', 
      '238-95-7766', 
      '267-41-2394' )

Note also that I chose to display in the result set the entire row from the local server, but only the au_fname and au_lname columns from the remote server. If I had used "select *" then SQL Server would have had to pull back all the columns, even though I was not particularly interested in looking at them here. SQL Server 2000 was smart enough to realize it only needed to pull back those two columns, plus the au_id column I used in the where statement--a neat bit of programming from the guys at MS!

You can check if your remote query is optimized in this using Query Analyser by displaying the execution plan for your SQL and hovering the cursor over the "remote query" icon.

Calling Stored Procedures

Stored procedures on the remote server can be executed easily, but first the server needs to be configured for RPC, which is just a one-line command:

sp_serveroption fuji_PUBS_DB, [rpc out], true

This only needs to be done once--we can now call stored procedures using the same 4-part naming style we used to reference tables:

exec fuji_PUBS_DB.master.dbo.sp_who2

Updating Remote Data

Updating tables on the linked server is not a problem either. Here is a simple example based on the select statement we used to join the authors tables on the remote servers:

UPDATE fuji_PUBS_DB.pubs.dbo.authors
SET   au_fname = r.au_fname
,     au_lname = r.au_lname
from  fuji_PUBS_DB.pubs.dbo.authors r
join  authors l
      on l.au_id = r.au_id
where (
      l.au_fname <> r.au_fname OR
      l.au_lname <> r.au_lname
      )

With this UPDATE command there needs to be two separate trips made to the linked server: Trip one pulls the data from the linked server back into the local server, where the join takes place. The second trip performs the actual update once SQL Server decides which rows need updating.

More performance issues

In this example I have left out the list of "suspect" rows, but if I had left them in, the query performance would be better because the rows could be filtered at the linked server end, resulting in less data flying over the network, and less work to do in the join.

Like all update transactions, updates to linked servers are done in a transaction--and as the transaction will take longer than a purely "local" transaction, this means more potential for locking and blocking on the remote server, so distributed transactions like this need special care to ensure they are designed efficiently.

You can also wrap up multiple updates in a single distributed transaction--though obviously the concerns about resource utilization still apply, only more so. A full run-down on Distributed Transactions would probably double the size of this article though, so I will let MSDN take over on this issue!

Loopback Linked Servers

You can set up a linked server to point to itself, but why would you want to?

One reason Microsoft cites is that you may only have one server to develop on, but need to test linked servers for a live environment with more than one server (See, you can try this at home!)

Loopback linked servers do come with a couple of drawbacks though--you cannot use them for Distributed Partitioned Views, and you cannot use them for nested transactions (which limits their potential for testing Update statements).

One other good reason for using Loopback Lined Servers is the Application Role. Once an Application Role has been started, you will not be able to directly access a different database on the same server, even if your Login ID has the necessary data rights. This is because the Application Role suspends all the rights associated with the login ID and replaces them with its own. However, you can still access Linked servers, so you can get at the other databases in this manner.

Note that although Network Traffic will not be an issue with Loopback Linked Servers, there are still processing overheads to deal with, as the server still has to initiate a new connection every time you access the Loopback Linked Server.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved