While databases and spreadsheets are indeed very versatile,
sometimes you have a need to work with text files. Perhaps you were sent a text
file containing data you need to use, or perhaps you need to be able to produce
a text file to send to someone.
While it's possible to read the text file and manipulate it
using VBA, depending on how the data's arranged in the file, it's also possible
to treat the text file as a data table, and use the standard approaches to dealing
with tables instead. This has a major advantage in that you can use SQL to
filter the text in the file.
In this article, I'm going to talk about a couple of ways
that you can do just that.
A couple of warnings
Before going any further, let me state up front that while it's
possible to read from a text file as though it was a table, it's not possible
to update a text file as though it was a table. Figure 1 shows the error
message you'll get it you try.
Figure 1: The error message when trying to update a text
file - Unfortunately, it's just not possible to update a linked text table.
This doesn't mean that you cannot write to a text file: it
just means that you cannot delete data from or insert it to a text file as you
can from a table, nor can you update data existing in the file.
Another potential issue is that, by default, you're limited
(at least in Access) to what file extensions you can use. Out of the box, the
only file extensions you can use are txt, csv, tab and asc.
If you need to get around that restriction, one option is to
programmatically rename the file, work with it, then name it back to its
original name. In VBA, you can use the Name statement to rename files. The following
code will rename file.xyz (in folder C:\Folder) to File.txt:
Name "C:\Folder\File.xyz" As "C:\Folder\File.txt"
Sometimes, though, you may be working with files where you
cannot rename the file, either because you don't have sufficient permissions on
the file, or because others are also using the file. In that case, you might be
able to copy the file to a location where you do have permissions and work with
the copy. In VBA, you can use the FileCopy statement. The following code will
make a copy of File.xyz from folder E:\Folder to folder C:\Data, renaming the
file to File.txt at the same time:
FileCopy "E:\Folder\File.xyz", "C:\Data\File.txt"
If that's not possible, the other option is to modify the
System Registry. If youre working with an MDB file in either the Access 2000
file format or the Access 2002/2003 file format, you're looking for the DisabledExtensions
value under the HKEY_LOCAL_MACHINE\Software\Microsoft\Jet\4.0\Engines\Text subkey
in the registry, as shown in Figure 2.
Figure 2: System Registry entries associated with text
files when using MDB files- Registry changes should never be taken lightly, as
they can cause serious problems that may require reinstallation of the
operating system.
If youre working with an ACCDB file, youre looking for the DisabledExtensions
value under the HKEY_LOCAL_MACHINE\Software\Microsoft\Office\12.0\Access
Connectivity Engine\Engines\Text subkey in the registry, as shown in Figure 3.
Figure 3: System Registry entries associated with text
files when using ACCDB files The same cautions about making Registry changes
apply.
Double-click on the DisabledExtensions label, and the Edit
String window will open up, allowing you to add additional extensions to the
list of extensions allowed, as shown in Figure 4.
Figure 4: The Edit String window - Add the additional
extension(s) to be allowed to the list.
It may seem counterintuitive to add an extension to the DisabledExtensions
value when you want to enable use of that extension. However, note that the
first character in the value is an exclamation point (!). Placing an
exclamation mark at the beginning of the Value data allows you to modify those
files with the extensions listed. When you do not place an exclamation mark at
the beginning of the Value data, you cannot modify files with the extension
listed.
Finally, I'm making a simplifying assumption that all files
are what I'll call "proper" file names, i.e.: they have only a single
period in the file name. While I know that it's possible to use filenames like This
is the text file I created. Joe created a different file, but we won't use
it.txt, the code I'm presenting won't work properly with names like that.
While it's not difficult to rewrite the code so that it will work with file
names like that, I didn't want to introduce that complexity.
Simplistic Approach
The easiest approach in Access, of course, is to manually
link to or import your text file. When you go through File | Get External Data
and select a text file, the Import Text Wizard (see Figure 5) walks you through
the process of defining what the table should look like.
Figure 5: The Import Text Wizard - The wizard should
appear automatically when you go to import or link a text file, and walks you
through the process.
If you click on the Advanced button (bottom left hand corner
of wizard), you get the option to refine the definition, and even to save the
specification in the database (see Figure 6).
Figure 6: The Data Link Specification dialog - This
dialog allows you to specify a number of characteristics of the data, and to
save the specification should you wish to reuse it.
If you save the specification, it's actually stored in two
system tables (which are normally hidden), MSysIMEXSpecs and MSysIMEXColumns.
Once the specification has been stored there, you can then use it
programmatically by passing the name of the specification as a parameter to the
TransferText method:
DoCmd.TransferText acExportDelim, _
"Standard Output", _
"qryAprilDetails", "C:\Txtfiles\April.txt"
In that example, the details stored as specification
"Standard Output" would be used when exporting the data produced by qryAprilDetails
to table C:\Txtfiles\April.txt.
Note that while the TransferText method most often is used to
import text files, the ability to create linked tables exists as well:
depending on the characteristics of your text file, youd use either acLinkDelim
or acLinkFixed instead of acExportDelim in the example above.
While it's not possible to programmatically create or
manipulate stored specifications, it is possible to import existing
specifications from one database into another. When you go to through File |
Get External Data | Import and select the database, click on the Options button
in the bottom right hand corner of the Import Objects dialog (see Figure 7).
One of the options available to you is to import the Import/Export Specs, as
shown in Figure 8.
Figure 7: The Import Objects dialog - Note that the
Options button does not appear when linking.
Figure 8: The Import Objects options - This dialog gives
you some additional capabilities when importing.
Using this technique, you can create linked tables whenever
you need to, and you can export data to text files.
Using IISAM
While the technique outlined above works with Access, there
are times you might want to go beyond those capabilities. As well, you might
want to be able to use the same approach from other Office applications such as
Word or Excel. This is where you might use IISAM (Installable ISAM).
As Michael Kaplan explained in http://msdn.microsoft.com/en-us/library/aa160682(office.11).aspx
"ISAM (Indexed Sequential Access Method) describes a kind of file type
where records are read and written in sequential order but can also be
retrieved using indexes and keys. Most desktop database systems (Jet, dBase,
Paradox) are, in some sense of the term, ISAM systems. IISAM (Installable ISAM)
allows you to load the code to read and write a specific database on an
as-needed basis. Jet comes with IISAMs for dBase, Excel, and text files. The
amount of functionality supported for each file type will vary from IISAM to
IISAM."
When using the text IISAM, the format of the text file is
determined by using a schema information file. This file is always named
Schema.ini, and must be in the same directory as the text file. Essentially,
each entry in the Schema.ini file specifies:
-
The name of the text file
-
The format of the text file (options are delimited, using any
delimiter wanted other than double quotation marks or fixed width)
-
The names, widths and types of all of the fields in the text file
-
The character set used (ANSI or OEM. If not provided, the
information is determined from the Windows Registry)
-
Any special data conversions (date/time format, numeric format,
currency format)
Examples are:
[Employees.txt]
ColNameHeader=True
CharacterSet=ANSI
Format=TabDelimited
Col1=EmployeeID Integer 3
Col2=LastName Char Width 20
Col3=FirstName Char Width 10
Col4=Title Char Width 30
Col5=TitleOfCourtesy Char Width 25
Col6=BirthDate Date 10
Col7=HireDate Date 10
Col8=Address Char Width 60
Col9=City Char Width 15
Col10=Region Char Width 15
Col11=PostalCode Char Width 10
Col12=Country Char Width 15
Col13=HomePhone Char Width 24
Col14=Extension Char Width 4
Col15=Photo OLE
Col16=Notes LongChar
Col17=ReportsTo Integer 3
The specification above indicates that the file Employees.txt
has column headers, that there are seventeen columns in the file, that the text
uses the ANSI character set and that the columns are delimited with tabs. (It
may be worth pointing out that the fifteenth column, named Photo, is really
only a place holder. The IISAM only works with text, so you cannot use it with
binary data such as images)
If you wanted fixed width columns, without column headers,
and you wanted to ensure that the dates were in yyyy-mm-dd format (regardless
of what the machine's Short Date format was set to through Regional Settings),
you could use:
[EmployeesExample1.txt]
ColNameHeader=False
CharacterSet=ANSI
Format=FixedLength
Col1=EmployeeID Integer Width 4
Col2=LastName Char Width 20
Col3=FirstName Char Width 10
Col4=Title Char Width 30
Col5=TitleOfCourtesy Char Width 25
Col6=BirthDate Date Width 10
Col7=HireDate Date Width 10
Col8=Address Char Width 60
Col9=City Char Width 15
Col10=Region Char Width 15
Col11=PostalCode Char Width 10
Col12=Country Char Width 15
Col13=HomePhone Char Width 24
Col14=Extension Char Width 4
Col15=Photo OLE Width 10
Col16=Notes LongChar Width 450
Col17=ReportsTo Integer Width 4
DateTimeFormat=yyyy-mm-dd
Note that because the specification above is for FixedLength,
it's necessary to provide a width for each column (it's optional for delimited
files).
It's not my intention to go through all of the details of the
Schema.ini file. If you want more information, check http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx
Working with Schema.ini files
Once you've got the Schema.ini file properly defined in the
same folder as the text file, you can then take advantage of what's in it when
using text files. Why might you want to do this? One big reason would be that
some text files don't have headers in them, so there's no easy way of letting
Access know what the column names should be. Since the Schema.ini file contains
the field names, your linked table will have correct column names, as opposed
to the Field1, Field2, Field3 names Access will assign by default.
There are three distinct ways that you can use the Schema.ini
file to read data. They are:
-
You can import the data into a table in your application
-
You can create a linked table that retrieves the text data
-
You can create a query that retrieves the data without having to
create a table.
For all of the code snippets that follow, assume that strTable
contains the name you want to assign to the table or query you're creating (one
of the three options above), strFolder contains the full path to the folder
where the text file and Schema.ini file exists (without a terminating slash),
and strFile contains the name of the text file.
The following code will create a table (named whatever's
contained in strTableName) containing all of the data contained in strFile (in
folder strFolder):
Dim strSQL As String
strSQL = "SELECT * INTO [" & _
strTable & "] FROM " & _
"[Text; Database=" & _
strFolder & ";].[" & _
Replace(strFile, ".", "#") & "]"
dbCurr.Execute strSQL, dbFailOnError
The reason for the square brackets on either side of strTable
is to handle the case where strTable contains blanks in the name (which I
always consider to be a bad idea!). Note that the name of the file gets
modified so that the period before the file extension is replaced by an octothorp
(#).
Importing the data from a text file into your application is
often an appropriate thing to do, but if the contents of the text file can
change over time, you may wish to link to it, rather than import. I'm using DAO
in the following code to create a linked table (again, named whatever's
contained in strTable) that's linked to strFile (in folder strFolder).
Set tdfNew = dbCurr.CreateTableDef(strTable)
tdfNew.Connect = "Text;DATABASE=" & strFolder & _
";TABLE=" & strFile
tdfNew.SourceTableName = strFile
dbCurr.TableDefs.Append tdfNew
You also have the option of creating a query that retrieves
the current values from the text file, although I'm not sure that there's any
advantage to doing so instead of creating a linked table. Again, I'll use DAO
to create the query (named whatever's contained in strTable) that returns the
data from file strFile in folder strFolder:
Set qdfNew = dbCurr.CreateQueryDef(strTable)
qdfNew.SQL = "SELECT * FROM [Text;Database=" & _
Me.txtFolder & _
";].[" & Replace(strFile, ".", "#") & "]"
(Note that if you're using Access 2000 or Access 2002, your
database may be missing the necessary reference to the DAO library, so that
neither of the previous two snippets will work. Go into the VB Editor and
select Tools | References from the menu. If Microsoft DAO 3.6 Object library
isn't one of the entries checked at the top of the list, scroll through the
list of available references until you find that reference, select it by
clicking on the check box then close the dialog)
The sample database that
accompanies this article includes eight text files with it:
-
Cats.BadfileName.txt
-
Cats.txt
-
CSVCategoriesHeader.txt
-
CSVCategoriesNoHeader.txt
-
Employees.txt
-
EmployeesExample1.txt
-
TabCategoriesHeader.txt
-
TabCategoriesHeader_Subset.txt
Theres also a Schema.ini file that contains definitions for
each of those eight files.
In the sample database, form frmImportUsingSchema (shown in
Figure 9) lets you select one of those eight files and then supply either a
table name (if you select Import to Table or Link to Table) or query name (if
you select Create linked query). When you click on the Import button, the table (or query) will be created for you, and you'll be able to use the data from the text file.
Figure 9: Form frmImportUsingSchema - This form allows
you to import or link to a file.
Exporting data using the Schema.ini file
Schema.ini files can also be used to output data. The easiest
way I've found is to use a simple SQL statement. The following code will create
a text file (named whatever is contained in strFile in whatever folder is
contained in strFolder) containing all of the data contained in strTable:
Dim strSQL As String
strSQL = "SELECT * INTO " & _
"[Text;" & _
"Database=" & srFolder & ";].[" & _
strFile & "] " & _
FROM [" & strTable & "]"
dbCurr.Execute strSQL, dbFailOnError
Form frmExportUsingSchema (shown in Figure 10) lets you
export data from either table Categories or Employees to one of the eight files
listed above. The problem youll encounter demonstrating how this works though,
is that since whats in the Schema.ini file dictates the name of the file to
which the data will be exported and frmImportUsingSchema required that the
files exist in order to be able to import or link to them, the files to which
youll be exporting the data already exist in the folder.
Figure 10: Form frmExportUsingSchema - This form allows
you to export from a table to a file.
My advice is to backup the files once youve played with frmImportUsingSchema
so that you can then replace those files via frmExportUsingSchema
Using the Schema.ini file with ADO
It's also possible to create an ADO connection that will let
you read data from the text file. The files included with the sample database
include djsArticle5_WordExample.doc.
If you open that document in Word 2003, you should see a
custom command bar (shown in Figure 11) that will let you invoke the code
Figure 11: Custom command bar Clicking on the Import
Data button will run the code, clicking on the Open VBEditor button will let
you see the code.
Depending on your Macro security settings, you may not be
able to run the sample code (which means you've got your system protected, a
good thing!). If you get the security warning message shown in Figure 12, you
can change your settings temporarily to use the sample. Select Tools | Macro |
Security from the menu bar (as illustrated in Figure 13), then set the Macro
security level to Medium or Low (as illustrated in Figure 14).
Figure 12: Word 2003 Macro Security warning Word is
trying to protect you from running unknown code.
Figure 13: Navigating to the Macro Security settings
Remember that Word 2003 hides rarely-used menu options. You might have to
search for the Menu option under the Tools menu.
Figure 14: Changing the Macro Security level Changing
to Low will allow you to run the code (changing to Medium means you'll get a
prompt). Remember to change your settings back when you're done!
If you're using Word 2007, the process will be slightly
different. When you open the document, you'll likely see a security warning
that macros have been disabled just below the ribbon (see Figure 15). Click on
the Options button and enable the contents, as shown in Figure 16.
Figure 15: Word 2007 Macro Security warning Security
in all the Office products is stronger than previously.
Figure 16: Enabling the contents in Word 2007 This
setting will only apply to the current document.
Once that is done, rather than the command bar shown in
Figure 11, a new group (Custom Toolbars) appears under the Add-Ins tab, as
shown in Figure 17.
Figure 17: Custom Toolbars in Word 2007 The toolbars
appear as groups in the Add-Ins tab!
To connect to the text file using ADO, you use a connection string along the lines of...
Driver={Microsoft Text Driver (*.txt; *.csv)};
Dbq=C:\Folder\;Extensions=asc,csv,tab,txt;
...and your SQL simply refers to the file name. Just to show that
these techniques can be applied to more than simply Access, I've included a
simple Word document along with this article to show how you can use this. I'm
assuming that the Schema.ini file and the text file Cats.txt exist in the same
folder as the Word document. This code reads the contents of the text file and
inserts it into a table in the Word document: To see the macro run, you need to
ensure that macros are enabled when you open the document. Once thats done,
you need to run the macro ImportData.
Because I'm going to use Late Binding, I define the ADO
constants I want to use, as well as a number of variables:
Const adOpenStatic As Long = 3
Const adLockOptimistic As Long = 3
Const adCmdText As Long = &H1
Dim objConnection As Object
Dim objRecordset As Object
Dim lngColumn As Long
Dim lngRow As Long
Dim lngTableNumber As Long
Dim strConnection As String
Dim strMessage As String
Dim strSchema As String
Dim strSQL As String
Dim strTable As String
As I said, the code assumes that both Schema.ini and Cats.txt
exist in the same folder as the Word document, so it's a good idea to check
that both assumptions are true:
strSchema = ActiveDocument.Path & "\Schema.ini"
strTable = ActiveDocument.Path & "\Cats.txt"
If Len(Dir(strSchema)) = 0 Then
strMessage = strMessage & _
strSchema & " does not exist." & vbCrLf
End If
If Len(Dir(strTable)) = 0 Then
strMessage = strMessage & _
strTable & " does not exist." & vbCrLf
End If
If Len(strMessage) > 0 Then
MsgBox strMessage & _
"Cannot proceed with the demonstration.", _
vbOK + vbCritical
Else
Since I want to be able to refer to the table I'm going to
create, it's necessary to know how many tables are already in the document. Once
I've done that, I instantiate the ADO Connection object and open it, then
instantiate the ADO Recordset and open it:
lngTableNumber = ActiveDocument.Tables.Count
Set objConnection = _
CreateObject("ADODB.Connection")
strConnection = _
"Driver={Microsoft Text Driver " & _
(*.txt; *.csv)};" & _
"Dbq=" & ActiveDocument.Path & ";" & _
"Extensions=asc,csv,tab,txt"
objConnection.Open strConnection
Set objRecordset = _
CreateObject("ADODB.Recordset")
strSQL = "SELECT * FROM Cats.txt"
objRecordset.Open strSQL, objConnection, _
adOpenStatic, adLockOptimistic, adCmdText
Now I simply work with the data in the recordset. First, in
order to know how many rows to add for the table, I need to get the number of
rows in the recordset. To ensure that the RecordCount property returns the
correct value, it's necessary to move to the end of the recordset.
With objRecordset
If .BOF = False And .EOF = False Then
.MoveLast
.MoveFirst
I add the new table:
ActiveDocument.Tables.Add _
Range:=Selection.Range, _
NumRows:=.RecordCount + 1, _
NumColumns:=.Fields.Count, _
DefaultTableBehavior:= _
wdWord9TableBehavior, _
AutoFitBehavior:=wdAutoFitContent
lngTableNumber = lngTableNumber + 1
lngRow = 1
I set the column headings to the names of the fields. (Note
that I'm being lazy and simply setting the font to Bold. Realistically, you
should be applying a style, not fiddling with the Font properties).
For lngColumn = 1 To .Fields.Count
ActiveDocument.Tables(lngTableNumber). _
Cell(Row:=lngRow, Column:=lngColumn). _
Range.Text = .Fields(lngColumn - 1).Name
ActiveDocument.Tables(lngTableNumber). _
Cell(Row:=lngRow, Column:=lngColumn). _
Range.Font.Bold = True
Next lngColumn
Finally, I loop through the recordset, putting values into
every cell of the table:
Do Until objRecordset.EOF
lngRow = lngRow + 1
For lngColumn = 1 To (.Fields.Count)
If IsNull(.Fields(lngColumn).Value) _
= False Then
ActiveDocument.Tables(lngTableNumber). _
Cell(Row:=lngRow, Column:=lngColumn). _
Range.Text = _
.Fields(lngColumn - 1).Value
End If
Next lngColumn
.MoveNext
Loop
Just a little clean-up, and I'm done:
End If
End With
End If
objRecordset.Close
Set objRecordset = Nothing
objConnection.Close
Set objConnection = Nothing
Conclusion
We've looked at several different ways in which it's possible
to interact with text files using SQL techniques that are familiar to us as
database programmers. It's really unfortunate that it's not possible to update
tables as well, but unfortunately that's a limitation we have to live with.
»
See All Articles by Columnist Doug Steele