Perl scripts
These are Perl scripts that I find useful when trying to import text files
into Access, or discover why they won't import. The links are directly to the
.pl files: right-click and download to your computer. Perl itself can be
downloaded free from www.activeperl.com.
- CountDelims.pl
- This script reports on the number of commas, tabs or other delimiters in
each row of a text file. It's useful for identifying which rows won't import
and why.
- CSVTab.pl
- This script converts a CSV file with "..." text delimiters (otherwise
known as a comma-delimited file with "..." text qualifiers) to a
tab-delimited file without delimiters.
- FillInCSVNulls.pl
- Takes a csv file in which fields may contain null (empty) values. For
each null
value, substitutes the last non-null value in that field. E.g.
ABC,FGD,123,TEST
,,124,TEST
,,125,TEST2
FGD,HJY,123,DATA
HJY,YUI,458,DATA2
,,852,DATA
converts to
ABC,FGD,123,TEST
ABC,FGD,124,TEST
ABC,FGD,125,TEST2
FGD,HJY,123,DATA
HJY,YUI,458,DATA2
HJY,YUI,852,DATA
- Interleave.pl
- This interleaves alternate lines from two text files (line 1 from file
1, line 1 from file 2, line 2 from file 1 and so on). Easily modified to
control what happens when one file is shorter than the other.
- TaggedLineToDelim.pl
- This script takes a text file like this, with each line containing a
field name and value:
FieldName1: Value
FieldName2: Value
...
FieldNameN: Value
blank line(s) between records-
FieldName1: Value
...
and converts it into a standard CSV file with field names in the first
row. You can specify which character(s) separate the field names from the
values.
- Blank lines are treated as record separators. Provided each non-blank line
fits the “field name: value” pattern, it doesn't matter if fields
within a record are out of order or missing. Commas and quotes in the field
values are correctly handled (at least in Windows terms).
- Prn-Tab.pl
- This script takes a text file like this, with each record containing the
same fields in the same order:
FieldName1: Value
FieldName2: Value
...
FieldNameN: Value (no blank line needed
between records)
FieldName1: Value
...
FieldNameN: Value
...
and converts it into a tab-delimited file with field names in the first
line.
Blank lines between fields or records are ignored. Field name and value must
be separated by a colon followed by white space (modify the script to use a
different separator). The field values may contain
colons but must not contain newlines. Fields may be empty, but every record
must contain the same fields in the same order (by contrast TaggedLineToDelim.pl
can handle missing and out-of-order fields, but requires a blank line
between records). .
- TopAndTail.pl
- This script removes a specified number of lines from beginning and/or
end of a text file.
- txtnrm.pl
- This is more of a program than a script, designed to overcome Access's
inability to import a text file with more than 255 fields. It transforms a
"wide" text file - with hundreds or thousands of fields, fixed width or
delimited - into a tall, narrow one with just three fields:
- KEY
- This is either a field that uniquely identifies each row in the
original file, or an automatically-generated line number.
- FIELD
- This identifies the field in the original file
- VALUE
- This of course is the value of that field in that record in the
original file.
-
You can then link an Access table to this new file, and use append
queries to import the data into properly normalised Access tables.