[ acky.net logo ]

Home > Tutorials > Perl

Storing Methods With Perl

By Alex Osipov
LAST EDITED: Tuesday, August 6, 2002 0:13 AM


Section 2 - Flat Files:

Programmers often use flat files when storing small amounts of data. Take for example storing something such as small caching information. For example for one project I was working on, I needed to store IP numbers, the unique IP address of the visitor, and the time the entry occurred. I used flat files for this task because it was not very data intensive, and the information was cleared every 15 minutes.
When doing something like this, you can take 2 different approaches. You can create a file for each visitor (what I had done, as I needed to store extra information), something that I like to call flat-files, or you can have the same file for all entries.
When creating many different files you will need to be able to ensure that you can have a unique filename for each file, otherwise files will start to overlap after some time. You can use the Digest::SHA1 modules to generate a 160 bit signature from random data (only in incredibly rare cases will the signature to be the same), however there are number of different ways to do this. Once you generate the unique name you can start to create the flat file.

# Open file for write only or die.
open(FH, "> $unique_filename") or die("Error: $!");
# Lock the file.
flock(FH, 2);
# Save the remote ip address, a null, and then the time.
print FH $ENV{REMOTE_ADDR}, "\0", time;
# Close the file and release lock or die.
close(FH) or die("Error: $!");

Now this takes care of saving the data in flat-files. Retrieving data from a simple structure like this is very simple.

# We open the file for reading only or die.
open(FH, "$unique_filename") or die("Error: $!");
# Read the first line from open file.
$line = <FH>;
# Close the file or die.
close(FH) or die("Error: $!");
# Separate the data using split.
($remote_addr, $create_time) = split(/\0/, $line);

In this example, the $ENV{REMOTE_ADDR} and the time since epoch is saved in the $unique_filename file. Be careful to watch for security risks when using a variable in an open (for more information read perlsec man page or view it online at http://www.perl.com/pub/doc/manual/html/pod/perlsec.html). Using the same fundamental ideas you can create much more complex data structures within flat-files.
As I mentioned earlier, the other way of using flat files is to create one larger file for all entries. Retrieving data from this kind of flat file database can be slower as data increases, so only use this if it presents something beneficial to your programs. You've been warned! The basic ideas for using this type of flat file database is virtually the same as for flat-files.

Rather than opening the file for writing as we did in the flat-files example, we have to open the file for appending, because overwriting data will not help us in this example. We must also separate each entry by a delimiter, I will use the newline character, and we no longer need to use $unique_filename in open because the filename will be static.

# Open file for append or die.
open(FH, ">> ./cache.db") or die("Error: $!");
# Lock the file.
flock(FH, 2);
# Save the unique id, a null, remote ip address, a null, and then the time since epoch.
print FH $unique_id, "\0", $ENV{REMOTE_ADDR}, "\0", time, "\n";
# Close the file and release lock or die.
close(FH) or die("Error: $!");

For retrieving data from the file we still needed the $unique_filename because in order for the program to be able to pick out a certain entry it needs something to search for, you could use the remote ip address, or the time, but I personally prefer a unique id for each visitor (that I save as a cookie, and retrieve anytime a script is run by the user).

Once you know what the unique id is that you want to retrieve from the flat file database, you can do the following.

# Open the file for read only.
open(FH, " ./cache.db") or die("Error: $!");
# Loop through each entry in the flat file and look for the one we need.
while ($line = <FH>) {
# Remove the newline character at the end of the line
chomp($line);
# Separate the data on line using split.
($unique_id, $remote_addr, $create_time) = split(/\0/, $line);
# Check if the unique id that we saved earlier matches the one
# that we are looking for this time, where $our_id is the id that
# we are looking for. If the two ids match, we break out of the loop.
if ($unique_id eq $our_id) {
$found = 1;
last;
}
}
# Close the file or die.
close(FH) or die("Error: $!");
unless ($found) {
die("Error: Could not find entry $our_id in the flat file database.");
}

In this example the $unique_id, $remote_addr, and $create_time will be retrieved from the cache.db file if they match the $our_id variable, otherwise it will die. You can adapt this for your own programs with minimal effort. Let me be mention this again, this can be very inefficient when dealing with large amounts of data, as the program must loop through every line until the entry is found. Another deficiency in this small example is the program will only retrieve the first entry in the cache.db file and exit, this is what most people would want, but if you want to retrieve all entries, or the most recent one, a little more work will be required. (There are different ways of sorting, and matching data which can speed this process up significantly.)
I will mention some other ways of storing data in flat files as well as other storing data methods, in the following pages.


Section 3: SQL »

Perl Forum

Email This Page To A Friend

Last updated: Monday, November 27, 2006 - 11:02 AM Eastern Daylight Time
Legal | Privacy Statement | Problems & Questions | Advertise | Link to us
© 1997-2007, All Rights Reserved.


Dedicated Server Provided by HighSpeedHosting