rupertb
2nd October 2002, 15:53
I need conceptual help regarding reading from a file (TIME.HIS) into a new baan table. I'm using seq.open, seq.eof, seq.gets and string.scan to read and format the data in my session. This file (TIME.HIS) is constantly written to by baan. I don't want to lock the file at any given point for this reason. However I need to clean out those lines I've read in from the top of the file and inserted into my table. Seq.unlink won't work here. How can I delete x number of lines from an ascii file?
Rupert
hklett
2nd October 2002, 16:47
May it could work if you rename the file periodic and read from the renamed file while the writing process writes in the original file.
I think the behavior of this procedure depent on the underlying operating system and the kind of the writing process .
Does it write several lines in a open file, or does it open and close the file for each line it writes.
The problem is that a sequentiell file isnt a database.
günther
2nd October 2002, 16:49
First of all: You can only "delete" lines from the beginning if you overwrite them. That would be possible, but I would'nt do that.
As you mention the file name TIME.HIS, I guess you mean the file in $BSE/lib? This file has "records" (lines) and "fields" separated by the pipe character. So you can easily read that file from the beginning, parse a line into separate fields. Before you insert the fields into your (database) table, you have to check, if an appropriate record already exists => that means you have already processed that line.
One more hint: If you rename the file at first, a process that has already opened the file can continue writing to it. When that process closes the file, you will see the data written (okay, a flush would be enough). A new open form another process would create TIME.HIS as a new file.
So, I think the bshell is the process that writes TIME.HIS. When that is intermally coded like seq.open("TIME.HIS", "a") / seq.puts() / seq.close(), you could try file.rename("TIME.HIS", "TIME.BAK") / wait 10 seconcs / process TIME.BAK.
günther
rupertb
2nd October 2002, 17:03
Thanks for the replies chaps,
Guenther the only problem I see with coping the file is that the original TIME.HIS would continue growing and that's what I want to avoid. I also don't want to constantly try and insert duplicate records into the database (wasted resource). I was hoping to use the TIME.HIS as a buffer, baan adds lines at the end of the file and my program flushes them out from the begining of the file.
Rupert
günther
2nd October 2002, 17:20
I know what you mean; I also take care about my resources. But please be careful: I did not write COPY, I wrote RENAME (or MOVE)!!!
The new created TIME.HIS will start with 0 bytes; no growing!
Btw. With that concept, you would not have to read the file again and again.
günther
NPRao
3rd October 2002, 04:14
Rupert,
1. Can you please state the purpose of why do you want to store the info from TIME.HIS file to a database?
2. Another solution is to use unix commands like - grep.
We had a sitatuation where wanted to remove the history specific to a user so we used -
$ grep -v TIME.HIS TIME.HIS.BAK
$ mv TIME.HIS.BAK TIME.HIS
This file is always growing big if not properly taken care of.
In this way we lose this a fraction of the history and also save space.
I logged a case with BaaN Support asking for enhancement to use this option as a table based data than a file based data.
Then the queries will be executed much faster. Right now, I guess the sessions - ttaad2402m000 - Print User History, ttaad2202m000 - Delete User History take a long time for processing.
Also, there is no option to cap the file size or roll over the file as for the $BSE/log files. :(
rupertb
3rd October 2002, 09:47
Once again thanks! We're a few months away from migrating to c4 from b2 (at last!). Unfortunately we let our users requests for enhancements get the better of us. My intention is to parse the TIME.HIS information for those sessions (enhancements) that have never been used - those that seemed like a GREAT idea at that time. Then I'll be able to motivate deleting those and limit my workload over the next 6 months. Unfortunately the standard print user history is to slow :( But thanks NPRao what I'll do is read in say 1000 lines from TIME.HIS insert them into my table and then execute a shell script calling something like 'sed' removing those same 1000 lines. This I'll put in a job. Then I can play with the timing of the job to keep TIME.HIS at a more or less fixed file size. And of course I can do my analysis of the user history from the suitably indexed table.
Guenther I was re-reading your suggestions as well, that can work too in fact that might be the option I'll use I need to ensure that TIME.HIS is never locked irrespective of the command I use be it 'mv' or 'sed'.
Thanks :D
Han Brinkman
3rd October 2002, 13:52
At least on Unix it's possible to use an option of ls in order to find out the last time a file was accessed. Wouldn't that help which customizations haven't been used for a long time?
Rgrds,
Han
NPRao
3rd October 2002, 16:48
Rupert,
Thinking back I got another idea using a shell script.
Here is a brief logic and I guess you can keep this in a function and call in a loop until the file is empty or no more lines.
head -n1000 TIME.HIS > TIME.HIS.TEMP
numlines=`wc -l TIME.HIS`
startline=`expr ${numlines} - 1000`
tail -${startline} TIME.HIS > TIME.HIS.BAK
mv TIME.HIS.BAK TIME.HIS
So you have to parse the TIME.HIS.TEMP files for every 1000 lines and put data into the table.
Hope it helps you. :p