Home All Groups Group Topic Archive Search About

Best ways for reading and editing text files

Author
19 Mar 2006 11:52 PM
Jim Carlock
I have some Apache log files ranging in size of 500KB that I need
to delete specific lines from and then save it as another file with a
different name.

Before I embark on this, I'd like to get some thoughts on how to
handle this.

Private Function GetTextFromFile(sFile As String) As String
  Dim iFile as Long, s As String
  iFile = FreeFile
  Open sFile For Input As #iFile
  GetTextFromFile = Input$(LOF(iFile), iFile)
  Close #iFile
End Function

And I've not quite decided how to delete the certain lines from
the text file. Basically there will be an IP address at the start of
the line that'll be searching for and then deleting the whole line.

If anyone has a link to where this might have discussed in the
past, that'll be great. I just started looking and it's definitely a
worthwhile topic.

Jim Carlock
Post replies to the group.

Author
20 Mar 2006 2:33 AM
Larry Serflaten
Show quote Hide quote
"Jim Carlock" <anonymous@localhost> wrote
> I have some Apache log files ranging in size of 500KB that I need
> to delete specific lines from and then save it as another file with a
> different name.
>
> Before I embark on this, I'd like to get some thoughts on how to
> handle this.
>
> Private Function GetTextFromFile(sFile As String) As String
>   Dim iFile as Long, s As String
>   iFile = FreeFile
>   Open sFile For Input As #iFile
>   GetTextFromFile = Input$(LOF(iFile), iFile)
>   Close #iFile
> End Function
>
> And I've not quite decided how to delete the certain lines from
> the text file. Basically there will be an IP address at the start of
> the line that'll be searching for and then deleting the whole line.
>


If you are only doing a handful of files, then the straight forward
method you have started with would be easy to implement.  Once
you have the file in a string, you search for the IP and the following
EOL characters using InStr() and delete the line by copying all after
EOL to where you found the IP, using Mid().  Then after doing all the
deletions, you could write the new file to disk.

Do recognise that by reading the file into a string you will convert
the data from the file's ANSI to a string's Unicode (for typical .txt files).
Using a Byte array would avoid that conversion.

If you use a byte array, instead of moving data around for deletions,
you might just keep track of the positions of the data you want to delete
(using a second array) and skip through the file's array writing only the
portions you want saved into the new file.

EX:
Private Sub Form_Load()
Dim data() As Byte
Dim find() As Byte
Dim pos As Long

  ' Demo file data (Opened for Binary reads)
  ' Get #file, , data
  data = StrConv("This is file data in an array", vbFromUnicode)

  ' Find string
  find = StrConv("file", vbFromUnicode)
  pos = InStrB(data, find)

  ' Pick and choose what to save
  ' Put #file, , MidB(data, pos, 9)
  Debug.Print StrConv(MidB(data, pos, 9), vbUnicode)

End Sub

HTH
LFS
Author
20 Mar 2006 3:58 AM
Jim Carlock
Hey Larry,

How do I get the length of b() (size of a byte array)?

Private Sub ProcessFile()
  Dim b() As Byte
  b = GetTextFromFile(cdg.FileName)
  Call MsgBox(CStr(Len(b)), vbOKOnly, "Length of b()")
End Sub

Private Function GetTextFromFile(sFile As String) As Byte()
  Dim iFile As Long
  Dim b() As Byte
  iFile = FreeFile
  'Open sFile For Input As #iFile
  Open sFile For Binary As #iFile
  'GetTextFromFile = Input$(LOF(iFile), iFile)
  GetTextFromFile = InputB(LOF(iFile), #iFile)
  Close #iFile
End Function

Is UBound() the right way to go with this? Both Len() and LenB()
seem to want a string and error out with a strange message:

Compile error:
Variable required - can't assign to this expression

Thanks much.

Jim Carlock
Post replies to the group.
Author
20 Mar 2006 8:30 AM
J French
On Sun, 19 Mar 2006 22:58:35 -0500, "Jim Carlock"
<anonymous@localhost> wrote:

>Hey Larry,
>
>How do I get the length of b() (size of a byte array)?
>
>Private Sub ProcessFile()
>  Dim b() As Byte
>  b = GetTextFromFile(cdg.FileName)
>  Call MsgBox(CStr(Len(b)), vbOKOnly, "Length of b()")
>End Sub

I would open the file in Binary Mode

Redim B( 1 To FileLen( TheFile$ ) )

Get# Fle, 1, B()

However, I'm not entirely happy with this solution, it is Ok if the
files are 500kb long - but if they really grow you'll run into
problems

Also the 'logic' for removing/extracting lines might be simpler if you
tackle each line one at a time
- the real hit is in the disk access

If you search Google Groups for :-

   "J French" cReadFileStream

You'll find something I've posted a number of times

Personally, for the sake of simplicity I would live with the ANSI <->
Unicode conversion
- it is an overhead, and might bite you in Far Eastern locales
- but the alternative is quite convoluted if you are doing anything
remotely complex
Author
20 Mar 2006 1:39 PM
Larry Serflaten
"Jim Carlock" <anonymous@localhost> wrote

> How do I get the length of b() (size of a byte array)?

<snipped for brievity>

> Is UBound() the right way to go with this? Both Len() and LenB()
> seem to want a string and error out with a strange message:

Yes, Use both UBound and LBound to determine the size of an
'unknown' array.

size = UBound(b) - LBound(b) + 1


But, as I said at the first, and as J French also advises, if you are
only doing a few files, then go with what you had (using strings).
You'll hardly notice a difference between the two for a small number
of files, so long as you read the entire file then do your work on it in
memory, and write it out after that.

As J French also indicated, loading the entire file into memory would
not be the best approach if the files got significantly larger (mega-byte
sizes), but you indicated that the files would be around 500K in size,
something that should easily fit in memory....

LFS