Home All Groups Group Topic Archive Search About

Best way to extract a word from a sentence

Author
2 Feb 2006 5:56 AM
michael
Hi

I have been using UniBasic (aka DataBasic/PickBasic) for years. If I
want to extract the first word from a sentence I would code :-

WORD=FIELD(SENTENCE,' ',1)

In the above example, SENTENCE is a string. WORD is also a string.

In VB, it seems, I could use the Split() function instead (or a
combination of instr and mid). What this - Split() function - does,
though, is take my string and create an array out of it. That's very
nice if that's what you want to do. I don't. I want a string to be
extracted from a string. I would have thought it quite common for
people to want to do this.

Btw, in UniBasic, if I want the third thru fifth words out of a
sentence I could code :-

PHRASE=FIELD(SENTENCE,' ',3,3)

This, it seems to me, is the sort of thing I would expect to find in
any half-decent programming language. Although I haven't been using VB
for long, I have come to regard it as something considerably less than
half-decent.

Am I missing something?

Mike.

Author
2 Feb 2006 6:55 AM
Dmitriy Antonov
<mich***@preece.net> wrote in message
Show quoteHide quote
news:1138859813.927754.72010@g14g2000cwa.googlegroups.com...
> Hi
>
> I have been using UniBasic (aka DataBasic/PickBasic) for years. If I
> want to extract the first word from a sentence I would code :-
>
> WORD=FIELD(SENTENCE,' ',1)
>
> In the above example, SENTENCE is a string. WORD is also a string.
>
> In VB, it seems, I could use the Split() function instead (or a
> combination of instr and mid). What this - Split() function - does,
> though, is take my string and create an array out of it. That's very
> nice if that's what you want to do. I don't. I want a string to be
> extracted from a string. I would have thought it quite common for
> people to want to do this.
>
> Btw, in UniBasic, if I want the third thru fifth words out of a
> sentence I could code :-
>
> PHRASE=FIELD(SENTENCE,' ',3,3)
>
> This, it seems to me, is the sort of thing I would expect to find in
> any half-decent programming language. Although I haven't been using VB
> for long, I have come to regard it as something considerably less than
> half-decent.
>
> Am I missing something?
>
> Mike.
>

It is not like I would use this syntax, but you can do it this way:

The first word
WORD=Split(SENTENCE,' ')(0)

The third word
WORD=Split(SENTENCE,' ')(2)

I don't think it is harder or less convenient, then in your example.

While VB doesn't have a function, which does it in exactly the same way
(unless I missed it too), it doesn't make it less descent, and, conversely,
existence of such function wouldn't make it more descent either.
What you need is to obtain part of string partitioned by particular
delimiter. But what if I want to use multiple delimiters. Another person
might need to divide text using all upper case letters as delimiters. There
can be endless number of different requirements and there is no language,
which has direct means to cover all possibilities.

Dmitriy.
Are all your drivers up to date? click for free checkup

Author
2 Feb 2006 7:23 AM
R. MacDonald
Hello, Mike,

Any time you change languages, you will find yourself wishing for the
"good old days" when you could use the techniques you already knew.
It's called inertia, I think  ;-)

Anyway, for a single word, it's just as easy as your old UniBasic.  For
example, using:

     Split("Try this on for size.")(1)

will return the string "this".  (The array is zero based.)

For your phrase problem, well, I can't think of a single function
equivalent, but it should be trivial for you to create one by iterating
the index of the resulting array.   Something like:

   Function Phrase(ByVal Sentence As String, ByVal Start As Integer, _
                   ByVal Count As Integer) As String

       Dim intWord As Integer
       Dim varParsed As Variant
       Dim strPhrase As String

       varParsed = Split(Sentence, , Start)
       If (Start - 1 > UBound(varParsed)) Then Exit Function
       varParsed = varParsed(Start - 1)
       varParsed = Split(varParsed, , Count + 1)
       If (Count > UBound(varParsed)) Then Count = UBound(varParsed) + 1

       For intWord = 0 To Count - 1
           strPhrase = strPhrase & varParsed(intWord) & " "
       Next intWord
       Phrase = Trim$(strPhrase)

   End Function

Cheers,
Randy


mich***@preece.net wrote:
Show quoteHide quote
> Hi
>
> I have been using UniBasic (aka DataBasic/PickBasic) for years. If I
> want to extract the first word from a sentence I would code :-
>
> WORD=FIELD(SENTENCE,' ',1)
>
> In the above example, SENTENCE is a string. WORD is also a string.
>
> In VB, it seems, I could use the Split() function instead (or a
> combination of instr and mid). What this - Split() function - does,
> though, is take my string and create an array out of it. That's very
> nice if that's what you want to do. I don't. I want a string to be
> extracted from a string. I would have thought it quite common for
> people to want to do this.
>
> Btw, in UniBasic, if I want the third thru fifth words out of a
> sentence I could code :-
>
> PHRASE=FIELD(SENTENCE,' ',3,3)
>
> This, it seems to me, is the sort of thing I would expect to find in
> any half-decent programming language. Although I haven't been using VB
> for long, I have come to regard it as something considerably less than
> half-decent.
>
> Am I missing something?
>
> Mike.
>
Author
2 Feb 2006 8:39 AM
Rick Rothstein [MVP - Visual Basic]
Show quote Hide quote
> For your phrase problem, well, I can't think of a single function
> equivalent, but it should be trivial for you to create one by iterating
> the index of the resulting array.   Something like:
>
>    Function Phrase(ByVal Sentence As String, ByVal Start As Integer, _
>                    ByVal Count As Integer) As String
>
>        Dim intWord As Integer
>        Dim varParsed As Variant
>        Dim strPhrase As String
>
>        varParsed = Split(Sentence, , Start)
>        If (Start - 1 > UBound(varParsed)) Then Exit Function
>        varParsed = varParsed(Start - 1)
>        varParsed = Split(varParsed, , Count + 1)
>        If (Count > UBound(varParsed)) Then Count = UBound(varParsed) + 1
>
>        For intWord = 0 To Count - 1
>            strPhrase = strPhrase & varParsed(intWord) & " "
>        Next intWord
>        Phrase = Trim$(strPhrase)
>
>    End Function

As long as we are committed to using Split, we can "simplify" the above
routine and remove the loop as follows...

Function Phrase(Sentence As String, Start As Integer, _
                         Count As Integer) As String

  Dim SubString As String
  Dim Words() As String

  On Error GoTo NoPhrase
  SubString = Split(Sentence, , Start)(Start - 1)
  Words = Split(SubString, , Count + 1)
  If Count = UBound(Words) Then Words(Count) = ""
  Phrase = Trim$(Join(Words))

NoPhrase:
End Function

Rick
Author
2 Feb 2006 9:34 AM
Rick Rothstein [MVP - Visual Basic]
Show quote Hide quote
> As long as we are committed to using Split, we can "simplify" the above
> routine and remove the loop as follows...
>
> Function Phrase(Sentence As String, Start As Integer, _
>                          Count As Integer) As String
>
>   Dim SubString As String
>   Dim Words() As String
>
>   On Error GoTo NoPhrase
>   SubString = Split(Sentence, , Start)(Start - 1)
>   Words = Split(SubString, , Count + 1)
>   If Count = UBound(Words) Then Words(Count) = ""
>   Phrase = Trim$(Join(Words))
>
> NoPhrase:
> End Function

Of course, we can reduce this by a couple of more lines by removing the
intermediate 'SubString' variable...

Function Phrase(Sentence As String, Start As Integer, _
                         Count As Integer) As String

  Dim Words() As String

  On Error GoTo NoPhrase
  Words = Split(Split(Sentence, , Start)(Start - 1), , Count + 1)
  If Count = UBound(Words) Then Words(Count) = ""
  Phrase = Trim$(Join(Words))

NoPhrase:
End Function

Rick
Author
2 Feb 2006 10:38 AM
Mike D Sutton
Show quote Hide quote
> I have been using UniBasic (aka DataBasic/PickBasic) for years. If I
> want to extract the first word from a sentence I would code :-
>
> WORD=FIELD(SENTENCE,' ',1)
>
> In the above example, SENTENCE is a string. WORD is also a string.
>
> In VB, it seems, I could use the Split() function instead (or a
> combination of instr and mid). What this - Split() function - does,
> though, is take my string and create an array out of it. That's very
> nice if that's what you want to do. I don't. I want a string to be
> extracted from a string. I would have thought it quite common for
> people to want to do this.
>
> Btw, in UniBasic, if I want the third thru fifth words out of a
> sentence I could code :-
>
> PHRASE=FIELD(SENTENCE,' ',3,3)
>
> This, it seems to me, is the sort of thing I would expect to find in
> any half-decent programming language. Although I haven't been using VB
> for long, I have come to regard it as something considerably less than
> half-decent.

I would hope that the Field() call is not quite as brain-dead as VB's Split() call method since it allows for 0-length
tokens.  Here's an alternative solution which doesn't allow for 0-length tokens and should be a lot faster than the
Split() method on long strings since it doesn't need to tokenize the entire input buffer or allocate a huge array of
strings:

'***
Private Function Field(ByRef inString As String, _
    ByRef inDelimiter As String, ByVal inStartToken As Long, _
    Optional ByVal inNumTokens As Long = 1, _
    Optional ByVal inOffset As Long = 0, _
    Optional ByVal inCompareMode As VbCompareMethod = vbBinaryCompare) As String
    Dim DelimPos As Long
    Dim LastPos As Long
    Dim TokenIdx As Long
    Dim StartPos As Long, EndPos As Long
    Dim GotTokens As Boolean

    ' Validate input parameters
    If (Len(inString) < 1) Or (Len(inDelimiter) < 1) Or _
        (inStartToken < 1) Or (inNumTokens < 1) Or _
        (inOffset < 0) Or (inOffset >= Len(inString)) Then Exit Function

    DelimPos = inOffset + 1
    LastPos = DelimPos

    Do ' Find next delimiter
        DelimPos = InStr(DelimPos, inString, inDelimiter, inCompareMode)

        If (DelimPos) Then
            If (DelimPos > LastPos) Then
                TokenIdx = TokenIdx + 1

                ' Check for start token
                If (TokenIdx = inStartToken) Then StartPos = LastPos

                ' Check for end token
                If (TokenIdx = (inStartToken + inNumTokens - 1)) Then
                    EndPos = DelimPos
                    GotTokens = True
                    Exit Do
                End If
            End If

            ' Increment search position
            DelimPos = DelimPos + Len(inDelimiter)
            LastPos = DelimPos
        End If
    Loop While DelimPos

    ' Check for a trailing token
    If ((Not GotTokens) And (LastPos < Len(inString))) Then
        GotTokens = (TokenIdx = (inStartToken + inNumTokens - 1))
        EndPos = Len(inString)
    End If

    ' Extract return string
    If (GotTokens) Then Field = Mid$(inString, StartPos, EndPos - StartPos)
End Function
'***

Hope this helps,

    Mike


- Microsoft Visual Basic MVP -
E-Mail: ED***@mvps.org
WWW: Http://EDais.mvps.org/
Author
2 Feb 2006 12:42 PM
J French
On 1 Feb 2006 21:56:53 -0800, mich***@preece.net wrote:

Show quoteHide quote
>Hi
>
>I have been using UniBasic (aka DataBasic/PickBasic) for years. If I
>want to extract the first word from a sentence I would code :-
>
>WORD=FIELD(SENTENCE,' ',1)
>
>In the above example, SENTENCE is a string. WORD is also a string.
>
>In VB, it seems, I could use the Split() function instead (or a
>combination of instr and mid). What this - Split() function - does,
>though, is take my string and create an array out of it. That's very
>nice if that's what you want to do. I don't. I want a string to be
>extracted from a string. I would have thought it quite common for
>people to want to do this.

You are right, Split() is incredibly inefficient for what you want as
it really creates an Array even if you just do :-

        S = Split( Tgt$, Delin )(2)

This is probably what you want, but remember there ar commas, full
stops etc in sentences so you'll need to sort those out first.

'
#########################################################################
'
'
Public Function StrExtStr$(StringIn$, Delimiter$, Nth%)

     ' mod 26/10/99 JF - Allowed Delim over 1 Byte
     ' 27/6/03  JF - Fixed Start

     Dim QEnd&, Count&, Found&, Start&, DLen%

     If Nth < 1 Then
        Exit Function
     End If

     DLen = Len(Delimiter$)
     Start = 1 - DLen  ' 27/6/03 - eg: 1 -> 0
     QEnd = 1 - DLen
     While Found = 0

         Start = QEnd + DLen
         QEnd = InStr(Start, StringIn$, Delimiter$)
         Count = Count + 1

         If Count = Nth Or _
             QEnd = 0 Then _
             Found = 1

     Wend
     If Count = Nth Then
        If QEnd = 0 Then QEnd = Len(StringIn$) + 1
        StrExtStr$ = Mid$(StringIn$, Start, QEnd - Start)
     End If

End Function

Bookmark and Share

Post Thread options