Home All Groups Group Topic Archive Search About

Working with unicode string

Author
20 Mar 2009 1:48 PM
Jack B. Pollack
I have a unicode string that I am reading from a file

eg "123" reads in as:  1  zero  2  zero  3  zero


I want to use the replace function. There must be an easier way of dealing
with
this type of string then the way I am doing now (need to keep new string as
unicode also):

s1="1" & Chr$(0) & "2" & Chr$(0) & "3"
s2="A" & Chr$(0) & "B" & Chr$(0) & "C"

Replace(a$, s1,s2)

Thanks

Author
20 Mar 2009 3:35 PM
Eduardo
Show quote Hide quote
"Jack B. Pollack" <N@NE.nothing> escribió en el mensaje
news:eEUvDKWqJHA.4108@TK2MSFTNGP06.phx.gbl...
>I have a unicode string that I am reading from a file
>
> eg "123" reads in as:  1  zero  2  zero  3  zero
>
>
> I want to use the replace function. There must be an easier way of dealing
> with
> this type of string then the way I am doing now (need to keep new string
> as unicode also):
>
> s1="1" & Chr$(0) & "2" & Chr$(0) & "3"
> s2="A" & Chr$(0) & "B" & Chr$(0) & "C"
>
> Replace(a$, s1,s2)

Convert the string with this function:

a$ = StrConv(a$, vbFromUnicode)
Author
20 Mar 2009 4:52 PM
Jim Mack
Eduardo wrote:
Show quoteHide quote
> "Jack B. Pollack" escribis...
>> I have a unicode string that I am reading from a file
>>
>> eg "123" reads in as:  1  zero  2  zero  3  zero
>>
>>
>> I want to use the replace function. There must be an easier way of
>> dealing with
>> this type of string then the way I am doing now (need to keep new
>> string as unicode also):
>>
>> s1="1" & Chr$(0) & "2" & Chr$(0) & "3"
>> s2="A" & Chr$(0) & "B" & Chr$(0) & "C"
>>
>> Replace(a$, s1,s2)
>
> Convert the string with this function:
>
> a$ = StrConv(a$, vbFromUnicode)

That may not give the OP what he needs, depending on the source of the
file data. If the file was generated in a different locale, the
conversion you show could corrupt the string. To be completely
language-agnostic, you should use a byte array as the intermediate.

--
   Jim Mack
   Twisted tees at http://www.cafepress.com/2050inc
   "We sew confusion"
Author
20 Mar 2009 5:12 PM
Eduardo
"Jim Mack" <jmack@mdxi.nospam.com> escribió en el mensaje
news:eQwT9wXqJHA.4516@TK2MSFTNGP02.phx.gbl...

> That may not give the OP what he needs, depending on the source of the
> file data. If the file was generated in a different locale, the
> conversion you show could corrupt the string. To be completely
> language-agnostic, you should use a byte array as the intermediate.

Yes, you are right, I was thinking of an ANSI string codified as Unicode.
PS: I realized I was still a bit asleep to write to the newsgroup.
Author
20 Mar 2009 5:58 PM
Tony Proctor
Absolutely right Jim! By default,VB's textual I/O will assume the data is in
the active ANSI character set. Anything that isn't part of 7-bit ASCII will
therefore get mangled and will not represent the correct characters in
memory

    Tony Proctor

Show quoteHide quote
"Jim Mack" <jmack@mdxi.nospam.com> wrote in message
news:eQwT9wXqJHA.4516@TK2MSFTNGP02.phx.gbl...
> Eduardo wrote:
>> "Jack B. Pollack" escribis...
>>> I have a unicode string that I am reading from a file
>>>
>>> eg "123" reads in as:  1  zero  2  zero  3  zero
>>>
>>>
>>> I want to use the replace function. There must be an easier way of
>>> dealing with
>>> this type of string then the way I am doing now (need to keep new
>>> string as unicode also):
>>>
>>> s1="1" & Chr$(0) & "2" & Chr$(0) & "3"
>>> s2="A" & Chr$(0) & "B" & Chr$(0) & "C"
>>>
>>> Replace(a$, s1,s2)
>>
>> Convert the string with this function:
>>
>> a$ = StrConv(a$, vbFromUnicode)
>
> That may not give the OP what he needs, depending on the source of the
> file data. If the file was generated in a different locale, the
> conversion you show could corrupt the string. To be completely
> language-agnostic, you should use a byte array as the intermediate.
>
> --
>   Jim Mack
>   Twisted tees at http://www.cafepress.com/2050inc
>   "We sew confusion"
>
Author
20 Mar 2009 7:34 PM
Jack B. Pollack
Show quote Hide quote
"Eduardo" <m*@mm.com> wrote in message news:gq0gij$dkh$1@aioe.org...
> "Jack B. Pollack" <N@NE.nothing> escribió en el mensaje
> news:eEUvDKWqJHA.4108@TK2MSFTNGP06.phx.gbl...
>>I have a unicode string that I am reading from a file
>>
>> eg "123" reads in as:  1  zero  2  zero  3  zero
>>
>>
>> I want to use the replace function. There must be an easier way of
>> dealing with
>> this type of string then the way I am doing now (need to keep new string
>> as unicode also):
>>
>> s1="1" & Chr$(0) & "2" & Chr$(0) & "3"
>> s2="A" & Chr$(0) & "B" & Chr$(0) & "C"
>>
>> Replace(a$, s1,s2)
>
> Convert the string with this function:
>
> a$ = StrConv(a$, vbFromUnicode)
>
>
Thanks. Despite the hot debate it worked fine.
Author
20 Mar 2009 7:50 PM
Bob Butler
Show quote Hide quote
"Jack B. Pollack" <N@NE.nothing> wrote in message
news:%23XEWsLZqJHA.4028@TK2MSFTNGP03.phx.gbl...
>
> "Eduardo" <m*@mm.com> wrote in message news:gq0gij$dkh$1@aioe.org...
>> "Jack B. Pollack" <N@NE.nothing> escribió en el mensaje
>> news:eEUvDKWqJHA.4108@TK2MSFTNGP06.phx.gbl...
>>>I have a unicode string that I am reading from a file
>>>
>>> eg "123" reads in as:  1  zero  2  zero  3  zero
>>>
>>>
>>> I want to use the replace function. There must be an easier way of
>>> dealing with
>>> this type of string then the way I am doing now (need to keep new string
>>> as unicode also):
>>>
>>> s1="1" & Chr$(0) & "2" & Chr$(0) & "3"
>>> s2="A" & Chr$(0) & "B" & Chr$(0) & "C"
>>>
>>> Replace(a$, s1,s2)
>>
>> Convert the string with this function:
>>
>> a$ = StrConv(a$, vbFromUnicode)
>>
>>
> Thanks. Despite the hot debate it worked fine.

It may have worked for the specific text you tried it with; it may even work
with every text you'll encounter in this task; that doesn't make it correct.

If the file has unicode text then use byte arrays and binary I/O to to the
read/write operations.
Author
20 Mar 2009 7:51 PM
Tony Proctor
With A-Z and other Latin characters I don't doubt it Jack

Try something like the Euro character. I think the Unicode value for this in
&h20AC (needs checking) but the Latin-1 ANSI code is just &h80

    Tony Proctor

Show quoteHide quote
"Jack B. Pollack" <N@NE.nothing> wrote in message
news:%23XEWsLZqJHA.4028@TK2MSFTNGP03.phx.gbl...
>
> "Eduardo" <m*@mm.com> wrote in message news:gq0gij$dkh$1@aioe.org...
>> "Jack B. Pollack" <N@NE.nothing> escribió en el mensaje
>> news:eEUvDKWqJHA.4108@TK2MSFTNGP06.phx.gbl...
>>>I have a unicode string that I am reading from a file
>>>
>>> eg "123" reads in as:  1  zero  2  zero  3  zero
>>>
>>>
>>> I want to use the replace function. There must be an easier way of
>>> dealing with
>>> this type of string then the way I am doing now (need to keep new string
>>> as unicode also):
>>>
>>> s1="1" & Chr$(0) & "2" & Chr$(0) & "3"
>>> s2="A" & Chr$(0) & "B" & Chr$(0) & "C"
>>>
>>> Replace(a$, s1,s2)
>>
>> Convert the string with this function:
>>
>> a$ = StrConv(a$, vbFromUnicode)
>>
>>
> Thanks. Despite the hot debate it worked fine.
>
Author
20 Mar 2009 4:01 PM
Jim Mack
Jack B. Pollack wrote:
Show quoteHide quote
> I have a unicode string that I am reading from a file
>
> eg "123" reads in as:  1  zero  2  zero  3  zero
>
>
> I want to use the replace function. There must be an easier way of
> dealing with
> this type of string then the way I am doing now (need to keep new
> string as unicode also):
>
> s1="1" & Chr$(0) & "2" & Chr$(0) & "3"
> s2="A" & Chr$(0) & "B" & Chr$(0) & "C"
>
> Replace(a$, s1,s2)

Read the file string into a byte array instead of a string, then
simply assign that array to a string. You'll have a true Unicode
string you can search and replace, etc.

To write it back, reverse the process.

--
   Jim Mack
   Twisted tees at http://www.cafepress.com/2050inc
   "We sew confusion"