Home All Groups Group Topic Archive Search About

Matching Unicode Characters

Author
15 Aug 2010 11:59 AM
David Kaye
My program loops through folders and filenames and reads file info into a
database.  The trouble is that the directory structure presents Unicode, but
an API call to WIN32_FIND_DATA or WIN32_FIND_DATAW brings back text in 8-bit
ASCII (ASCII plus the IBM character set 128-255). 

Case in point is the name, Pavel Bielcík, a name which uses an 8-bit ASCII "i"
character (which passes fine) but uses a "c" with a little accent mark over
the top, a character that exists only in Unicode. 

So, the "c" with the accent is changed into a conventional "c" in my
WIN32_FIND_DATA call, but a filelistbox is expecting the name with the proper
Unicode "c" character instead.  Thus, the file list box can't find the folder
and presents an error.

WHAT I NEED TO DO is to convert the text in one direction or the other, either
strip the Unicode accent mark or read the WIN32_FIND_DATA correctly.  I can't
seem to get either one to work.  StrConv to and from Unicode will not work
because "to Unicode" turns any given text into double-bytes, and "from
Unicode" turns the text into "???????" 

Help!

Author
15 Aug 2010 3:55 PM
Nobody
Show quote Hide quote
"David Kaye" <sfdavidka***@yahoo.com> wrote in message
news:i48kqe$rja$1@news.eternal-september.org...
> My program loops through folders and filenames and reads file info into a
> database.  The trouble is that the directory structure presents Unicode,
> but
> an API call to WIN32_FIND_DATA or WIN32_FIND_DATAW brings back text in
> 8-bit
> ASCII (ASCII plus the IBM character set 128-255).
>
> Case in point is the name, Pavel Bielcík, a name which uses an 8-bit ASCII
> "i"
> character (which passes fine) but uses a "c" with a little accent mark
> over
> the top, a character that exists only in Unicode.
>
> So, the "c" with the accent is changed into a conventional "c" in my
> WIN32_FIND_DATA call, but a filelistbox is expecting the name with the
> proper
> Unicode "c" character instead.  Thus, the file list box can't find the
> folder
> and presents an error.
>
> WHAT I NEED TO DO is to convert the text in one direction or the other,
> either
> strip the Unicode accent mark or read the WIN32_FIND_DATA correctly.  I
> can't
> seem to get either one to work.  StrConv to and from Unicode will not work
> because "to Unicode" turns any given text into double-bytes, and "from
> Unicode" turns the text into "???????"

You need to use FindFirstFileW and WIN32_FIND_DATAW, and declare the strings
As Long. Example air code:

Dim wfd As WIN32_FIND_DATAW
Dim sFileName As String

sFileName = String(32768, 0)
wfd.cFileName = StrPtr(sFileName)
hFind = FindFirstFileW(ByVal StrPtr("\\?\C:\*"), wfd)
sFileName = TrimNull(sFileName)

If you still can't get it to work, post your declaration and sample code,
and what error you are getting.