BBC BASIC for Windows
« Unicode in Folder names »

Welcome Guest. Please Login or Register.
Apr 5th, 2018, 10:16pm



ATTENTION MEMBERS: Conforums will be closing it doors and discontinuing its service on April 15, 2018.
Ad-Free has been deactivated. Outstanding Ad-Free credits will be reimbursed to respective payment methods.

If you require a dump of the post on your message board, please come to the support board and request it.


Thank you Conforums members.

BBC BASIC for Windows Resources
Online BBC BASIC for Windows documentation
BBC BASIC for Windows Beginners' Tutorial
BBC BASIC Home Page
BBC BASIC on Rosetta Code
BBC BASIC discussion group
BBC BASIC for Windows Programmers' Reference

« Previous Topic | Next Topic »
Pages: 1  Notify Send Topic Print
 thread  Author  Topic: Unicode in Folder names  (Read 1336 times)
hellomike
New Member
Image


member is offline

Avatar




PM

Gender: Male
Posts: 46
xx Re: Unicode in Folder names
« Reply #6 on: Apr 2nd, 2015, 4:22pm »

As before, just getting confirmed how stuff works (and doesn't) helps a great deal!

Quote:
I worry that you are turning something which is intrinsically very simple into something difficult.


Yep, my approach was needlessly difficult. The theory behind it all isn't complex but also not really "very simple" and then again, once understood, everything is simple.

It was confusing for me that the API "FindFirstFileW" wasn't really documented on MSDN and it took me a while to realize that the call works with wide strings for input and output and that a wide string delimiter is now 0x0000.

So there is progress and the following code now lists the folder-names correctly after making a function to make a Wide string version for the initial rootdir (D:\X30Share).
Code:
      CP_UTF8 = &FDE9
      VDU 23,22,640;512;8,16,16,128+8 : REM Select UTF-8 mode
      *font Courier New
      rootpath$="D:\X30Share"

      N%=FNscandir(FNANSItoWide(rootpath$))
      PRINT '"There were ";N%;" files in root path"
      END

      DEF FNscandir(path$)
      LOCAL dir%,sh%,res%,n%,utf8%
      DIM dir% LOCAL 317,utf8% LOCAL 260
      SYS "FindFirstFileW",path$+"\"+CHR$0+"*"+CHR$0+CHR$0+CHR$0,dir% TO sh%
      IF sh%<>-1 THEN
        REPEAT
          IF dir%!44<>&0000002E AND dir%!44<>&002E002E THEN
            IF !dir% AND &10 THEN
              SYS "WideCharToMultiByte",CP_UTF8,0,dir%+44,-1,utf8%,260,0,0
              PRINT $$utf8%
              REM Now I have to somehow append the double byte string at dir%+44
              REM to path$ in order to do the recurse call to this function
              REM path$+=$$(dir%+44) won't work...
            ELSE
              n%+=1
            ENDIF
          ENDIF
          SYS "FindNextFileW",sh%,dir% TO res%
        UNTIL res%=0
        SYS "FindClose",sh%
      ENDIF
      =n%:REM Return number of files in the folder

      REM --------------------------------------------------------------------
      DEF FNANSItoWide(a$)
      LOCAL wide$,i%

      FOR i%=1 TO LENa$
        wide$+=MID$(a$,i%,1)+CHR$0
      NEXT
      =wide$ 

Also testing for "." and ".." had to change.

I will manage appending the double byte string to path$ but had a strange error.
I gathered that the returned names at dir%+44 now occupy twice as many bytes so I though to enlarge the memory area for dir% to be on the save side and changed only that line
Code:
      DIM dir% LOCAL 511,utf8% LOCAL 260 


After listing the folder names, the program errors out with

Not in a function

and emphasized the "=n%" line.

I'm using BB4W V5.95a.

Regards,

Mike
User IP Logged

rtr2
Guest
xx Re: Unicode in Folder names
« Reply #7 on: Apr 2nd, 2015, 4:44pm »

on Apr 2nd, 2015, 4:22pm, hellomike wrote:
I though to enlarge the memory area for dir% to be on the save side and changed only that line
Code:
      DIM dir% LOCAL 511,utf8% LOCAL 260 


You were quite right in thinking that it was necessary to increase the amount of memory allocated to dir%, but rather than erring on the "safe side" in fact you didn't increase it enough! If you look at the definition of WIN32_FIND_DATA at MSDN you'll find that the wide-character version occupies 592 bytes so in your program you require as a minimum:

Code:
      DIM dir% LOCAL 591,utf8% LOCAL 260 

Strictly speaking the 260 should be increased as well, because the theoretical maximum path length when encoded as UTF-8 is longer than MAX_PATH bytes, but in practice you would be unlikely to exceed that.

Quote:
I'm using BB4W V5.95a.

I would not advise that if you are working with UTF-16 strings. Windows (particularly 64-bit Windows) sometimes requires that such strings are WORD-aligned, i.e. at an even memory address, and BB4W v6.00a guarantees that when using DIM ... LOCAL. However v5.95a does not. Try this program on both v5.95a and v6.00a to see what I mean:

Code:
      FOR N% = 1 TO 10
        PROC1(N%)
      NEXT
      END

      DEF PROC1(S%)
      DIM dir% LOCAL S%
      PRINT dir%
      ENDPROC 

Richard.
« Last Edit: Apr 2nd, 2015, 5:07pm by rtr2 » User IP Logged

hellomike
New Member
Image


member is offline

Avatar




PM

Gender: Male
Posts: 46
xx Re: Unicode in Folder names
« Reply #8 on: Apr 4th, 2015, 2:55pm »

Yes I see the difference between v5 and v6 using the code snippet.

I will continue development using BB4W v6.x.

Thanks for all the help and tips.

Mike
User IP Logged

Pages: 1  Notify Send Topic Print
« Previous Topic | Next Topic »

| |

This forum powered for FREE by Conforums ©
Terms of Service | Privacy Policy | Conforums Support | Parental Controls