BBC BASIC for Windows
« UTF-8 editor »

Welcome Guest. Please Login or Register.
Apr 5th, 2018, 10:26pm



ATTENTION MEMBERS: Conforums will be closing it doors and discontinuing its service on April 15, 2018.
Ad-Free has been deactivated. Outstanding Ad-Free credits will be reimbursed to respective payment methods.

If you require a dump of the post on your message board, please come to the support board and request it.


Thank you Conforums members.

BBC BASIC for Windows Resources
Online BBC BASIC for Windows documentation
BBC BASIC for Windows Beginners' Tutorial
BBC BASIC Home Page
BBC BASIC on Rosetta Code
BBC BASIC discussion group
BBC BASIC for Windows Programmers' Reference

« Previous Topic | Next Topic »
Pages: 1  Notify Send Topic Print
 thread  Author  Topic: UTF-8 editor  (Read 1154 times)
Ken Down
Guest
xx UTF-8 editor
« Thread started on: Jun 5th, 2010, 04:34am »

Richard Russell's simple text editor is very useful, but I cannot get it to accept or output UTF-8 text. I've put in the recommended code to tell it to use UTF-8, but any "funny" characters just come out on screen as ||||| and are saved likewise.

Any ideas, please?
User IP Logged

admin
Administrator
ImageImageImageImageImage


member is offline

Avatar




PM


Posts: 1145
xx Re: UTF-8 editor
« Reply #1 on: Jun 5th, 2010, 09:20am »

on Jun 5th, 2010, 04:34am, Guest-Ken Down wrote:
Richard Russell's simple text editor is very useful, but I cannot get it to accept or output UTF-8 text.

Are we talking about TEXTEDIT.BBC (or something else based on a Windows Edit Control)? If so, Windows Edit Controls do not directly support UTF-8 encoding. What you will have to do is to configure the Edit Control as Unicode (UTF-16, or more precisely UCS-2 encoding) then use MultiByteToWideChar and WideCharToMultiByte respectively to convert UTF-8 to UCS-2 and UCS-2 to UTF-8. It's not too difficult.

To convert the Edit Control to Unicode see this Wiki article, but use the "RichEdit20W" class rather than "RichEdit20A":

http://bb4w.wikispaces.com/Using+Rich+Edit+controls

Do be careful when allocating buffers to ensure they are big enough (UCS-2 requires two bytes per character).

Richard.
User IP Logged

Ken Down
Guest
xx Re: UTF-8 editor
« Reply #2 on: Jun 6th, 2010, 4:10pm »

It's the TEXTEDIT.BBC example program, not an edit box. I presume, therefore, that the detailed instructions you have given do not apply.
User IP Logged

admin
Administrator
ImageImageImageImageImage


member is offline

Avatar




PM


Posts: 1145
xx Re: UTF-8 editor
« Reply #3 on: Jun 6th, 2010, 4:54pm »

on Jun 6th, 2010, 4:10pm, Guest-Ken Down wrote:
It's the TEXTEDIT.BBC example program, not an edit box. I presume, therefore, that the detailed instructions you have given do not apply.

TEXTEDIT.BBC does use a Windows Edit Control (an "edit box", if you prefer):

Code:
Hedit% = FN_createwindow("EDIT", "", 0, 0, @vdu%!208, @vdu%!212, 0, &200044, 0) 

Therefore the instructions I gave apply in full.

Richard.
User IP Logged

Ken Down
Guest
xx Re: UTF-8 editor
« Reply #4 on: Jun 6th, 2010, 8:23pm »

Oh, OK. I didn't realise that a window was an edit box.
I'll try working it out.
Thanks.
User IP Logged

Ken Down
Guest
xx Re: UTF-8 editor
« Reply #5 on: Jun 27th, 2010, 07:40am »

Hmmmm. I finally got around to trying this out. I loaded in the example program, TEXTEDIT.BBC
I copied the section from the instructions for Rich Text Edit Boxes which begins SYS "LoadLibrary", "RICHED20.DLL"
and ends
SCF_ALL = 4
and put them just before the call to FN_createwindow.

I then altered the call to FN_createwindow
Hedit% = FN_createwindow("RichEdit20W","",0,0,@vdu%!208,@vdu%!212,0,WS_BORDER,0)

When I ran the program the edit box appeared with a nice border around it. It accepted and displayed correctly some Hebrew characters, but it would not accept RETURN, just beeped when I pressed that key.

I returned the style parameter to &200044 and the border disappeared but it would now accept RETURN.

However when I saved what I had entered the Hebrew characters appeared as huh? (which isn't much of an improvement).

I have tried putting VDU23,22,800;600;8,16,16,8+128 (which is supposed to set the font to UTF-8) right at the start of the program, but it makes no difference.

I presume there is something simple which I have overlooked, but I can't see what it might be. Any help gratefully accepted.
User IP Logged

admin
Administrator
ImageImageImageImageImage


member is offline

Avatar




PM


Posts: 1145
xx Re: UTF-8 editor
« Reply #6 on: Jun 27th, 2010, 09:19am »

on Jun 27th, 2010, 07:40am, Guest-Ken Down wrote:
it would not accept RETURN, just beeped when I pressed that key.

Check your style values. You probably need to include ES_MULTILINE (4) and possibly ES_WANTRETURN (&1000). See the list of RichEdit styles here:

http://msdn.microsoft.com/en-us/library/bb774367.aspx

Quote:
However when I saved what I had entered the Hebrew characters appeared as ???? (which isn't much of an improvement).

It's a little hard to comment without seeing your code. I explained before that you would need to use SYS "WideCharToMultiByte" to convert the UCS-2 text returned from the RichEdit control to the UTF-8 text that you want to save to file. You must also use SYS "SendMessageW" (rather than the regular SYS "SendMessage") to get the UCS-2 data in the first place. My guess would be that you've used one or other of those calls incorrectly.

This is what I would expect your code to look like (or something very similar):

Code:
      DEF FNsaveas : LOCAL F%, L%, N%, U%
      SYS "GetSaveFileName", fs{} TO F%
      IF F% PROCtitle ELSE = FALSE
      DEF FNsave : LOCAL F%, L%, N%, U% : IF ?Fn% = 0 THEN = FNsaveas
      SYS "SendMessageW", Hedit%, WM_GETTEXTLENGTH, 0, 0 TO L%
      SYS "GlobalAlloc", 0, 2*(L%+1) TO F%
      SYS "SendMessageW", Hedit%, WM_GETTEXT, L%+1, F%
      SYS "WideCharToMultiByte", CP_UTF8, 0, F%, L%, 0, 0, 0, 0 TO N%
      SYS "GlobalAlloc", 0, N% TO U%
      SYS "WideCharToMultiByte", CP_UTF8, 0, F%, L%, U%, N%, 0, 0
      SYS "GlobalFree", F%
      OSCLI "SAVE """+$$Fn%+""" "+STR$~U%+"+"+STR$~N%
      SYS "GlobalFree", U%
      = TRUE 

Quote:
I have tried putting VDU23,22,800;600;8,16,16,8+128 (which is supposed to set the font to UTF-8) right at the start of the program, but it makes no difference.

As I've explained before, TEXTEDIT.BBC does not use BBC BASIC's VDU emulator for its output, therefore that command is irrelevant and will have no effect.

Richard.
« Last Edit: Jun 27th, 2010, 09:46am by admin » User IP Logged

Ken Down
Guest
xx Re: UTF-8 editor
« Reply #7 on: Jun 28th, 2010, 9:18pm »

Ok, I'll play around with that over the next few days. Thanks for your patience and expertise.
User IP Logged

Ken Down
Guest
xx Re: UTF-8 editor
« Reply #8 on: Jul 5th, 2010, 5:05pm »

I presume that when loading a file back in again, I would need to use the opposite call, "MultiByteToWideChar"? Would the parameters be the same as in the two calls to "WideCharToMultiByte" in the save routine?
User IP Logged

admin
Administrator
ImageImageImageImageImage


member is offline

Avatar




PM


Posts: 1145
xx Re: UTF-8 editor
« Reply #9 on: Jul 5th, 2010, 6:06pm »

on Jul 5th, 2010, 5:05pm, Guest-Ken Down wrote:
I presume that when loading a file back in again, I would need to use the opposite call, "MultiByteToWideChar"?

That's correct.

Quote:
Would the parameters be the same as in the two calls to "WideCharToMultiByte" in the save routine?

MultiByteToWideChar has fewer parameters (six rather than eight). Look it up in your preferred Windows API Reference. APIViewer will give you the declaration in BBC BASIC syntax, but not tell you what the parameters mean (I don't advise guessing)!

See Frequently Asked Question #8:

http://www.bbcbasic.co.uk/bbcwin/faq.html#q8

Richard.
User IP Logged

Pages: 1  Notify Send Topic Print
« Previous Topic | Next Topic »

| |

This forum powered for FREE by Conforums ©
Terms of Service | Privacy Policy | Conforums Support | Parental Controls