Author |
Topic: Edit control with Unicode-only language (Read 7438 times) |
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Edit control with Unicode-only language
« Thread started on: Sep 1st, 2011, 2:04pm » |
|
I am making a text editor for 'foreign' languages.
The example TEXTEDIT.BBC does work with say Russian or Hebrew. These languages have ANSI equivalents, so no problem. I can type in Persian, a right-to-left language, - it works fine!
BUT I am working with Tajik - Tajik is a language that uses characters which have a Unicode encoding, but do not exist in any ANSI codepage (i.e. there was never a standardized legacy encoding for these characters).
I can PASTE Tajik text into the box - it displays no problem - but I cannot TYPE text into the box (as I easily can on say Notepad/Word)
Can you point me in the right direction on how to initialise the edit control so that it displays the Unicode characters directly without 'ANSI issues'?
Thanks!
Nick
|
« Last Edit: Sep 1st, 2011, 2:08pm by Nick » |
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #1 on: Sep 1st, 2011, 4:05pm » |
|
on Sep 1st, 2011, 2:04pm, Nick wrote:I can PASTE Tajik text into the box - it displays no problem - but I cannot TYPE text into the box (as I easily can on say Notepad/Word) |
|
Have you tried using a RichEdit Control rather than the regular Edit Control? That might be all that is required, but possibly you'll also need to use a Unicode RichEdit Control (class name RichEdit20W) for it to accept the Tajik input from the keyboard.
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #2 on: Sep 1st, 2011, 6:30pm » |
|
on Sep 1st, 2011, 4:05pm, Richard Russell wrote:Have you tried using a RichEdit Control rather than the regular Edit Control? That might be all that is required, but possibly you'll also need to use a Unicode RichEdit Control (class name RichEdit20W) for it to accept the Tajik input from the keyboard.
Richard. |
|
Sorry, I didn't explain. My code (unlike TEXTEDIT.BBC) is a rich edit control:
Snippet showing Rich Edit initialisation code (especially bit in pink):
INSTALL @lib$+"WINLIB5A" DIM CHARFORMAT{cbSize%, \ \ dwMask%, \ \ dwEffects%, \ \ yHeight%, \ \ yOffset%, \ \ crTextColor%, \ \ bCharSet&, \ \ bPitchAndFamily&, \ \ szFaceName&(31), \ \ padding&(1) } fontsize% = 32 fontname$ = "Microsoft sans serif" bckgrdcolour% = &FFFFFF : REM off white textcolour% = &000000 : REM White SYS "LoadLibrary", "Msftedit.dll" TO hRichEditDLL% IF hRichEditDLL%=0 THEN ERROR 100,"Failed to load Msftedit.dll" id_richedit% = 100 hRichEdit% = FN_createwindow(@hwnd%, \ \ "RichEdit50W", \ \ "", \ \ 0, \ \ 0, \ \ @vdu.tr%, \ \ @vdu.tb%, \ \ id_richedit%, \ \ WS_BORDER OR ES_MULTILINE, \ \ 0 ) CHARFORMAT.crTextColor% = textcolour% REM Set Richedit font CHARFORMAT.cbSize% = DIM(CHARFORMAT{}) CHARFORMAT.dwMask% = CFM_BOLD OR CFM_ITALIC OR CFM_UNDERLINE OR \ \ CFM_STRIKEOUT OR CFM_FACE OR CFM_COLOR OR CFM_SIZE CHARFORMAT.dwEffects% = CFE_BOLD OR CFE_UNDERLINE OR CFE_ITALIC CHARFORMAT.yHeight% = fontsize%*20 CHARFORMAT.szFaceName&() = fontname$ SYS "SendMessage", hRichEdit%, EM_SETCHARFORMAT, SCF_ALL, CHARFORMAT{} SYS "SetWindowText", hRichEdit%, "This is some initial text" SYS "SetWindowText", @hwnd%, "Tajik text box test"
Any other possibilities?
Thanks
Nick
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #3 on: Sep 1st, 2011, 9:22pm » |
|
on Sep 1st, 2011, 6:30pm, Nick wrote: I'm surprised if a Unicode edit control isn't accepting the keyboard input, when Notepad is. As far as I'm aware Notepad is basically an edit control wrapped in a fairly minimal supporting application.
You might want to check out the EM_SETLANGOPTIONS message to see if there's something you need to configure there:
http://msdn.microsoft.com/en-us/library/bb774250.aspx
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #4 on: Sep 1st, 2011, 10:10pm » |
|
on Sep 1st, 2011, 9:22pm, Richard Russell wrote:You might want to check out the EM_SETLANGOPTIONS message to see if there's something you need to configure there: |
|
OK Richard, I will have a further poke around the docs!
Question:
In the FNcreatewindow, there is this:
hRichEdit% = FN_createwindow(@hwnd%, \ \ "RichEdit50W", \
What is the significance of the string in red?
I guess I am wondering exactly what in FNcreatewindow tells it to initialise a Unicode (not ANSI) box...
??
Thanks
Nick
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #5 on: Sep 1st, 2011, 10:26pm » |
|
on Sep 1st, 2011, 10:10pm, Nick wrote:I guess I am wondering exactly what in FNcreatewindow tells it to initialise a Unicode (not ANSI) box... |
|
The W (wide) signifies that it's Unicode. RichEdit50A would be ANSI.
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #6 on: Sep 5th, 2011, 8:26pm » |
|
on Sep 1st, 2011, 10:26pm, Richard Russell wrote:The W (wide) signifies that it's Unicode. RichEdit50A would be ANSI.
Richard. |
|
I have tried an experiment. Looking at the Winapi docs, I found the TEXTMODE enumeration.
( http://msdn.microsoft.com/en-us/library/bb774364(v=VS.85).aspx#TM_RICHTEXT )
Two of its flags affect whether or not the control accepts Unicode. TM_RICHTEXT and TM_MULTICODEPAGE
Of the latter it says:
"The control allows multiple code pages and Unicode text into the control. This is the default setting."
So I tried an experiment. Right at the end of the code which initialises the Rich Edit control, I put the following:
TM_MULTICODEPAGE = 32 TM_RICHTEXT = 2 TMflags%= TM_MULTICODEPAGE OR TM_RICHTEXT SYS "SendMessage", hRichEdit%, WM_SETTEXT, 0, "" SYS "SendMessage", hRichEdit%, EM_SETTEXTMODE,TMflags%,0 TO TMset% PRINT TAB(80,0);"Settextmode result= ";TMset%;" Value sent= ";TMflags% SYS "SendMessage", hRichEdit%, EM_GETTEXTMODE,0,0 TO TMreturned% PRINT TAB(80,1);"Gextmode=";TMreturned%;
It says that the control must have no text when EM_SETTEXTMODE is called. If I send the empty string as above then TMset% comes back zero. The docs say this indicates that the TEXTMODE has been successfully set. If I simply insert a character into the string passed to SETTEXT, then SETTEXTMODE returns non-zero - indicating it failed. So it seems to be working...
BUT the following call to get back the TEXTMODE ALSO returns 0!!
In other words, it thinks it has set my choice, but the control does not return my choice when asked immediately afterwards.
I had thought that using CHARFORMAT2 (instead of the existing CHARFORMAT) structure might make the control ignore the instruction, so I upgraded the code to include CHARFORMAT2 - it makes no difference.
The control is still not accepting Unicode-only characters.
I will keep looking through the docs... 
Thanks
Nick
P.S. My initialisation code starts:
SYS "LoadLibrary", "Msftedit.dll" TO hRichEditDLL% hRichEdit% = FN_createwindow(@hwnd%, \ \ "RichEdit50W", \
http://msdn.microsoft.com/en-us/library/bb774286(v=VS.85).aspx
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #7 on: Sep 5th, 2011, 9:41pm » |
|
on Sep 5th, 2011, 8:26pm, Nick wrote:TM_MULTICODEPAGE = 32 TM_RICHTEXT = 2 |
|
Where are your declarations of EM_SETTEXTMODE and EM_GETTEXTMODE?
Quote:SYS "LoadLibrary", "Msftedit.dll" TO hRichEditDLL% |
|
Do you check the value of hRichEditDLL% in your code, to confirm the DLL is being loaded successfully?
Quote:hRichEdit% = FN_createwindow(@hwnd%, "RichEdit50W", \ |
|
Have you tried using an earlier version of the RichEdit control? If you don't specifically need the features only available in RichEdit50W you might have more success with an older version.
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #8 on: Sep 5th, 2011, 9:51pm » |
|
on Sep 5th, 2011, 9:41pm, Richard Russell wrote:Where are your declarations of EM_SETTEXTMODE and EM_GETTEXTMODE? |
|
Sorry - I only posted a snippet. They are in a PROC, called near the start of the programme:
EM_GETTEXTMODE = EM_GETTEXTMODE = (WM_USER + 90) EM_SETTEXTMODE = (WM_USER + 89)
Quote:Do you check the value of hRichEditDLL% in your code, to confirm the DLL is being loaded successfully? |
|
Yes:
IF hRichEditDLL%=0 THEN ERROR 100,"Failed to load RICHED20.DLL"
Quote:Have you tried using an earlier version of the RichEdit control? |
|
Yes - no difference.
Thanks
Nick
P.S. The behaviour is easily demonstrable - although I am working in Tajik, it shows up on 'standard' windows languages.
Use "Regional and language" in Win XP to install Tatar.
Then open notepad and type "QWERTY" - do the same in what you believe to be a rich edit control. You will get question marks for some characters.
The strange thing is that it PASTES unicode no problem...even right-to-left like Persian...
|
« Last Edit: Sep 5th, 2011, 9:57pm by Nick » |
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #9 on: Sep 6th, 2011, 08:12am » |
|
on Sep 5th, 2011, 9:51pm, Nick wrote:The strange thing is that it PASTES unicode no problem...even right-to-left like Persian... |
|
As it's keyboard-specific it sounds as though the problem may be related to the IME rather than to the RichEdit control itself. I think you probably need to start asking for help on a Microsoft forum.
The top man as far as internationalisation issues are concerned is Michael Kaplan. His blog is here:
http://blogs.msdn.com/b/michkap/
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #10 on: Sep 6th, 2011, 4:26pm » |
|
on Sep 6th, 2011, 08:12am, Richard Russell wrote:As it's keyboard-specific it sounds as though the problem may be related to the IME rather than to the RichEdit control itself. I think you probably need to start asking for help on a Microsoft forum. |
|
Yes, that sounds very likely.
Thanks for the excellent pointer to this blog - I will read it!
I did have a thought: we know that notepad works perfectly. It should be able to find out the handle for the main edit control, and then read off the various structures that can affect text optionbs (like CHARFORMAT etc).
Either way, I have options.
The good news is that I am decreasingly making mistakes with BBC4W syntax and increasingly finding answers in the Winapi docs - must be progress!
Thanks
Nick
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #11 on: Sep 13th, 2011, 4:58pm » |
|
Richard,
RECAP: typing unicode characters which have no legacy code page equivalent into a rich edit control yields "?" characters.
[By the way, you can take out any issues from the fact it is my application. Just build a rich edit control as per the example on the Wiki (minimalist code at the end), then type "qwerty" into is using the "Tatar" language. You will get the odd "?". Then try it in notepad - flawless....PASTE from notepad - flawless...]
I have done more work on this problem before posting this. In fact I have spent over two DAYS reading up on Unicode controls etc. Eventually I found a Russian programmer on the masm32.com forum. In one interchange he described the problem I am experiencing. This is that although the rich edit control is indeed fully unicode aware, and I can paste unicode in no problem, and sniffing the data shows it to be unicode data, when I TYPE into the control (say in the Tatar language), I get some characters in the control as "?" (i.e. &3F00).
The thread is here:
http://www.masm32.com/board/index.php?PHPSESSID=0af4a38a9fb821fe7beab00cece0466a&topic=14448.0
There is one contributor - a Russian guy ("Antariy") - who seems to be pretty clued up. I have had some correspondence with him and he has helped me to understand a little better where the problem may lie.
What is needed is for "the system" to pass unicode to the edit control as pure unicode, and NOT pass them through the 'thunking' that involves going to code pages and back again.
I am not a programmer, nor particularly experienced with BBC4W, but I want to respectfully ask this question:
is it possible that "something" about the BBC main app window (inside which my edit control is embedded) forces "the system" to route KBD input through the code-page-thunking even though the control itself is unicode aware?
(I could PM you with the correspondence he sent if you like)
It is so exasperating because in every other respect BBC4W is *brilliant* and exactly what people like me need. But if in fact there is something about which the user has no control (the initialisation of the main window) that prevents a rich edit control working with unicode as it should do, then it is a great shame...
Thoughts?
Thanks
Nick
************* FROM WIKI **************
SYS "LoadLibrary", "RICHED20.DLL" EM_SETBKGNDCOLOR = 1091 EM_SETCHARFORMAT = 1092 ES_MULTILINE = 4 WS_BORDER = &800000 WS_CHILD = &40000000 WS_VISIBLE = &10000000 CFM_BOLD = 1 CFM_ITALIC = 2 CFM_UNDERLINE = 4 CFM_STRIKEOUT = 8 CFM_FACE = &20000000 CFM_COLOR = &40000000 CFM_SIZE = &80000000 CFE_BOLD = 1 CFE_ITALIC = 2 CFE_UNDERLINE = 4 CFE_STRIKEOUT = 8 SCF_SELECTION = 1 SCF_ALL = 4 INSTALL @lib$+"WINLIB5" hre% = FN_createwindow("RichEdit20W", "", 200, 50, 140, 200, 0, \ \ WS_BORDER OR ES_MULTILINE, 0)
|
« Last Edit: Sep 13th, 2011, 5:16pm by Nick » |
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #12 on: Sep 13th, 2011, 9:46pm » |
|
on Sep 13th, 2011, 4:58pm, Nick wrote:is it possible that "something" about the BBC main app window (inside which my edit control is embedded) forces "the system" to route KBD input through the code-page-thunking even though the control itself is unicode aware? |
|
The only possibility I can think of is that the Unicode Edit Control is somehow behaving differently because it knows it's been called from an ANSI application. Why it should I don't know (I would probably call it a bug) but you can try creating the control by calling CreateWindowExW rather than CreateWindowExA.
Quote:But if in fact there is something about which the user has no control |
|
Ultimately you can always regain 'control'. In a worst-case scenario it would only be necessary to write a thin Unicode wrapper which opened the Edit Control. Your BBC BASIC program could then communicate with the wrapper and, hey presto, the Edit Control thinks it's being accessed from a Unicode application!
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #13 on: Sep 13th, 2011, 10:44pm » |
|
on Sep 13th, 2011, 9:46pm, Richard Russell wrote:The only possibility I can think of is that the Unicode Edit Control is somehow behaving differently because it knows it's been called from an ANSI application.[Why it should I don't know |
|
Thank you for the honest reply!! I thought it was just me being thick. Please indulge my inexperience, but when I use FN_createwindow(), my edit control is then subclassed to the main app window (@hwnd%) - right?
What effect does this have on the message chain for keyboard related messages? Do they get sent to the main app window and then passed to the edit control?
Presumably that is something that happens within the runtime generated windproc for the main window. I further presume that it is @hwnd% that decides how to handle keydown messages etc and is therefore responsible for the undesirable thunking - right?
Or if it simply passes the keydown etc message on to the control, maybe it is not passing the messages wide - so the edit control thinks it is being sent ansi, not unicode?
Of course if the edit control has focus and keydown messages are sent directly to the edit control, the above is rubbish!
Quote: (I would probably call it a bug) but you can try creating the control by calling CreateWindowExW rather than CreateWindowExA. |
|
The Russian guy did suggest this, and I changed WINLIB5 so that ALL the system functions in FN_createwindow were wide. The app still worked....and still gave me question marks!
Quote:Ultimately you can always regain 'control'. In a worst-case scenario it would only be necessary to write a thin Unicode wrapper which opened the Edit Control. Richard. |
|
Forgive me, but do you mean "write a wrapper in something other than BBC4W" - or could I do this in BBC4W?
Any thoughts are very welcome at this stage!
Nick
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #14 on: Sep 14th, 2011, 08:07am » |
|
on Sep 13th, 2011, 10:44pm, Nick wrote:Please indulge my inexperience, but when I use FN_createwindow(), my edit control is then subclassed to the main app window (@hwnd%) - right? |
|
No. What may be misleading you is that, in order to ensure CreateWindowEx is called from the context of the thread containing the message pump, @hwnd% is subclassed very briefly (for microseconds). As soon as the new window has been created the subclassing is undone and everything returns to normal. No subclassing is required for a child window like an Edit Control to work.
Also, don't mix up subclassing the parent window (which is what happens very briefly) with subclassing the Edit Control itself (which was mentioned as an issue in the thread to which you linked and which apparently has to be done using SetWindowLongW).
Quote:What effect does this have on the message chain for keyboard related messages? Do they get sent to the main app window and then passed to the edit control? |
|
None and No! Even if the parent window were subclassed it would have no effect either. That's not the way keyboard input works in Windows: the keyboard messages are sent directly to the window which has the focus, not via its parent.
Quote:Presumably that is something that happens within the runtime generated windproc for the main window. I further presume that it is @hwnd% that decides how to handle keydown messages etc and is therefore responsible for the undesirable thunking - right? |
|
No, as far as I am aware that is all completely wrong!
Quote:Of course if the edit control has focus and keydown messages are sent directly to the edit control, the above is rubbish! |
|
I'm afraid so. 
Quote:Forgive me, but do you mean "write a wrapper in something other than BBC4W" - or could I do this in BBC4W? |
|
I meant in something like C. But I would still be very surprised if it was necessary. Let's hope Microsoft hasn't done something really silly (it has happened before!).
One thing you could try would be deliberately to subclass the Edit Control (or to use one of the 'spy' utilities) to see just what WM_KEYDOWN and WM_CHAR messages etc. it is receiving when you provide keyboard input. Also, you could try sending what you believe to be the appropriate messages (what are they in the case of the language you are using?) to the Edit Control to see if the behaviour is the same as, or different from, the real keyboard. You might even be able to do something with a keyboard accelerator.
Richard.
|
|
Logged
|
|
|
|
|