Author |
Topic: Edit control with Unicode-only language (Read 7464 times) |
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #3 on: Sep 1st, 2011, 9:22pm » |
|
on Sep 1st, 2011, 6:30pm, Nick wrote: I'm surprised if a Unicode edit control isn't accepting the keyboard input, when Notepad is. As far as I'm aware Notepad is basically an edit control wrapped in a fairly minimal supporting application.
You might want to check out the EM_SETLANGOPTIONS message to see if there's something you need to configure there:
http://msdn.microsoft.com/en-us/library/bb774250.aspx
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #4 on: Sep 1st, 2011, 10:10pm » |
|
on Sep 1st, 2011, 9:22pm, Richard Russell wrote:You might want to check out the EM_SETLANGOPTIONS message to see if there's something you need to configure there: |
|
OK Richard, I will have a further poke around the docs!
Question:
In the FNcreatewindow, there is this:
hRichEdit% = FN_createwindow(@hwnd%, \ \ "RichEdit50W", \
What is the significance of the string in red?
I guess I am wondering exactly what in FNcreatewindow tells it to initialise a Unicode (not ANSI) box...
??
Thanks
Nick
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #5 on: Sep 1st, 2011, 10:26pm » |
|
on Sep 1st, 2011, 10:10pm, Nick wrote:| I guess I am wondering exactly what in FNcreatewindow tells it to initialise a Unicode (not ANSI) box... |
|
The W (wide) signifies that it's Unicode. RichEdit50A would be ANSI.
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #6 on: Sep 5th, 2011, 8:26pm » |
|
on Sep 1st, 2011, 10:26pm, Richard Russell wrote:The W (wide) signifies that it's Unicode. RichEdit50A would be ANSI.
Richard. |
|
I have tried an experiment. Looking at the Winapi docs, I found the TEXTMODE enumeration.
( http://msdn.microsoft.com/en-us/library/bb774364(v=VS.85).aspx#TM_RICHTEXT )
Two of its flags affect whether or not the control accepts Unicode. TM_RICHTEXT and TM_MULTICODEPAGE
Of the latter it says:
"The control allows multiple code pages and Unicode text into the control. This is the default setting."
So I tried an experiment. Right at the end of the code which initialises the Rich Edit control, I put the following:
TM_MULTICODEPAGE = 32 TM_RICHTEXT = 2 TMflags%= TM_MULTICODEPAGE OR TM_RICHTEXT SYS "SendMessage", hRichEdit%, WM_SETTEXT, 0, "" SYS "SendMessage", hRichEdit%, EM_SETTEXTMODE,TMflags%,0 TO TMset% PRINT TAB(80,0);"Settextmode result= ";TMset%;" Value sent= ";TMflags% SYS "SendMessage", hRichEdit%, EM_GETTEXTMODE,0,0 TO TMreturned% PRINT TAB(80,1);"Gextmode=";TMreturned%;
It says that the control must have no text when EM_SETTEXTMODE is called. If I send the empty string as above then TMset% comes back zero. The docs say this indicates that the TEXTMODE has been successfully set. If I simply insert a character into the string passed to SETTEXT, then SETTEXTMODE returns non-zero - indicating it failed. So it seems to be working...
BUT the following call to get back the TEXTMODE ALSO returns 0!!
In other words, it thinks it has set my choice, but the control does not return my choice when asked immediately afterwards.
I had thought that using CHARFORMAT2 (instead of the existing CHARFORMAT) structure might make the control ignore the instruction, so I upgraded the code to include CHARFORMAT2 - it makes no difference.
The control is still not accepting Unicode-only characters.
I will keep looking through the docs... 
Thanks
Nick
P.S. My initialisation code starts:
SYS "LoadLibrary", "Msftedit.dll" TO hRichEditDLL% hRichEdit% = FN_createwindow(@hwnd%, \ \ "RichEdit50W", \
http://msdn.microsoft.com/en-us/library/bb774286(v=VS.85).aspx
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #7 on: Sep 5th, 2011, 9:41pm » |
|
on Sep 5th, 2011, 8:26pm, Nick wrote:TM_MULTICODEPAGE = 32 TM_RICHTEXT = 2 |
|
Where are your declarations of EM_SETTEXTMODE and EM_GETTEXTMODE?
Quote:| SYS "LoadLibrary", "Msftedit.dll" TO hRichEditDLL% |
|
Do you check the value of hRichEditDLL% in your code, to confirm the DLL is being loaded successfully?
Quote:| hRichEdit% = FN_createwindow(@hwnd%, "RichEdit50W", \ |
|
Have you tried using an earlier version of the RichEdit control? If you don't specifically need the features only available in RichEdit50W you might have more success with an older version.
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #8 on: Sep 5th, 2011, 9:51pm » |
|
on Sep 5th, 2011, 9:41pm, Richard Russell wrote:| Where are your declarations of EM_SETTEXTMODE and EM_GETTEXTMODE? |
|
Sorry - I only posted a snippet. They are in a PROC, called near the start of the programme:
EM_GETTEXTMODE = EM_GETTEXTMODE = (WM_USER + 90) EM_SETTEXTMODE = (WM_USER + 89)
Quote:| Do you check the value of hRichEditDLL% in your code, to confirm the DLL is being loaded successfully? |
|
Yes:
IF hRichEditDLL%=0 THEN ERROR 100,"Failed to load RICHED20.DLL"
Quote:| Have you tried using an earlier version of the RichEdit control? |
|
Yes - no difference.
Thanks
Nick
P.S. The behaviour is easily demonstrable - although I am working in Tajik, it shows up on 'standard' windows languages.
Use "Regional and language" in Win XP to install Tatar.
Then open notepad and type "QWERTY" - do the same in what you believe to be a rich edit control. You will get question marks for some characters.
The strange thing is that it PASTES unicode no problem...even right-to-left like Persian...
|
| « Last Edit: Sep 5th, 2011, 9:57pm by Nick » |
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #9 on: Sep 6th, 2011, 08:12am » |
|
on Sep 5th, 2011, 9:51pm, Nick wrote:| The strange thing is that it PASTES unicode no problem...even right-to-left like Persian... |
|
As it's keyboard-specific it sounds as though the problem may be related to the IME rather than to the RichEdit control itself. I think you probably need to start asking for help on a Microsoft forum.
The top man as far as internationalisation issues are concerned is Michael Kaplan. His blog is here:
http://blogs.msdn.com/b/michkap/
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #10 on: Sep 6th, 2011, 4:26pm » |
|
on Sep 6th, 2011, 08:12am, Richard Russell wrote:| As it's keyboard-specific it sounds as though the problem may be related to the IME rather than to the RichEdit control itself. I think you probably need to start asking for help on a Microsoft forum. |
|
Yes, that sounds very likely.
Thanks for the excellent pointer to this blog - I will read it!
I did have a thought: we know that notepad works perfectly. It should be able to find out the handle for the main edit control, and then read off the various structures that can affect text optionbs (like CHARFORMAT etc).
Either way, I have options.
The good news is that I am decreasingly making mistakes with BBC4W syntax and increasingly finding answers in the Winapi docs - must be progress!
Thanks
Nick
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #11 on: Sep 13th, 2011, 4:58pm » |
|
Richard,
RECAP: typing unicode characters which have no legacy code page equivalent into a rich edit control yields "?" characters.
[By the way, you can take out any issues from the fact it is my application. Just build a rich edit control as per the example on the Wiki (minimalist code at the end), then type "qwerty" into is using the "Tatar" language. You will get the odd "?". Then try it in notepad - flawless....PASTE from notepad - flawless...]
I have done more work on this problem before posting this. In fact I have spent over two DAYS reading up on Unicode controls etc. Eventually I found a Russian programmer on the masm32.com forum. In one interchange he described the problem I am experiencing. This is that although the rich edit control is indeed fully unicode aware, and I can paste unicode in no problem, and sniffing the data shows it to be unicode data, when I TYPE into the control (say in the Tatar language), I get some characters in the control as "?" (i.e. &3F00).
The thread is here:
http://www.masm32.com/board/index.php?PHPSESSID=0af4a38a9fb821fe7beab00cece0466a&topic=14448.0
There is one contributor - a Russian guy ("Antariy") - who seems to be pretty clued up. I have had some correspondence with him and he has helped me to understand a little better where the problem may lie.
What is needed is for "the system" to pass unicode to the edit control as pure unicode, and NOT pass them through the 'thunking' that involves going to code pages and back again.
I am not a programmer, nor particularly experienced with BBC4W, but I want to respectfully ask this question:
is it possible that "something" about the BBC main app window (inside which my edit control is embedded) forces "the system" to route KBD input through the code-page-thunking even though the control itself is unicode aware?
(I could PM you with the correspondence he sent if you like)
It is so exasperating because in every other respect BBC4W is *brilliant* and exactly what people like me need. But if in fact there is something about which the user has no control (the initialisation of the main window) that prevents a rich edit control working with unicode as it should do, then it is a great shame...
Thoughts?
Thanks
Nick
************* FROM WIKI **************
SYS "LoadLibrary", "RICHED20.DLL" EM_SETBKGNDCOLOR = 1091 EM_SETCHARFORMAT = 1092 ES_MULTILINE = 4 WS_BORDER = &800000 WS_CHILD = &40000000 WS_VISIBLE = &10000000 CFM_BOLD = 1 CFM_ITALIC = 2 CFM_UNDERLINE = 4 CFM_STRIKEOUT = 8 CFM_FACE = &20000000 CFM_COLOR = &40000000 CFM_SIZE = &80000000 CFE_BOLD = 1 CFE_ITALIC = 2 CFE_UNDERLINE = 4 CFE_STRIKEOUT = 8 SCF_SELECTION = 1 SCF_ALL = 4 INSTALL @lib$+"WINLIB5" hre% = FN_createwindow("RichEdit20W", "", 200, 50, 140, 200, 0, \ \ WS_BORDER OR ES_MULTILINE, 0)
|
| « Last Edit: Sep 13th, 2011, 5:16pm by Nick » |
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #12 on: Sep 13th, 2011, 9:46pm » |
|
on Sep 13th, 2011, 4:58pm, Nick wrote:| is it possible that "something" about the BBC main app window (inside which my edit control is embedded) forces "the system" to route KBD input through the code-page-thunking even though the control itself is unicode aware? |
|
The only possibility I can think of is that the Unicode Edit Control is somehow behaving differently because it knows it's been called from an ANSI application. Why it should I don't know (I would probably call it a bug) but you can try creating the control by calling CreateWindowExW rather than CreateWindowExA.
Quote:| But if in fact there is something about which the user has no control |
|
Ultimately you can always regain 'control'. In a worst-case scenario it would only be necessary to write a thin Unicode wrapper which opened the Edit Control. Your BBC BASIC program could then communicate with the wrapper and, hey presto, the Edit Control thinks it's being accessed from a Unicode application!
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #13 on: Sep 13th, 2011, 10:44pm » |
|
on Sep 13th, 2011, 9:46pm, Richard Russell wrote:| The only possibility I can think of is that the Unicode Edit Control is somehow behaving differently because it knows it's been called from an ANSI application.[Why it should I don't know |
|
Thank you for the honest reply!! I thought it was just me being thick. Please indulge my inexperience, but when I use FN_createwindow(), my edit control is then subclassed to the main app window (@hwnd%) - right?
What effect does this have on the message chain for keyboard related messages? Do they get sent to the main app window and then passed to the edit control?
Presumably that is something that happens within the runtime generated windproc for the main window. I further presume that it is @hwnd% that decides how to handle keydown messages etc and is therefore responsible for the undesirable thunking - right?
Or if it simply passes the keydown etc message on to the control, maybe it is not passing the messages wide - so the edit control thinks it is being sent ansi, not unicode?
Of course if the edit control has focus and keydown messages are sent directly to the edit control, the above is rubbish!
Quote:| (I would probably call it a bug) but you can try creating the control by calling CreateWindowExW rather than CreateWindowExA. |
|
The Russian guy did suggest this, and I changed WINLIB5 so that ALL the system functions in FN_createwindow were wide. The app still worked....and still gave me question marks!
Quote:Ultimately you can always regain 'control'. In a worst-case scenario it would only be necessary to write a thin Unicode wrapper which opened the Edit Control. Richard. |
|
Forgive me, but do you mean "write a wrapper in something other than BBC4W" - or could I do this in BBC4W?
Any thoughts are very welcome at this stage!
Nick
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #14 on: Sep 14th, 2011, 08:07am » |
|
on Sep 13th, 2011, 10:44pm, Nick wrote:| Please indulge my inexperience, but when I use FN_createwindow(), my edit control is then subclassed to the main app window (@hwnd%) - right? |
|
No. What may be misleading you is that, in order to ensure CreateWindowEx is called from the context of the thread containing the message pump, @hwnd% is subclassed very briefly (for microseconds). As soon as the new window has been created the subclassing is undone and everything returns to normal. No subclassing is required for a child window like an Edit Control to work.
Also, don't mix up subclassing the parent window (which is what happens very briefly) with subclassing the Edit Control itself (which was mentioned as an issue in the thread to which you linked and which apparently has to be done using SetWindowLongW).
Quote:| What effect does this have on the message chain for keyboard related messages? Do they get sent to the main app window and then passed to the edit control? |
|
None and No! Even if the parent window were subclassed it would have no effect either. That's not the way keyboard input works in Windows: the keyboard messages are sent directly to the window which has the focus, not via its parent.
Quote:| Presumably that is something that happens within the runtime generated windproc for the main window. I further presume that it is @hwnd% that decides how to handle keydown messages etc and is therefore responsible for the undesirable thunking - right? |
|
No, as far as I am aware that is all completely wrong!
Quote:| Of course if the edit control has focus and keydown messages are sent directly to the edit control, the above is rubbish! |
|
I'm afraid so. 
Quote:| Forgive me, but do you mean "write a wrapper in something other than BBC4W" - or could I do this in BBC4W? |
|
I meant in something like C. But I would still be very surprised if it was necessary. Let's hope Microsoft hasn't done something really silly (it has happened before!).
One thing you could try would be deliberately to subclass the Edit Control (or to use one of the 'spy' utilities) to see just what WM_KEYDOWN and WM_CHAR messages etc. it is receiving when you provide keyboard input. Also, you could try sending what you believe to be the appropriate messages (what are they in the case of the language you are using?) to the Edit Control to see if the behaviour is the same as, or different from, the real keyboard. You might even be able to do something with a keyboard accelerator.
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #15 on: Sep 14th, 2011, 5:09pm » |
|
on Sep 14th, 2011, 08:07am, Richard Russell wrote:| One thing you could try would be deliberately to subclass the Edit Control (or to use one of the 'spy' utilities) to see just what WM_KEYDOWN and WM_CHAR messages etc. it is receiving when you provide keyboard input. |
|
Thank you for your patience Richard. And too for the helpful note about the true nature of the brief subclassing. As I said, a learning curve!
I used Winspector to trap the WM_CHAR messages. One of the letters that shows the thunking problem is the letter "W" when typed in Tatar.
Winspector shows the WM_CHAR results for this key press:
wParam: 0x000004e9 iParam: 0x00110001
This is the correct unicode, I should get a "ө" but I get a "?"
Then I changed the code to 'manually' send a WM_CHAR message with precisely the same parameters to the edit control after it was initialised. It works! I get a "ө"
I am not sure where that leaves me. I think the Russian guy is saying that for some reason that is not clear, if you have a unicode control that was created by an ANSI window, it reverts to the code-page thunking... (He seemed to suggest it showed most in Windows XP, but my win 7 machine also gives the dreaded "?" in the control.)
I have asked him to write his explanation in Russian and I will get it translated for here.
Any other ideas in the mean time?
Thanks
Nick
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Edit control with Unicode-only language
« Reply #16 on: Sep 14th, 2011, 10:32pm » |
|
on Sep 14th, 2011, 5:09pm, Nick wrote:| I think the Russian guy is saying that for some reason that is not clear, if you have a unicode control that was created by an ANSI window, it reverts to the code-page thunking... |
|
How is he suggesting the Edit Control is even aware that it was created by an ANSI application? If it is created by calling CreateWindowExA then, fine, it is reasonable for it to assume it was created by an ANSI app. But if it is created by calling CreateWindowExW, which I understand you have already tried doing, why would it not think it had been created by a Unicode application?
There's nothing 'magic' about an application being a Unicode app. In a language like 'C' it comes about purely by defining the constant UNICODE, which results in the pre-processor expanding macros and defining function prototypes etc. such that they are applicable to Unicode (so, for example, CreateWindowEx is expanded by the pre-processor to CreateWindowExA if UNICODE is not defined and to CreateWindowExW if UNICODE is defined).
The other thing I don't understand is that you say the Edit Control behaves differently when receiving a WM_CHAR from the keyboard as opposed to receiving a WM_CHAR from you. But how does it know? It's just a message that arrives via its message pump, and such messages don't include any information from which the Edit Control could deduce the source, as far as I know.
If it's the case that you can display the correct character by sending a WM_CHAR message to the control, could you not somehow arrange for all keyboard input to be routed via your code and sent to the Edit Control by that route? I'm not too sure how you'd do that - it would involve hijacking the input focus mechanism somehow - but perhaps it could be accomplished using a keyboard accelerator.
Richard.
|
|
Logged
|
|
|
|
Nick
New Member
member is offline


Gender: 
Posts: 33
|
 |
Re: Edit control with Unicode-only language
« Reply #17 on: Sep 15th, 2011, 06:02am » |
|
on Sep 14th, 2011, 10:32pm, Richard Russell wrote:| How is he suggesting the Edit Control is even aware that it was created by an ANSI application? |
|
Hello Richard,
Phew - this is proving interesting. I have pushed him on this point. He did TWO things. Firstly he explained in more detail:
"Main window do not subclassed controls, yes. It just contains them. BUT, the point is: message pump, which servicing main window - servicing all the windows in thread. Windows controls are all windows. I.e., message pump servicing not only main window, it servicing messages from all the windows in thread - from main window, from edit box, from RichEdit, from buttons etc etc.
And, if message pump is not Unicode-aware, there is have a place thunking between intermixed content. Even if some window (main window or child-control - have no difference) is Unicode aware, Unicode created etc, it will not support some Unicode features, like keyboard input, because ASCII-APIs message pump is not ready to get and translate and service such input. In this case Unicode will be converted to ASCII, non-ACP (non system language codepage) chars will be dropped to "?", and then, if target window is Unicode, char will be converted again to Unicode, but it will already be just "?'..."
Secondly, to back up his point, he knocked up a very basic edit control/dialogue app in Power BASIC and compiled it and sent it to me. It too displays precisely the same shortcoming with my Tatar input.
Then he hacked his own EXECUTABLE, changed just three bytes, and it worked *PERFECTLY* in any non-code paged language (such as my desired Tajik).
What were the three bytes? Here is his explanation:
"I'll try to write small example app. I'll use PowerBASIC v.9 compiler, and its runtime dialogs creation. Yes, I'll use runtime specially, because it does not support Unicode as well, message pump of runtime is ASCII. I'll compile the program, and this EXE would be usual EXE with standard runtime, and it will not fully support Unicode. And, I'll make a copy of executable with only 3 bytes changed in it: I'll replace ending "A" to "W" in 3 functions only - main functions servising message pump: PeekMessage IsDialogMessage DispatchMessage
Second, patched executable will support full range of Unicode input, at least on WinXP. You can make binary comparsion "fc /b ..." and see, than executables differents only by 3 bytes."
So to the extent he backed up his understanding with working code, to that extent he seems to have the right understanding. I think he is basically saying that the inherent "ANSIness" of the main window predisposes what he calls the 'message pump' for the thread NOT to correctly handle unicode messaging., even if the thread subsequently creates fully unicode aware controls.
Anyway, he offered to do the same with a simple BBC BASIC programme and make my app work by hacking YOUR runtime generated code!
By the way the source of the Power Basic programme he sent is below - it too uses W for dialogue creation.
If you want to play with his small executables (only 20K each) let me know. And I will keep you posted on the results of his BBC4W hack...
Thanks Richard
Nick
************** SOURCE in Power Basic *********
#INCLUDE "win32api.inc"
DECLARE FUNCTION MessageBoxW LIB "USER32.DLL" ALIAS "MessageBoxW" (BYVAL hWnd AS DWORD, lpText AS ASCIIZ, lpCaption AS ASCIIZ, BYVAL dwType AS DWORD) AS LONG DECLARE FUNCTION GetDlgItemTextW LIB "USER32.DLL" ALIAS "GetDlgItemTextW" (BYVAL hDlg AS DWORD, BYVAL nIDDlgItem AS LONG, lpString AS ASCIIZ, BYVAL nMaxCount AS LONG) AS DWORD
CALLBACK FUNCTION hDlgProc() AS LONG SELECT CASE AS LONG CBMSG CASE %WM_COMMAND SELECT CASE AS LONG LOWRD(CBWPARAM) CASE 1001 LOCAL buff AS STRING LOCAL dw AS DWORD
'######### get the buffer for text, Unicode - 2 bytes per char
'this can be CONTROL SEND runtime statement, but I using this to 'be more clear - CONTROL SEND is just a wrapper around SendDlgItemMessage call dw=SendDlgItemMessage(CBHNDL,1000,%WM_GETTEXTLENGTH,0,0) buff=SPACE$(dw*2+2) GetDlgItemTextW(CBHNDL,1000,BYVAL STRPTR(buff),dw+1) MessageBoxW(CBHNDL,BYVAL STRPTR(buff),$NUL+$NUL,0) END SELECT
END SELECT END FUNCTION
FUNCTION PBMAIN() AS LONG
IF LoadLibrary("RICHED20.DLL")=0 THEN MSGBOX "Cannot load RICHED20.DLL!" EXIT FUNCTION END IF
LOCAL hDlg AS DWORD
DIALOG FONT "Arial",14,0,1
DIALOG NEW 0,"Example",,,200,150,%WS_SYSMENU OR %WS_MINIMIZEBOX TO hDlg
CONTROL ADD "RICHEDIT20W",hDlg,1000,"",0,0,200,110,%WS_VISIBLE OR %WS_CHILD OR %WS_TABSTOP OR %ES_MULTILINE OR %ES_WANTRETURN
CONTROL ADD BUTTON,hDlg,1001,"Message text out",10,120,100,10
DIALOG SHOW MODAL hDlg CALL hDlgProc
END FUNCTION
|
|
Logged
|
|
|
|
|