Author |
Topic: Schroedinger’s String (Read 757 times) |
|
Andy Parkes
Developer
member is offline


Gender: 
Posts: 25
|
 |
Schroedinger’s String
« Thread started on: Jan 6th, 2014, 6:00pm » |
|
Until recently I thought that I understood how BB4W strings worked, but I've stumbled on some behaviour that I don't understand. Its difficult to explain, so I have uploaded a short program that I've called 'Schroedinger’s String', to the Wiggio group Temp folder, to try and demonstrate the behaviour I am referring to.
http://wiggio.com/yui/folder/stream_file.php?doc_key=maAMVhKzNA8ReAl0MpQvh5XxLwN1Q4EV3M4EyYIf7cA=
It seems that if you make a string variable LOCAL in more than one procedure, it is possible for it to have the same string address pointer in both procedures?
If you run the program I've uploaded. it will give you 2 options. Choosing option 1 produces the sorts of results that I would have expected (i.e. different string address), run the program a second time and choose option 2, this produces results that I would not have expected (i.e. the same string address every time). Well, I just don't know what’s going on there! I would also have expected the same behaviour (either same address or different addresses) from both routes through the program.
Can anyone shed any light?
Thank you, and a Happy New Year to you all.
Andy
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Schroedinger’s String
« Reply #1 on: Jan 6th, 2014, 7:31pm » |
|
on Jan 6th, 2014, 6:00pm, Andy Parkes wrote:Can anyone shed any light? |
|
Try running your program in BB4W version 6 (i.e. via BB4W_v6.bas, which is the only form in which it is currently released). You will probably find that the results are more in keeping with your expectations; here is what I get:
Code:In _test3 BEFORE _test4
a$ = TEST3
pointer: 56237586
length: 5
In _test4 BEFORE assigning the string
a$ =
pointer: 0
length: 0
In _test4 AFTER assigning the string
a$ = TEST4 STRING
pointer: 56237594
length: 12
Back in _test3 AFTER _test4
a$ = TEST3
pointer: 56237586
length: 5
_test3 string at _test4 string length: TEST3TEST
> The critical difference, which explains your observations, is that in BB4W v5 strings are temporarily stored on the stack whereas in BB4W v6 they are temporarily stored on the heap. There are various pros and cons of the two approaches, and the issue was discussed some time ago on the Yahoo! group, I think.
Richard.
|
|
Logged
|
|
|
|
Andy Parkes
Developer
member is offline


Gender: 
Posts: 25
|
 |
Re: Schroedinger’s String
« Reply #2 on: Jan 7th, 2014, 12:00pm » |
|
Hi Richard,
Thanks very much for your help, that seems to explain it. Let me double check that I've got this right:
It appears that in the current version of BB4W, when a string is assigned to a LOCAL variable, it is first placed on the heap (as I had thought). If that variable is then made LOCAL again in another procedure (called from the first), the string assigned in the first procedure is copied to the stack (rather than the address of the string). If possible the same address on the heap will then be used for a string assignment in the second procedure (this might not be possible if other things have been created on the heap in the mean time). When the second procedure returns, the string from the first procedure is copied back from the stack to the heap. Again the same address on the heap will be used if possible.
I've made some small changes to the program I previously uploaded to show this more clearly:
http://wiggio.com/yui/folder/stream_file.php?doc_key=4M+BQAHGRiJ2jploDrIwOvrplAPO7eYhX6yad/fXRzk=
A recent project has led me on a merry dance as I've encountered vanishing and corrupted strings because I had not understood that the strings themselves were saved on the stack.
Thanks again
Andy
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Schroedinger’s String
« Reply #3 on: Jan 7th, 2014, 12:17pm » |
|
on Jan 7th, 2014, 12:00pm, Andy Parkes wrote:It appears that in the current version of BB4W, when a string is assigned to a LOCAL variable, it is first placed on the heap (as I had thought). |
|
Not really. It's irrelevant whether the string was previously declared as LOCAL - the interpreter doesn't even know that. Assigning a string does the same thing wherever and whenever it happens.
Quote:If that variable is then made LOCAL again in another procedure (called from the first), the string assigned in the first procedure is copied to the stack |
|
Again you're overcomplicating things. Making a string LOCAL temporarily stores it on the stack (in v5) or on the heap (v6); whether it was previously made LOCAL, or indeed defined at all, makes no difference.
Quote:A recent project has led me on a merry dance as I've encountered vanishing and corrupted strings |
|
What How the string handling works internally should be completely transparent to a user's program. If you have encountered 'vanishing' or 'corrupted' strings you have found a MASSIVE BUG in BBC BASIC for Windows. If that is indeed the case please engage with me VERY URGENTLY so that this may be investigated and if necessary fixed.
Richard.
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Schroedinger’s String
« Reply #4 on: Jan 7th, 2014, 1:10pm » |
|
on Jan 7th, 2014, 12:00pm, Andy Parkes wrote:I've made some small changes to the program I previously uploaded to show this more clearly |
|
That code is somewhat flawed in that PROC_printChrsOnStack fails to take account of the fact that it uses some stack itself, so what is being printed as the contents of the stack includes some irrelevant data. I would suggest either passing the current stack pointer to the routine as a parameter:
Code: DIM S% LOCAL -1
PROC_printChrsOnStack(S%, !^a$) Or, knowing how much stack is used for the procedure call and its own LOCAL variables, adjust the starting point of the stack dump accordingly.
I remain extremely concerned at your claim to have suffered corrupted strings in BB4W. No bugs of that seriousness have been discovered for several years, so it would be an awful shock to have one confirmed now.
Richard.
|
|
Logged
|
|
|
|
Andy Parkes
Developer
member is offline


Gender: 
Posts: 25
|
 |
Re: Schroedinger’s String
« Reply #5 on: Jan 7th, 2014, 7:50pm » |
|
Hi Richard,
No no, don't worry its definitely not a bug in BB4W! That was a poor, but honestly accidental choice of words. I apologise if it read like that. Instead of 'corrupt' I might have more accurately said that, I encountered a confusing variety of partially or entirely incorrect output relative to my expectations, based upon the understanding that I had at the time. I never for a second assumed that it was genuinely corrupt data, and did not intend it to be read as a claim to having discovered a bug. As is always the case, the output is predictable when I know what I'm doing.
It happened when I began reading and assigning strings indirectly from/to structure string members. I have chosen to do this indirectly as part of an attempt to simplify the interface to a library I am working on. Of course, its taken me into this new territory, which is also a good reason to have explored the idea.
I can see that my choice of words has overcomplicated my explanation, but your notes confirm that which I have recently learned - that strings are created on the heap, and copied into the stack (in version 5) when made local.
Quote:what is being printed as the contents of the stack includes some irrelevant data |
|
Thank you, I did notice that one, but I did not think that it mattered in this case, since I only wanted to reveal any ASCII characters on the stack that would prove the existence on the stack, of one of the strings. But I appreciate having the concept reinforced, that when a procedure is called, it and its variables will add to the stack, and I also read into your comment, the advice that its good practice to make the meaning of the code clear.
Apologies again for having caused you concern.
Kind Regards
Andy
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Schroedinger’s String
« Reply #6 on: Jan 7th, 2014, 8:38pm » |
|
on Jan 7th, 2014, 7:50pm, Andy Parkes wrote:I never for a second assumed that it was genuinely corrupt data, and did not intend it to be read as a claim to having discovered a bug. As is always the case, the output is predictable when I know what I'm doing. |
|
I'm grateful for the assurance (I probably wouldn't have slept tonight otherwise!) but I'm still puzzled as to the kind of code you have which is sensitive to the details of how strings are managed internally.
The critical thing about BBC BASIC's strings (as with most other dialects of BASIC, in fact) is that they are 'movable', therefore you cannot assume that a given string will remain at a constant memory address throughout its life.
You're safe if the string variable and the corresponding memory contents are accessed in consecutive statements, or with the only intervening code having nothing to do with strings, for example:
Code: PRINT a$
a$ += CHR$(0)
PRINT $$!^a$ but this definitely isn't guaranteed to work:
Code: PRINT a$
S% = !^a$
a$ += CHR$(0)
PRINT $$S% Are you, somewhere in your program, assuming that strings are fixed?
Richard.
|
|
Logged
|
|
|
|
Andy Parkes
Developer
member is offline


Gender: 
Posts: 25
|
 |
Re: Schroedinger’s String
« Reply #7 on: Jan 8th, 2014, 10:42am » |
|
Hi Richard,
I'm glad you've asked, because in answering, I've discovered that there is still something that I don't quite understand.
The program accessible from the Wiggio link below, reproduces the difficulty I was encountering and also demonstrates my solution. However, the fact that the same output is generated in BB4W_v6 tells me that this has nothing to do with LOCAL string management after all (although the insight I've gained into this was worth the extra confusion).
http://wiggio.com/yui/folder/stream_file.php?doc_key=QVteex27abiyemnhNemnSSNGUImoPAe4Y5ZCmINRI+I=
The issue occurs when I take a string variable (LOCAL or not), and indirectly set its header to point to a string at a known memory address. I can then, as I expected, use this string variable to directly access the string at the given memory address - OK! But for some reason that I don't yet understand, repeating this process for different memory address' does not appear to work in my example?
I understand the examples in your previous post, such that when the length of a string is changed, it may (or may not) be relocated on the heap to accommodate the new length of the string. But as I am not changing the length of any strings, I think there is something else at play here.
Andy
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: Schroedinger’s String
« Reply #8 on: Jan 8th, 2014, 12:53pm » |
|
on Jan 8th, 2014, 10:42am, Andy Parkes wrote:The program accessible from the Wiggio link below, reproduces the difficulty I was encountering and also demonstrates my solution. |
|
Your program appears to be implementing a linked list using structures. Have you seen my example of doing this on the Wiki:
http://bb4w.wikispaces.com/Linked+lists+using+structures
One thing you'll notice is that I never explicitly manipulate the 'format pointer' of the structure, I use only built-in language features to look after that; only the 'data pointer' is explicitly loaded. In my opinion that makes the code cleaner and easier to follow.
Another thing about your code that I'm not enthusiastic about is this way of discovering the size of a structure:
Code: pFrmt% = !^Node{}
dataSize% = !(pFrmt%) It works, but using the built-in DIM() function is more straightforward:
Code: Quote:The issue occurs when I take a string variable (LOCAL or not), and indirectly set its header to point to a string at a known memory address. |
|
Don't do it! BBC BASIC's string handling and garbage collection routines assume that the memory occupied by a string was allocated by the interpreter for that specific purpose. The sort of thing that can easily go wrong if you 'do it yourself' is that two different string variables can end up pointing to the same memory address, with potentially nasty consequences!
Quote:But for some reason that I don't yet understand, repeating this process for different memory address' does not appear to work in my example? |
|
It's pretty easy to see why FN_getMemberBAD is going to fail. You poke an address into the descriptor of LOCAL string variable name$, so when the function exits the interpreter will free the local variable in the usual way, hence 'freeing' memory that it didn't allocate in the first place (and which definitely isn't 'free')!
In over a decade of programming in BBC BASIC for Windows I have never been tempted to play tricks of the sort you are using. Partly that's because I know they won't work (reliably), but partly it's because I have never had cause to - there has always been a more 'legitimate' way of achieving the same result.
For example in the Wiki article to which I link above the equivalent routine to your FN_getMemberGOOD and FN_getMemberBAD is FNgetitem:
Code: DEF FNgetitem(RETURN n{}, n%)
IF n% = 0 THEN = ""
!(^n{}+4) = n%
= n.item$ Richard.
|
|
Logged
|
|
|
|
Andy Parkes
Developer
member is offline


Gender: 
Posts: 25
|
 |
Re: Schroedinger’s String
« Reply #9 on: Jan 8th, 2014, 7:27pm » |
|
Hi Richard,
That's fantastic, thanks very much . I see that I've missed quite a few fundamental things, and while I apologise for not having first studied the linked list example on the wiki, I've learned more by doing it the hard way.
Quote:when the function exits the interpreter will free the local variable in the usual way, hence 'freeing' memory that it didn't allocate in the first place (and which definitely isn't 'free')! |
|
Thank you, I've finally made the connection! So that’s why it was writing the next string insertion over the top of the previous one, because the internal string management freed the memory occupied by the string, every time my function fetched it.
I see that the technique is never going to be a good idea. Even if I was to use a global string variable for the same purpose, its just creating an unnecessary risk. I recognise that its a hack, and the wrong way to go about solving the problem.
Right, well I have some reworking to do of the project which started this, but it will be so much the better for it.
Thank you very much for your help
Kind Regards
Andy
|
|
Logged
|
|
|
|
|