• All submissions to this site are governed by Second Life Project Contribution Agreement. By submitting patches and other information using this site, you acknowledge that you have read, understood, and agreed to those terms.
Issue Details (XML | Word | Printable)

Key: SVC-2286
Type: Bug Bug
Status: Resolved Resolved
Resolution: Misfiled
Priority: Normal Normal
Assignee: Unassigned
Reporter: Domchi Underwood
Votes: 1
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
2. Second Life Service - SVC

Changing states in LSL clears event queue - which means events get lost

Created: 03/May/08 10:39 AM   Updated: 22/Jun/08 03:12 AM
Return to search
Component/s: Scripts
Affects Version/s: 1.21.0 Server
Fix Version/s: None

Issue Links:
Relates
 


 Description  « Hide
As documented on LSL Wiki (http://www.lslwiki.net/lslwiki/wakka.php?wakka=events) changing states in LSL clears event queue. This means that if script receives llMessageLinked when it's changing states, message gets deleted.

In my case, this was just now a cause of a subtle and hard-to-find bug which manifested only in specific conditions in lag-free sims.

I think this needs to be fixed, and luckily, backward compatibility shouldn't be a problem in this case, and I can't think of a case where deleting event queue is a good think.

It's really a pity - I believe this issue is the only thing that makes linked messages a bit unreliable.



 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Harleen Gretzky added a comment - 03/May/08 11:43 AM
I always thought they got queued and the next time the script entered the state they fired off.

Domchi Underwood added a comment - 03/May/08 01:17 PM
For a simple example, put those two scripts inside a single prim and touch the prim... see how the return message gets lost if you leave llSleep uncommented.

— 8< ----------------------------------------------------------------------------------------

default
{
touch_start(integer total_number)

{ llSay(0, "Touched."); llMessageLinked(LINK_THIS, 0, "ping", ""); // uncomment for this script to stop losing the return pong linked message // llSleep(5.0); state another; }

}

state another
{
state_entry()

{ llSetTimerEvent(10); }

link_message(integer sender_num, integer num, string msg, key id)

{ if (msg == "pong") llSay(0, "Got return message!"); }

timer()

{ llSay(0, "Returning to default state..."); state default; }

}

— 8< ----------------------------------------------------------------------------------------

default
{
link_message(integer sender_num, integer num, string msg, key id)
{
if (msg == "ping")

{ llSay(0, "Received ping, responding..."); llMessageLinked(LINK_THIS, 0, "pong", ""); }

}
}

— 8< ----------------------------------------------------------------------------------------


Haravikk Mistral added a comment - 04/May/08 05:24 AM
This is actually a fair point. However, I'm unsure about the assumption that it won't break functionality; while I'm unaware of such poorly scripted items, some scripts may not filter link-messages, but simply respond differently to them depending upon state. For example;

-----------------------

default {
state_entry() { llMessageLinked(LINK_THIS, 0, "", NULL_KEY); llMessageLinked(LINK_THIS, 0, "", NULL_KEY); }
link_message(integer x, integer y, string msg, key id) { state received; }
}

state received {
state_entry() { llOwnerSay("Received first message"); }
link_message(integer x, integer y, string msg, key id) { state end; }
}

state end {
state_entry() { llOwnerSay("Ended"); }
}

----------------------------

The above script would immediately switch to the ended state, rather than sitting in the received state until it receives an appropriate message. Granted this example is pointless, but there may be similar cases in-world somewhere that would break if this fundamental behaviour is changed.

A better alternative would a new function:

llRetainEvents(integer flag);

This function allows you to specify whether you want to retain events during a state-transition by setting flag to TRUE (keep events) or FALSE (clear events). Default behaviour is llRetainEvents(FALSE) which is the same as it is now.


Lex Neva added a comment - 04/May/08 10:15 AM
Harleen: that's not how it works.

Domchi: the system has always worked this way. I know at least some of my scripts are designed to take advantage of this, and they purposefully switch states in some cases in order to clear out some unwanted events. For a simple example, imagine a lightswitch that takes a second to flip. I want to make it so that clicking 4 times in a row doesn't mean that 4 touch events get put into the queue, causing the light to switch 4 times in a row. To accomplish this, I'll have two states, "on" and "off", knowing that, when switching states, any extra events in the queue will get dropped. This can be a very useful and sensible way to program this kind of device.


Lex Neva added a comment - 04/May/08 10:15 AM
I should note additionally that there's no other way to clear the event queue.

Domchi Underwood added a comment - 04/May/08 10:17 AM
Haravikk, I'm pretty sceptical of that. Why would someone send a linked message just to rely on it being ignored?

Even if such scripts do exist (that is, if someone is sending messages and knowingly using this mechanism to ignore them), I still think this kind of behaviour is a bug and should be corrected since in majority of cases it's something you don't expect and which probably introduces a lot of subtle bugs. Or, in other words, bug which I had because of this was probably the subtlest and most hard-to-find bug I ever encountered in LSL.


Domchi Underwood added a comment - 04/May/08 10:24 AM
Lex, thanks for example - so someone is using it after all. But why don't you simply create a boolean called "switching" and set it to true when you start the flip, reset it to false when the flip is done, and simply ignore all the events in the meantime?

Or, create a intermediate "switching" state which would ignore those events - isn't that "cleaner" solution to such problems?


Domchi Underwood added a comment - 04/May/08 10:41 AM
I thought about this a bit more and still think it's a bug.

Perhaps a function to clear the event queue would be appropriate (though I think that can be done with an empty intermediate state).

This current situation basically means that if I have one script which reacts to linked messages with changing states, and more than one script which sends linked messages, I can never be sure that all the messages are received, and I cannot fix that by any kind of defensive programming. Basically, I have no control.

This can be especially bad if two scripts share state or variable, where change of that state/variable can originate in any of the two scripts.


Lex Neva added a comment - 04/May/08 10:52 AM
Hmm... well, let's code this out and see what we come up with.

My original example code:

turnOff() {
// some process that takes one second
}

turnOn() {
// some process that takes one second
}

default {
state_entry() { turnOff(); state off; }
}

state off {
touch_start(integer num) { turnOn(); state on; }
}


state on {
touch_start(integer num) { turnOff(); state off; } }
}

Fairly simple and straightforward code.

Your first suggestion is to set a boolean called "switching" that's only set to TRUE while the switch is on. I can't just do this, though:

turnOn() {
switching = TRUE;
// do something that takes one second
switching = FALSE;
}

The problem there is that events are going to be queueing while I'm on the middle line in the function, but they'll actually run after I've left the function because LSL is single-threaded. It won't work to just check the state of the switching variable when events come in... by the time those events get run, turnOn() will be long finished. In fact, the touch_start() event will also have finished... so I'm going to already be in the other state, but events that queued up while turnOn() was running will actually be running in the "on" state. Suddenly keeping track of that "switching" variable becomes a mess.

I can potentially code around this IF (and only if) turnOn() can be broken up into two parts, startTurningOn() and stopTurningOn(), with a 1-second pause in the middle. Then I could do this:

state on {
touch_start(integer num) {
if (!switching) { switching = TRUE; startTurningOn(); llSetTimerEvent(1.0); }
}

timer() { stopTurningOn(); switching = FALSE; state off; }
}

That just got a lot more complicated to write, but it should work... except there's still a little bit of a problem. Any time while that timer() event is running, more touches might be coming in and queueing up. Those touches would be there and run in the "off" state, which means I failed at making my switch not queue up events. Plus, this whole thing is not possible if I can't break up turnOn() (eg if the 1-second pause is caused by LSL-enforced script delay).

So the intermediate "switching" state would be the only solution:

turnOff() {
// some process that takes one second
}

turnOn() {
// some process that takes one second
}

default {
state_entry() { turnOff(); state off; }
}

state off {
touch_start(integer num) { turnOn(); state switching_to_on; }
}

state switching_to_on {
state_entry() { state on; }
}

state switching_to_off {
state_entry() { state off; }
}

state on {
touch_start(integer num) { turnOff(); state switching_to_off; }
}

Should work, right? But how do I know whether all of those events got cleared out of the queue in the intermediate states? I can't just llSleep()... the state_entry() will still be running. Maybe I need to set a llSetTimerEvent() and allow those events to process for a second before I actually change states... it gets messy. Plus, I personally think that switching state is much uglier than my original code, and it's not obvious to the uninitiate WHY I've made an intermediate state.

Before I go, I just thought of a more concrete example of why keeping events could cause problems:

default {
state_entry() { state listen1; }
}

state listen1 {
state_entry() { llListen(1, "", "", ""); }

listen(integer channel, string name, key id, string message) { llSay(0, "I expect that you said this on channel 1: " + message); }

touch_start(integer num) { state listen2; }
}


state listen2 {
state_entry() { llListen(2, "", "", ""); }

listen(integer channel, string name, key id, string message) { llSay(0, "I expect that you said this on channel 2: " + message); }

touch_start(integer num) { state listen1; } }
}

In this script, the code reasonably expects that if it's listening on channel 1, then listen() events are only going to come through for chat on channel 1. But if events are kept between state transitions, that expectation is violated. If a touch event and then a listen event come in rapid succession, the touch event will switch to the other state and then a listen event will immediately process on an unexpected channel. Worse yet, that listen event will process before the state_entry() event processes! This means that state_entry() will no longer be a place where initialization can be done, and all event code will need to check to be sure that events from the previous state haven't carried over. That'd be a huge mess.


Lex Neva added a comment - 04/May/08 10:53 AM
Well, I can see your problem... but I think I've somewhat longwindedly shown that keeping events around has its own set of pitfalls, and changing this behavior suddenly would confuse even more people.

Haravikk Mistral added a comment - 04/May/08 12:12 PM
Anymore love for a llRetainEvents() function then? I don't see that this can be done without changing basic behaviour, but I can see advantages in it being available for those that want it.

Personally I try to design shared 'service' scripts in an offer/commit type situation. Basically the first message my script receives is an request, if my script is available, then it immediately "locks" itself (using a unique key), and replies. If after the reply I then receive a "commit" request with that unique key, then I begin executing. If I receive new requests while "locked" then they are automatically rejected.

Basic behaviour for a 'client' script in this case then is to send a request, and start a timer with a reasonable response-time. If a reject response is received then the script just adds some time to its timer to give the 'service' script a chance to respond. Once the timer completes the script just tries again. The result is that it keeps trying until it gets through, or gives up. If the request is responded to with an 'offer' response then it stops its timer and commits the action.
There's also the special case of a client that receives an offer but no longer wants to commit, in which case it requests a rollback, which causes the service to "unlock" itself again.
The service automatically unlocks itself if it doesn't receive a commit request in response to its offer in a timely fashion.

This is a fairly typical distributed system model, it's not as easy to code as just assuming events are retained, but it's designed not to fail.


Domchi Underwood added a comment - 04/May/08 11:35 PM
Lex, regarding the switch... I usually delegate long-running tasks to another script. So my instinct would be to write something like this:

— 8< ----------------------------------

integer switch_in_progress = FALSE;

default
{
touch_start(integer num)
{
if (!switch_in_progress)

{ switch_in_progress = TRUE; llMessageLinked(LINK_THIS, 0, "flip", ""); }

}

link_message(integer sender_num, integer num, string msg, key id)

{ if (msg == "finished") switch_in_progress = FALSE; }

}

— 8< ----------------------------------

// default is off
default
{
state_entry()

{ // here goes 1-second flip to off llMessageLinked(LINK_THIS, 0, "finished", ""); }

link_message(integer sender_num, integer num, string msg, key id)

{ if (msg == "flip") state on; }

}

state on
{
state_entry()

{ // here goes 1-second flip to on llMessageLinked(LINK_THIS, 0, "finished", ""); }

link_message(integer sender_num, integer num, string msg, key id)

{ if (msg == "flip") state default; }

}

— 8< ----------------------------------

Maybe not so straightforward as your original code, but not that complicated either.

Now, my problem is exactly the opposite of yours - doing the same on/off switching thing, but not ignoring any user touches in the meantime. Let's say that user input is sacred and shouldn't be ignored. Yes, I could do that without using states trivially, but suppose that I want to use states because of other reasons (different on/off behaviour that could get ugly if state is saved in variables).

My problem is that due to the said issue, there is NO WAY of being sure that I get all the user input if my script listens to input and uses states. I can make switching as fast as possible (not doing any long operations while switching states), and hope that I catch everything, but I can never be sure.

And I usually find myself in situation where I do want to receive all user input (or linked messages, or whatever) - and then choose to ignore it if I wish.

To give the example... consider your last example with listens on channel 1 and 2 depending on state. It's trivially easy to fix - just add "if (channel == 1)" and "if (channel == 2)" to listens.

So, to summarize. I don't know what behaviour is logical. To me, not clearing event queue is, but I guess that this eventually depends on individual preferences. My problem is, I can emulate clearing event queue with a bit more programming, but I can't emulate not deleting the queue in current situation.

The solution I like the most would be to correct this behaviour and add function to clear the queue - llClearEventQueue() for example.

Second best would be to add script property to set desired behaviour. For backward compatibility, default could be to clear the event queue when switching states.


Haravikk Mistral added a comment - 05/May/08 02:59 AM
llClearEventQueue() in this case is not the better solution, as it still changes behaviour for existing scripts. We can't expect script authors to go around correcting faults for their scripts, not everyone tracks their sales or has an auto-update system to do this.

As I've already said, llRetainEvents(integer flag) or some similarly named function is the best solution that I can see as it doesn't change existing scripts, but lets new ones decide which behaviour they want.


Aimee Trescothick added a comment - 05/May/08 06:10 AM
While the existing behavior may be limiting and a source of bugs, it is as intended and useful in some circumstances, so not really a bug in itself.

I agree with Haravikk, it should be an optional flag to avoid breaking existing content. SVC-2297 created as a New Feature request for llRetainEvents so it's voteable, feel free to expand the description.


Aimee Trescothick made changes - 05/May/08 06:10 AM
Field Original Value New Value
Link This issue is related to by SVC-2297 [ SVC-2297 ]
Lex Neva added a comment - 05/May/08 09:26 AM
Well, what we're facing is controversial behavior with no clear winner. I think in a case like that, precedent has to win. llRetainEvents() could work, but it's a little hacky, IMO.

You could solve your problem by adding another script, Domchi. It'd just be a simple script that queues up messages (touches, link messages, etc). It would send them one at a time to the consumer script, and that script would acknowledge when it had received a message. It ought to be a fairly simple solution that allows you to continue to use state transitions in your main logic.


Domchi Underwood added a comment - 05/May/08 10:03 AM
I think we have reached some sort of consensus.

I agree that llRetainEvents() is hacky, however it's probably the most realistic solution because of the backwards compatibility.

However, I would still dispute that this might even be the case where LL should consider breaking backwards compatibility.

On one hand, you have all the items which rely on this behaviour (and we don't really know how many - maybe few, maybe a lot) and breaking that really hurts. If something like that is broken, LL should at least announce it long before it's broken (maybe with next big LSL version?) so that scripters have enough time to adapt... or, since they control the platform in peculiar way, maybe check when script was last modified and implement new default behaviour only for scripts which are last modified after a certain date.

On the other hand, you have the classic argument... think of the children..! And by that I mean, how many scripters consider this behaviour when writing scripts? In this issue, we had three scripters who assumed three different things. I assumed that events don't get deleted. Harleen Gretzky thought that they fire next time script enters the same state. And of course Lex has read the documentation.

But how many scripters in the future (and present scripters as well) will NOT assume that events get deleted? How many scripters won't intuitively search for llRetainEvents() function? This is the real question which should be weighted against not breaking backward compatibility.

For me, it doesn't matter anymore. I've spent my couple of hours debugging strange bug which made my script behave completely different depending on the sim. Now that I know about this issue, I've fixed my scripts so that this is taken into account. But how many scripters will have to spend same couple of hours hunting bugs, and even worse, how many people have bought items in which scripters didn't assume that linked messages and events can be lost?

And even worse, how many items who worked well thus far will stop working when Mono suddenly introduces much better performance and threads start executing LSL much faster, sending more linked messages in a shorter period of time...


Lex Neva added a comment - 05/May/08 10:13 AM
A consensus of sorts... but I still strongly feel that the current behavior makes the most sense. It's not just because I've read the documentation... I've been scripting for 3 years and never been inconvenienced by this functionality, and I make use of it constantly.

> or, since they control the platform in peculiar way, maybe check when script was last modified and
> implement new default behaviour only for scripts which are last modified after a certain date.

Eek, please no! I'd really hate to go in and correct a typo in my script, only to have to completely rewrite it to handle this new behavior.

> But how many scripters in the future (and present scripters as well) will NOT assume that events
> get deleted? How many scripters won't intuitively search for llRetainEvents() function? This is the
> real question which should be weighted against not breaking backward compatibility.

Well, llListen()s are cleared on state transition. In fact, the only things that aren't cleared are global variables and llSetTimerEvent()s, and in my opinion, the latter is a serious and longstanding bug.


Domchi Underwood added a comment - 05/May/08 10:30 AM
>> implement new default behaviour only for scripts which are last modified after a certain date.
> Eek, please no! I'd really hate to go in and correct a typo in my script, only to have to completely rewrite it to handle this new behavior.

Yeah, you're right. Last created after a certain date? Or simply a bad idea...

> Well, llListen()s are cleared on state transition. In fact, the only things that aren't cleared are global variables and llSetTimerEvent()s, and in my opinion, the latter is a serious and longstanding bug.

Well, my original problem was that if you receive linked message just before switching states, that linked message is eaten. I've had it happen when two linked messages are sent one after another, and where the first one changes the state.


Strife Onizuka added a comment - 31/May/08 06:30 PM - edited
I'm resolving this as Misfiled because it's not a bug or a misfeature, it was intentionally designed this way. Furthermore it's in the documentation and has been there for at least four years.

Personally I never use states, I don't like the idea of missing events.


Strife Onizuka made changes - 31/May/08 06:30 PM
Status Open [ 1 ] Resolved [ 5 ]
Resolution Misfiled [ 6 ]
Christopher Omega added a comment - 22/Jun/08 03:12 AM - edited
Oops. I really should've searched for something similar before posting my own JIRA complaining about this "feature" of state transitions. I wholeheartedly disagree with Strife that this issue is resolved. For precisely the reasons he stated, event queue clearing makes state transitions dangerous! State transitions should not be dangerous, I don't think they were originally intended to be so. In fact, the current (albeit, long standing) behavior is a hack to workaround a bug caused by a bad design decision. See SVC-2560 if you want to hear more from me about that.

I would totally be for a llRetainEvents(TRUE) workaround for this. As much as I hate the current behavior, I hate breaking backward compatibility even more, even though LL has set a precedent for doing so when they made llListen state-scoped. It used to maintain global scope just like llSetTimerEvent.

Grumble Grumble!


Sue Linden made changes - 13/Nov/08 12:10 PM
Workflow jira-2007-12-22a [ 55575 ] jira-2008-11-14 [ 82444 ]
Sue Linden made changes - 13/Nov/08 04:44 PM
Workflow jira-2008-11-14 [ 82444 ] jira-2008-11-14a [ 91434 ]